6.9 KiB
6.9 KiB
Story 3.3: Content Interlinking Injection - Implementation Summary
Status
COMPLETE - All acceptance criteria met, all tests passing
What Was Implemented
Core Module: src/interlinking/content_injection.py
Main function: inject_interlinks() - Injects three types of links into article HTML:
-
Tiered Links (Money Site / Lower Tier Articles)
- Tier 1: Links to money site URL
- Tier 2+: Links to 2-4 random lower-tier articles
- Uses tier-appropriate anchor text from
anchor_text_generator.py - Supports job config overrides (default/override/append modes)
- Searches for anchor text in content (case-insensitive)
- Wraps first occurrence or inserts via fallback
-
Homepage Links
- Links to
/index.htmlon the article's domain - Uses "Home" as anchor text
- Searches for "Home" in article content or inserts it
- Links to
-
"See Also" Section
- Added after last
</p>tag - Links to ALL other articles in the batch
- Each link uses article title as anchor text
- Formatted as
<h3>+<ul>list
- Added after last
Template Updates: Navigation Menu
Added responsive navigation menu to all 4 templates (src/templating/templates/):
- basic.html - Clean, simple nav with blue accents
- modern.html - Gradient hover effects matching purple theme
- classic.html - Serif font, muted brown colors
- minimal.html - Uppercase, minimalist black & white
All templates now include:
<nav>
<ul>
<li><a href="/index.html">Home</a></li>
<li><a href="about.html">About</a></li>
<li><a href="privacy.html">Privacy</a></li>
<li><a href="contact.html">Contact</a></li>
</ul>
</nav>
Helper Functions
_inject_tiered_links()- Handles money site (T1) and lower-tier (T2+) links_inject_homepage_link()- Injects "Home" link to/index.html_inject_see_also_section()- Builds "See Also" section with batch links_get_anchor_texts_for_tier()- Gets anchor text with job config overrides_try_inject_link()- Tries to find/wrap anchor text or falls back to insertion_find_and_wrap_anchor_text()- Case-insensitive search and wrap (first occurrence only)_insert_link_into_random_paragraph()- Fallback insertion into random paragraph_extract_homepage_url()- Extracts base domain URL_extract_domain_name()- Extracts domain name (removes www.)_insert_before_closing_tags()- Inserts content after last</p>tag
Database Integration
All injected links are recorded in article_links table:
- Tiered links:
link_type="tiered",to_url(money site or lower tier URL) - Homepage links:
link_type="homepage",to_url(domain/index.html) - See Also links:
link_type="wheel_see_also",to_content_id(internal)
Content is updated in generated_content.content field via content_repo.update().
Anchor Text Configuration
Supports three modes in job config:
{
"anchor_text_config": {
"mode": "default|override|append",
"custom_text": ["anchor 1", "anchor 2", ...]
}
}
- default: Use tier-based anchors (T1: main keyword, T2: related searches, T3: main keyword, T4+: entities)
- override: Replace defaults with custom_text
- append: Add custom_text to defaults
Link Injection Strategy
- Search for anchor text in content (case-insensitive, match within phrases)
- Wrap first occurrence with
<a>tag - Skip existing links (don't link text already inside
<a>tags) - Fallback to insertion if anchor text not found
- Random placement in fallback mode
Testing
Unit Tests (33 tests in tests/unit/test_content_injection.py):
- Homepage URL extraction
- "See Also" section insertion
- Anchor text finding and wrapping (case-insensitive, within phrases)
- Link insertion into paragraphs
- Anchor text config modes (default, override, append)
- Tiered link injection (T1 money site, T2+ lower tier)
- Error handling
Integration Tests (9 tests in tests/integration/test_content_injection_integration.py):
- Full flow: T1 batch with money site links + See Also section
- Homepage link injection
- T2 batch linking to T1 articles
- Anchor text config overrides (override/append modes)
- Different batch sizes (1 article, 20 articles)
- ArticleLink database records (all link types)
- Internal vs external link handling
All 42 tests pass
Key Design Decisions
- "Home" for homepage links: Using "Home" as anchor text instead of domain name, now that all templates have navigation menus
- Homepage URL: Points to
/index.html(not just/) - Random selection: For T2+ articles, random selection of 2-4 lower-tier URLs to link to
- Case-insensitive matching: "Shaft Machining" matches "shaft machining"
- First occurrence only: Only link the first instance of anchor text to avoid over-optimization
- BeautifulSoup for HTML parsing: Safe, preserves structure, handles malformed HTML
- Fallback insertion: If anchor text not found, insert into random paragraph at random position
- See Also section: Simpler than wheel_next/wheel_prev - all articles link to all others
Files Modified
Created
src/interlinking/content_injection.py(410 lines)tests/unit/test_content_injection.py(363 lines)tests/integration/test_content_injection_integration.py(469 lines)
Modified
src/templating/templates/basic.html- Added navigation menusrc/templating/templates/modern.html- Added navigation menusrc/templating/templates/classic.html- Added navigation menusrc/templating/templates/minimal.html- Added navigation menu
Dependencies
- BeautifulSoup4: HTML parsing and manipulation
- Story 3.1: URL generation (uses
generate_urls_for_batch()) - Story 3.2: Tiered link finding (uses
find_tiered_links()) - Existing:
anchor_text_generator.pyfor tier-based anchor text
Usage Example
from src.interlinking.content_injection import inject_interlinks
from src.interlinking.tiered_links import find_tiered_links
from src.generation.url_generator import generate_urls_for_batch
# 1. Generate URLs for batch
article_urls = generate_urls_for_batch(content_records, site_repo)
# 2. Find tiered links
tiered_links = find_tiered_links(content_records, job_config, project_repo, content_repo, site_repo)
# 3. Inject all interlinks
inject_interlinks(
content_records,
article_urls,
tiered_links,
project,
job_config,
content_repo,
link_repo
)
Next Steps
Story 3.3 is complete and ready for:
- Story 4.x: Deployment (will use final HTML with all links)
- Future: Analytics dashboard using
article_linkstable - Future: Create About, Privacy, Contact pages to match nav menu links
Notes
- Homepage links use "Home" anchor text, pointing to
/index.html - All 4 templates now have consistent navigation structure
- Link relationships fully tracked in database for analytics
- Simple, maintainable code with comprehensive test coverage