7.7 KiB
CLI Integration Explanation - Story 3.3
The Problem
Story 3.3's inject_interlinks() function (and Stories 3.1-3.2) are implemented and tested perfectly, but they're never called in the actual batch generation workflow.
Current Workflow
When you run:
uv run python main.py generate-batch --job-file jobs/example.json
Here's what actually happens:
Step-by-Step Current Flow
1. CLI Command (src/cli/commands.py)
└─> generate_batch() function called
└─> Creates BatchProcessor
└─> BatchProcessor.process_job()
2. BatchProcessor.process_job() (src/generation/batch_processor.py)
└─> Reads job file
└─> For each job:
└─> _process_single_job()
└─> Validates deployment targets
└─> For each tier (tier1, tier2, tier3):
└─> _process_tier()
3. _process_tier()
└─> For each article (1 to count):
└─> _generate_single_article()
├─> Generate title
├─> Generate outline
├─> Generate content
├─> Augment if needed
└─> SAVE to database
4. END! ⚠️
Nothing happens after articles are generated!
No URLs, no tiered links, no interlinking!
What's Missing
After all articles are generated for a tier, we need to add Story 3.1-3.3:
# THIS CODE DOES NOT EXIST YET!
# Needs to be added at the end of _process_tier() or _process_single_job()
# 1. Get all generated content for this batch
content_records = self.content_repo.get_by_project_and_tier(project_id, tier_name)
# 2. Assign sites (Story 3.1)
from src.generation.site_assignment import assign_sites_to_batch
assign_sites_to_batch(content_records, job, site_repo, bunny_client, project.main_keyword)
# 3. Generate URLs (Story 3.1)
from src.generation.url_generator import generate_urls_for_batch
article_urls = generate_urls_for_batch(content_records, site_repo)
# 4. Find tiered links (Story 3.2)
from src.interlinking.tiered_links import find_tiered_links
tiered_links = find_tiered_links(
content_records, job_config, project_repo, content_repo, site_repo
)
# 5. Inject interlinks (Story 3.3)
from src.interlinking.content_injection import inject_interlinks
from src.database.repositories import ArticleLinkRepository
link_repo = ArticleLinkRepository(session)
inject_interlinks(
content_records, article_urls, tiered_links,
project, job_config, content_repo, link_repo
)
# 6. Apply templates (existing functionality)
for content in content_records:
content_generator.apply_template(content.id)
Why This Matters
Current State
✓ Articles are generated
✗ Articles have NO internal links
✗ Articles have NO tiered links
✗ Articles have NO "See Also" section
✗ Articles have NO final URLs assigned
✗ Templates are NOT applied
Result: Articles sit in database with raw HTML, no links, unusable for deployment
With Integration
✓ Articles are generated
✓ Sites are assigned to articles
✓ Final URLs are generated
✓ Tiered links are found
✓ All links are injected
✓ Templates are applied
✓ Articles are ready for deployment
Result: Complete, interlinked articles ready for Story 4.x deployment
Where to Add Integration
Option 1: End of _process_tier() (RECOMMENDED)
Add the integration code at line 162 (after the article generation loop):
def _process_tier(self, project_id, tier_name, tier_config, ...):
# ... existing article generation loop ...
# NEW: Post-generation interlinking
click.echo(f" {tier_name}: Injecting interlinks for {tier_config.count} articles...")
self._inject_tier_interlinks(project_id, tier_name, job, debug)
Then create new method:
def _inject_tier_interlinks(self, project_id, tier_name, job, debug):
"""Inject interlinks for all articles in a tier"""
# Get all articles for this tier
content_records = self.content_repo.get_by_project_and_tier(
project_id, tier_name
)
if not content_records:
click.echo(f" Warning: No articles found for {tier_name}")
return
# Steps 1-6 from above...
Option 2: End of _process_single_job()
Add integration after ALL tiers are generated (processes entire job at once):
def _process_single_job(self, job, job_idx, debug, continue_on_error):
# ... existing tier processing ...
# NEW: Process all tiers together
click.echo(f"\nPost-processing: Injecting interlinks...")
for tier_name in job.tiers.keys():
self._inject_tier_interlinks(job.project_id, tier_name, job, debug)
Why It Wasn't Integrated Yet
Looking at the story implementations, it appears:
- Story 3.1 (URL Generation) - Functions exist but not integrated
- Story 3.2 (Tiered Links) - Functions exist but not integrated
- Story 3.3 (Content Injection) - Functions exist but not integrated
This suggests the stories focused on building the functionality with the expectation that Story 4.x (Deployment) would integrate everything together.
Impact of Missing Integration
Tests Still Pass ✓
- Unit tests test functions in isolation
- Integration tests use the functions directly
- All 42 tests pass because the functions work perfectly
But Real Usage Fails ✗
When you actually run generate-batch:
- Articles are generated
- They're saved to database
- But they have no links, no URLs, nothing
- Story 4.x deployment would fail because articles aren't ready
Effort to Fix
Time Estimate: 30-60 minutes
Tasks:
- Add imports to
batch_processor.py(2 minutes) - Create
_inject_tier_interlinks()method (15 minutes) - Add call at end of
_process_tier()(2 minutes) - Test with real job file (10 minutes)
- Debug any issues (10-20 minutes)
Complexity: Low - just wiring existing functions together
Testing the Integration
After adding integration:
# 1. Run batch generation
uv run python main.py generate-batch \
--job-file jobs/test_small.json \
--username admin \
--password yourpass
# 2. Check database for links
uv run python -c "
from src.database.session import db_manager
from src.database.repositories import ArticleLinkRepository
session = db_manager.get_session()
link_repo = ArticleLinkRepository(session)
links = link_repo.get_all()
print(f'Total links: {len(links)}')
for link in links[:5]:
print(f' {link.link_type}: {link.anchor_text} -> {link.to_url or link.to_content_id}')
session.close()
"
# 3. Verify articles have links in content
uv run python -c "
from src.database.session import db_manager
from src.database.repositories import GeneratedContentRepository
session = db_manager.get_session()
content_repo = GeneratedContentRepository(session)
articles = content_repo.get_all(limit=1)
if articles:
print('Sample article content:')
print(articles[0].content[:500])
print(f'Contains links: {\"<a href=\" in articles[0].content}')
print(f'Has See Also: {\"See Also\" in articles[0].content}')
session.close()
"
Summary
The Good News:
- All Story 3.3 code is perfect ✓
- Tests prove functionality works ✓
- No bugs, no issues ✓
The Bad News:
- Code isn't wired into CLI workflow ✗
- Running
generate-batchdoesn't use Story 3.1-3.3 ✗ - Articles are incomplete without integration ✗
The Fix:
- Add ~50 lines of integration code
- Wire existing functions into
BatchProcessor - Test with real job file
- Done! ✓
When to Fix:
- Now (before Story 4.x) - RECOMMENDED
- Or during Story 4.x (when deployment needs links)
- Not urgent if not deploying yet
This explains why all tests pass but the feature "isn't done" yet - the plumbing exists, it's just not connected to the main pipeline.