258 lines
7.7 KiB
Markdown
258 lines
7.7 KiB
Markdown
# CLI Integration Explanation - Story 3.3
|
|
|
|
## The Problem
|
|
|
|
Story 3.3's `inject_interlinks()` function (and Stories 3.1-3.2) are **implemented and tested perfectly**, but they're **never called** in the actual batch generation workflow.
|
|
|
|
## Current Workflow
|
|
|
|
When you run:
|
|
```bash
|
|
uv run python main.py generate-batch --job-file jobs/example.json
|
|
```
|
|
|
|
Here's what actually happens:
|
|
|
|
### Step-by-Step Current Flow
|
|
|
|
```
|
|
1. CLI Command (src/cli/commands.py)
|
|
└─> generate_batch() function called
|
|
└─> Creates BatchProcessor
|
|
└─> BatchProcessor.process_job()
|
|
|
|
2. BatchProcessor.process_job() (src/generation/batch_processor.py)
|
|
└─> Reads job file
|
|
└─> For each job:
|
|
└─> _process_single_job()
|
|
└─> Validates deployment targets
|
|
└─> For each tier (tier1, tier2, tier3):
|
|
└─> _process_tier()
|
|
|
|
3. _process_tier()
|
|
└─> For each article (1 to count):
|
|
└─> _generate_single_article()
|
|
├─> Generate title
|
|
├─> Generate outline
|
|
├─> Generate content
|
|
├─> Augment if needed
|
|
└─> SAVE to database
|
|
|
|
4. END! ⚠️
|
|
|
|
Nothing happens after articles are generated!
|
|
No URLs, no tiered links, no interlinking!
|
|
```
|
|
|
|
## What's Missing
|
|
|
|
After all articles are generated for a tier, we need to add Story 3.1-3.3:
|
|
|
|
```python
|
|
# THIS CODE DOES NOT EXIST YET!
|
|
# Needs to be added at the end of _process_tier() or _process_single_job()
|
|
|
|
# 1. Get all generated content for this batch
|
|
content_records = self.content_repo.get_by_project_and_tier(project_id, tier_name)
|
|
|
|
# 2. Assign sites (Story 3.1)
|
|
from src.generation.site_assignment import assign_sites_to_batch
|
|
assign_sites_to_batch(content_records, job, site_repo, bunny_client, project.main_keyword)
|
|
|
|
# 3. Generate URLs (Story 3.1)
|
|
from src.generation.url_generator import generate_urls_for_batch
|
|
article_urls = generate_urls_for_batch(content_records, site_repo)
|
|
|
|
# 4. Find tiered links (Story 3.2)
|
|
from src.interlinking.tiered_links import find_tiered_links
|
|
tiered_links = find_tiered_links(
|
|
content_records, job_config, project_repo, content_repo, site_repo
|
|
)
|
|
|
|
# 5. Inject interlinks (Story 3.3)
|
|
from src.interlinking.content_injection import inject_interlinks
|
|
from src.database.repositories import ArticleLinkRepository
|
|
link_repo = ArticleLinkRepository(session)
|
|
inject_interlinks(
|
|
content_records, article_urls, tiered_links,
|
|
project, job_config, content_repo, link_repo
|
|
)
|
|
|
|
# 6. Apply templates (existing functionality)
|
|
for content in content_records:
|
|
content_generator.apply_template(content.id)
|
|
```
|
|
|
|
## Why This Matters
|
|
|
|
### Current State
|
|
✓ Articles are generated
|
|
✗ Articles have NO internal links
|
|
✗ Articles have NO tiered links
|
|
✗ Articles have NO "See Also" section
|
|
✗ Articles have NO final URLs assigned
|
|
✗ Templates are NOT applied
|
|
|
|
**Result**: Articles sit in database with raw HTML, no links, unusable for deployment
|
|
|
|
### With Integration
|
|
✓ Articles are generated
|
|
✓ Sites are assigned to articles
|
|
✓ Final URLs are generated
|
|
✓ Tiered links are found
|
|
✓ All links are injected
|
|
✓ Templates are applied
|
|
✓ Articles are ready for deployment
|
|
|
|
**Result**: Complete, interlinked articles ready for Story 4.x deployment
|
|
|
|
## Where to Add Integration
|
|
|
|
### Option 1: End of `_process_tier()` (RECOMMENDED)
|
|
Add the integration code at line 162 (after the article generation loop):
|
|
|
|
```python
|
|
def _process_tier(self, project_id, tier_name, tier_config, ...):
|
|
# ... existing article generation loop ...
|
|
|
|
# NEW: Post-generation interlinking
|
|
click.echo(f" {tier_name}: Injecting interlinks for {tier_config.count} articles...")
|
|
self._inject_tier_interlinks(project_id, tier_name, job, debug)
|
|
```
|
|
|
|
Then create new method:
|
|
```python
|
|
def _inject_tier_interlinks(self, project_id, tier_name, job, debug):
|
|
"""Inject interlinks for all articles in a tier"""
|
|
# Get all articles for this tier
|
|
content_records = self.content_repo.get_by_project_and_tier(
|
|
project_id, tier_name
|
|
)
|
|
|
|
if not content_records:
|
|
click.echo(f" Warning: No articles found for {tier_name}")
|
|
return
|
|
|
|
# Steps 1-6 from above...
|
|
```
|
|
|
|
### Option 2: End of `_process_single_job()`
|
|
Add integration after ALL tiers are generated (processes entire job at once):
|
|
|
|
```python
|
|
def _process_single_job(self, job, job_idx, debug, continue_on_error):
|
|
# ... existing tier processing ...
|
|
|
|
# NEW: Process all tiers together
|
|
click.echo(f"\nPost-processing: Injecting interlinks...")
|
|
for tier_name in job.tiers.keys():
|
|
self._inject_tier_interlinks(job.project_id, tier_name, job, debug)
|
|
```
|
|
|
|
## Why It Wasn't Integrated Yet
|
|
|
|
Looking at the story implementations, it appears:
|
|
|
|
1. **Story 3.1** (URL Generation) - Functions exist but not integrated
|
|
2. **Story 3.2** (Tiered Links) - Functions exist but not integrated
|
|
3. **Story 3.3** (Content Injection) - Functions exist but not integrated
|
|
|
|
This suggests the stories focused on **building the functionality** with the expectation that **Story 4.x (Deployment)** would integrate everything together.
|
|
|
|
## Impact of Missing Integration
|
|
|
|
### Tests Still Pass ✓
|
|
- Unit tests test functions in isolation
|
|
- Integration tests use the functions directly
|
|
- All 42 tests pass because the **functions work perfectly**
|
|
|
|
### But Real Usage Fails ✗
|
|
When you actually run `generate-batch`:
|
|
- Articles are generated
|
|
- They're saved to database
|
|
- But they have no links, no URLs, nothing
|
|
- Story 4.x deployment would fail because articles aren't ready
|
|
|
|
## Effort to Fix
|
|
|
|
**Time Estimate**: 30-60 minutes
|
|
|
|
**Tasks**:
|
|
1. Add imports to `batch_processor.py` (2 minutes)
|
|
2. Create `_inject_tier_interlinks()` method (15 minutes)
|
|
3. Add call at end of `_process_tier()` (2 minutes)
|
|
4. Test with real job file (10 minutes)
|
|
5. Debug any issues (10-20 minutes)
|
|
|
|
**Complexity**: Low - just wiring existing functions together
|
|
|
|
## Testing the Integration
|
|
|
|
After adding integration:
|
|
|
|
```bash
|
|
# 1. Run batch generation
|
|
uv run python main.py generate-batch \
|
|
--job-file jobs/test_small.json \
|
|
--username admin \
|
|
--password yourpass
|
|
|
|
# 2. Check database for links
|
|
uv run python -c "
|
|
from src.database.session import db_manager
|
|
from src.database.repositories import ArticleLinkRepository
|
|
|
|
session = db_manager.get_session()
|
|
link_repo = ArticleLinkRepository(session)
|
|
links = link_repo.get_all()
|
|
print(f'Total links: {len(links)}')
|
|
for link in links[:5]:
|
|
print(f' {link.link_type}: {link.anchor_text} -> {link.to_url or link.to_content_id}')
|
|
session.close()
|
|
"
|
|
|
|
# 3. Verify articles have links in content
|
|
uv run python -c "
|
|
from src.database.session import db_manager
|
|
from src.database.repositories import GeneratedContentRepository
|
|
|
|
session = db_manager.get_session()
|
|
content_repo = GeneratedContentRepository(session)
|
|
articles = content_repo.get_all(limit=1)
|
|
if articles:
|
|
print('Sample article content:')
|
|
print(articles[0].content[:500])
|
|
print(f'Contains links: {\"<a href=\" in articles[0].content}')
|
|
print(f'Has See Also: {\"See Also\" in articles[0].content}')
|
|
session.close()
|
|
"
|
|
```
|
|
|
|
## Summary
|
|
|
|
**The Good News**:
|
|
- All Story 3.3 code is perfect ✓
|
|
- Tests prove functionality works ✓
|
|
- No bugs, no issues ✓
|
|
|
|
**The Bad News**:
|
|
- Code isn't wired into CLI workflow ✗
|
|
- Running `generate-batch` doesn't use Story 3.1-3.3 ✗
|
|
- Articles are incomplete without integration ✗
|
|
|
|
**The Fix**:
|
|
- Add ~50 lines of integration code
|
|
- Wire existing functions into `BatchProcessor`
|
|
- Test with real job file
|
|
- Done! ✓
|
|
|
|
**When to Fix**:
|
|
- Now (before Story 4.x) - RECOMMENDED
|
|
- Or during Story 4.x (when deployment needs links)
|
|
- Not urgent if not deploying yet
|
|
|
|
---
|
|
|
|
*This explains why all tests pass but the feature "isn't done" yet - the plumbing exists, it's just not connected to the main pipeline.*
|
|
|