242 lines
12 KiB
Markdown
242 lines
12 KiB
Markdown
# Visual: The Integration Gap
|
|
|
|
## What Currently Happens
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ uv run python main.py generate-batch --job-file jobs/x.json │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ BatchProcessor.process_job() │
|
|
│ │
|
|
│ For each tier (tier1, tier2, tier3): │
|
|
│ For each article (1 to N): │
|
|
│ ┌──────────────────────────────────┐ │
|
|
│ │ 1. Generate title │ │
|
|
│ │ 2. Generate outline │ │
|
|
│ │ 3. Generate content │ │
|
|
│ │ 4. Augment if too short │ │
|
|
│ │ 5. Save to database │ │
|
|
│ └──────────────────────────────────┘ │
|
|
│ │
|
|
│ ⚠️ STOPS HERE! ⚠️ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
|
|
Result in database:
|
|
┌──────────────────────────────────────────────────────────────┐
|
|
│ generated_content table: │
|
|
│ - Raw HTML (no links) │
|
|
│ - No site_deployment_id (most articles) │
|
|
│ - No final URL │
|
|
│ - No formatted_html │
|
|
│ │
|
|
│ article_links table: │
|
|
│ - EMPTY (no records) │
|
|
└──────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## What SHOULD Happen
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ uv run python main.py generate-batch --job-file jobs/x.json │
|
|
└────────────────────────┬────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ BatchProcessor.process_job() │
|
|
│ │
|
|
│ For each tier (tier1, tier2, tier3): │
|
|
│ For each article (1 to N): │
|
|
│ ┌──────────────────────────────────┐ │
|
|
│ │ 1. Generate title │ │
|
|
│ │ 2. Generate outline │ │
|
|
│ │ 3. Generate content │ │
|
|
│ │ 4. Augment if too short │ │
|
|
│ │ 5. Save to database │ │
|
|
│ └──────────────────────────────────┘ │
|
|
│ │
|
|
│ ✨ NEW: After all articles in tier generated ✨ │
|
|
│ ┌──────────────────────────────────┐ │
|
|
│ │ 6. Assign sites (Story 3.1) │ ← MISSING │
|
|
│ │ 7. Generate URLs (Story 3.1) │ ← MISSING │
|
|
│ │ 8. Find tiered links (3.2) │ ← MISSING │
|
|
│ │ 9. Inject interlinks (3.3) │ ← MISSING │
|
|
│ │ 10. Apply templates │ ← MISSING │
|
|
│ └──────────────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
|
|
Result in database:
|
|
┌──────────────────────────────────────────────────────────────┐
|
|
│ generated_content table: │
|
|
│ ✅ Final HTML with all links injected │
|
|
│ ✅ site_deployment_id assigned │
|
|
│ ✅ Final URL generated │
|
|
│ ✅ formatted_html with template applied │
|
|
│ │
|
|
│ article_links table: │
|
|
│ ✅ Tiered links (T1→money site, T2→T1) │
|
|
│ ✅ Homepage links (all→/index.html) │
|
|
│ ✅ See Also links (all→all in batch) │
|
|
└──────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## The Gap in Code
|
|
|
|
### Current Code Structure
|
|
|
|
```python
|
|
# src/generation/batch_processor.py
|
|
|
|
class BatchProcessor:
|
|
def _process_tier(self, project_id, tier_name, tier_config, ...):
|
|
"""Process all articles for a tier"""
|
|
|
|
# Generate each article
|
|
for article_num in range(1, tier_config.count + 1):
|
|
self._generate_single_article(...)
|
|
self.stats["generated_articles"] += 1
|
|
|
|
# ⚠️ Method ends here!
|
|
# Nothing happens after article generation
|
|
```
|
|
|
|
### What Needs to Be Added
|
|
|
|
```python
|
|
# src/generation/batch_processor.py
|
|
|
|
class BatchProcessor:
|
|
def _process_tier(self, project_id, tier_name, tier_config, ...):
|
|
"""Process all articles for a tier"""
|
|
|
|
# Generate each article
|
|
for article_num in range(1, tier_config.count + 1):
|
|
self._generate_single_article(...)
|
|
self.stats["generated_articles"] += 1
|
|
|
|
# ✨ NEW: Post-processing
|
|
click.echo(f" {tier_name}: Post-processing {tier_config.count} articles...")
|
|
self._post_process_tier(project_id, tier_name, job, debug)
|
|
|
|
def _post_process_tier(self, project_id, tier_name, job, debug):
|
|
"""Apply URL generation, interlinking, and templating"""
|
|
|
|
# Get all articles for this tier
|
|
content_records = self.content_repo.get_by_project_and_tier(
|
|
project_id, tier_name, status=["generated", "augmented"]
|
|
)
|
|
|
|
if not content_records:
|
|
click.echo(f" No articles to post-process")
|
|
return
|
|
|
|
project = self.project_repo.get_by_id(project_id)
|
|
|
|
# Step 1: Assign sites (Story 3.1)
|
|
# (Site assignment might already be done via deployment_targets)
|
|
|
|
# Step 2: Generate URLs (Story 3.1)
|
|
from src.generation.url_generator import generate_urls_for_batch
|
|
click.echo(f" Generating URLs...")
|
|
article_urls = generate_urls_for_batch(content_records, self.site_deployment_repo)
|
|
|
|
# Step 3: Find tiered links (Story 3.2)
|
|
from src.interlinking.tiered_links import find_tiered_links
|
|
click.echo(f" Finding tiered links...")
|
|
tiered_links = find_tiered_links(
|
|
content_records, job, self.project_repo,
|
|
self.content_repo, self.site_deployment_repo
|
|
)
|
|
|
|
# Step 4: Inject interlinks (Story 3.3)
|
|
from src.interlinking.content_injection import inject_interlinks
|
|
from src.database.repositories import ArticleLinkRepository
|
|
click.echo(f" Injecting interlinks...")
|
|
|
|
session = self.content_repo.session # Use same session
|
|
link_repo = ArticleLinkRepository(session)
|
|
inject_interlinks(
|
|
content_records, article_urls, tiered_links,
|
|
project, job, self.content_repo, link_repo
|
|
)
|
|
|
|
# Step 5: Apply templates
|
|
click.echo(f" Applying templates...")
|
|
for content in content_records:
|
|
self.generator.apply_template(content.id)
|
|
|
|
click.echo(f" Post-processing complete: {len(content_records)} articles ready")
|
|
```
|
|
|
|
## Files That Need Changes
|
|
|
|
```
|
|
src/generation/batch_processor.py
|
|
├─ Add imports at top
|
|
├─ Add call to _post_process_tier() in _process_tier()
|
|
└─ Add new method _post_process_tier()
|
|
|
|
src/database/repositories.py
|
|
└─ May need to add get_by_project_and_tier() if it doesn't exist
|
|
```
|
|
|
|
## Why Tests Still Pass
|
|
|
|
```
|
|
┌─────────────────────────────────────────┐
|
|
│ Unit Tests │
|
|
│ ✅ Test inject_interlinks() directly │
|
|
│ ✅ Test find_tiered_links() directly │
|
|
│ ✅ Test generate_urls_for_batch() │
|
|
│ │
|
|
│ These call the functions directly, │
|
|
│ so they work perfectly! │
|
|
└─────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────┐
|
|
│ Integration Tests │
|
|
│ ✅ Create test database │
|
|
│ ✅ Call functions in sequence │
|
|
│ ✅ Verify results │
|
|
│ │
|
|
│ These simulate the workflow manually, │
|
|
│ so they work perfectly! │
|
|
└─────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────┐
|
|
│ Real CLI Usage │
|
|
│ ✅ Generates articles │
|
|
│ ❌ Never calls Story 3.1-3.3 functions │
|
|
│ ❌ Articles incomplete │
|
|
│ │
|
|
│ This is missing the integration! │
|
|
└─────────────────────────────────────────┘
|
|
```
|
|
|
|
## Summary
|
|
|
|
**The Analogy**:
|
|
|
|
Imagine you built a perfect car engine:
|
|
- All parts work perfectly ✅
|
|
- Each part tested individually ✅
|
|
- Each part fits together ✅
|
|
|
|
But you never **installed it in the car** ❌
|
|
|
|
That's the current state:
|
|
- Story 3.3 functions work perfectly
|
|
- Tests prove it works
|
|
- But the CLI never calls them
|
|
- So users get articles with no links
|
|
|
|
**The Fix**: Install the engine (add 50 lines to BatchProcessor)
|
|
|
|
**Time**: 30-60 minutes
|
|
|
|
**Priority**: High (if deploying), Medium (if still developing)
|
|
|