Big-Link-Man/INTEGRATION_GAP_VISUAL.md

242 lines
12 KiB
Markdown

# Visual: The Integration Gap
## What Currently Happens
```
┌─────────────────────────────────────────────────────────────┐
│ uv run python main.py generate-batch --job-file jobs/x.json │
└────────────────────────┬────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ BatchProcessor.process_job() │
│ │
│ For each tier (tier1, tier2, tier3): │
│ For each article (1 to N): │
│ ┌──────────────────────────────────┐ │
│ │ 1. Generate title │ │
│ │ 2. Generate outline │ │
│ │ 3. Generate content │ │
│ │ 4. Augment if too short │ │
│ │ 5. Save to database │ │
│ └──────────────────────────────────┘ │
│ │
│ ⚠️ STOPS HERE! ⚠️ │
└─────────────────────────────────────────────────────────────┘
Result in database:
┌──────────────────────────────────────────────────────────────┐
│ generated_content table: │
│ - Raw HTML (no links) │
│ - No site_deployment_id (most articles) │
│ - No final URL │
│ - No formatted_html │
│ │
│ article_links table: │
│ - EMPTY (no records) │
└──────────────────────────────────────────────────────────────┘
```
## What SHOULD Happen
```
┌─────────────────────────────────────────────────────────────┐
│ uv run python main.py generate-batch --job-file jobs/x.json │
└────────────────────────┬────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ BatchProcessor.process_job() │
│ │
│ For each tier (tier1, tier2, tier3): │
│ For each article (1 to N): │
│ ┌──────────────────────────────────┐ │
│ │ 1. Generate title │ │
│ │ 2. Generate outline │ │
│ │ 3. Generate content │ │
│ │ 4. Augment if too short │ │
│ │ 5. Save to database │ │
│ └──────────────────────────────────┘ │
│ │
│ ✨ NEW: After all articles in tier generated ✨ │
│ ┌──────────────────────────────────┐ │
│ │ 6. Assign sites (Story 3.1) │ ← MISSING │
│ │ 7. Generate URLs (Story 3.1) │ ← MISSING │
│ │ 8. Find tiered links (3.2) │ ← MISSING │
│ │ 9. Inject interlinks (3.3) │ ← MISSING │
│ │ 10. Apply templates │ ← MISSING │
│ └──────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Result in database:
┌──────────────────────────────────────────────────────────────┐
│ generated_content table: │
│ ✅ Final HTML with all links injected │
│ ✅ site_deployment_id assigned │
│ ✅ Final URL generated │
│ ✅ formatted_html with template applied │
│ │
│ article_links table: │
│ ✅ Tiered links (T1→money site, T2→T1) │
│ ✅ Homepage links (all→/index.html) │
│ ✅ See Also links (all→all in batch) │
└──────────────────────────────────────────────────────────────┘
```
## The Gap in Code
### Current Code Structure
```python
# src/generation/batch_processor.py
class BatchProcessor:
def _process_tier(self, project_id, tier_name, tier_config, ...):
"""Process all articles for a tier"""
# Generate each article
for article_num in range(1, tier_config.count + 1):
self._generate_single_article(...)
self.stats["generated_articles"] += 1
# ⚠️ Method ends here!
# Nothing happens after article generation
```
### What Needs to Be Added
```python
# src/generation/batch_processor.py
class BatchProcessor:
def _process_tier(self, project_id, tier_name, tier_config, ...):
"""Process all articles for a tier"""
# Generate each article
for article_num in range(1, tier_config.count + 1):
self._generate_single_article(...)
self.stats["generated_articles"] += 1
# ✨ NEW: Post-processing
click.echo(f" {tier_name}: Post-processing {tier_config.count} articles...")
self._post_process_tier(project_id, tier_name, job, debug)
def _post_process_tier(self, project_id, tier_name, job, debug):
"""Apply URL generation, interlinking, and templating"""
# Get all articles for this tier
content_records = self.content_repo.get_by_project_and_tier(
project_id, tier_name, status=["generated", "augmented"]
)
if not content_records:
click.echo(f" No articles to post-process")
return
project = self.project_repo.get_by_id(project_id)
# Step 1: Assign sites (Story 3.1)
# (Site assignment might already be done via deployment_targets)
# Step 2: Generate URLs (Story 3.1)
from src.generation.url_generator import generate_urls_for_batch
click.echo(f" Generating URLs...")
article_urls = generate_urls_for_batch(content_records, self.site_deployment_repo)
# Step 3: Find tiered links (Story 3.2)
from src.interlinking.tiered_links import find_tiered_links
click.echo(f" Finding tiered links...")
tiered_links = find_tiered_links(
content_records, job, self.project_repo,
self.content_repo, self.site_deployment_repo
)
# Step 4: Inject interlinks (Story 3.3)
from src.interlinking.content_injection import inject_interlinks
from src.database.repositories import ArticleLinkRepository
click.echo(f" Injecting interlinks...")
session = self.content_repo.session # Use same session
link_repo = ArticleLinkRepository(session)
inject_interlinks(
content_records, article_urls, tiered_links,
project, job, self.content_repo, link_repo
)
# Step 5: Apply templates
click.echo(f" Applying templates...")
for content in content_records:
self.generator.apply_template(content.id)
click.echo(f" Post-processing complete: {len(content_records)} articles ready")
```
## Files That Need Changes
```
src/generation/batch_processor.py
├─ Add imports at top
├─ Add call to _post_process_tier() in _process_tier()
└─ Add new method _post_process_tier()
src/database/repositories.py
└─ May need to add get_by_project_and_tier() if it doesn't exist
```
## Why Tests Still Pass
```
┌─────────────────────────────────────────┐
│ Unit Tests │
│ ✅ Test inject_interlinks() directly │
│ ✅ Test find_tiered_links() directly │
│ ✅ Test generate_urls_for_batch() │
│ │
│ These call the functions directly, │
│ so they work perfectly! │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Integration Tests │
│ ✅ Create test database │
│ ✅ Call functions in sequence │
│ ✅ Verify results │
│ │
│ These simulate the workflow manually, │
│ so they work perfectly! │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ Real CLI Usage │
│ ✅ Generates articles │
│ ❌ Never calls Story 3.1-3.3 functions │
│ ❌ Articles incomplete │
│ │
│ This is missing the integration! │
└─────────────────────────────────────────┘
```
## Summary
**The Analogy**:
Imagine you built a perfect car engine:
- All parts work perfectly ✅
- Each part tested individually ✅
- Each part fits together ✅
But you never **installed it in the car**
That's the current state:
- Story 3.3 functions work perfectly
- Tests prove it works
- But the CLI never calls them
- So users get articles with no links
**The Fix**: Install the engine (add 50 lines to BatchProcessor)
**Time**: 30-60 minutes
**Priority**: High (if deploying), Medium (if still developing)