Big-Link-Man/docs/technical-debt.md

# Technical Debt & Future Enhancements

This document tracks technical debt, future enhancements, and features that were deferred from the MVP.

---

## Story 1.6: Deployment Infrastructure Management

### Domain Health Check / Verification Status

**Priority**: Medium
**Epic Suggestion**: Epic 4 (Deployment) or Epic 3 (Pre-deployment)
**Estimated Effort**: Small (1-2 days)

#### Problem
After importing or provisioning sites, there's no way to verify:
- Domain ownership is still valid (user didn't let domain expire)
- DNS configuration is correct and pointing to bunny.net
- Custom domain is actually serving content
- SSL certificates are valid

With 50+ domains, manual checking is impractical.

#### Proposed Solution

**Option 1: Active Health Check**
1. Create a health check file in each Storage Zone (e.g., `.health-check.txt`)
2. Periodically attempt to fetch it via the custom domain
3. Record results in database

**Option 2: Use bunny.net API**
- Check if bunny.net exposes domain verification status via API
- Query verification status for each custom hostname

**Database Changes**
Add `health_status` field to `SiteDeployment` table:
- `unknown` - Not yet checked
- `healthy` - Domain resolving and serving content
- `dns_failure` - Cannot resolve domain
- `ssl_error` - Certificate issues
- `unreachable` - Domain not responding
- `expired` - Likely domain ownership lost

Add `last_health_check` timestamp field.

**CLI Commands**
```bash
# Check single domain
check-site-health --domain www.example.com

# Check all domains
check-all-sites-health

# List unhealthy sites
list-sites --status unhealthy
```

**Use Cases**
- Automated monitoring to detect when domains expire
- Pre-deployment validation before pushing new content
- Dashboard showing health of entire portfolio
- Alert system for broken domains

#### Impact
- Prevents wasted effort deploying to expired domains
- Early detection of DNS/SSL issues
- Better operational visibility across large domain portfolios

---

## Story 2.3: AI-Powered Content Generation

### Prompt Template A/B Testing & Optimization

**Priority**: Medium
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (3-5 days)

#### Problem
Content quality and AI compliance with CORA targets varies based on prompt wording. No systematic way to:
- Test different prompt variations
- Compare results objectively
- Select optimal prompts for different scenarios
- Track which prompts work best with which models

#### Proposed Solution

**Prompt Versioning System:**
1. Support multiple versions of each prompt template
2. Name prompts with version suffix (e.g., `title_generation_v1.json`, `title_generation_v2.json`)
3. Job config specifies which prompt version to use per stage

**Comparison Tool:**
```bash
# Generate with multiple prompt versions
compare-prompts --project-id 1 --variants v1,v2,v3 --stages title,outline

# Outputs:
# - Side-by-side content comparison
# - Validation scores
# - Augmentation requirements
# - Generation time/cost
# - Recommendation
```

**Metrics to Track:**
- Validation pass rate
- Augmentation frequency
- Average attempts per stage
- Word count variance
- Keyword density accuracy
- Generation time
- API cost

**Database Changes:**
Add `prompt_version` fields to `GeneratedContent`:
- `title_prompt_version`
- `outline_prompt_version`
- `content_prompt_version`

#### Impact
- Higher quality content
- Reduced augmentation needs
- Lower API costs
- Model-specific optimizations
- Data-driven prompt improvements

---

### Parallel Article Generation

**Priority**: Low
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (3-5 days)

#### Problem
Articles are generated sequentially, which is slow for large batches:
- 15 tier 1 articles: ~10-20 minutes
- 150 tier 2 articles: ~2-3 hours

This could be parallelized since articles are independent.

#### Proposed Solution

**Multi-threading/Multi-processing:**
1. Add `--parallel N` flag to `generate-batch` command
2. Process N articles simultaneously
3. Share database session pool
4. Rate limit API calls to avoid throttling

**Considerations:**
- Database connection pooling
- OpenRouter rate limits
- Memory usage (N concurrent AI calls)
- Progress tracking complexity
- Error handling across threads

**Example:**
```bash
# Generate 4 articles in parallel
generate-batch -j job.json --parallel 4
```

#### Impact
- 3-4x faster for large batches
- Better resource utilization
- Reduced total job time

---

### Job Folder Auto-Processing

**Priority**: Low
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Small (1-2 days)

#### Problem
Currently must run each job file individually. For large operations with many batches, want to:
- Queue multiple jobs
- Process jobs/folder automatically
- Run overnight batches

#### Proposed Solution

**Job Queue System:**
```bash
# Process all jobs in folder
generate-batch --folder jobs/pending/

# Process and move to completed/
generate-batch --folder jobs/pending/ --move-on-complete jobs/completed/

# Watch folder for new jobs
generate-batch --watch jobs/queue/ --interval 60
```

**Features:**
- Process jobs in order (alphabetical or by timestamp)
- Move completed jobs to archive folder
- Skip failed jobs or retry
- Summary report for all jobs

**Database Changes:**
Add `JobRun` table to track batch job executions:
- job_file_path
- start_time, end_time
- total_articles, successful, failed
- status (running/completed/failed)

#### Impact
- Hands-off batch processing
- Better for large-scale operations
- Easier job management

---

### Cost Tracking & Analytics

**Priority**: Medium
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (2-4 days)

#### Problem
No visibility into:
- API costs per article/batch
- Which models are most cost-effective
- Cost per tier/quality level
- Budget tracking

#### Proposed Solution

**Track API Usage:**
1. Log tokens used per API call
2. Store in database with cost calculation
3. Dashboard showing costs

**Cost Fields in GeneratedContent:**
- `title_tokens_used`
- `title_cost_usd`
- `outline_tokens_used`
- `outline_cost_usd`
- `content_tokens_used`
- `content_cost_usd`
- `total_cost_usd`

**Analytics Commands:**
```bash
# Show costs for project
cost-report --project-id 1

# Compare model costs
model-cost-comparison --models claude-3.5-sonnet,gpt-4o

# Budget tracking
cost-summary --date-range 2025-10-01:2025-10-31
```

**Reports:**
- Cost per article by tier
- Model efficiency (cost vs quality)
- Daily/weekly/monthly spend
- Budget alerts

#### Impact
- Cost optimization
- Better budget planning
- Model selection data
- ROI tracking

---

### Model Performance Analytics

**Priority**: Low
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (3-5 days)

#### Problem
No data on which models perform best for:
- Different tiers
- Different content types
- Title vs outline vs content generation
- Pass rates and quality scores

#### Proposed Solution

**Performance Tracking:**
1. Track validation metrics per model
2. Generate comparison reports
3. Recommend optimal models for scenarios

**Metrics:**
- First-attempt pass rate
- Average attempts to success
- Augmentation frequency
- Validation score distributions
- Generation time
- Cost per successful article

**Dashboard:**
```bash
# Model performance report
model-performance --days 30

# Output:
Model: claude-3.5-sonnet
  Title: 98% pass rate, 1.02 avg attempts, $0.05 avg cost
  Outline: 85% pass rate, 1.35 avg attempts, $0.15 avg cost
  Content: 72% pass rate, 1.67 avg attempts, $0.89 avg cost

Model: gpt-4o
  ...

Recommendations:
- Use claude-3.5-sonnet for titles (best pass rate)
- Use gpt-4o for content (better quality scores)
```

#### Impact
- Data-driven model selection
- Optimize quality vs cost
- Identify model strengths/weaknesses
- Better tier-model mapping

---

### Improved Content Augmentation

**Priority**: Medium
**Epic Suggestion**: Epic 2 (Content Generation) - Enhancement
**Estimated Effort**: Medium (3-5 days)

#### Problem
Current augmentation is basic:
- Random word insertion can break sentence flow
- Doesn't consider context
- Can feel unnatural
- No quality scoring

#### Proposed Solution

**Smarter Augmentation:**
1. Use AI to rewrite sentences with missing terms
2. Analyze sentence structure before insertion
3. Add quality scoring for augmented vs original
4. User-reviewable augmentation suggestions

**Example:**
```python
# Instead of: "The process involves machine learning techniques."
# Random insert: "The process involves keyword machine learning techniques."

# Smarter: "The process involves keyword-driven machine learning techniques."
# Or: "The process, focused on keyword optimization, involves machine learning."
```

**Features:**
- Context-aware term insertion
- Sentence rewriting option
- A/B comparison (original vs augmented)
- Quality scoring
- Manual review mode

#### Impact
- More natural augmented content
- Better readability
- Higher quality scores
- User confidence in output

---

## Future Sections

Add new technical debt items below as they're identified during development.