Big-Link-Man/docs/technical-debt.md

376 lines
9.1 KiB
Markdown

# Technical Debt & Future Enhancements
This document tracks technical debt, future enhancements, and features that were deferred from the MVP.
---
## Story 1.6: Deployment Infrastructure Management
### Domain Health Check / Verification Status
**Priority**: Medium
**Epic Suggestion**: Epic 4 (Deployment) or Epic 3 (Pre-deployment)
**Estimated Effort**: Small (1-2 days)
#### Problem
After importing or provisioning sites, there's no way to verify:
- Domain ownership is still valid (user didn't let domain expire)
- DNS configuration is correct and pointing to bunny.net
- Custom domain is actually serving content
- SSL certificates are valid
With 50+ domains, manual checking is impractical.
#### Proposed Solution
**Option 1: Active Health Check**
1. Create a health check file in each Storage Zone (e.g., `.health-check.txt`)
2. Periodically attempt to fetch it via the custom domain
3. Record results in database
**Option 2: Use bunny.net API**
- Check if bunny.net exposes domain verification status via API
- Query verification status for each custom hostname
**Database Changes**
Add `health_status` field to `SiteDeployment` table:
- `unknown` - Not yet checked
- `healthy` - Domain resolving and serving content
- `dns_failure` - Cannot resolve domain
- `ssl_error` - Certificate issues
- `unreachable` - Domain not responding
- `expired` - Likely domain ownership lost
Add `last_health_check` timestamp field.
**CLI Commands**
```bash
# Check single domain
check-site-health --domain www.example.com
# Check all domains
check-all-sites-health
# List unhealthy sites
list-sites --status unhealthy
```
**Use Cases**
- Automated monitoring to detect when domains expire
- Pre-deployment validation before pushing new content
- Dashboard showing health of entire portfolio
- Alert system for broken domains
#### Impact
- Prevents wasted effort deploying to expired domains
- Early detection of DNS/SSL issues
- Better operational visibility across large domain portfolios
---
## Story 2.3: AI-Powered Content Generation
### Prompt Template A/B Testing & Optimization
**Priority**: Medium
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (3-5 days)
#### Problem
Content quality and AI compliance with CORA targets varies based on prompt wording. No systematic way to:
- Test different prompt variations
- Compare results objectively
- Select optimal prompts for different scenarios
- Track which prompts work best with which models
#### Proposed Solution
**Prompt Versioning System:**
1. Support multiple versions of each prompt template
2. Name prompts with version suffix (e.g., `title_generation_v1.json`, `title_generation_v2.json`)
3. Job config specifies which prompt version to use per stage
**Comparison Tool:**
```bash
# Generate with multiple prompt versions
compare-prompts --project-id 1 --variants v1,v2,v3 --stages title,outline
# Outputs:
# - Side-by-side content comparison
# - Validation scores
# - Augmentation requirements
# - Generation time/cost
# - Recommendation
```
**Metrics to Track:**
- Validation pass rate
- Augmentation frequency
- Average attempts per stage
- Word count variance
- Keyword density accuracy
- Generation time
- API cost
**Database Changes:**
Add `prompt_version` fields to `GeneratedContent`:
- `title_prompt_version`
- `outline_prompt_version`
- `content_prompt_version`
#### Impact
- Higher quality content
- Reduced augmentation needs
- Lower API costs
- Model-specific optimizations
- Data-driven prompt improvements
---
### Parallel Article Generation
**Priority**: Low
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (3-5 days)
#### Problem
Articles are generated sequentially, which is slow for large batches:
- 15 tier 1 articles: ~10-20 minutes
- 150 tier 2 articles: ~2-3 hours
This could be parallelized since articles are independent.
#### Proposed Solution
**Multi-threading/Multi-processing:**
1. Add `--parallel N` flag to `generate-batch` command
2. Process N articles simultaneously
3. Share database session pool
4. Rate limit API calls to avoid throttling
**Considerations:**
- Database connection pooling
- OpenRouter rate limits
- Memory usage (N concurrent AI calls)
- Progress tracking complexity
- Error handling across threads
**Example:**
```bash
# Generate 4 articles in parallel
generate-batch -j job.json --parallel 4
```
#### Impact
- 3-4x faster for large batches
- Better resource utilization
- Reduced total job time
---
### Job Folder Auto-Processing
**Priority**: Low
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Small (1-2 days)
#### Problem
Currently must run each job file individually. For large operations with many batches, want to:
- Queue multiple jobs
- Process jobs/folder automatically
- Run overnight batches
#### Proposed Solution
**Job Queue System:**
```bash
# Process all jobs in folder
generate-batch --folder jobs/pending/
# Process and move to completed/
generate-batch --folder jobs/pending/ --move-on-complete jobs/completed/
# Watch folder for new jobs
generate-batch --watch jobs/queue/ --interval 60
```
**Features:**
- Process jobs in order (alphabetical or by timestamp)
- Move completed jobs to archive folder
- Skip failed jobs or retry
- Summary report for all jobs
**Database Changes:**
Add `JobRun` table to track batch job executions:
- job_file_path
- start_time, end_time
- total_articles, successful, failed
- status (running/completed/failed)
#### Impact
- Hands-off batch processing
- Better for large-scale operations
- Easier job management
---
### Cost Tracking & Analytics
**Priority**: Medium
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (2-4 days)
#### Problem
No visibility into:
- API costs per article/batch
- Which models are most cost-effective
- Cost per tier/quality level
- Budget tracking
#### Proposed Solution
**Track API Usage:**
1. Log tokens used per API call
2. Store in database with cost calculation
3. Dashboard showing costs
**Cost Fields in GeneratedContent:**
- `title_tokens_used`
- `title_cost_usd`
- `outline_tokens_used`
- `outline_cost_usd`
- `content_tokens_used`
- `content_cost_usd`
- `total_cost_usd`
**Analytics Commands:**
```bash
# Show costs for project
cost-report --project-id 1
# Compare model costs
model-cost-comparison --models claude-3.5-sonnet,gpt-4o
# Budget tracking
cost-summary --date-range 2025-10-01:2025-10-31
```
**Reports:**
- Cost per article by tier
- Model efficiency (cost vs quality)
- Daily/weekly/monthly spend
- Budget alerts
#### Impact
- Cost optimization
- Better budget planning
- Model selection data
- ROI tracking
---
### Model Performance Analytics
**Priority**: Low
**Epic Suggestion**: Epic 2 (Content Generation) - Post-MVP
**Estimated Effort**: Medium (3-5 days)
#### Problem
No data on which models perform best for:
- Different tiers
- Different content types
- Title vs outline vs content generation
- Pass rates and quality scores
#### Proposed Solution
**Performance Tracking:**
1. Track validation metrics per model
2. Generate comparison reports
3. Recommend optimal models for scenarios
**Metrics:**
- First-attempt pass rate
- Average attempts to success
- Augmentation frequency
- Validation score distributions
- Generation time
- Cost per successful article
**Dashboard:**
```bash
# Model performance report
model-performance --days 30
# Output:
Model: claude-3.5-sonnet
Title: 98% pass rate, 1.02 avg attempts, $0.05 avg cost
Outline: 85% pass rate, 1.35 avg attempts, $0.15 avg cost
Content: 72% pass rate, 1.67 avg attempts, $0.89 avg cost
Model: gpt-4o
...
Recommendations:
- Use claude-3.5-sonnet for titles (best pass rate)
- Use gpt-4o for content (better quality scores)
```
#### Impact
- Data-driven model selection
- Optimize quality vs cost
- Identify model strengths/weaknesses
- Better tier-model mapping
---
### Improved Content Augmentation
**Priority**: Medium
**Epic Suggestion**: Epic 2 (Content Generation) - Enhancement
**Estimated Effort**: Medium (3-5 days)
#### Problem
Current augmentation is basic:
- Random word insertion can break sentence flow
- Doesn't consider context
- Can feel unnatural
- No quality scoring
#### Proposed Solution
**Smarter Augmentation:**
1. Use AI to rewrite sentences with missing terms
2. Analyze sentence structure before insertion
3. Add quality scoring for augmented vs original
4. User-reviewable augmentation suggestions
**Example:**
```python
# Instead of: "The process involves machine learning techniques."
# Random insert: "The process involves keyword machine learning techniques."
# Smarter: "The process involves keyword-driven machine learning techniques."
# Or: "The process, focused on keyword optimization, involves machine learning."
```
**Features:**
- Context-aware term insertion
- Sentence rewriting option
- A/B comparison (original vs augmented)
- Quality scoring
- Manual review mode
#### Impact
- More natural augmented content
- Better readability
- Higher quality scores
- User confidence in output
---
## Future Sections
Add new technical debt items below as they're identified during development.