267 lines
8.2 KiB
Markdown
267 lines
8.2 KiB
Markdown
# Story 3.1 Implementation Summary
|
|
|
|
## Overview
|
|
Implemented URL generation and site assignment for batch content generation, including full auto-creation capabilities and priority-based site assignment.
|
|
|
|
## What Was Implemented
|
|
|
|
### 1. Database Schema Changes
|
|
- **Modified**: `src/database/models.py`
|
|
- Made `custom_hostname` nullable in `SiteDeployment` model
|
|
- Added unique constraint to `pull_zone_bcdn_hostname`
|
|
- Updated `__repr__` to handle both custom and bcdn hostnames
|
|
|
|
- **Migration Script**: `scripts/migrate_story_3.1.sql`
|
|
- SQL script to update existing databases
|
|
- Run this on your dev database before testing
|
|
|
|
### 2. Repository Layer Updates
|
|
- **Modified**: `src/database/interfaces.py`
|
|
- Changed `custom_hostname` to optional parameter in `create()` signature
|
|
- Added `get_by_bcdn_hostname()` method signature
|
|
- Updated `exists()` to check both hostname types
|
|
|
|
- **Modified**: `src/database/repositories.py`
|
|
- Made `custom_hostname` parameter optional with default `None`
|
|
- Implemented `get_by_bcdn_hostname()` method
|
|
- Updated `exists()` to query both custom and bcdn hostnames
|
|
|
|
### 3. Template Service Update
|
|
- **Modified**: `src/templating/service.py`
|
|
- Line 92: Changed to `hostname = site_deployment.custom_hostname or site_deployment.pull_zone_bcdn_hostname`
|
|
- Now handles sites with only bcdn hostnames
|
|
|
|
### 4. CLI Updates
|
|
- **Modified**: `src/cli/commands.py`
|
|
- Updated `sync-sites` command to import sites without custom domains
|
|
- Removed filter that skipped bcdn-only sites
|
|
- Now imports all bunny.net sites (with or without custom domains)
|
|
|
|
### 5. Site Provisioning Module (NEW)
|
|
- **Created**: `src/generation/site_provisioning.py`
|
|
- `generate_random_suffix()`: Creates random 4-char suffixes
|
|
- `slugify_keyword()`: Converts keywords to URL-safe slugs
|
|
- `create_bunnynet_site()`: Creates Storage Zone + Pull Zone via API
|
|
- `provision_keyword_sites()`: Pre-creates sites for specific keywords
|
|
- `create_generic_sites()`: Creates generic sites on-demand
|
|
|
|
### 6. URL Generator Module (NEW)
|
|
- **Created**: `src/generation/url_generator.py`
|
|
- `generate_slug()`: Converts article titles to URL-safe slugs
|
|
- `generate_urls_for_batch()`: Generates complete URLs for all articles in batch
|
|
- Handles custom domains and bcdn hostnames
|
|
- Returns full URL mappings with metadata
|
|
|
|
### 7. Job Config Extensions
|
|
- **Modified**: `src/generation/job_config.py`
|
|
- Added `tier1_preferred_sites: Optional[List[str]]` field
|
|
- Added `auto_create_sites: bool` field (default: False)
|
|
- Added `create_sites_for_keywords: Optional[List[Dict]]` field
|
|
- Full validation for all new fields
|
|
|
|
### 8. Site Assignment Module (NEW)
|
|
- **Created**: `src/generation/site_assignment.py`
|
|
- `assign_sites_to_batch()`: Main assignment function with full priority system
|
|
- `_get_keyword_sites()`: Helper to match sites by keyword
|
|
- **Priority system**:
|
|
- Tier1: preferred sites → keyword sites → random
|
|
- Tier2+: keyword sites → random
|
|
- Auto-creates sites when pool is insufficient (if enabled)
|
|
- Prevents duplicate assignments within same batch
|
|
|
|
### 9. Comprehensive Tests
|
|
- **Created**: `tests/unit/test_url_generator.py` - URL generation tests
|
|
- **Created**: `tests/unit/test_site_provisioning.py` - Site creation tests
|
|
- **Created**: `tests/unit/test_site_assignment.py` - Assignment logic tests
|
|
- **Created**: `tests/unit/test_job_config_extensions.py` - Config parsing tests
|
|
- **Created**: `tests/integration/test_story_3_1_integration.py` - Full workflow tests
|
|
|
|
### 10. Example Job Config
|
|
- **Created**: `jobs/example_story_3.1_full_features.json`
|
|
- Demonstrates all new features
|
|
- Ready-to-use template
|
|
|
|
## How to Use
|
|
|
|
### Step 1: Migrate Your Database
|
|
Run the migration script on your development database:
|
|
|
|
```sql
|
|
-- From scripts/migrate_story_3.1.sql
|
|
ALTER TABLE site_deployments MODIFY COLUMN custom_hostname VARCHAR(255) NULL;
|
|
ALTER TABLE site_deployments ADD CONSTRAINT uq_pull_zone_bcdn_hostname UNIQUE (pull_zone_bcdn_hostname);
|
|
```
|
|
|
|
### Step 2: Sync Existing Bunny.net Sites
|
|
Import your 400+ existing bunny.net buckets:
|
|
|
|
```bash
|
|
uv run python main.py sync-sites --admin-user your_admin --dry-run
|
|
```
|
|
|
|
Review the output, then run without `--dry-run` to import.
|
|
|
|
### Step 3: Create a Job Config
|
|
Use the new fields in your job configuration:
|
|
|
|
```json
|
|
{
|
|
"jobs": [{
|
|
"project_id": 1,
|
|
"tiers": {
|
|
"tier1": {"count": 10}
|
|
},
|
|
"tier1_preferred_sites": ["www.premium.com"],
|
|
"auto_create_sites": true,
|
|
"create_sites_for_keywords": [
|
|
{"keyword": "engine repair", "count": 3}
|
|
]
|
|
}]
|
|
}
|
|
```
|
|
|
|
### Step 4: Use in Your Workflow
|
|
In your content generation workflow:
|
|
|
|
```python
|
|
from src.generation.site_assignment import assign_sites_to_batch
|
|
from src.generation.url_generator import generate_urls_for_batch
|
|
|
|
# After content generation, assign sites
|
|
assign_sites_to_batch(
|
|
content_records=generated_articles,
|
|
job=job_config,
|
|
site_repo=site_repository,
|
|
bunny_client=bunny_client,
|
|
project_keyword=project.main_keyword
|
|
)
|
|
|
|
# Generate URLs
|
|
urls = generate_urls_for_batch(
|
|
content_records=generated_articles,
|
|
site_repo=site_repository
|
|
)
|
|
|
|
# urls is a list of:
|
|
# [{
|
|
# "content_id": 1,
|
|
# "title": "How to Fix Your Engine",
|
|
# "url": "https://www.example.com/how-to-fix-your-engine.html",
|
|
# "tier": "tier1",
|
|
# "slug": "how-to-fix-your-engine",
|
|
# "hostname": "www.example.com"
|
|
# }, ...]
|
|
```
|
|
|
|
## Site Assignment Priority Logic
|
|
|
|
### For Tier1 Articles:
|
|
1. **Preferred Sites** (from `tier1_preferred_sites`) - if specified
|
|
2. **Keyword Sites** (matching article keyword in site name)
|
|
3. **Random** from available pool
|
|
|
|
### For Tier2+ Articles:
|
|
1. **Keyword Sites** (matching article keyword in site name)
|
|
2. **Random** from available pool
|
|
|
|
### Auto-Creation:
|
|
If `auto_create_sites: true` and pool is insufficient:
|
|
- Creates minimum number of generic sites needed
|
|
- Uses project main keyword in site names
|
|
- Creates via bunny.net API (Storage Zone + Pull Zone)
|
|
|
|
## URL Structure
|
|
|
|
### With Custom Domain:
|
|
```
|
|
https://www.example.com/how-to-fix-your-engine.html
|
|
```
|
|
|
|
### With Bunny.net CDN Only:
|
|
```
|
|
https://mysite123.b-cdn.net/how-to-fix-your-engine.html
|
|
```
|
|
|
|
## Slug Generation Rules
|
|
- Lowercase
|
|
- Replace spaces with hyphens
|
|
- Remove special characters
|
|
- Max 100 characters
|
|
- Fallback: `article-{content_id}` if empty
|
|
|
|
## Testing
|
|
|
|
Run the tests:
|
|
|
|
```bash
|
|
# Unit tests
|
|
uv run pytest tests/unit/test_url_generator.py
|
|
uv run pytest tests/unit/test_site_provisioning.py
|
|
uv run pytest tests/unit/test_site_assignment.py
|
|
uv run pytest tests/unit/test_job_config_extensions.py
|
|
|
|
# Integration tests
|
|
uv run pytest tests/integration/test_story_3_1_integration.py
|
|
|
|
# All Story 3.1 tests
|
|
uv run pytest tests/ -k "story_3_1 or url_generator or site_provisioning or site_assignment or job_config_extensions"
|
|
```
|
|
|
|
## Key Features
|
|
|
|
### Simple Over Complex
|
|
- No fuzzy keyword matching (as requested)
|
|
- Straightforward priority system
|
|
- Clear error messages
|
|
- Minimal dependencies
|
|
|
|
### Full Auto-Creation
|
|
- Pre-create sites for specific keywords
|
|
- Auto-create generic sites when needed
|
|
- All sites use bunny.net API
|
|
|
|
### Full Priority System
|
|
- Tier1 preferred sites
|
|
- Keyword-based matching
|
|
- Random assignment fallback
|
|
|
|
### Flexible Hostnames
|
|
- Supports custom domains
|
|
- Supports bcdn-only sites
|
|
- Automatically chooses correct hostname
|
|
|
|
## Production Deployment
|
|
|
|
When moving to production:
|
|
1. The model changes will automatically apply (SQLAlchemy will create tables correctly)
|
|
2. No additional migration scripts needed
|
|
3. Just ensure your production `.env` has `BUNNY_ACCOUNT_API_KEY` set
|
|
4. Run `sync-sites` to import existing bunny.net infrastructure
|
|
|
|
## Files Changed/Created
|
|
|
|
### Modified (8 files):
|
|
- `src/database/models.py`
|
|
- `src/database/interfaces.py`
|
|
- `src/database/repositories.py`
|
|
- `src/templating/service.py`
|
|
- `src/cli/commands.py`
|
|
- `src/generation/job_config.py`
|
|
|
|
### Created (9 files):
|
|
- `scripts/migrate_story_3.1.sql`
|
|
- `src/generation/site_provisioning.py`
|
|
- `src/generation/url_generator.py`
|
|
- `src/generation/site_assignment.py`
|
|
- `tests/unit/test_url_generator.py`
|
|
- `tests/unit/test_site_provisioning.py`
|
|
- `tests/unit/test_site_assignment.py`
|
|
- `tests/unit/test_job_config_extensions.py`
|
|
- `tests/integration/test_story_3_1_integration.py`
|
|
- `jobs/example_story_3.1_full_features.json`
|
|
- `STORY_3.1_IMPLEMENTATION_SUMMARY.md`
|
|
|
|
## Total Effort
|
|
Completed all 10 tasks from the story specification.
|
|
|