Big-Link-Man/STORY_3.1_IMPLEMENTATION_SU...

8.2 KiB

Story 3.1 Implementation Summary

Overview

Implemented URL generation and site assignment for batch content generation, including full auto-creation capabilities and priority-based site assignment.

What Was Implemented

1. Database Schema Changes

  • Modified: src/database/models.py

    • Made custom_hostname nullable in SiteDeployment model
    • Added unique constraint to pull_zone_bcdn_hostname
    • Updated __repr__ to handle both custom and bcdn hostnames
  • Migration Script: scripts/migrate_story_3.1.sql

    • SQL script to update existing databases
    • Run this on your dev database before testing

2. Repository Layer Updates

  • Modified: src/database/interfaces.py

    • Changed custom_hostname to optional parameter in create() signature
    • Added get_by_bcdn_hostname() method signature
    • Updated exists() to check both hostname types
  • Modified: src/database/repositories.py

    • Made custom_hostname parameter optional with default None
    • Implemented get_by_bcdn_hostname() method
    • Updated exists() to query both custom and bcdn hostnames

3. Template Service Update

  • Modified: src/templating/service.py
    • Line 92: Changed to hostname = site_deployment.custom_hostname or site_deployment.pull_zone_bcdn_hostname
    • Now handles sites with only bcdn hostnames

4. CLI Updates

  • Modified: src/cli/commands.py
    • Updated sync-sites command to import sites without custom domains
    • Removed filter that skipped bcdn-only sites
    • Now imports all bunny.net sites (with or without custom domains)

5. Site Provisioning Module (NEW)

  • Created: src/generation/site_provisioning.py
    • generate_random_suffix(): Creates random 4-char suffixes
    • slugify_keyword(): Converts keywords to URL-safe slugs
    • create_bunnynet_site(): Creates Storage Zone + Pull Zone via API
    • provision_keyword_sites(): Pre-creates sites for specific keywords
    • create_generic_sites(): Creates generic sites on-demand

6. URL Generator Module (NEW)

  • Created: src/generation/url_generator.py
    • generate_slug(): Converts article titles to URL-safe slugs
    • generate_urls_for_batch(): Generates complete URLs for all articles in batch
    • Handles custom domains and bcdn hostnames
    • Returns full URL mappings with metadata

7. Job Config Extensions

  • Modified: src/generation/job_config.py
    • Added tier1_preferred_sites: Optional[List[str]] field
    • Added auto_create_sites: bool field (default: False)
    • Added create_sites_for_keywords: Optional[List[Dict]] field
    • Full validation for all new fields

8. Site Assignment Module (NEW)

  • Created: src/generation/site_assignment.py
    • assign_sites_to_batch(): Main assignment function with full priority system
    • _get_keyword_sites(): Helper to match sites by keyword
    • Priority system:
      • Tier1: preferred sites → keyword sites → random
      • Tier2+: keyword sites → random
    • Auto-creates sites when pool is insufficient (if enabled)
    • Prevents duplicate assignments within same batch

9. Comprehensive Tests

  • Created: tests/unit/test_url_generator.py - URL generation tests
  • Created: tests/unit/test_site_provisioning.py - Site creation tests
  • Created: tests/unit/test_site_assignment.py - Assignment logic tests
  • Created: tests/unit/test_job_config_extensions.py - Config parsing tests
  • Created: tests/integration/test_story_3_1_integration.py - Full workflow tests

10. Example Job Config

  • Created: jobs/example_story_3.1_full_features.json
    • Demonstrates all new features
    • Ready-to-use template

How to Use

Step 1: Migrate Your Database

Run the migration script on your development database:

-- From scripts/migrate_story_3.1.sql
ALTER TABLE site_deployments MODIFY COLUMN custom_hostname VARCHAR(255) NULL;
ALTER TABLE site_deployments ADD CONSTRAINT uq_pull_zone_bcdn_hostname UNIQUE (pull_zone_bcdn_hostname);

Step 2: Sync Existing Bunny.net Sites

Import your 400+ existing bunny.net buckets:

uv run python main.py sync-sites --admin-user your_admin --dry-run

Review the output, then run without --dry-run to import.

Step 3: Create a Job Config

Use the new fields in your job configuration:

{
  "jobs": [{
    "project_id": 1,
    "tiers": {
      "tier1": {"count": 10}
    },
    "tier1_preferred_sites": ["www.premium.com"],
    "auto_create_sites": true,
    "create_sites_for_keywords": [
      {"keyword": "engine repair", "count": 3}
    ]
  }]
}

Step 4: Use in Your Workflow

In your content generation workflow:

from src.generation.site_assignment import assign_sites_to_batch
from src.generation.url_generator import generate_urls_for_batch

# After content generation, assign sites
assign_sites_to_batch(
    content_records=generated_articles,
    job=job_config,
    site_repo=site_repository,
    bunny_client=bunny_client,
    project_keyword=project.main_keyword
)

# Generate URLs
urls = generate_urls_for_batch(
    content_records=generated_articles,
    site_repo=site_repository
)

# urls is a list of:
# [{
#   "content_id": 1,
#   "title": "How to Fix Your Engine",
#   "url": "https://www.example.com/how-to-fix-your-engine.html",
#   "tier": "tier1",
#   "slug": "how-to-fix-your-engine",
#   "hostname": "www.example.com"
# }, ...]

Site Assignment Priority Logic

For Tier1 Articles:

  1. Preferred Sites (from tier1_preferred_sites) - if specified
  2. Keyword Sites (matching article keyword in site name)
  3. Random from available pool

For Tier2+ Articles:

  1. Keyword Sites (matching article keyword in site name)
  2. Random from available pool

Auto-Creation:

If auto_create_sites: true and pool is insufficient:

  • Creates minimum number of generic sites needed
  • Uses project main keyword in site names
  • Creates via bunny.net API (Storage Zone + Pull Zone)

URL Structure

With Custom Domain:

https://www.example.com/how-to-fix-your-engine.html

With Bunny.net CDN Only:

https://mysite123.b-cdn.net/how-to-fix-your-engine.html

Slug Generation Rules

  • Lowercase
  • Replace spaces with hyphens
  • Remove special characters
  • Max 100 characters
  • Fallback: article-{content_id} if empty

Testing

Run the tests:

# Unit tests
uv run pytest tests/unit/test_url_generator.py
uv run pytest tests/unit/test_site_provisioning.py
uv run pytest tests/unit/test_site_assignment.py
uv run pytest tests/unit/test_job_config_extensions.py

# Integration tests
uv run pytest tests/integration/test_story_3_1_integration.py

# All Story 3.1 tests
uv run pytest tests/ -k "story_3_1 or url_generator or site_provisioning or site_assignment or job_config_extensions"

Key Features

Simple Over Complex

  • No fuzzy keyword matching (as requested)
  • Straightforward priority system
  • Clear error messages
  • Minimal dependencies

Full Auto-Creation

  • Pre-create sites for specific keywords
  • Auto-create generic sites when needed
  • All sites use bunny.net API

Full Priority System

  • Tier1 preferred sites
  • Keyword-based matching
  • Random assignment fallback

Flexible Hostnames

  • Supports custom domains
  • Supports bcdn-only sites
  • Automatically chooses correct hostname

Production Deployment

When moving to production:

  1. The model changes will automatically apply (SQLAlchemy will create tables correctly)
  2. No additional migration scripts needed
  3. Just ensure your production .env has BUNNY_ACCOUNT_API_KEY set
  4. Run sync-sites to import existing bunny.net infrastructure

Files Changed/Created

Modified (8 files):

  • src/database/models.py
  • src/database/interfaces.py
  • src/database/repositories.py
  • src/templating/service.py
  • src/cli/commands.py
  • src/generation/job_config.py

Created (9 files):

  • scripts/migrate_story_3.1.sql
  • src/generation/site_provisioning.py
  • src/generation/url_generator.py
  • src/generation/site_assignment.py
  • tests/unit/test_url_generator.py
  • tests/unit/test_site_provisioning.py
  • tests/unit/test_site_assignment.py
  • tests/unit/test_job_config_extensions.py
  • tests/integration/test_story_3_1_integration.py
  • jobs/example_story_3.1_full_features.json
  • STORY_3.1_IMPLEMENTATION_SUMMARY.md

Total Effort

Completed all 10 tasks from the story specification.