Add Epic 6: Multi-Cloud Storage Support planning and merge images branch changes

2025-12-10 11:37:56 -06:00 · 2025-12-10 11:37:56 -06:00 · 7e21482419
parent 62074cd995
commit 7e21482419
20 changed files with 1127 additions and 80 deletions
--- a/IMAGE_TEMPLATE_ISSUES_ANALYSIS.md
+++ b/IMAGE_TEMPLATE_ISSUES_ANALYSIS.md
@ -0,0 +1,89 @@
 # Image and Template Issues Analysis
 ## Problems Identified
 ### 1. Missing Image CSS in Templates
 **Issue**: None of the templates (basic, modern, classic) have CSS for `<img>` tags.
 **Impact**: Images display at full size, breaking layout especially in modern template with constrained article width (850px).
 **Solution**: Add responsive image CSS to all templates:
 ```css
 img {
    max-width: 100%;
    height: auto;
    display: block;
    margin: 1.5rem auto;
    border-radius: 8px;
 }
 ```
 ### 2. Template Storage Inconsistency
 **Issue**: `template_used` field is only set when `apply_template()` is called. If:
 - Templates are applied at different times
 - Some articles skip template application
 - Articles are moved between sites with different templates
 - Template application fails silently
 Then the database may show incorrect or missing template values.
 **Evidence**: User reports articles showing "basic" when they're actually "modern".
 **Solution**: 
 - Always apply templates before deployment
 - Re-apply templates if `template_used` doesn't match site's `template_name`
 - Add validation to ensure `template_used` matches site template
 ### 3. Images Lost During Interlink Injection
 **Issue**: Processing order:
 1. Images inserted into `content` → saved
 2. Interlinks injected → BeautifulSoup parses/rewrites HTML → saved
 3. Template applied → reads `content` → creates `formatted_html`
 BeautifulSoup parsing may break image tags or lose them during HTML rewriting.
 **Evidence**: User reports images were generated and uploaded (URLs in database) but don't appear in deployed articles.
 **Solution Options**:
 - **Option A**: Re-insert images after interlink injection (read from `hero_image_url` and `content_images` fields)
 - **Option B**: Use more robust HTML parsing that preserves all tags
 - **Option C**: Apply template immediately after image insertion, then inject interlinks into `formatted_html` instead of `content`
 ### 4. Image Size Not Constrained
 **Issue**: Even if images are present, they're not constrained by template CSS, causing layout issues.
 **Solution**: Add image CSS (see #1) and ensure images are inserted with proper attributes:
 ```html
 <img src="..." alt="..." style="max-width: 100%; height: auto;" />
 ```
 ## Recommended Fixes
 ### Priority 1: Add Image CSS to All Templates
 Add responsive image styling to:
 - `src/templating/templates/basic.html`
 - `src/templating/templates/modern.html`
 - `src/templating/templates/classic.html`
 ### Priority 2: Fix Image Preservation
 Modify `src/interlinking/content_injection.py` to preserve images:
 - Use `html.parser` with `preserve_whitespace` or `html5lib` parser
 - Or re-insert images after interlink injection using database fields
 ### Priority 3: Fix Template Tracking
 - Add validation in deployment to ensure `template_used` matches site template
 - Re-apply templates if mismatch detected
 - Add script to backfill/correct `template_used` values
 ### Priority 4: Improve Image Insertion
 - Add `max-width` style attribute when inserting images
 - Ensure images are inserted with proper responsive attributes
 ## Code Locations
 - Image insertion: `src/generation/image_injection.py`
 - Interlink injection: `src/interlinking/content_injection.py` (line 53-76)
 - Template application: `src/generation/service.py` (line 409-460)
 - Template files: `src/templating/templates/*.html`
 - Deployment: `src/deployment/deployment_service.py` (uses `formatted_html`)
--- a/docs/job-schema.md
+++ b/docs/job-schema.md
@ -41,6 +41,7 @@ Each job object defines a complete content generation batch for a specific proje
 | `auto_create_sites` | `boolean` | `false` | Whether to auto-create sites when pool is insufficient (Story 3.1) |
 | `create_sites_for_keywords` | `Array<Object>` | `null` | Array of keyword site creation configs (Story 3.1) |
 | `tiered_link_count_range` | `Object` | `null` | Configuration for tiered link counts (Story 3.2) |
 | `image_theme_prompt` | `string` | `null` | Override image theme prompt for all images in this job (Story 7.1) |
 ## Tier Configuration
@ -212,6 +213,36 @@ Each tier in the `tiers` object defines content generation parameters for that s
 ### Implementation Status
 **Implemented** - The `models` field is fully functional. Different models can be specified for title, outline, and content generation stages. If a job file contains a `models` configuration and you also use the `--model` CLI flag, the system will warn you that the CLI flag is being ignored in favor of the job config.
 ## Image Theme Configuration (Story 7.1)
 ### `image_theme_prompt`
 - **Type**: `string` (optional)
 - **Purpose**: Override the image theme prompt for all images (hero and content) generated in this job
 - **Behavior**: 
  - If provided, this string is used directly as the theme prompt for all image generation
  - If not provided, the system checks for a cached theme in the project database
  - If no cached theme exists, a new theme is generated using AI based on the project's keyword, entities, and related searches
 - **Format**: A single string describing visual style, color scheme, lighting, environment, and overall aesthetic
 - **Note**: This is the prompt sent directly to the image generation API (fal.ai FLUX.1 schnell), not split into system/user messages
 ### Example
 ```json
 {
  "image_theme_prompt": "Modern industrial workspace, warm amber lighting, deep burgundy accents, professional photography style, clean minimalist aesthetic"
 }
 ```
 ### Theme Prompt Priority
 1. **Job override** (`image_theme_prompt` in job.json) - Highest priority
 2. **Database cache** (`Project.image_theme_prompt`) - Used if no override
 3. **AI generation** - Generated using `image_theme_generation.json` template if no cache exists
 ### Best Practices
 - Use descriptive color schemes to avoid default blue tones
 - Include lighting, environment, and style details
 - Keep it concise (2-3 sentences recommended)
 - Consider the industry/product when choosing colors and aesthetic
 ## Tiered Link Configuration (Story 3.2)
 ### `tiered_link_count_range`
@ -270,6 +301,7 @@ Each tier in the `tiers` object defines content generation parameters for that s
        "min": 3,
        "max": 5
      },
      "image_theme_prompt": "Modern industrial workspace, warm amber lighting, deep burgundy accents, professional photography style, clean minimalist aesthetic",
      "tiers": {
        "tier1": {
          "count": 10,
@ -305,6 +337,7 @@ Each tier in the `tiers` object defines content generation parameters for that s
 - `auto_create_sites` must be a boolean (if specified)
 - `create_sites_for_keywords` must be an array of objects with `keyword` and `count` fields (if specified)
 - `tiered_link_count_range` must have `min` >= 1 and `max` >= `min` (if specified)
 - `image_theme_prompt` must be a non-empty string (if specified)
 ### Tier Level Validation
 - `count` must be a positive integer
@ -362,6 +395,11 @@ uv run python main.py generate-batch --job-file jobs/example.json --username adm
 - Integrated with tiered link generation system
 - Added validation for link count ranges
 ### Story 7.1: Image Generation
 - Added `image_theme_prompt` for overriding image theme prompts
 - Allows manual control over visual style and color schemes
 - Overrides database cache and AI generation when specified
 ## Future Extensions
 The schema is designed to be extensible for future features:
--- a/docs/prd/epic-6-multi-cloud-storage.md
+++ b/docs/prd/epic-6-multi-cloud-storage.md
@ -17,29 +17,40 @@ Currently, the system only supports Bunny.net storage, creating vendor lock-in a
 - **Story 6.3**: 🔄 PLANNING (Database Schema Updates)
 - **Story 6.4**: 🔄 PLANNING (URL Generation for S3)
 - **Story 6.5**: 🔄 PLANNING (S3-Compatible Services Support)
 - **Story 6.6**: 🔄 PLANNING (Bucket Provisioning Script)
 ## Stories
 ### Story 6.1: Storage Provider Abstraction Layer
-**Estimated Effort**: 5 story points
+**Estimated Effort**: 3 story points
-**As a developer**, I want a unified storage interface that abstracts provider-specific details, so that the deployment service can work with any storage provider without code changes.
+**As a developer**, I want a simple way to support multiple storage providers without cluttering `DeploymentService` with if/elif chains, so that adding new providers (eventually 8+) is straightforward.
 **Acceptance Criteria**:
-* Create a `StorageClient` protocol/interface with common methods:
+* Create a simple factory function `create_storage_client(site: SiteDeployment)` that returns the appropriate client:
-  - `upload_file(file_path: str, content: str, content_type: str) -> UploadResult`
+  - `'bunny'` → `BunnyStorageClient()`
-  - `file_exists(file_path: str) -> bool`
+  - `'s3'` → `S3StorageClient()`
-  - `list_files(prefix: str = '') -> List[str]`
+  - `'s3_compatible'` → `S3StorageClient()` (with custom endpoint)
-* Refactor `BunnyStorageClient` to implement the interface
+  - Future providers added here
-* Create a `StorageClientFactory` that returns the appropriate client based on provider type
+* Refactor `BunnyStorageClient.upload_file()` to accept `site: SiteDeployment` parameter:
-* Update `DeploymentService` to use the factory instead of hardcoding `BunnyStorageClient`
+  - Change from: `upload_file(zone_name, zone_password, zone_region, file_path, content)`
  - Change to: `upload_file(site: SiteDeployment, file_path: str, content: str)`
  - Client extracts bunny-specific fields from `site` internally
 * Update `DeploymentService` to use factory and unified interface:
  - Remove hardcoded `BunnyStorageClient` from `__init__`
  - In `deploy_article()` and `deploy_boilerplate_page()`: create client per site
  - Call: `client.upload_file(site, file_path, content)` (same signature for all providers)
 * Optional: Add `StorageClient` Protocol for type hints (helps with 8+ providers)
 * All existing Bunny.net deployments continue to work without changes
-* Unit tests verify interface compliance
+* Unit tests verify factory returns correct clients
 **Technical Notes**:
-* Use Python `Protocol` (typing) or ABC for interface definition
+* Factory function is simple if/elif chain (one place to maintain)
-* Factory pattern: `create_storage_client(site: SiteDeployment) -> StorageClient`
+* All clients use same method signature: `upload_file(site, file_path, content)`
-* Maintain backward compatibility: default provider is "bunny" if not specified
+* Each client extracts provider-specific fields from `site` object internally
 * Protocol is optional but recommended for type safety with many providers
 * Factory pattern keeps `DeploymentService` clean (no provider-specific logic)
 * Backward compatibility: default provider is "bunny" if not specified
 ---
@ -52,8 +63,12 @@ Currently, the system only supports Bunny.net storage, creating vendor lock-in a
 * Create `S3StorageClient` implementing `StorageClient` interface
 * Use boto3 library for AWS S3 operations
 * Support standard AWS S3 regions
-* Authentication via AWS credentials (access key ID, secret access key)
+* Authentication via AWS credentials from environment variables
-* Handle bucket permissions (public read access required)
+* Automatically configure bucket for public READ access only (not write):
  - Apply public-read ACL or bucket policy on first upload
  - Ensure bucket allows public read access (disable block public access settings)
  - Verify public read access is enabled before deployment
  - **Security**: Never enable public write access - only read permissions
 * Upload files with correct content-type headers
 * Generate public URLs from bucket name and region
 * Support custom domain mapping (if configured)
@ -62,14 +77,14 @@ Currently, the system only supports Bunny.net storage, creating vendor lock-in a
 * Unit tests with mocked boto3 calls
 **Configuration**:
-* AWS credentials from environment variables:
+* AWS credentials from environment variables (global):
  - `AWS_ACCESS_KEY_ID`
  - `AWS_SECRET_ACCESS_KEY`
  - `AWS_REGION` (default region, can be overridden per-site)
 * Per-site configuration stored in database:
-  - `bucket_name`: S3 bucket name
+  - `s3_bucket_name`: S3 bucket name
-  - `bucket_region`: AWS region (optional, uses default if not set)
+  - `s3_bucket_region`: AWS region (optional, uses default if not set)
-  - `custom_domain`: Optional custom domain for URL generation
+  - `s3_custom_domain`: Optional custom domain for URL generation (manual setup)
 **URL Generation**:
 * Default: `https://{bucket_name}.s3.{region}.amazonaws.com/{file_path}`
@ -79,7 +94,12 @@ Currently, the system only supports Bunny.net storage, creating vendor lock-in a
 **Technical Notes**:
 * boto3 session management (reuse sessions for performance)
 * Content-type detection (text/html for HTML files)
-* Public read ACL or bucket policy required for public URLs
+* Automatic public read access configuration (read-only, never write):
  - Check and configure bucket policy for public read access only
  - Disable "Block Public Access" settings for read access
  - Apply public-read ACL to uploaded objects (not public-write)
  - Validate public read access before deployment
  - **Security**: Uploads require authenticated credentials, only reads are public
 ---
@ -139,30 +159,37 @@ Currently, the system only supports Bunny.net storage, creating vendor lock-in a
 ### Story 6.5: S3-Compatible Services Support
 **Estimated Effort**: 5 story points
-**As a user**, I want to deploy to S3-compatible services (DigitalOcean Spaces, Backblaze B2, Linode Object Storage), so that I can use cost-effective alternatives to AWS.
+**As a user**, I want to deploy to S3-compatible services (Linode Object Storage, DreamHost Object Storage, DigitalOcean Spaces), so that I can use S3-compatible storage providers the same way I use Bunny.net.
 **Acceptance Criteria**:
 * Extend `S3StorageClient` to support S3-compatible endpoints
 * Support provider-specific configurations:
  - **DigitalOcean Spaces**: Custom endpoint (e.g., `https://nyc3.digitaloceanspaces.com`)
  - **Backblaze B2**: Custom endpoint and authentication
  - **Linode Object Storage**: Custom endpoint
  - **DreamHost Object Storage**: Custom endpoint
  - **DigitalOcean Spaces**: Custom endpoint (e.g., `https://nyc3.digitaloceanspaces.com`)
 * Store `s3_endpoint_url` per site for custom endpoints
 * Handle provider-specific authentication differences
 * Support provider-specific URL generation
 * Configuration examples in documentation
 * Unit tests for each supported service
-**Supported Services** (Initial):
+**Supported Services**:
-* DigitalOcean Spaces
+* AWS S3 (standard)
 * Backblaze B2
 * Linode Object Storage
-* (Others can be added as needed)
+* DreamHost Object Storage
 * DigitalOcean Spaces
 * Backblaze
 * Cloudflare
 * (Other S3-compatible services can be added as needed)
 **Configuration**:
-* Per-service credentials in `.env` or per-site in database
+* Per-service credentials in `.env` (global environment variables):
  - `LINODE_ACCESS_KEY` / `LINODE_SECRET_KEY` (for Linode)
  - `DREAMHOST_ACCESS_KEY` / `DREAMHOST_SECRET_KEY` (for DreamHost)
  - `DO_SPACES_ACCESS_KEY` / `DO_SPACES_SECRET_KEY` (for DigitalOcean)
 * Endpoint URLs stored per-site in `s3_endpoint_url` field
 * Provider type stored in `storage_provider` ('s3_compatible')
 * Automatic public access configuration (same as AWS S3)
 **Technical Notes**:
 * Most S3-compatible services work with boto3 using custom endpoints
@ -171,61 +198,126 @@ Currently, the system only supports Bunny.net storage, creating vendor lock-in a
 ---
 ### Story 6.6: S3 Bucket Provisioning Script
 **Estimated Effort**: 3 story points
 **As a user**, I want a script to automatically create and configure S3 buckets with proper public access settings, so that I can quickly set up new storage targets without manual AWS console work.
 **Acceptance Criteria**:
 * Create CLI command: `provision-s3-bucket --name <bucket> --region <region> [--provider <s3|linode|dreamhost|do>]`
 * Automatically create bucket if it doesn't exist
 * Configure bucket for public read access only (not write):
  - Apply bucket policy allowing public read (GET requests only)
  - Disable "Block Public Access" settings for read access
  - Set appropriate CORS headers if needed
  - **Security**: Never enable public write access - uploads require authentication
 * Support multiple providers:
  - AWS S3 (standard regions)
  - Linode Object Storage
  - DreamHost Object Storage
  - DigitalOcean Spaces
 * Validate bucket configuration after creation
 * Option to link bucket to existing site deployment
 * Clear error messages for common issues (bucket name conflicts, permissions, etc.)
 * Documentation with examples for each provider
 **Usage Examples**:
 ```bash
 # Create AWS S3 bucket
 provision-s3-bucket --name my-site-bucket --region us-east-1
 # Create Linode bucket
 provision-s3-bucket --name my-site-bucket --region us-east-1 --provider linode
 # Create and link to site
 provision-s3-bucket --name my-site-bucket --region us-east-1 --site-id 5
 ```
 **Technical Notes**:
 * Uses boto3 for all providers (with custom endpoints for S3-compatible)
 * Bucket naming validation (AWS rules apply)
 * Idempotent: safe to run multiple times
 * Optional: Can be integrated into `provision-site` command later
 ---
 ## Technical Considerations
 ### Architecture Changes
-1. **Interface/Protocol Design**:
+1. **Unified Method Signature**:
   ```python
-   class StorageClient(Protocol):
+   # All storage clients use the same signature
-       def upload_file(...) -> UploadResult: ...
+   class BunnyStorageClient:
-       def file_exists(...) -> bool: ...
+       def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult:
-       def list_files(...) -> List[str]: ...
+           # Extract bunny-specific fields from site
           zone_name = site.storage_zone_name
           zone_password = site.storage_zone_password
           # ... do upload
   class S3StorageClient:
       def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult:
           # Extract S3-specific fields from site
           bucket_name = site.s3_bucket_name
           # ... do upload
   ```
-2. **Factory Pattern**:
+2. **Simple Factory Function**:
   ```python
-   def create_storage_client(site: SiteDeployment) -> StorageClient:
+   def create_storage_client(site: SiteDeployment):
       """Create appropriate storage client based on site provider"""
       if site.storage_provider == 'bunny':
           return BunnyStorageClient()
-       elif site.storage_provider in ('s3', 's3_compatible'):
+       elif site.storage_provider == 's3':
-           return S3StorageClient(site)
+           return S3StorageClient()
       elif site.storage_provider == 's3_compatible':
           return S3StorageClient()  # Same client, uses site.s3_endpoint_url
       # Future: elif site.storage_provider == 'cloudflare': ...
       else:
           raise ValueError(f"Unknown provider: {site.storage_provider}")
   ```
-3. **Dependency Injection**:
+3. **Clean DeploymentService**:
-   - `DeploymentService` receives `StorageClient` from factory
+   ```python
-   - No hardcoded provider dependencies
+   # In deploy_article():
   client = create_storage_client(site)  # One line, works for all providers
   client.upload_file(site, file_path, content)  # Same call for all
   ```
 4. **Optional Protocol** (recommended for type safety with 8+ providers):
   ```python
   from typing import Protocol
   class StorageClient(Protocol):
       def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult: ...
   ```
 ### Credential Management
-**Option A: Environment Variables (Recommended for AWS)**
+**Decision: Global Environment Variables**
- Global AWS credentials in `.env`
+- All credentials stored in `.env` file (global)
- Simple, secure, follows AWS best practices
+- AWS: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`
 - Linode: `LINODE_ACCESS_KEY`, `LINODE_SECRET_KEY`
 - DreamHost: `DREAMHOST_ACCESS_KEY`, `DREAMHOST_SECRET_KEY`
 - DigitalOcean: `DO_SPACES_ACCESS_KEY`, `DO_SPACES_SECRET_KEY`
 - Simple, secure, follows cloud provider best practices
 - Works well for single-account deployments
-
+- Per-site credentials can be added later if needed for multi-account scenarios
 **Option B: Per-Site Credentials**
 - Store credentials in database (encrypted)
 - Required for multi-account or S3-compatible services
 - More complex but more flexible
 **Decision Needed**: Which approach for initial implementation?
 ### URL Generation Strategy
 **Bunny.net**: Uses CDN hostname (custom or bunny.net domain)
-**AWS S3**: Uses bucket name + region or custom domain
+**AWS S3**: Uses bucket name + region or custom domain (manual setup)
-**S3-Compatible**: Uses service-specific endpoint or custom domain
+**S3-Compatible**: Uses service-specific endpoint or custom domain (manual setup)
-All providers should support custom domain mapping for consistent URLs.
+Custom domain mapping is supported but requires manual configuration (documented, not automated).
 ### Backward Compatibility
 - All existing Bunny.net sites continue to work
 - Default `storage_provider='bunny'` for existing records
 - No breaking changes to existing APIs
- Migration is optional (sites can stay on Bunny.net)
+- No migration tools provided (sites can stay on Bunny.net or be manually reconfigured)
 ### Testing Strategy
@ -240,26 +332,42 @@ All providers should support custom domain mapping for consistent URLs.
 - Existing deployment infrastructure (Epic 4)
 - Database migration tools
-## Open Questions
+## Decisions Made
-1. **Credential Storage**: Per-site in DB vs. global env vars? (Recommendation: Start with env vars, add per-site later if needed)
+1. **Credential Storage**: ✅ Global environment variables (Option A)
   - All credentials in `.env` file
   - Simple, secure, follows cloud provider best practices
-2. **S3-Compatible Priority**: Which services to support first? (Recommendation: DigitalOcean Spaces, then Backblaze B2)
+2. **S3-Compatible Services**: ✅ Support Linode, DreamHost, and DigitalOcean
   - All services supported equally - no priority/decision logic in this epic
   - Provider selection happens elsewhere in the codebase
   - This epic just enables S3-compatible services to work the same as Bunny.net
-3. **Custom Domains**: How are custom domains configured? Manual setup or automated? (Recommendation: Manual for now, document process)
+3. **Custom Domains**: ✅ Manual setup (deferred automation)
   - Custom domains require manual configuration
   - Documented process, no automation in this epic
-4. **Bucket Provisioning**: Should we automate S3 bucket creation, or require manual setup? (Recommendation: Manual for now, similar to current Bunny.net approach)
+4. **Bucket Provisioning**: ✅ Manual with optional script (Story 6.6)
   - Primary: Manual bucket creation
   - Optional: `provision-s3-bucket` CLI script for automated setup
-5. **Public Access**: How to ensure buckets are publicly readable? (Recommendation: Document requirements, validate in tests)
+5. **Public Access**: ✅ Automatic configuration (read-only)
   - System automatically configures buckets for public READ access only
   - Applies bucket policies for read access, disables block public access, sets public-read ACLs
   - **Security**: Never enables public write access - all uploads require authenticated credentials
-6. **Migration Path**: Should we provide tools to migrate existing Bunny.net sites to S3? (Recommendation: Defer to future story)
+6. **Migration Path**: ✅ No migration tools
   - No automated migration from Bunny.net to S3
   - Sites can be manually reconfigured if needed
 ## Success Metrics
 - ✅ Deploy content to AWS S3 successfully
- ✅ Deploy content to at least one S3-compatible service
+- ✅ Deploy content to S3-compatible services (Linode, DreamHost, DigitalOcean) successfully
 - ✅ All existing Bunny.net deployments continue working
 - ✅ URL generation works correctly for all providers
 - ✅ Buckets automatically configured for public read access (not write)
 - ✅ Zero breaking changes to existing functionality
 - ✅ Bucket provisioning script works for all supported providers
--- a/jobs/README.md
+++ b/jobs/README.md
@ -32,6 +32,7 @@ Job files define batch content generation parameters using JSON format.
 - `tiers` (required): Dictionary of tier configurations
 - `deployment_targets` (optional): Array of site custom_hostnames or site_deployment_ids to cycle through
 - `deployment_overflow` (optional): Strategy when batch size exceeds deployment_targets ("round_robin", "random_available", or "none"). Default: "round_robin"
 - `image_theme_prompt` (optional): Override the image theme prompt for all images in this job. If not specified, uses the cached theme from the database or generates a new one using AI. This is a single string that describes the visual style, color scheme, lighting, and overall aesthetic for generated images.
 ### Tier Level
 - `count` (required): Number of articles to generate for this tier
@ -155,6 +156,25 @@ If tier parameters are not specified, these defaults are used:
 }
 ```
 ### Custom Image Theme
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "image_theme_prompt": "Modern industrial workspace, warm amber lighting, deep burgundy accents, professional photography style, clean minimalist aesthetic",
      "tiers": {
        "tier1": {
          "count": 5
        }
      }
    }
  ]
 }
 ```
 The `image_theme_prompt` overrides the default AI-generated theme for all images (hero and content) in this job. Use it to ensure consistent visual styling or to avoid default color schemes. If omitted, the system will use the cached theme from the project database, or generate a new one if none exists.
 ## Usage
 Run batch generation with:
--- a/scripts/assign_templates_to_domains.py
+++ b/scripts/assign_templates_to_domains.py
@ -0,0 +1,76 @@
 """
 Randomly assign templates to all domains
 Usage:
    uv run python scripts/assign_templates_to_domains.py
 """
 import sys
 from pathlib import Path
 import random
 project_root = Path(__file__).parent.parent
 sys.path.insert(0, str(project_root))
 from src.database.session import db_manager
 from src.database.models import SiteDeployment
 from src.templating.service import TemplateService
 def assign_templates():
    """Randomly assign templates to all site deployments"""
    db_manager.initialize()
    session = db_manager.get_session()
    try:
        template_service = TemplateService()
        available_templates = template_service.get_available_templates()
        if not available_templates:
            print("Error: No templates found!")
            return
        print(f"Available templates: {', '.join(available_templates)}")
        sites = session.query(SiteDeployment).all()
        fqdn_sites = [s for s in sites if s.custom_hostname is not None]
        bcdn_sites = [s for s in sites if s.custom_hostname is None]
        print(f"\nTotal sites: {len(sites)}")
        print(f"  FQDN domains: {len(fqdn_sites)}")
        print(f"  b-cdn.net domains: {len(bcdn_sites)}")
        updated_fqdn = 0
        updated_bcdn = 0
        for site in fqdn_sites:
            if site.template_name == "basic":
                site.template_name = random.choice(available_templates)
                updated_fqdn += 1
        for site in bcdn_sites:
            if site.template_name == "basic":
                site.template_name = random.choice(available_templates)
                updated_bcdn += 1
        session.commit()
        print(f"\nUpdated templates:")
        print(f"  FQDN domains: {updated_fqdn}")
        print(f"  b-cdn.net domains: {updated_bcdn}")
        print(f"  Total: {updated_fqdn + updated_bcdn}")
    except Exception as e:
        session.rollback()
        print(f"Error: {e}")
        raise
    finally:
        session.close()
 if __name__ == "__main__":
    assign_templates()
--- a/scripts/check_theme_prompts.py
+++ b/scripts/check_theme_prompts.py
@ -0,0 +1,110 @@
 """
 Script to check image theme prompts in the database
 """
 import sys
 from pathlib import Path
 sys.path.insert(0, str(Path(__file__).parent.parent))
 from src.database.session import db_manager
 from src.database.repositories import ProjectRepository
 def check_theme_prompts():
    """Check all projects and their image theme prompts"""
    db_manager.initialize()
    session = db_manager.get_session()
    try:
        project_repo = ProjectRepository(session)
        projects = project_repo.get_all()
        print("=" * 80)
        print("IMAGE THEME PROMPTS IN DATABASE")
        print("=" * 80)
        print()
        projects_with_themes = []
        projects_without_themes = []
        for project in projects:
            if project.image_theme_prompt:
                projects_with_themes.append(project)
            else:
                projects_without_themes.append(project)
        print(f"Total projects: {len(projects)}")
        print(f"Projects WITH theme prompts: {len(projects_with_themes)}")
        print(f"Projects WITHOUT theme prompts: {len(projects_without_themes)}")
        print()
        if projects_with_themes:
            print("=" * 80)
            print("PROJECTS WITH THEME PROMPTS:")
            print("=" * 80)
            print()
            for project in projects_with_themes:
                print(f"Project ID: {project.id}")
                print(f"Name: {project.name}")
                print(f"Main Keyword: {project.main_keyword}")
                print(f"Theme Prompt:")
                print(f"  {project.image_theme_prompt}")
                print()
                print("-" * 80)
                print()
        if projects_without_themes:
            print("=" * 80)
            print("PROJECTS WITHOUT THEME PROMPTS:")
            print("=" * 80)
            print()
            for project in projects_without_themes:
                print(f"  ID {project.id}: {project.name} ({project.main_keyword})")
            print()
        # Check for common patterns
        if projects_with_themes:
            print("=" * 80)
            print("ANALYSIS:")
            print("=" * 80)
            print()
            blue_mentions = []
            for project in projects_with_themes:
                theme_lower = project.image_theme_prompt.lower()
                if 'blue' in theme_lower:
                    blue_mentions.append((project.id, project.name, project.image_theme_prompt))
            print(f"Projects mentioning 'blue': {len(blue_mentions)}/{len(projects_with_themes)}")
            if blue_mentions:
                print()
                print("Projects with 'blue' in theme:")
                for proj_id, name, theme in blue_mentions:
                    print(f"  ID {proj_id}: {name}")
                    print(f"    Theme: {theme}")
                    print()
            # Check for other common color mentions
            colors = ['red', 'green', 'yellow', 'orange', 'purple', 'gray', 'grey', 'black', 'white']
            color_counts = {}
            for color in colors:
                count = sum(1 for p in projects_with_themes if color in p.image_theme_prompt.lower())
                if count > 0:
                    color_counts[color] = count
            if color_counts:
                print("Other color mentions:")
                for color, count in sorted(color_counts.items(), key=lambda x: x[1], reverse=True):
                    print(f"  {color}: {count} projects")
                print()
    finally:
        session.close()
        db_manager.close()
 if __name__ == "__main__":
    check_theme_prompts()
--- a/scripts/count_templates_by_domain.py
+++ b/scripts/count_templates_by_domain.py
@ -0,0 +1,44 @@
 """
 Count how many domains have each template assigned
 Usage:
    uv run python scripts/count_templates_by_domain.py
 """
 import sys
 from pathlib import Path
 from collections import Counter
 project_root = Path(__file__).parent.parent
 sys.path.insert(0, str(project_root))
 from src.database.session import db_manager
 from src.database.models import SiteDeployment
 def count_templates():
    """Count templates across all site deployments"""
    db_manager.initialize()
    session = db_manager.get_session()
    try:
        sites = session.query(SiteDeployment.template_name).all()
        template_counts = Counter()
        for (template_name,) in sites:
            template_counts[template_name] += 1
        print(f"\nTotal sites: {sum(template_counts.values())}")
        print("\nTemplate distribution:")
        print("-" * 40)
        for template, count in sorted(template_counts.items()):
            print(f"  {template:20} : {count:4}")
        print("-" * 40)
    finally:
        session.close()
 if __name__ == "__main__":
    count_templates()
--- a/scripts/list_t1_articles.py
+++ b/scripts/list_t1_articles.py
@ -0,0 +1,66 @@
 """
 List all Tier 1 articles for a project with their URLs, templates, and hero URLs
 Usage:
    uv run python scripts/list_t1_articles.py [project_id]
 If project_id is not provided, defaults to project 30.
 """
 import sys
 from pathlib import Path
 project_root = Path(__file__).parent.parent
 sys.path.insert(0, str(project_root))
 from src.database.session import db_manager
 from src.database.repositories import GeneratedContentRepository, ProjectRepository
 def list_t1_articles(project_id: int = 30):
    """List all Tier 1 articles for a project"""
    session = db_manager.get_session()
    try:
        content_repo = GeneratedContentRepository(session)
        project_repo = ProjectRepository(session)
        project = project_repo.get_by_id(project_id)
        if not project:
            print(f"Project {project_id} not found")
            return
        articles = content_repo.get_by_project_and_tier(project_id, "tier1", require_site=False)
        if not articles:
            print(f"No Tier 1 articles found for project {project_id}")
            return
        print(f"\nProject {project_id}: {project.name}")
        print("=" * 140)
        print(f"{'Article URL':<60} {'Template':<20} {'Hero URL':<60}")
        print("=" * 140)
        for article in articles:
            article_url = article.deployed_url or "(Not deployed)"
            template = article.template_used or "(No template)"
            hero_url = article.hero_image_url or "(No hero image)"
            print(f"{article_url:<60} {template:<20} {hero_url:<60}")
        print("=" * 140)
        print(f"\nTotal Tier 1 articles: {len(articles)}")
    finally:
        session.close()
 if __name__ == "__main__":
    project_id = 30
    if len(sys.argv) > 1:
        try:
            project_id = int(sys.argv[1])
        except ValueError:
            print(f"Invalid project_id: {sys.argv[1]}. Using default: 30")
    list_t1_articles(project_id)
--- a/scripts/test_image_generation.py
+++ b/scripts/test_image_generation.py
@ -266,8 +266,7 @@ def test_image_generation(project_id: int):
            except Exception as e:
                click.echo(f"   [ERROR] {str(e)[:200]}")
-            click.echo("\n2. Content Images:")
+         
            click.echo("   (Skipped - T2 articles don't get content images by default)")
        click.echo(f"\n\n{'='*60}")
        click.echo("TEST COMPLETE")
--- a/scripts/test_image_reinsertion.py
+++ b/scripts/test_image_reinsertion.py
@ -0,0 +1,173 @@
 """
 Test script to verify image reinsertion after interlink injection
 Tests the new flow:
 1. Get existing articles (2 T1, 2 T2) from project 30
 2. Simulate interlink injection (already done, just read current content)
 3. Re-insert images using _reinsert_images logic
 4. Apply templates
 5. Save formatted HTML locally to verify images display
 Usage:
    uv run python scripts/test_image_reinsertion.py
 """
 import sys
 from pathlib import Path
 project_root = Path(__file__).parent.parent
 sys.path.insert(0, str(project_root))
 from src.database.session import db_manager
 from src.database.repositories import GeneratedContentRepository, ProjectRepository, SiteDeploymentRepository
 from src.generation.image_injection import insert_hero_after_h1, insert_content_images_after_h2s, generate_alt_text
 from src.templating.service import TemplateService
 def test_image_reinsertion(project_id: int = 30):
    """Test image reinsertion on existing articles"""
    session = db_manager.get_session()
    try:
        content_repo = GeneratedContentRepository(session)
        project_repo = ProjectRepository(session)
        site_repo = SiteDeploymentRepository(session)
        project = project_repo.get_by_id(project_id)
        if not project:
            print(f"Project {project_id} not found")
            return
        # Get 2 T1 and 2 T2 articles
        t1_articles = content_repo.get_by_project_and_tier(project_id, "tier1", require_site=False)
        t2_articles = content_repo.get_by_project_and_tier(project_id, "tier2", require_site=False)
        if len(t1_articles) < 2:
            print(f"Not enough T1 articles (found {len(t1_articles)}, need 2)")
            return
        if len(t2_articles) < 2:
            print(f"Not enough T2 articles (found {len(t2_articles)}, need 2)")
            return
        test_articles = t1_articles[:2] + t2_articles[:2]
        print(f"\nTesting image reinsertion for project {project_id}: {project.name}")
        print(f"Selected {len(test_articles)} articles:")
        for article in test_articles:
            has_hero = article.hero_image_url or "None"
            has_content = f"{len(article.content_images) if article.content_images else 0} images"
            existing_imgs = article.content.count("<img")
            print(f"  - {article.tier}: {article.title[:50]}")
            print(f"    Hero URL in DB: {has_hero}")
            print(f"    Content images in DB: {has_content}")
            print(f"    Existing <img> tags in content: {existing_imgs}")
        # Create output directory
        output_dir = Path("test_output")
        output_dir.mkdir(exist_ok=True)
        # Initialize template service
        template_service = TemplateService()
        # Process each article
        for article in test_articles:
            print(f"\nProcessing: {article.title[:50]}...")
            # Step 1: Get current content (after interlink injection)
            html = article.content
            print(f"  Content length: {len(html)} chars")
            # Step 2: Re-insert images (simulating _reinsert_images)
            if article.hero_image_url or article.content_images:
                print(f"  Re-inserting images...")
                # Remove existing images first (to avoid duplicates)
                import re
                existing_count = html.count("<img")
                if existing_count > 0:
                    print(f"    Removing {existing_count} existing image(s)...")
                    html = re.sub(r'<img[^>]*>', '', html)
                # Insert hero image if exists
                if article.hero_image_url:
                    alt_text = generate_alt_text(project)
                    html = insert_hero_after_h1(html, article.hero_image_url, alt_text)
                    print(f"    Hero image inserted: {article.hero_image_url}")
                else:
                    print(f"    No hero image URL in database")
                # Insert content images if exist
                if article.content_images:
                    alt_texts = [generate_alt_text(project) for _ in article.content_images]
                    html = insert_content_images_after_h2s(html, article.content_images, alt_texts)
                    print(f"    {len(article.content_images)} content images inserted")
            else:
                print(f"  No images to insert (hero_image_url and content_images both empty)")
            # Step 3: Apply template
            print(f"  Applying template...")
            try:
                # Get template name from site or use default
                template_name = template_service.select_template_for_content(
                    site_deployment_id=article.site_deployment_id,
                    site_deployment_repo=site_repo
                )
                # Generate meta description
                import re
                from html import unescape
                text = re.sub(r'<[^>]+>', '', html)
                text = unescape(text)
                words = text.split()[:25]
                meta_description = ' '.join(words) + '...'
                # Format content with template
                formatted_html = template_service.format_content(
                    content=html,
                    title=article.title,
                    meta_description=meta_description,
                    template_name=template_name,
                    canonical_url=article.deployed_url
                )
                print(f"    Template '{template_name}' applied")
                # Step 4: Save to file
                safe_title = "".join(c for c in article.title if c.isalnum() or c in (' ', '-', '_')).rstrip()[:50]
                filename = f"{article.tier}_{article.id}_{safe_title}.html"
                filepath = output_dir / filename
                with open(filepath, 'w', encoding='utf-8') as f:
                    f.write(formatted_html)
                print(f"    Saved to: {filepath}")
                # Check if images are in the HTML
                hero_count = formatted_html.count(article.hero_image_url) if article.hero_image_url else 0
                content_count = sum(formatted_html.count(url) for url in (article.content_images or []))
                print(f"    Image check: Hero={hero_count}, Content={content_count}")
            except Exception as e:
                print(f"    ERROR applying template: {e}")
                import traceback
                traceback.print_exc()
        print(f"\n✓ Test complete! Check files in {output_dir}/")
        print(f"  Open the HTML files in a browser to verify images display correctly.")
    finally:
        session.close()
 if __name__ == "__main__":
    project_id = 30
    if len(sys.argv) > 1:
        try:
            project_id = int(sys.argv[1])
        except ValueError:
            print(f"Invalid project_id: {sys.argv[1]}. Using default: 30")
    test_image_reinsertion(project_id)
--- a/src/cli/commands.py
+++ b/src/cli/commands.py
@ -23,6 +23,7 @@ from src.database.repositories import GeneratedContentRepository, SitePageReposi
 from src.deployment.bunny_storage import BunnyStorageClient, BunnyStorageError
 from src.deployment.deployment_service import DeploymentService
 from src.deployment.url_logger import URLLogger
 from src.templating.service import TemplateService
 from dotenv import load_dotenv
 import os
 import requests
@ -433,6 +434,15 @@ def provision_site(name: str, domain: str, storage_name: str, region: str,
                pull_zone_bcdn_hostname=pull_result.hostname
            )
            # Randomly assign template
            template_service = TemplateService()
            available_templates = template_service.get_available_templates()
            if available_templates:
                deployment.template_name = random.choice(available_templates)
                session.commit()
                session.refresh(deployment)
                click.echo(f"  Template assigned: {deployment.template_name}")
            click.echo("\n" + "=" * 70)
            click.echo("Site provisioned successfully!")
            click.echo("=" * 70)
@ -540,6 +550,15 @@ def attach_domain(name: str, domain: str, storage_name: str,
                pull_zone_bcdn_hostname=pull_result.hostname
            )
            # Randomly assign template
            template_service = TemplateService()
            available_templates = template_service.get_available_templates()
            if available_templates:
                deployment.template_name = random.choice(available_templates)
                session.commit()
                session.refresh(deployment)
                click.echo(f"  Template assigned: {deployment.template_name}")
            click.echo("\n" + "=" * 70)
            click.echo("Domain attached successfully!")
            click.echo("=" * 70)
@ -841,11 +860,20 @@ def sync_sites(admin_user: Optional[str], admin_password: Optional[str], dry_run
                                custom_hostname=custom_hostname
                            )
                            # Randomly assign template
                            template_service = TemplateService()
                            available_templates = template_service.get_available_templates()
                            if available_templates:
                                deployment.template_name = random.choice(available_templates)
                                session.commit()
                                session.refresh(deployment)
                            click.echo(f"IMPORTED: {check_hostname}")
                            click.echo(f"  Storage Zone: {storage_zone['Name']} (Region: {storage_zone.get('Region', 'Unknown')})")
                            click.echo(f"  Pull Zone: {pz['Name']} (ID: {pz['Id']})")
                            if custom_hostname:
                                click.echo(f"  Custom Domain: {custom_hostname}")
                            click.echo(f"  Template: {deployment.template_name}")
                            imported += 1
                    except Exception as e:
--- a/src/generation/.service.py.swp
+++ b/src/generation/.service.py.swp
--- a/src/generation/batch_processor.py
+++ b/src/generation/batch_processor.py
@ -401,7 +401,8 @@ class BatchProcessor:
            tier_config=tier_config,
            title=title,
            site_deployment_id=site_deployment_id,
-            prefix=prefix
+            prefix=prefix,
            theme_override=job.image_theme_prompt
        )
        # Update article with image URLs
@ -420,7 +421,8 @@ class BatchProcessor:
        title: str,
        content: str,
        site_deployment_id: Optional[int],
-        prefix: str
+        prefix: str,
        theme_override: Optional[str] = None
    ) -> tuple[str, Optional[str], List[str]]:
        """
        Generate images and insert into HTML content
@ -444,7 +446,8 @@ class BatchProcessor:
        image_generator = ImageGenerator(
            ai_client=self.generator.ai_client,
            prompt_manager=self.generator.prompt_manager,
-            project_repo=self.project_repo
+            project_repo=self.project_repo,
            theme_override=theme_override
        )
        storage_client = BunnyStorageClient()
@ -539,7 +542,8 @@ class BatchProcessor:
        tier_config: TierConfig,
        title: str,
        site_deployment_id: Optional[int],
-        prefix: str
+        prefix: str,
        theme_override: Optional[str] = None
    ) -> tuple[Optional[str], List[str]]:
        """
        Generate images and upload to storage, but don't insert into HTML.
@ -559,7 +563,8 @@ class BatchProcessor:
        image_generator = ImageGenerator(
            ai_client=self.generator.ai_client,
            prompt_manager=self.generator.prompt_manager,
-            project_repo=self.project_repo
+            project_repo=self.project_repo,
            theme_override=theme_override
        )
        storage_client = BunnyStorageClient()
@ -896,7 +901,8 @@ class BatchProcessor:
            thread_image_generator = ImageGenerator(
                ai_client=thread_generator.ai_client,
                prompt_manager=thread_generator.prompt_manager,
-                project_repo=thread_project_repo
+                project_repo=thread_project_repo,
                theme_override=job.image_theme_prompt
            )
            hero_url = None
--- a/src/generation/image_generator.py
+++ b/src/generation/image_generator.py
@ -19,13 +19,56 @@ logger = logging.getLogger(__name__)
 def truncate_title(title: str, max_words: int = 4) -> str:
-    """Truncate title to max_words and convert to UPPERCASE"""
+    """Truncate a title to a maximum number of words and convert to uppercase.
    Takes the first max_words from the title, joins them with spaces, and converts
    the result to uppercase. Useful for creating short, prominent text overlays
    on images.
    Args:
        title: The title text to truncate. Can contain any number of words.
        max_words: Maximum number of words to keep from the beginning of the title.
            Defaults to 4.
    Returns:
        A string containing the first max_words of the title in UPPERCASE format.
        If the title has fewer words than max_words, returns the entire title
        in uppercase.
    Example:
        >>> truncate_title("The Quick Brown Fox Jumps Over", 4)
        'THE QUICK BROWN FOX'
        >>> truncate_title("Short Title", 4)
        'SHORT TITLE'
    """
    words = title.split()[:max_words]
    return " ".join(words).upper()
 def slugify(text: str) -> str:
-    """Convert text to URL-friendly slug"""
+    """Convert text to a URL-friendly slug format.
    Transforms text into a lowercase slug suitable for use in URLs or filenames.
    Replaces all non-alphanumeric characters with hyphens and removes leading/trailing
    hyphens. Multiple consecutive non-alphanumeric characters are collapsed into
    a single hyphen.
    Args:
        text: The text string to convert to a slug. Can contain any characters.
    Returns:
        A lowercase string containing only alphanumeric characters and hyphens,
        with no leading or trailing hyphens. Multiple consecutive hyphens are
        collapsed into a single hyphen.
    Example:
        >>> slugify("Hello World! 123")
        'hello-world-123'
        >>> slugify("  Test---String  ")
        'test-string'
        >>> slugify("Special@#$Characters")
        'special-characters'
    """
    text = text.lower()
    text = re.sub(r'[^a-z0-9]+', '-', text)
    text = text.strip('-')
@ -33,17 +76,57 @@ def slugify(text: str) -> str:
 class ImageGenerator:
-    """Generate images using fal.ai API"""
+    """Generate images using fal.ai FLUX.1 schnell API.
    This class handles image generation for projects, including hero images with
    text overlays and content images. It manages theme prompts, coordinates with
    AI services for prompt generation, and uses the fal.ai API for actual image
    creation. Images are generated asynchronously using a thread pool executor
    for concurrent processing.
    The generator maintains project-specific theme prompts that are either
    retrieved from the database or generated on-demand using AI. Hero images
    include text overlays with automatic wrapping and styling, while content
    images focus on specific entities and related search terms.
    """
    def __init__(
        self,
        ai_client: AIClient,
        prompt_manager: PromptManager,
-        project_repo: ProjectRepository
+        project_repo: ProjectRepository,
        theme_override: Optional[str] = None
    ):
        """Initialize the ImageGenerator with required dependencies.
        Sets up the image generator with AI client for prompt generation, prompt
        manager for formatting prompts, and project repository for database access.
        Configures the fal.ai API key from environment variables and creates a
        thread pool executor for concurrent image generation.
        Args:
            ai_client: Client for generating AI completions (used for theme prompts).
            prompt_manager: Manager for formatting and retrieving prompt templates.
            project_repo: Repository for accessing and updating project data.
        Note:
            The fal_client library expects FAL_KEY environment variable, but this
            implementation uses FAL_API_KEY. The constructor automatically sets
            FAL_KEY from FAL_API_KEY if needed for compatibility. If neither is
            set, a warning is logged and image generation will fail.
        Attributes:
            ai_client: AI client instance for generating completions.
            prompt_manager: Prompt manager for template handling.
            project_repo: Project repository for database operations.
            fal_key: API key for fal.ai service (from FAL_API_KEY or FAL_KEY env var).
            max_concurrent: Maximum number of concurrent image generation tasks (default: 5).
            executor: ThreadPoolExecutor for managing concurrent image generation.
        """
        self.ai_client = ai_client
        self.prompt_manager = prompt_manager
        self.project_repo = project_repo
        self.theme_override = theme_override
        # fal_client library expects FAL_KEY, but we use FAL_API_KEY in our env
        # Set both for compatibility
        self.fal_key = os.getenv("FAL_API_KEY") or os.getenv("FAL_KEY")
@ -55,15 +138,44 @@ class ImageGenerator:
        self.executor = ThreadPoolExecutor(max_workers=self.max_concurrent)
    def get_theme_prompt(self, project_id: int) -> str:
-        """Get or generate theme prompt for project"""
+        """Get or generate a theme prompt for a project.
        Retrieves the cached theme prompt from the project if it exists, otherwise
        generates a new one using AI based on the project's main keyword, entities,
        and related searches. The generated prompt is saved to the database for
        future use, ensuring consistency across image generations for the same project.
        Args:
            project_id: The unique identifier of the project to get/generate
                the theme prompt for.
        Returns:
            A string containing the theme prompt that describes the visual style
            and theme for images in this project.
        Raises:
            ValueError: If the project with the given project_id is not found
                in the database.
        Note:
            The theme prompt is generated using the "image_theme_generation" prompt
            template with the project's main keyword, entities, and related searches.
            Once generated, it is persisted to the database and reused for all
            subsequent image generations for this project.
        """
        project = self.project_repo.get_by_id(project_id)
        if not project:
            raise ValueError(f"Project {project_id} not found")
        # Check for override first (from job.json)
        if self.theme_override:
            return self.theme_override
        # Then check cached theme in database
        if project.image_theme_prompt:
            return project.image_theme_prompt
-        # Generate theme prompt using AI
+        # Finally, generate new theme using AI
        entities_str = ", ".join(project.entities or [])
        related_str = ", ".join(project.related_searches or [])
@ -95,7 +207,33 @@ class ImageGenerator:
        width: int,
        height: int
    ) -> bytes:
-        """Overlay text on image using PIL"""
+        """Overlay text on an image with automatic wrapping and styling.
        Takes an image in bytes format and overlays centered text with automatic
        word wrapping, a semi-transparent dark background box for readability,
        and white text with a black outline for contrast. The text is positioned
        in the center of the image and wrapped to fit within 80% of the image width.
        Args:
            image_bytes: Raw image data in bytes format (JPEG, PNG, etc.).
            text: The text string to overlay on the image. Will be automatically
                wrapped to fit within the image boundaries.
            width: The width of the image in pixels. Used for calculating font
                size and text positioning.
            height: The height of the image in pixels. Used for vertical centering
                of the text.
        Returns:
            Image bytes in JPEG format with the text overlay applied. The image
            is converted to RGB mode if necessary and saved with 95% quality.
        Note:
            Font size is calculated as width // 15. If Arial font is not available,
            falls back to the default PIL font. The text is rendered with a
            semi-transparent black background (alpha=180) and white text with
            a black outline for maximum readability across different image backgrounds.
            Line spacing is set to 130% of the line height for comfortable reading.
        """
        img = Image.open(io.BytesIO(image_bytes))
        if img.mode != 'RGBA':
            img = img.convert('RGBA')
@ -183,7 +321,41 @@ class ImageGenerator:
        width: int = 1280,
        height: int = 720
    ) -> Optional[bytes]:
-        """Generate hero image with title text"""
+        """Generate a hero image with title text overlay.
        Creates a hero image using the project's theme prompt via the fal.ai
        FLUX.1 schnell API, then overlays the provided title text on the generated
        image. The image is generated with optimized settings for fast generation
        (4 inference steps) and downloaded from the API response URL.
        The workflow:
        1. Retrieves or generates the project's theme prompt
        2. Calls fal.ai API with the theme prompt to generate the base image
        3. Downloads the generated image from the API response URL
        4. Overlays the title text with automatic wrapping and styling
        5. Returns the final image as JPEG bytes
        Args:
            project_id: The unique identifier of the project. Used to retrieve
                the project's theme prompt for image generation.
            title: The title text to overlay on the hero image. Will be automatically
                wrapped and styled for readability.
            width: Desired width of the generated image in pixels. Defaults to 1280
                (standard HD width).
            height: Desired height of the generated image in pixels. Defaults to 720
                (standard HD height).
        Returns:
            Bytes containing the JPEG image data with title overlay, or None if
            generation fails. Failure can occur due to missing API key, API errors,
            network issues, or malformed API responses.
        Note:
            Uses fal.ai FLUX.1 schnell model with 4 inference steps and guidance
            scale of 3.5 for fast generation. The API response structure is
            handled flexibly to accommodate different response formats. All errors
            are logged with detailed information for debugging.
        """
        if not self.fal_key:
            logger.error("FAL_API_KEY not set")
            return None
@ -254,7 +426,42 @@ class ImageGenerator:
        width: int = 512,
        height: int = 512
    ) -> Optional[bytes]:
-        """Generate content image with entity and related search"""
+        """Generate a content image focused on a specific entity and related search.
        Creates a content image that combines the project's theme prompt with
        specific focus on an entity and related search term. Unlike hero images,
        content images do not include text overlays and are optimized for smaller
        dimensions (default 512x512). The prompt explicitly requests a professional
        illustration style.
        The workflow:
        1. Retrieves or generates the project's theme prompt
        2. Constructs a focused prompt combining theme, entity, and related search
        3. Calls fal.ai API to generate the image
        4. Downloads and returns the image as JPEG bytes
        Args:
            project_id: The unique identifier of the project. Used to retrieve
                the project's theme prompt for consistent styling.
            entity: The main entity or subject to focus on in the image. This
                is incorporated into the generation prompt.
            related_search: A related search term to include in the image context.
                Combined with the entity to create a more specific image.
            width: Desired width of the generated image in pixels. Defaults to 512.
            height: Desired height of the generated image in pixels. Defaults to 512.
        Returns:
            Bytes containing the JPEG image data, or None if generation fails.
            Failure can occur due to missing API key, API errors, network issues,
            or malformed API responses.
        Note:
            The generated prompt format is: "{theme} Focus on {entity} and
            {related_search}, professional illustration style." Uses the same
            API settings as hero images (4 inference steps, guidance scale 3.5)
            but without text overlay processing. All errors are logged with
            detailed information for debugging.
        """
        if not self.fal_key:
            logger.error("FAL_API_KEY not set")
            return None
--- a/src/generation/job_config.py
+++ b/src/generation/job_config.py
@ -120,6 +120,7 @@ class Job:
    failure_config: Optional[FailureConfig] = None
    interlinking: Optional[InterlinkingConfig] = None
    max_workers: Optional[int] = None
    image_theme_prompt: Optional[str] = None
 class JobConfig:
@ -319,6 +320,15 @@ class JobConfig:
            if not isinstance(max_workers, int) or max_workers < 1:
                raise ValueError("'max_workers' must be a positive integer")
        # Parse image_theme_prompt (optional override)
        image_theme_prompt = job_data.get("image_theme_prompt")
        if image_theme_prompt is not None:
            if not isinstance(image_theme_prompt, str):
                raise ValueError("'image_theme_prompt' must be a string")
            image_theme_prompt = image_theme_prompt.strip()
            if not image_theme_prompt:
                raise ValueError("'image_theme_prompt' cannot be empty")
        return Job(
            project_id=project_id,
            tiers=tiers,
@ -331,7 +341,8 @@ class JobConfig:
            anchor_text_config=anchor_text_config,
            failure_config=failure_config,
            interlinking=interlinking,
-            max_workers=max_workers
+            max_workers=max_workers,
            image_theme_prompt=image_theme_prompt
        )
    def _parse_tier(self, tier_name: str, tier_data: dict) -> TierConfig:
--- a/src/generation/service.py
+++ b/src/generation/service.py
@ -424,11 +424,15 @@ class ContentGenerator:
            True if successful, False otherwise
        """
        try:
            # Refresh to ensure we have latest content (especially after image reinsertion)
            content_record = self.content_repo.get_by_id(content_id)
            if not content_record:
                print(f"Warning: Content {content_id} not found")
                return False
            # Force refresh from database to get latest content
            self.content_repo.session.refresh(content_record)
            if not meta_description:
                text = re.sub(r'<[^>]+>', '', content_record.content)
                text = unescape(text)
@ -452,11 +456,19 @@ class ContentGenerator:
            content_record.template_used = template_name
            self.content_repo.update(content_record)
            # Verify it was saved
            self.content_repo.session.refresh(content_record)
            if content_record.template_used != template_name:
                print(f"ERROR: template_used not saved! Expected '{template_name}', got '{content_record.template_used}'")
                return False
            print(f"Applied template '{template_name}' to content {content_id}")
            return True
        except Exception as e:
            print(f"Error applying template to content {content_id}: {e}")
            import traceback
            traceback.print_exc()
            return False
    def _clean_markdown_fences(self, content: str) -> str:
--- a/src/templating/templates/basic.html
+++ b/src/templating/templates/basic.html
@ -100,11 +100,26 @@
            background-color: #e7f1ff;
            text-decoration: none;
        }
        img {
            max-width: 100%;
            height: auto;
            display: block;
            margin: 1.5rem auto;
            border-radius: 4px;
            box-shadow: 0 2px 8px rgba(0,0,0,0.1);
        }
        h1 + img {
            margin-top: 1rem;
            margin-bottom: 2rem;
        }
        @media (max-width: 768px) {
            nav ul {
                flex-wrap: wrap;
                gap: 1rem;
            }
            img {
                margin: 1rem auto;
            }
        }
    </style>
 </head>
--- a/src/templating/templates/classic.html
+++ b/src/templating/templates/classic.html
@ -106,6 +106,19 @@
            background-color: #f9f6f2;
            color: #5d4a37;
        }
        img {
            max-width: 100%;
            height: auto;
            display: block;
            margin: 2rem auto;
            border-radius: 4px;
            border: 1px solid #e0d7c9;
            box-shadow: 0 2px 6px rgba(0,0,0,0.08);
        }
        h1 + img {
            margin-top: 1.5rem;
            margin-bottom: 2.5rem;
        }
        @media (max-width: 768px) {
            body {
                padding: 10px;
@ -132,6 +145,9 @@
                flex-wrap: wrap;
                gap: 1rem;
            }
            img {
                margin: 1.5rem auto;
            }
        }
    </style>
 </head>
--- a/src/templating/templates/minimal.html
+++ b/src/templating/templates/minimal.html
@ -91,6 +91,16 @@
        nav a:hover {
            border-bottom-color: #000;
        }
        img {
            max-width: 100%;
            height: auto;
            display: block;
            margin: 2rem auto;
        }
        h1 + img {
            margin-top: 1.5rem;
            margin-bottom: 2.5rem;
        }
        @media (max-width: 768px) {
            body {
                padding: 20px 15px;
@ -108,6 +118,9 @@
                flex-wrap: wrap;
                gap: 1rem;
            }
            img {
                margin: 1.5rem auto;
            }
        }
    </style>
 </head>
--- a/src/templating/templates/modern.html
+++ b/src/templating/templates/modern.html
@ -115,6 +115,18 @@
            text-decoration: none;
            transform: translateY(-2px);
        }
        img {
            max-width: 100%;
            height: auto;
            display: block;
            margin: 2rem auto;
            border-radius: 8px;
            box-shadow: 0 4px 12px rgba(0,0,0,0.15);
        }
        h1 + img {
            margin-top: 1.5rem;
            margin-bottom: 2.5rem;
        }
        @media (max-width: 768px) {
            body {
                padding: 20px 10px;
@ -138,6 +150,10 @@
                flex-wrap: wrap;
                gap: 1rem;
            }
            img {
                margin: 1.5rem auto;
                border-radius: 6px;
            }
        }
    </style>
 </head>