Resolve merge conflicts - choose newer implementations

2025-10-20 11:43:33 -05:00 · 2025-10-20 11:43:33 -05:00 · 19e1c93358
parent 3063fc4b84 ef62ecf852
commit 19e1c93358
32 changed files with 2703 additions and 1707 deletions
--- a/.gitignore
+++ b/.gitignore
@ -17,3 +17,6 @@ __pycache__/
 .idea/
 *.xlsx
 # Debug output
 debug_output/
--- a/IMPLEMENTATION_SUMMARY.md
+++ b/IMPLEMENTATION_SUMMARY.md
@ -0,0 +1,199 @@
 # Story 2.2 Implementation Summary
 ## Overview
 Successfully implemented simplified AI content generation via batch jobs using OpenRouter API.
 ## Completed Phases
 ### Phase 1: Data Model & Schema Design
 - ✅ Added `GeneratedContent` model to `src/database/models.py`
 - ✅ Created `GeneratedContentRepository` in `src/database/repositories.py`
 - ✅ Updated `scripts/init_db.py` (automatic table creation via Base.metadata)
 ### Phase 2: AI Client & Prompt Management
 - ✅ Created `src/generation/ai_client.py` with:
  - `AIClient` class for OpenRouter API integration
  - `PromptManager` class for template loading
  - Retry logic with exponential backoff
 - ✅ Created prompt templates in `src/generation/prompts/`:
  - `title_generation.json`
  - `outline_generation.json`
  - `content_generation.json`
  - `content_augmentation.json`
 ### Phase 3: Core Generation Pipeline
 - ✅ Implemented `ContentGenerator` in `src/generation/service.py` with:
  - `generate_title()` - Stage 1
  - `generate_outline()` - Stage 2 with JSON validation
  - `generate_content()` - Stage 3
  - `validate_word_count()` - Word count validation
  - `augment_content()` - Simple augmentation
  - `count_words()` - HTML-aware word counting
  - Debug output support
 ### Phase 4: Batch Processing
 - ✅ Created `src/generation/job_config.py` with:
  - `JobConfig` parser with tier defaults
  - `TierConfig` and `Job` dataclasses
  - JSON validation
 - ✅ Created `src/generation/batch_processor.py` with:
  - `BatchProcessor` class
  - Progress logging to console
  - Error handling and continue-on-error support
  - Statistics tracking
 ### Phase 5: CLI Integration
 - ✅ Added `generate-batch` command to `src/cli/commands.py`
 - ✅ Command options:
  - `--job-file` (required)
  - `--username` / `--password` for authentication
  - `--debug` for saving AI responses
  - `--continue-on-error` flag
  - `--model` selection (default: gpt-4o-mini)
 ### Phase 6: Testing & Validation
 - ✅ Created unit tests:
  - `tests/unit/test_job_config.py` (9 tests)
  - `tests/unit/test_content_generator.py` (9 tests)
 - ✅ Created integration test stub:
  - `tests/integration/test_generate_batch.py` (2 tests)
 - ✅ Created example job files:
  - `jobs/example_tier1_batch.json`
  - `jobs/example_multi_tier_batch.json`
  - `jobs/README.md` (comprehensive documentation)
 ### Phase 7: Cleanup & Documentation
 - ✅ Deprecated old `src/generation/rule_engine.py`
 - ✅ Updated documentation:
  - `docs/architecture/workflows.md` - Added generation workflow diagram
  - `docs/architecture/components.md` - Updated generation module description
  - `docs/architecture/data-models.md` - Updated GeneratedContent model
  - `docs/stories/story-2.2. simplified-ai-content-generation.md` - Marked as Completed
 - ✅ Updated `.gitignore` to exclude `debug_output/`
 - ✅ Updated `env.example` with `OPENROUTER_API_KEY`
 ## Key Files Created/Modified
 ### New Files (17)
 ```
 src/generation/ai_client.py
 src/generation/service.py
 src/generation/job_config.py
 src/generation/batch_processor.py
 src/generation/prompts/title_generation.json
 src/generation/prompts/outline_generation.json
 src/generation/prompts/content_generation.json
 src/generation/prompts/content_augmentation.json
 jobs/example_tier1_batch.json
 jobs/example_multi_tier_batch.json
 jobs/README.md
 tests/unit/test_job_config.py
 tests/unit/test_content_generator.py
 tests/integration/test_generate_batch.py
 IMPLEMENTATION_SUMMARY.md
 ```
 ### Modified Files (7)
 ```
 src/database/models.py (added GeneratedContent model)
 src/database/repositories.py (added GeneratedContentRepository)
 src/cli/commands.py (added generate-batch command)
 src/generation/rule_engine.py (deprecated)
 docs/architecture/workflows.md (updated)
 docs/architecture/components.md (updated)
 docs/architecture/data-models.md (updated)
 docs/stories/story-2.2. simplified-ai-content-generation.md (marked complete)
 .gitignore (added debug_output/)
 env.example (added OPENROUTER_API_KEY)
 ```
 ## Usage
 ### 1. Set up environment
 ```bash
 # Copy env.example to .env and add your OpenRouter API key
 cp env.example .env
 # Edit .env and set OPENROUTER_API_KEY
 ```
 ### 2. Initialize database
 ```bash
 python scripts/init_db.py
 ```
 ### 3. Create a project (if not exists)
 ```bash
 python main.py ingest-cora --file path/to/cora.xlsx --name "My Project"
 ```
 ### 4. Run batch generation
 ```bash
 python main.py generate-batch --job-file jobs/example_tier1_batch.json
 ```
 ### 5. With debug output
 ```bash
 python main.py generate-batch --job-file jobs/example_tier1_batch.json --debug
 ```
 ## Architecture Highlights
 ### Three-Stage Pipeline
 1. **Title Generation**: Uses keyword + entities + related searches
 2. **Outline Generation**: JSON-formatted with H2/H3 structure, validated against min/max constraints
 3. **Content Generation**: Full HTML fragment based on outline
 ### Simplification Wins
 - No complex rule engine
 - Single word count validation (min/max from job file)
 - One-attempt augmentation if below minimum
 - Job file controls all operational parameters
 - Tier defaults for common configurations
 ### Error Handling
 - Network errors: 3 retries with exponential backoff
 - Rate limits: Respects retry-after headers
 - Failed articles: Saved with status='failed', can continue processing with `--continue-on-error`
 - Database errors: Always abort (data integrity)
 ## Testing
 Run tests with:
 ```bash
 pytest tests/unit/test_job_config.py -v
 pytest tests/unit/test_content_generator.py -v
 pytest tests/integration/test_generate_batch.py -v
 ```
 ## Next Steps (Future Stories)
 - Story 2.3: Interlinking integration
 - Story 3.x: Template selection
 - Story 4.x: Deployment integration
 - Expand test coverage (currently basic tests only)
 ## Success Criteria Met
 All acceptance criteria from Story 2.2 have been met:
 ✅ 1. Batch Job Control - Job file specifies all tier parameters
 ✅ 2. Three-Stage Generation - Title → Outline → Content pipeline
 ✅ 3. SEO Data Integration - Keyword, entities, related searches used in all stages
 ✅ 4. Word Count Validation - Validates against min/max from job file
 ✅ 5. Simple Augmentation - Single attempt if below minimum
 ✅ 6. Database Storage - GeneratedContent table with all required fields
 ✅ 7. CLI Execution - generate-batch command with progress logging
 ## Estimated Implementation Time
 - Total: ~20-29 hours (as estimated in task breakdown)
 - Actual: Completed in single session with comprehensive implementation
 ## Notes
 - OpenRouter API key required in environment
 - Debug output saved to `debug_output/` when `--debug` flag used
 - Job files support multiple projects and tiers
 - Tier defaults can be fully or partially overridden
 - HTML output is fragment format (no <html>, <head>, or <body> tags)
 - Word count strips HTML tags and counts text words only
--- a/check_last_gen.py
+++ b/check_last_gen.py
@ -0,0 +1,36 @@
 from src.database.session import db_manager
 from src.database.models import GeneratedContent
 import json
 s = db_manager.get_session()
 gc = s.query(GeneratedContent).order_by(GeneratedContent.id.desc()).first()
 if gc:
    print(f"Content ID: {gc.id}")
    print(f"Stage: {gc.generation_stage}")
    print(f"Status: {gc.status}")
    print(f"Outline attempts: {gc.outline_attempts}")
    print(f"Error: {gc.error_message}")
    if gc.outline:
        outline = json.loads(gc.outline)
        sections = outline.get("sections", [])
        print(f"\nOutline:")
        print(f"H2 count: {len(sections)}")
        h3_count = sum(len(s.get('h3s', [])) for s in sections)
        print(f"H3 count: {h3_count}")
        has_faq = any("faq" in s["h2"].lower() or "question" in s["h2"].lower() for s in sections)
        print(f"Has FAQ: {has_faq}")
        print(f"\nH2s:")
        for s in sections:
            print(f"  - {s['h2']} ({len(s.get('h3s', []))} H3s)")
    else:
        print("\nNo outline saved")
 else:
    print("No content found")
 s.close()
--- a/content_automation.db.backup
+++ b/content_automation.db.backup
--- a/docs/architecture/components.md
+++ b/docs/architecture/components.md
@ -20,7 +20,14 @@ Manages user authentication, password hashing, and role-based access control log
 Responsible for parsing the CORA .xlsx files and creating new Project entries in the database.
 ### generation
-Interacts with the AI service API. It takes project data, constructs prompts, and retrieves the generated text. Includes the Content Rule Engine for validation.
+Interacts with the AI service API (OpenRouter). Implements a simplified three-stage pipeline:
 - **AIClient**: Handles OpenRouter API calls with retry logic
 - **PromptManager**: Loads and formats prompt templates from JSON files
 - **ContentGenerator**: Orchestrates title, outline, and content generation
 - **BatchProcessor**: Processes job files and manages multi-tier batch generation
 - **JobConfig**: Parses job configuration files with tier defaults
 The generation module uses SEO data from the Project table (keyword, entities, related searches) to inform all stages of content generation. Validates word count and performs simple augmentation if content is below minimum threshold.
 ### templating
 Takes raw generated text and applies the appropriate HTML/CSS template based on the project's configuration.
--- a/docs/architecture/data-models.md
+++ b/docs/architecture/data-models.md
@ -29,20 +29,28 @@ The following data models will be implemented using SQLAlchemy.
 ## 3. GeneratedContent
-**Purpose**: Stores the AI-generated content and its final deployed state.
+**Purpose**: Stores the AI-generated content from the three-stage pipeline.
 **Key Attributes**:
- `id`: Integer, Primary Key
+- `id`: Integer, Primary Key, Auto-increment
- `project_id`: Integer, Foreign Key to Project
+- `project_id`: Integer, Foreign Key to Project, Indexed
- `title`: Text
+- `tier`: String(20), Not Null, Indexed (tier1, tier2, tier3)
- `outline`: Text
+- `keyword`: String(255), Not Null, Indexed
- `body_text`: Text
+- `title`: Text, Not Null (Generated in stage 1)
- `final_html`: Text
+- `outline`: JSON, Not Null (Generated in stage 2)
- `deployed_url`: String, Unique
+- `content`: Text, Not Null (HTML fragment from stage 3)
- `tier`: String (for link classification)
+- `word_count`: Integer, Not Null (Validated word count)
 - `status`: String(20), Not Null (generated, augmented, failed)
 - `created_at`: DateTime, Not Null
 - `updated_at`: DateTime, Not Null
 **Relationships**: Belongs to one Project.
 **Status Values**:
 - `generated`: Content was successfully generated within word count range
 - `augmented`: Content was below minimum and was augmented
 - `failed`: Generation failed (error details in outline JSON)
 ## 4. FqdnMapping
 **Purpose**: Maps cloud storage buckets to fully qualified domain names for URL generation.
--- a/docs/architecture/workflows.md
+++ b/docs/architecture/workflows.md
@ -1,27 +1,81 @@
 # Core Workflows
-This sequence diagram illustrates the primary workflow for a single content generation job.
+## Content Generation Workflow (Story 2.2)
 The simplified three-stage content generation pipeline:
 ```mermaid
 sequenceDiagram
    participant User
    participant CLI
-    participant Ingestion
+    participant BatchProcessor
-    participant Generation
+    participant ContentGenerator
-    participant Interlinking
+    participant AIClient
-    participant Deployment
+    participant Database
    participant API
-    User->>CLI: run job --file report.xlsx
+    User->>CLI: generate-batch --job-file jobs/example.json
-    CLI->>Ingestion: process_cora_file("report.xlsx")
+    CLI->>BatchProcessor: process_job()
-    Ingestion-->>CLI: project_id
+    
-    CLI->>Generation: generate_content(project_id)
+    loop For each project/tier/article
-    Generation-->>CLI: raw_html_list
+        BatchProcessor->>ContentGenerator: generate_title(project_id)
-    CLI->>Interlinking: inject_links(raw_html_list)
+        ContentGenerator->>AIClient: generate_completion(prompt)
-    Interlinking-->>CLI: final_html_list
+        AIClient-->>ContentGenerator: title
-    CLI->>Deployment: deploy_batch(final_html_list)
+        
-    Deployment-->>CLI: deployed_urls
+        BatchProcessor->>ContentGenerator: generate_outline(project_id, title)
-    CLI->>API: send_to_link_builder(job_data, deployed_urls)
+        ContentGenerator->>AIClient: generate_completion(prompt, json_mode=true)
-    API-->>CLI: success
+        AIClient-->>ContentGenerator: outline JSON
-    CLI-->>User: Job Complete! URLs logged.
+        
        BatchProcessor->>ContentGenerator: generate_content(project_id, title, outline)
        ContentGenerator->>AIClient: generate_completion(prompt)
        AIClient-->>ContentGenerator: HTML content
        BatchProcessor->>ContentGenerator: validate_word_count(content)
        alt Below minimum word count
            BatchProcessor->>ContentGenerator: augment_content(content, target_count)
            ContentGenerator->>AIClient: generate_completion(prompt)
            AIClient-->>ContentGenerator: augmented HTML
        end
        BatchProcessor->>Database: save GeneratedContent record
    end
    BatchProcessor-->>CLI: Summary statistics
    CLI-->>User: Job complete
 ```
 ## CORA Ingestion Workflow (Story 2.1)
 ```mermaid
 sequenceDiagram
    participant User
    participant CLI
    participant Parser
    participant Database
    User->>CLI: ingest-cora --file report.xlsx --name "Project Name"
    CLI->>Parser: parse(file_path)
    Parser-->>CLI: cora_data dict
    CLI->>Database: create Project record
    Database-->>CLI: project_id
    CLI-->>User: Project created (ID: X)
 ```
 ## Deployment Workflow (Story 1.6)
 ```mermaid
 sequenceDiagram
    participant User
    participant CLI
    participant BunnyNetClient
    participant Database
    User->>CLI: provision-site --name "Site" --domain "example.com"
    CLI->>BunnyNetClient: create_storage_zone()
    BunnyNetClient-->>CLI: storage_zone_id
    CLI->>BunnyNetClient: create_pull_zone()
    BunnyNetClient-->>CLI: pull_zone_id
    CLI->>BunnyNetClient: add_custom_hostname()
    CLI->>Database: save SiteDeployment record
    CLI-->>User: Site provisioned! Configure DNS.
 ```
--- a/docs/stories/story-2.2-task-breakdown.md
+++ b/docs/stories/story-2.2-task-breakdown.md
@ -0,0 +1,913 @@
 # Story 2.2: Simplified AI Content Generation - Detailed Task Breakdown
 ## Overview
 This document breaks down Story 2.2 into detailed tasks with specific implementation notes.
 ---
 ## **PHASE 1: Data Model & Schema Design**
 ### Task 1.1: Create GeneratedContent Database Model
 **File**: `src/database/models.py`
 **Add new model class:**
 ```python
 class GeneratedContent(Base):
    __tablename__ = "generated_content"
    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
    project_id: Mapped[int] = mapped_column(Integer, ForeignKey('projects.id'), nullable=False, index=True)
    tier: Mapped[str] = mapped_column(String(20), nullable=False, index=True)
    keyword: Mapped[str] = mapped_column(String(255), nullable=False, index=True)
    title: Mapped[str] = mapped_column(Text, nullable=False)
    outline: Mapped[dict] = mapped_column(JSON, nullable=False)
    content: Mapped[str] = mapped_column(Text, nullable=False)
    word_count: Mapped[int] = mapped_column(Integer, nullable=False)
    status: Mapped[str] = mapped_column(String(20), nullable=False)
    created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False)
    updated_at: Mapped[datetime] = mapped_column(
        DateTime, 
        default=datetime.utcnow, 
        onupdate=datetime.utcnow, 
        nullable=False
    )
 ```
 **Status values**: `generated`, `augmented`, `failed`
 **Update**: `scripts/init_db.py` to create the table
 ---
 ### Task 1.2: Create GeneratedContent Repository
 **File**: `src/database/repositories.py`
 **Add repository class:**
 ```python
 class GeneratedContentRepository(BaseRepository[GeneratedContent]):
    def __init__(self, session: Session):
        super().__init__(GeneratedContent, session)
    def get_by_project_id(self, project_id: int) -> list[GeneratedContent]:
        pass
    def get_by_project_and_tier(self, project_id: int, tier: str) -> list[GeneratedContent]:
        pass
    def get_by_keyword(self, keyword: str) -> list[GeneratedContent]:
        pass
 ```
 ---
 ### Task 1.3: Define Job File JSON Schema
 **File**: `jobs/README.md` (create/update)
 **Job file structure** (one project per job, multiple jobs per file):
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5,
          "min_word_count": 2000,
          "max_word_count": 2500,
          "min_h2_tags": 3,
          "max_h2_tags": 5,
          "min_h3_tags": 5,
          "max_h3_tags": 10
        },
        "tier2": {
          "count": 10,
          "min_word_count": 1500,
          "max_word_count": 2000,
          "min_h2_tags": 2,
          "max_h2_tags": 4,
          "min_h3_tags": 3,
          "max_h3_tags": 8
        },
        "tier3": {
          "count": 15,
          "min_word_count": 1000,
          "max_word_count": 1500,
          "min_h2_tags": 2,
          "max_h2_tags": 3,
          "min_h3_tags": 2,
          "max_h3_tags": 6
        }
      }
    },
    {
      "project_id": 2,
      "tiers": {
        "tier1": { ... }
      }
    }
  ]
 }
 ```
 **Tier defaults** (constants if not specified in job file):
 ```python
 TIER_DEFAULTS = {
    "tier1": {
        "min_word_count": 2000,
        "max_word_count": 2500,
        "min_h2_tags": 3,
        "max_h2_tags": 5,
        "min_h3_tags": 5,
        "max_h3_tags": 10
    },
    "tier2": {
        "min_word_count": 1500,
        "max_word_count": 2000,
        "min_h2_tags": 2,
        "max_h2_tags": 4,
        "min_h3_tags": 3,
        "max_h3_tags": 8
    },
    "tier3": {
        "min_word_count": 1000,
        "max_word_count": 1500,
        "min_h2_tags": 2,
        "max_h2_tags": 3,
        "min_h3_tags": 2,
        "max_h3_tags": 6
    }
 }
 ```
 **Future extensibility note**: This structure allows adding more fields per job in future stories.
 ---
 ## **PHASE 2: AI Client & Prompt Management**
 ### Task 2.1: Implement AIClient for OpenRouter
 **File**: `src/generation/ai_client.py`
 **OpenRouter API details**:
 - Base URL: `https://openrouter.ai/api/v1`
 - Compatible with OpenAI SDK
 - Requires `OPENROUTER_API_KEY` env variable
 **Initial model list**:
 ```python
 AVAILABLE_MODELS = {
    "gpt-4o-mini": "openai/gpt-4o-mini",
    "claude-sonnet-4.5": "anthropic/claude-3.5-sonnet"
 }
 ```
 **Implementation**:
 ```python
 class AIClient:
    def __init__(self, api_key: str, model: str, base_url: str = "https://openrouter.ai/api/v1"):
        self.client = OpenAI(api_key=api_key, base_url=base_url)
        self.model = model
    def generate_completion(
        self, 
        prompt: str, 
        system_message: str = None,
        max_tokens: int = 4000,
        temperature: float = 0.7,
        json_mode: bool = False
    ) -> str:
        """
        Generate completion from OpenRouter API
        json_mode: if True, adds response_format={"type": "json_object"}
        """
        pass
 ```
 **Error handling**: Retry 3x with exponential backoff for network/rate limit errors
 ---
 ### Task 2.2: Create Prompt Templates
 **Files**: `src/generation/prompts/*.json`
 **title_generation.json**:
 ```json
 {
  "system_message": "You are an expert SEO content writer...",
  "user_prompt": "Generate an SEO-optimized title for an article about: {keyword}\n\nRelated entities: {entities}\n\nRelated searches: {related_searches}\n\nReturn only the title text, no formatting."
 }
 ```
 **outline_generation.json**:
 ```json
 {
  "system_message": "You are an expert content outliner...",
  "user_prompt": "Create an article outline for:\nTitle: {title}\nKeyword: {keyword}\n\nConstraints:\n- {min_h2} to {max_h2} H2 headings\n- {min_h3} to {max_h3} H3 subheadings total\n\nEntities: {entities}\nRelated searches: {related_searches}\n\nReturn as JSON: {\"outline\": [{\"h2\": \"...\", \"h3\": [\"...\", \"...\"]}]}"
 }
 ```
 **content_generation.json**:
 ```json
 {
  "system_message": "You are an expert content writer...",
  "user_prompt": "Write a complete article based on:\nTitle: {title}\nOutline: {outline}\nKeyword: {keyword}\n\nEntities to include: {entities}\nRelated searches: {related_searches}\n\nReturn as HTML fragment with <h2>, <h3>, <p> tags. Do NOT include <html>, <head>, or <body> tags."
 }
 ```
 **content_augmentation.json**:
 ```json
 {
  "system_message": "You are an expert content editor...",
  "user_prompt": "Please expand on the following article to add more detail and depth, ensuring you maintain the existing topical focus. Target word count: {target_word_count}\n\nCurrent article:\n{content}\n\nReturn the expanded article as an HTML fragment."
 }
 ```
 ---
 ### Task 2.3: Create PromptManager
 **File**: `src/generation/ai_client.py` (add to same file)
 ```python
 class PromptManager:
    def __init__(self, prompts_dir: str = "src/generation/prompts"):
        self.prompts_dir = prompts_dir
        self.prompts = {}
    def load_prompt(self, prompt_name: str) -> dict:
        """Load prompt from JSON file"""
        pass
    def format_prompt(self, prompt_name: str, **kwargs) -> tuple[str, str]:
        """
        Format prompt with variables
        Returns: (system_message, user_prompt)
        """
        pass
 ```
 ---
 ## **PHASE 3: Core Generation Pipeline**
 ### Task 3.1: Implement ContentGenerator Service
 **File**: `src/generation/service.py`
 ```python
 class ContentGenerator:
    def __init__(
        self,
        ai_client: AIClient,
        prompt_manager: PromptManager,
        project_repo: ProjectRepository,
        content_repo: GeneratedContentRepository
    ):
        self.ai_client = ai_client
        self.prompt_manager = prompt_manager
        self.project_repo = project_repo
        self.content_repo = content_repo
 ```
 ---
 ### Task 3.2: Implement Stage 1 - Title Generation
 **File**: `src/generation/service.py`
 ```python
 def generate_title(self, project_id: int, debug: bool = False) -> str:
    """
    Generate SEO-optimized title
    Returns: title string
    Saves to debug_output/title_project_{id}_{timestamp}.txt if debug=True
    """
    # Fetch project
    # Load prompt
    # Call AI
    # If debug: save response to debug_output/
    # Return title
    pass
 ```
 ---
 ### Task 3.3: Implement Stage 2 - Outline Generation
 **File**: `src/generation/service.py`
 ```python
 def generate_outline(
    self, 
    project_id: int, 
    title: str, 
    min_h2: int,
    max_h2: int,
    min_h3: int,
    max_h3: int,
    debug: bool = False
 ) -> dict:
    """
    Generate article outline in JSON format
    Returns: {"outline": [{"h2": "...", "h3": ["...", "..."]}]}
    Uses json_mode=True in AI call to ensure JSON response
    Validates: at least min_h2 headings, at least min_h3 total subheadings
    Saves to debug_output/outline_project_{id}_{timestamp}.json if debug=True
    """
    pass
 ```
 **Validation**:
 - Parse JSON response
 - Count h2 tags (must be >= min_h2)
 - Count total h3 tags across all h2s (must be >= min_h3)
 - Raise error if validation fails
 ---
 ### Task 3.4: Implement Stage 3 - Content Generation
 **File**: `src/generation/service.py`
 ```python
 def generate_content(
    self, 
    project_id: int, 
    title: str, 
    outline: dict,
    debug: bool = False
 ) -> str:
    """
    Generate full article HTML fragment
    Returns: HTML string with <h2>, <h3>, <p> tags
    Does NOT include <html>, <head>, or <body> tags
    Saves to debug_output/content_project_{id}_{timestamp}.html if debug=True
    """
    pass
 ```
 **HTML fragment format**:
 ```html
 <h2>First Heading</h2>
 <p>Paragraph content...</p>
 <h3>Subheading</h3>
 <p>More content...</p>
 ```
 ---
 ### Task 3.5: Implement Word Count Validation
 **File**: `src/generation/service.py`
 ```python
 def validate_word_count(self, content: str, min_words: int, max_words: int) -> tuple[bool, int]:
    """
    Validate content word count
    Returns: (is_valid, actual_count)
    - is_valid: True if min_words <= actual_count <= max_words
    - actual_count: number of words in content
    Implementation: Strip HTML tags, split on whitespace, count tokens
    """
    pass
 ```
 ---
 ### Task 3.6: Implement Simple Augmentation
 **File**: `src/generation/service.py`
 ```python
 def augment_content(
    self, 
    content: str, 
    target_word_count: int,
    debug: bool = False
 ) -> str:
    """
    Expand article content to meet minimum word count
    Called ONLY if word_count < min_word_count
    Makes ONE API call only
    Saves to debug_output/augmented_project_{id}_{timestamp}.html if debug=True
    """
    pass
 ```
 ---
 ## **PHASE 4: Batch Processing**
 ### Task 4.1: Create JobConfig Parser
 **File**: `src/generation/job_config.py`
 ```python
 from dataclasses import dataclass
 from typing import Optional
 TIER_DEFAULTS = {
    "tier1": {
        "min_word_count": 2000,
        "max_word_count": 2500,
        "min_h2_tags": 3,
        "max_h2_tags": 5,
        "min_h3_tags": 5,
        "max_h3_tags": 10
    },
    "tier2": {
        "min_word_count": 1500,
        "max_word_count": 2000,
        "min_h2_tags": 2,
        "max_h2_tags": 4,
        "min_h3_tags": 3,
        "max_h3_tags": 8
    },
    "tier3": {
        "min_word_count": 1000,
        "max_word_count": 1500,
        "min_h2_tags": 2,
        "max_h2_tags": 3,
        "min_h3_tags": 2,
        "max_h3_tags": 6
    }
 }
@dataclass
 class TierConfig:
    count: int
    min_word_count: int
    max_word_count: int
    min_h2_tags: int
    max_h2_tags: int
    min_h3_tags: int
    max_h3_tags: int
@dataclass
 class Job:
    project_id: int
    tiers: dict[str, TierConfig]
 class JobConfig:
    def __init__(self, job_file_path: str):
        """Load and parse job file, apply defaults"""
        pass
    def get_jobs(self) -> list[Job]:
        """Return list of all jobs in file"""
        pass
    def get_tier_config(self, job: Job, tier_name: str) -> Optional[TierConfig]:
        """Get tier config with defaults applied"""
        pass
 ```
 ---
 ### Task 4.2: Create BatchProcessor
 **File**: `src/generation/batch_processor.py`
 ```python
 class BatchProcessor:
    def __init__(
        self,
        content_generator: ContentGenerator,
        content_repo: GeneratedContentRepository,
        project_repo: ProjectRepository
    ):
        pass
    def process_job(
        self, 
        job_file_path: str, 
        debug: bool = False,
        continue_on_error: bool = False
    ):
        """
        Process all jobs in job file
        For each job:
          For each tier:
            For count times:
              1. Generate title (log to console)
              2. Generate outline
              3. Generate content
              4. Validate word count
              5. If below min, augment once
              6. Save to GeneratedContent table
        Logs progress to console
        If debug=True, saves AI responses to debug_output/
        """
        pass
 ```
 **Console output format**:
 ```
 Processing Job 1/3: Project ID 5
  Tier 1: Generating 5 articles
    [1/5] Generating title... "Ultimate Guide to SEO in 2025"
    [1/5] Generating outline... 4 H2s, 8 H3s
    [1/5] Generating content... 1,845 words
    [1/5] Below minimum (2000), augmenting... 2,123 words
    [1/5] Saved (ID: 42, Status: augmented)
    [2/5] Generating title... "Advanced SEO Techniques"
    ...
  Tier 2: Generating 10 articles
    ...
 Summary:
  Jobs processed: 3/3
  Articles generated: 45/45
  Augmented: 12
  Failed: 0
 ```
 ---
 ### Task 4.3: Error Handling & Retry Logic
 **File**: `src/generation/batch_processor.py`
 **Error handling strategy**:
 - AI API errors: Log error, mark as `status='failed'`, save to DB
 - If `continue_on_error=True`: continue to next article
 - If `continue_on_error=False`: stop batch processing
 - Database errors: Always abort (data integrity)
 - Invalid job file: Fail fast with validation error
 **Retry logic** (in AIClient):
 - Network errors: 3 retries with exponential backoff (1s, 2s, 4s)
 - Rate limit errors: Respect Retry-After header
 - Other errors: No retry, raise immediately
 ---
 ## **PHASE 5: CLI Integration**
 ### Task 5.1: Add generate-batch Command
 **File**: `src/cli/commands.py`
 ```python
@app.command("generate-batch")
@click.option('--job-file', '-j', required=True, type=click.Path(exists=True), 
              help='Path to job JSON file')
@click.option('--username', '-u', help='Username for authentication')
@click.option('--password', '-p', help='Password for authentication')
@click.option('--debug', is_flag=True, help='Save AI responses to debug_output/')
@click.option('--continue-on-error', is_flag=True, 
              help='Continue processing if article generation fails')
@click.option('--model', '-m', default='gpt-4o-mini',
              help='AI model to use (gpt-4o-mini, claude-sonnet-4.5)')
 def generate_batch(
    job_file: str, 
    username: Optional[str], 
    password: Optional[str],
    debug: bool,
    continue_on_error: bool,
    model: str
 ):
    """Generate content batch from job file"""
    # Authenticate user
    # Initialize AIClient with OpenRouter
    # Initialize PromptManager, ContentGenerator, BatchProcessor
    # Call process_job()
    # Show summary
    pass
 ```
 ---
 ### Task 5.2: Add Progress Logging & Debug Output
 **File**: `src/generation/batch_processor.py`
 **Debug output** (when `--debug` flag used):
 - Create `debug_output/` directory if not exists
 - For each AI call, save response to file:
  - `debug_output/title_project{id}_tier{tier}_{n}_{timestamp}.txt`
  - `debug_output/outline_project{id}_tier{tier}_{n}_{timestamp}.json`
  - `debug_output/content_project{id}_tier{tier}_{n}_{timestamp}.html`
  - `debug_output/augmented_project{id}_tier{tier}_{n}_{timestamp}.html`
 - Also echo to console with `click.echo()`
 **Normal output** (without `--debug`):
 - Always show title when generated: `"Generated title: {title}"`
 - Show word counts and status
 - Show progress counter `[n/total]`
 ---
 ## **PHASE 6: Testing & Validation**
 ### Task 6.1: Create Unit Tests
 #### `tests/unit/test_ai_client.py`
 ```python
 def test_generate_completion_success():
    """Test successful AI completion"""
    pass
 def test_generate_completion_json_mode():
    """Test JSON mode returns valid JSON"""
    pass
 def test_generate_completion_retry_on_network_error():
    """Test retry logic for network errors"""
    pass
 ```
 #### `tests/unit/test_content_generator.py`
 ```python
 def test_generate_title():
    """Test title generation with mocked AI response"""
    pass
 def test_generate_outline_valid_structure():
    """Test outline generation returns valid JSON with min h2/h3"""
    pass
 def test_generate_content_html_fragment():
    """Test content is HTML fragment (no <html> tag)"""
    pass
 def test_validate_word_count():
    """Test word count validation with various HTML inputs"""
    pass
 def test_augment_content_called_once():
    """Test augmentation only called once"""
    pass
 ```
 #### `tests/unit/test_job_config.py`
 ```python
 def test_load_job_config_valid():
    """Test loading valid job file"""
    pass
 def test_tier_defaults_applied():
    """Test defaults applied when not in job file"""
    pass
 def test_multiple_jobs_in_file():
    """Test parsing file with multiple jobs"""
    pass
 ```
 #### `tests/unit/test_batch_processor.py`
 ```python
 def test_process_job_success():
    """Test successful batch processing"""
    pass
 def test_process_job_with_augmentation():
    """Test articles below min word count are augmented"""
    pass
 def test_process_job_continue_on_error():
    """Test continue_on_error flag behavior"""
    pass
 ```
 ---
 ### Task 6.2: Create Integration Test
 **File**: `tests/integration/test_generate_batch.py`
 ```python
 def test_generate_batch_end_to_end(test_db, mock_ai_client):
    """
    End-to-end test:
    1. Create test project in DB
    2. Create test job file
    3. Run batch processor
    4. Verify GeneratedContent records created
    5. Verify word counts within range
    6. Verify HTML structure
    """
    pass
 ```
 ---
 ### Task 6.3: Create Example Job Files
 #### `jobs/example_tier1_batch.json`
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5
        }
      }
    }
  ]
 }
 ```
 (Uses all defaults for tier1)
 #### `jobs/example_multi_tier_batch.json`
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5,
          "min_word_count": 2200,
          "max_word_count": 2600
        },
        "tier2": {
          "count": 10
        },
        "tier3": {
          "count": 15,
          "max_h2_tags": 4
        }
      }
    },
    {
      "project_id": 2,
      "tiers": {
        "tier1": {
          "count": 3
        }
      }
    }
  ]
 }
 ```
 #### `jobs/README.md`
 Document job file format and examples
 ---
 ## **PHASE 7: Cleanup & Deprecation**
 ### Task 7.1: Remove Old ContentRuleEngine
 **Action**: Delete or gut `src/generation/rule_engine.py`
 Only keep if it has reusable utilities. Otherwise remove entirely.
 ---
 ### Task 7.2: Remove Old Validator Logic
 **Action**: Review `src/generation/validator.py` (if exists)
 Remove any strict CORA validation beyond word count. Keep only simple validation utilities.
 ---
 ### Task 7.3: Update Documentation
 **Files to update**:
 - `docs/stories/story-2.2. simplified-ai-content-generation.md` - Status to "In Progress" → "Done"
 - `docs/architecture/workflows.md` - Document simplified generation flow
 - `docs/architecture/components.md` - Update generation component description
 ---
 ## Implementation Order Recommendation
 1. **Phase 1** (Data Layer) - Required foundation
 2. **Phase 2** (AI Client) - Required for generation
 3. **Phase 3** (Core Logic) - Implement one stage at a time, test each
 4. **Phase 4** (Batch Processing) - Orchestrate stages
 5. **Phase 5** (CLI) - Make accessible to users
 6. **Phase 6** (Testing) - Can be done in parallel with implementation
 7. **Phase 7** (Cleanup) - Final polish
 **Estimated effort**: 
 - Phase 1-2: 4-6 hours
 - Phase 3: 6-8 hours
 - Phase 4: 3-4 hours
 - Phase 5: 2-3 hours
 - Phase 6: 4-6 hours
 - Phase 7: 1-2 hours
 - **Total**: 20-29 hours
 ---
 ## Critical Dev Notes
 ### OpenRouter Specifics
 - API key from environment: `OPENROUTER_API_KEY`
 - Model format: `"provider/model-name"`
 - Supports OpenAI SDK drop-in replacement
 - Rate limits vary by model (check OpenRouter docs)
 ### HTML Fragment Format
 Content generation returns HTML like:
 ```html
 <h2>Main Topic</h2>
 <p>Introduction paragraph with relevant keywords and entities.</p>
 <h3>Subtopic One</h3>
 <p>Detailed content about subtopic.</p>
 <h3>Subtopic Two</h3>
 <p>More detailed content.</p>
 <h2>Second Main Topic</h2>
 <p>Content continues...</p>
 ```
 **No document structure**: No `<!DOCTYPE>`, `<html>`, `<head>`, or `<body>` tags.
 ### Word Count Method
 ```python
 import re
 from html import unescape
 def count_words(html_content: str) -> int:
    # Strip HTML tags
    text = re.sub(r'<[^>]+>', '', html_content)
    # Unescape HTML entities
    text = unescape(text)
    # Split and count
    words = text.split()
    return len(words)
 ```
 ### Debug Output Directory
 - Create `debug_output/` at project root if not exists
 - Add to `.gitignore`
 - Filename format: `{stage}_project{id}_tier{tier}_article{n}_{timestamp}.{ext}`
 - Example: `title_project5_tier1_article3_20251020_143022.txt`
 ### Tier Constants Location
 Define in `src/generation/job_config.py` as module-level constant for easy reference.
 ### Future Extensibility
 Job file structure designed to support:
 - Custom interlinking rules (Story 2.4+)
 - Template selection (Story 3.x)
 - Deployment targets (Story 4.x)
 - SEO metadata overrides
 Keep job parsing flexible to add new fields without breaking existing jobs.
 ---
 ## Testing Strategy
 ### Unit Test Mocking
 Mock `AIClient.generate_completion()` to return realistic HTML:
 ```python
@pytest.fixture
 def mock_title_response():
    return "The Ultimate Guide to Sustainable Gardening in 2025"
@pytest.fixture
 def mock_outline_response():
    return {
        "outline": [
            {"h2": "Getting Started", "h3": ["Tools", "Planning"]},
            {"h2": "Best Practices", "h3": ["Watering", "Composting"]}
        ]
    }
@pytest.fixture
 def mock_content_response():
    return """<h2>Getting Started</h2>
 <p>Sustainable gardening begins with proper planning...</p>
 <h3>Tools</h3>
 <p>Essential tools include...</p>"""
 ```
 ### Integration Test Database
 Use `conftest.py` fixture with in-memory SQLite and test data:
 ```python
@pytest.fixture
 def test_project(test_db):
    project_repo = ProjectRepository(test_db)
    return project_repo.create(
        user_id=1,
        name="Test Project",
        data={
            "main_keyword": "sustainable gardening",
            "entities": ["composting", "organic soil"],
            "related_searches": ["how to compost", "organic gardening tips"]
        }
    )
 ```
 ---
 ## Success Criteria
 Story is complete when:
 1. All database models and repositories implemented
 2. AIClient successfully calls OpenRouter API
 3. Three-stage generation pipeline works end-to-end
 4. Batch processor handles multiple jobs/tiers
 5. CLI command `generate-batch` functional
 6. Debug output saves to `debug_output/` when `--debug` used
 7. All unit tests pass
 8. Integration test demonstrates full workflow
 9. Example job files work correctly
 10. Documentation updated
 **Acceptance**: Run `generate-batch` on real project, verify content saved to database with correct word count and structure.
--- a/simplified-ai-content-generation.md
+++ b/simplified-ai-content-generation.md
@ -0,0 +1,40 @@
 # Story 2.2: Simplified AI Content Generation via Batch Job
 ## Status
 Completed
 ## Story
 **As a** User,
 **I want** to control AI content generation via a batch file that specifies word count and heading limits,
 **so that** I can easily create topically relevant articles without unnecessary complexity or rigid validation.
 ## Acceptance Criteria
 1.  **Batch Job Control:** The `generate-batch` command accepts a JSON job file that specifies `min_word_count`, `max_word_count`, `max_h2_tags`, and `max_h3_tags` for each tier.
 2.  **Three-Stage Generation:** The system uses a simple three-stage pipeline:
    * Generates a title using the project's SEO data.
    * Generates an outline based on the title, SEO data, and the `max_h2`/`max_h3` limits from the job file.
    * Generates the full article content based on the validated outline.
 3.  **SEO Data Integration:** The generation process for all stages is informed by the project's `keyword`, `entities`, and `related_searches` to ensure topical relevance.
 4.  **Word Count Validation:** After generation, the system validates the content *only* against the `min_word_count` and `max_word_count` specified in the job file.
 5.  **Simple Augmentation:** If the generated content is below `min_word_count`, the system makes **one** attempt to append additional content using a simple "expand on this article" prompt.
 6.  **Database Storage:** The final generated title, outline, and content are stored in the `GeneratedContent` table.
 7.  **CLI Execution:** The `generate-batch` command successfully runs the job, logs progress to the console, and indicates when the process is complete.
 ## Dev Notes
 * **Objective:** This story replaces the previous, overly complex stories 2.2 and 2.3. The goal is maximum simplicity and user control via the job file.
 * **Key Change:** Remove the entire `ContentRuleEngine` and all strict CORA validation logic. The only validation required is a final word count check.
 * **Job File is King:** All operational parameters (`min_word_count`, `max_word_count`, `max_h2_tags`, `max_h3_tags`) must be read from the job file for each tier being processed.
 * **Augmentation:** Keep it simple. If `word_count < min_word_count`, make a single API call to the AI with a prompt like: "Please expand on the following article to add more detail and depth, ensuring you maintain the existing topical focus. Here is the article: {content}". Do not create a complex augmentation system.
 ## Implementation Plan
 See **[story-2.2-task-breakdown.md](story-2.2-task-breakdown.md)** for detailed implementation tasks.
 The task breakdown is organized into 7 phases:
 1. **Phase 1**: Data Model & Schema Design (GeneratedContent table, repositories, job file schema)
 2. **Phase 2**: AI Client & Prompt Management (OpenRouter integration, prompt templates)
 3. **Phase 3**: Core Generation Pipeline (title, outline, content generation with validation)
 4. **Phase 4**: Batch Processing (job config parser, batch processor, error handling)
 5. **Phase 5**: CLI Integration (generate-batch command, progress logging, debug output)
 6. **Phase 6**: Testing & Validation (unit tests, integration tests, example job files)
 7. **Phase 7**: Cleanup & Deprecation (remove old rule engine and validators)
--- a/env.example
+++ b/env.example
@ -2,7 +2,7 @@
 DATABASE_URL=sqlite:///./content_automation.db
 # AI Service Configuration (OpenRouter)
-AI_API_KEY=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
+OPENROUTER_API_KEY=your_openrouter_api_key_here
 AI_API_BASE_URL=https://openrouter.ai/api/v1
 AI_MODEL=anthropic/claude-3.5-sonnet
--- a/16
+++ b/16
@ -0,0 +1,16 @@
 [33m5b5bd1b[m[33m ([m[1;36mHEAD[m[33m -> [m[1;32mfeature/tier-word-count-override[m[33m)[m Add tier-specific word count and outline controls
 [33m3063fc4[m[33m ([m[1;31morigin/main[m[33m, [m[1;31morigin/HEAD[m[33m, [m[1;32mmain[m[33m)[m Story 2.3 - content generation script nightmare alomst done  - fixed (maybe) outline too big issue
 [33mb6b0acf[m Story 2.3 - content generation script nightmare alomst done  - pre-fix outline too big issue
 [33mf73b070[m[33m ([m[1;31mgithub/main[m[33m)[m Story 2.3 - content generation script finished - fix ci
 [33me2afabb[m Story 2.3 - content generation script finished
 [33m0069e6e[m Story 2.2 - rule engine finished
 [33md81537f[m Story 2.1 finished
 [33m02dd5a3[m Story 2.1 finished
 [33m29ecaec[m Story 1.7 finished
 [33mda797c2[m Story 1.6 finished - added sync
 [33m4cada9d[m Story 1.6 finished
 [33mb6e495e[m feat: Story 1.5 - CLI User Management
 [33m0a223e2[m Complete Story 1.4: Internal API Foundation
 [33m8641bca[m Complete Epic 1 Stories 1.1-1.3: Foundation, Database, and Authentication
 [33m70b9de2[m feat: Complete Story 1.1 - Project Initialization & Configuration
 [33m31b9580[m Initial commit: Project structure and planning documents
--- a/jobs/README.md
+++ b/jobs/README.md
@ -1,77 +1,179 @@
-# Job Configuration Files
+# Job File Format
-This directory contains batch job configuration files for content generation.
+Job files define batch content generation parameters using JSON format.
-## Usage
+## Structure
 Run a batch job using the CLI:
 ```bash
 python main.py generate-batch --job-file jobs/example_tier1_batch.json -u admin -p password
 ```
 ## Job Configuration Structure
 ```json
 {
-  "job_name": "Descriptive name",
+  "jobs": [
  "project_id": 1,
  "description": "Optional description",
  "tiers": [
    {
-      "tier": 1,
+      "project_id": 1,
-      "article_count": 15,
+      "tiers": {
-      "models": {
+        "tier1": {
-        "title": "model-id",
+          "count": 5,
-        "outline": "model-id",
+          "min_word_count": 2000,
-        "content": "model-id"
+          "max_word_count": 2500,
-      },
+          "min_h2_tags": 3,
-      "anchor_text_config": {
+          "max_h2_tags": 5,
-        "mode": "default|override|append",
+          "min_h3_tags": 5,
-        "custom_text": ["optional", "custom", "anchors"],
+          "max_h3_tags": 10
        "additional_text": ["optional", "additions"]
      },
      "validation_attempts": 3
        }
  ],
  "failure_config": {
    "max_consecutive_failures": 5,
    "skip_on_failure": true
  },
  "interlinking": {
    "links_per_article_min": 2,
    "links_per_article_max": 4,
    "include_home_link": true
      }
    }
  ]
 }
 ```
-## Available Models
+## Fields
- `anthropic/claude-3.5-sonnet` - Best for high-quality content
+### Job Level
- `anthropic/claude-3-haiku` - Fast and cost-effective
+- `project_id` (required): The project ID to generate content for
- `openai/gpt-4o` - Excellent quality
+- `tiers` (required): Dictionary of tier configurations
 - `openai/gpt-4o-mini` - Good for titles/outlines
 - `meta-llama/llama-3.1-70b-instruct` - Open source alternative
 - `google/gemini-pro-1.5` - Google's offering
-## Anchor Text Modes
+### Tier Level
 - `count` (required): Number of articles to generate for this tier
 - `min_word_count` (optional): Minimum word count (uses defaults if not specified)
 - `max_word_count` (optional): Maximum word count (uses defaults if not specified)
 - `min_h2_tags` (optional): Minimum H2 headings (uses defaults if not specified)
 - `max_h2_tags` (optional): Maximum H2 headings (uses defaults if not specified)
 - `min_h3_tags` (optional): Minimum H3 subheadings total (uses defaults if not specified)
 - `max_h3_tags` (optional): Maximum H3 subheadings total (uses defaults if not specified)
- **default**: Use CORA rules (keyword, entities, related searches)
+## Tier Defaults
 - **override**: Replace default with custom_text list
 - **append**: Add additional_text to default anchor text
-## Example Files
+If tier parameters are not specified, these defaults are used:
- `example_tier1_batch.json` - Single tier 1 with 15 articles
+### tier1
- `example_multi_tier_batch.json` - Three tiers with 165 total articles
+- `min_word_count`: 2000
- `example_custom_anchors.json` - Custom anchor text demo
+- `max_word_count`: 2500
 - `min_h2_tags`: 3
 - `max_h2_tags`: 5
 - `min_h3_tags`: 5
 - `max_h3_tags`: 10
-## Tips
+### tier2
 - `min_word_count`: 1500
 - `max_word_count`: 2000
 - `min_h2_tags`: 2
 - `max_h2_tags`: 4
 - `min_h3_tags`: 3
 - `max_h3_tags`: 8
-1. Start with tier 1 to ensure quality
+### tier3
-2. Use faster/cheaper models for tier 2+
+- `min_word_count`: 1000
-3. Set `skip_on_failure: true` to continue on errors
+- `max_word_count`: 1500
-4. Adjust `max_consecutive_failures` based on model reliability
+- `min_h2_tags`: 2
-5. Test with small batches first
+- `max_h2_tags`: 3
 - `min_h3_tags`: 2
 - `max_h3_tags`: 6
 ## Examples
 ### Simple: Single Tier with Defaults
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5
        }
      }
    }
  ]
 }
 ```
 ### Custom Word Counts
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 3,
          "min_word_count": 2500,
          "max_word_count": 3000
        }
      }
    }
  ]
 }
 ```
 ### Multi-Tier
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5
        },
        "tier2": {
          "count": 10
        },
        "tier3": {
          "count": 15
        }
      }
    }
  ]
 }
 ```
 ### Multiple Projects
 ```json
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5
        }
      }
    },
    {
      "project_id": 2,
      "tiers": {
        "tier1": {
          "count": 3
        },
        "tier2": {
          "count": 8
        }
      }
    }
  ]
 }
 ```
 ## Usage
 Run batch generation with:
 ```bash
 python main.py generate-batch --job-file jobs/example_tier1_batch.json --username youruser --password yourpass
 ```
 ### Options
 - `--job-file, -j`: Path to job JSON file (required)
 - `--username, -u`: Username for authentication
 - `--password, -p`: Password for authentication
 - `--debug`: Save AI responses to debug_output/
 - `--continue-on-error`: Continue processing if article generation fails
 - `--model, -m`: AI model to use (default: gpt-4o-mini)
 ### Debug Mode
 When using `--debug`, AI responses are saved to `debug_output/`:
 - `title_project{id}_tier{tier}_article{n}_{timestamp}.txt`
 - `outline_project{id}_tier{tier}_article{n}_{timestamp}.json`
 - `content_project{id}_tier{tier}_article{n}_{timestamp}.html`
 - `augmented_project{id}_tier{tier}_article{n}_{timestamp}.html` (if augmented)
--- a/jobs/example_multi_tier_batch.json
+++ b/jobs/example_multi_tier_batch.json
@ -1,57 +1,30 @@
 {
-  "job_name": "Multi-Tier Site Build",
+  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 5,
          "min_word_count": 2200,
          "max_word_count": 2600
        },
        "tier2": {
          "count": 10
        },
        "tier3": {
          "count": 15,
          "max_h2_tags": 4
        }
      }
    },
    {
      "project_id": 2,
-  "description": "Complete site build with 165 articles across 3 tiers",
+      "tiers": {
-  "tiers": [
+        "tier1": {
-    {
+          "count": 3
      "tier": 1,
      "article_count": 15,
      "models": {
        "title": "openai/gpt-4o-mini",
        "outline": "openai/gpt-4o-mini",
        "content": "anthropic/claude-4.5-sonnet"
      },
      "anchor_text_config": {
        "mode": "default"
      },
      "validation_attempts": 3
    },
    {
      "tier": 2,
      "article_count": 50,
      "models": {
        "title": "openai/gpt-4o-mini",
        "outline": "openai/gpt-4o-mini",
        "content": "openai/gpt-4o-mini"
      },
      "anchor_text_config": {
        "mode": "append",
        "additional_text": ["comprehensive guide", "expert insights"]
      },
      "validation_attempts": 2
    },
    {
      "tier": 3,
      "article_count": 100,
      "models": {
        "title": "openai/gpt-4o-mini",
        "outline": "openai/gpt-4o-mini",
        "content": "openai/gpt-4o-mini"
      },
      "anchor_text_config": {
        "mode": "default"
      },
      "validation_attempts": 2
        }
  ],
  "failure_config": {
    "max_consecutive_failures": 3,
    "skip_on_failure": true
  },
  "interlinking": {
    "links_per_article_min": 2,
    "links_per_article_max": 4,
    "include_home_link": true
      }
    }
  ]
 }
--- a/jobs/example_tier1_batch.json
+++ b/jobs/example_tier1_batch.json
@ -1,30 +1,13 @@
 {
-  "job_name": "Tier 1 Launch Batch",
+  "jobs": [
  "project_id": 1,
  "description": "Initial tier 1 content - 15 high-quality articles with strict validation",
  "tiers": [
    {
-      "tier": 1,
+      "project_id": 1,
-      "article_count": 15,
+      "tiers": {
-      "models": {
+        "tier1": {
-        "title": "anthropic/claude-3.5-sonnet",
+          "count": 5
        "outline": "anthropic/claude-3.5-sonnet",
        "content": "anthropic/claude-3.5-sonnet"
      },
      "anchor_text_config": {
        "mode": "default"
      },
      "validation_attempts": 3
        }
  ],
  "failure_config": {
    "max_consecutive_failures": 5,
    "skip_on_failure": true
  },
  "interlinking": {
    "links_per_article_min": 2,
    "links_per_article_max": 4,
    "include_home_link": true
      }
    }
  ]
 }
--- a/jobs/test_augmentation.json
+++ b/jobs/test_augmentation.json
@ -0,0 +1,19 @@
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 1,
          "min_word_count": 2000,
          "max_word_count": 2500,
          "min_h2_tags": 3,
          "max_h2_tags": 5,
          "min_h3_tags": 5,
          "max_h3_tags": 10
        }
      }
    }
  ]
 }
--- a/jobs/test_small.json
+++ b/jobs/test_small.json
@ -0,0 +1,19 @@
 {
  "jobs": [
    {
      "project_id": 1,
      "tiers": {
        "tier1": {
          "count": 1,
          "min_word_count": 500,
          "max_word_count": 800,
          "min_h2_tags": 2,
          "max_h2_tags": 3,
          "min_h3_tags": 3,
          "max_h3_tags": 6
        }
      }
    }
  ]
 }
--- a/scripts/add_admin_direct.py
+++ b/scripts/add_admin_direct.py
@ -0,0 +1,27 @@
 import sys
 from pathlib import Path
 project_root = Path(__file__).parent.parent
 sys.path.insert(0, str(project_root))
 from src.database.session import db_manager
 from src.database.repositories import UserRepository
 from src.auth.service import AuthService
 db_manager.initialize()
 session = db_manager.get_session()
 try:
    user_repo = UserRepository(session)
    auth_service = AuthService(user_repo)
    user = auth_service.create_user_with_hashed_password(
        username="admin",
        password="admin1234",
        role="Admin"
    )
    print(f"Admin user created: {user.username}")
 finally:
    session.close()
    db_manager.close()
--- a/src/cli/commands.py
+++ b/src/cli/commands.py
@ -16,6 +16,11 @@ from src.deployment.bunnynet import (
    BunnyNetResourceConflictError
 )
 from src.ingestion.parser import CORAParser, CORAParseError
 from src.generation.ai_client import AIClient, PromptManager
 from src.generation.service import ContentGenerator
 from src.generation.batch_processor import BatchProcessor
 from src.database.repositories import GeneratedContentRepository
 import os
 def authenticate_admin(username: str, password: str) -> Optional[User]:
@ -871,22 +876,26 @@ def list_projects(username: Optional[str], password: Optional[str]):
        raise click.Abort()
-@app.command()
+<<<<<<< HEAD
-@click.option("--job-file", "-j", required=True, help="Path to job configuration JSON file")
+@app.command("generate-batch")
-@click.option("--force-regenerate", "-f", is_flag=True, help="Force regeneration even if content exists")
+@click.option('--job-file', '-j', required=True, type=click.Path(exists=True), 
-@click.option("--debug", "-d", is_flag=True, help="Enable debug mode (saves generated content to debug_output/)")
+              help='Path to job JSON file')
-@click.option("--username", "-u", help="Username for authentication")
+@click.option('--username', '-u', help='Username for authentication')
-@click.option("--password", "-p", help="Password for authentication")
+@click.option('--password', '-p', help='Password for authentication')
-def generate_batch(job_file: str, force_regenerate: bool, debug: bool, username: Optional[str], password: Optional[str]):
+@click.option('--debug', is_flag=True, help='Save AI responses to debug_output/')
-    """
+@click.option('--continue-on-error', is_flag=True, 
-    Generate batch of articles from a job configuration file
+              help='Continue processing if article generation fails')
-    
+@click.option('--model', '-m', default='gpt-4o-mini',
-    Example:
+              help='AI model to use (gpt-4o-mini, claude-sonnet-4.5)')
-        python main.py generate-batch --job-file jobs/tier1_batch.json -u admin -p pass
+def generate_batch(
-    """
+    job_file: str, 
-    from src.generation.batch_processor import BatchProcessor
+    username: Optional[str], 
-    from src.generation.job_config import JobConfig
+    password: Optional[str],
-    
+    debug: bool,
    continue_on_error: bool,
    model: str
 ):
    """Generate content batch from job file"""
    try:
        if not username or not password:
            username, password = prompt_admin_credentials()
@ -903,70 +912,47 @@ def generate_batch(job_file: str, force_regenerate: bool, debug: bool, username:
            click.echo(f"Authenticated as: {user.username} ({user.role})")
-            job_config = JobConfig.from_file(job_file)
+            api_key = os.getenv("OPENROUTER_API_KEY")
            if not api_key:
                click.echo("Error: OPENROUTER_API_KEY not found in environment", err=True)
                click.echo("Please set OPENROUTER_API_KEY in your .env file", err=True)
                raise click.Abort()
-            click.echo(f"\nLoading Job: {job_config.job_name}")
+            click.echo(f"Initializing AI client with model: {model}")
-            click.echo(f"Project ID: {job_config.project_id}")
+            ai_client = AIClient(api_key=api_key, model=model)
-            click.echo(f"Total Articles: {job_config.get_total_articles()}")
+            prompt_manager = PromptManager()
            click.echo(f"\nTiers:")
            for tier_config in job_config.tiers:
                click.echo(f"  Tier {tier_config.tier}: {tier_config.article_count} articles")
                click.echo(f"    Models: {tier_config.models.title} / {tier_config.models.outline} / {tier_config.models.content}")
-            if not click.confirm("\nProceed with generation?"):
+            project_repo = ProjectRepository(session)
-                click.echo("Aborted")
+            content_repo = GeneratedContentRepository(session)
                return
-            click.echo("\nStarting batch generation...")
+            content_generator = ContentGenerator(
-            click.echo("-" * 80)
+                ai_client=ai_client,
                prompt_manager=prompt_manager,
                project_repo=project_repo,
                content_repo=content_repo
            )
-            def progress_callback(tier=None, article_num=None, total=None, status=None, stage=None, **kwargs):
+            batch_processor = BatchProcessor(
-                if stage:
+                content_generator=content_generator,
-                    if status == "completed":
+                content_repo=content_repo,
-                        if stage == "title":
+                project_repo=project_repo
-                            title = kwargs.get("title", "")
+            )
                            click.echo(f"  - Title generated: {title}")
                        elif stage == "outline":
                            outline = kwargs.get("outline", {})
                            h2_count = len(outline.get("sections", []))
                            h3_count = sum(len(s.get("h3s", [])) for s in outline.get("sections", []))
                            click.echo(f"  - Outline generated: {h2_count} H2s, {h3_count} H3s")
                        elif stage == "content":
                            word_count = kwargs.get("word_count", 0)
                            click.echo(f"  - Content generated: {word_count} words")
                elif status == "starting":
                    click.echo(f"[Tier {tier}] Article {article_num}/{total}: Generating...")
                elif status == "completed":
                    content_id = kwargs.get("content_id", "?")
                    click.echo(f"[Tier {tier}] Article {article_num}/{total}: Completed (ID: {content_id})")
                elif status == "skipped":
                    error = kwargs.get("error", "Unknown error")
                    click.echo(f"[Tier {tier}] Article {article_num}/{total}: Skipped - {error}", err=True)
                elif status == "failed":
                    error = kwargs.get("error", "Unknown error")
                    click.echo(f"[Tier {tier}] Article {article_num}/{total}: Failed - {error}", err=True)
            click.echo(f"\nProcessing job file: {job_file}")
            if debug:
-                click.echo("\n[DEBUG MODE ENABLED - Content will be saved to debug_output/]\n")
+                click.echo("Debug mode: AI responses will be saved to debug_output/\n")
-            processor = BatchProcessor(session)
+            batch_processor.process_job(
-            result = processor.process_job(job_config, progress_callback, debug=debug)
+                job_file_path=job_file,
-            
+                debug=debug,
-            click.echo("-" * 80)
+                continue_on_error=continue_on_error
-            click.echo("\nBatch Generation Complete!")
+            )
            click.echo(result.to_summary())
        finally:
            session.close()
    except FileNotFoundError as e:
        click.echo(f"Error: {e}", err=True)
        raise click.Abort()
    except ValueError as e:
        click.echo(f"Error: {e}", err=True)
        raise click.Abort()
    except Exception as e:
-        click.echo(f"Error: {e}", err=True)
+        click.echo(f"Error processing batch: {e}", err=True)
        raise click.Abort()
--- a/src/database/models.py
+++ b/src/database/models.py
@ -3,7 +3,7 @@ SQLAlchemy database models
 """
 from datetime import datetime, timezone
-from typing import Literal, Optional
+from typing import Optional
 from sqlalchemy import String, Integer, DateTime, Float, ForeignKey, JSON, Text
 from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column
@ -120,40 +120,18 @@ class Project(Base):
 class GeneratedContent(Base):
-    """Generated content model for AI-generated articles with version tracking"""
+    """Generated content model for AI-created articles"""
    __tablename__ = "generated_content"
    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
    project_id: Mapped[int] = mapped_column(Integer, ForeignKey('projects.id'), nullable=False, index=True)
-    tier: Mapped[int] = mapped_column(Integer, nullable=False, index=True)
+    tier: Mapped[str] = mapped_column(String(20), nullable=False, index=True)
-    
+    keyword: Mapped[str] = mapped_column(String(255), nullable=False, index=True)
-    title: Mapped[Optional[str]] = mapped_column(String(500), nullable=True)
+    title: Mapped[str] = mapped_column(Text, nullable=False)
-    outline: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
+    outline: Mapped[dict] = mapped_column(JSON, nullable=False)
-    content: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
+    content: Mapped[str] = mapped_column(Text, nullable=False)
-    
+    word_count: Mapped[int] = mapped_column(Integer, nullable=False)
-    status: Mapped[str] = mapped_column(String(20), nullable=False, default="pending", index=True)
+    status: Mapped[str] = mapped_column(String(20), nullable=False)
    is_active: Mapped[bool] = mapped_column(Integer, nullable=False, default=False)
    generation_stage: Mapped[str] = mapped_column(String(20), nullable=False, default="title")
    title_attempts: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
    outline_attempts: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
    content_attempts: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
    title_model: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
    outline_model: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
    content_model: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
    validation_errors: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
    validation_warnings: Mapped[int] = mapped_column(Integer, nullable=False, default=0)
    validation_report: Mapped[Optional[dict]] = mapped_column(JSON, nullable=True)
    word_count: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
    augmented: Mapped[bool] = mapped_column(Integer, nullable=False, default=False)
    augmentation_log: Mapped[Optional[dict]] = mapped_column(JSON, nullable=True)
    generation_duration: Mapped[Optional[float]] = mapped_column(Float, nullable=True)
    error_message: Mapped[Optional[str]] = mapped_column(Text, nullable=True)
    created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False)
    updated_at: Mapped[datetime] = mapped_column(
        DateTime, 
@ -163,4 +141,4 @@ class GeneratedContent(Base):
    )
    def __repr__(self) -> str:
-        return f"<GeneratedContent(id={self.id}, project_id={self.project_id}, tier={self.tier}, status='{self.status}', stage='{self.generation_stage}')>"
+        return f"<GeneratedContent(id={self.id}, project_id={self.project_id}, tier='{self.tier}', status='{self.status}')>"
--- a/src/database/repositories.py
+++ b/src/database/repositories.py
@ -5,9 +5,8 @@ Concrete repository implementations
 from typing import Optional, List, Dict, Any
 from sqlalchemy.orm import Session
 from sqlalchemy.exc import IntegrityError
-from src.database.interfaces import IUserRepository, ISiteDeploymentRepository, IProjectRepository, IGeneratedContentRepository
+from src.database.interfaces import IUserRepository, ISiteDeploymentRepository, IProjectRepository
 from src.database.models import User, SiteDeployment, Project, GeneratedContent
 from src.core.config import get_config
 class UserRepository(IUserRepository):
@ -377,35 +376,55 @@ class ProjectRepository(IProjectRepository):
        return False
-class GeneratedContentRepository(IGeneratedContentRepository):
+<<<<<<< HEAD
-    """Repository implementation for GeneratedContent data access"""
+class GeneratedContentRepository:
    """Repository for GeneratedContent data access"""
    def __init__(self, session: Session):
        self.session = session
-    def create(self, project_id: int, tier: int) -> GeneratedContent:
+    def create(
        self,
        project_id: int,
        tier: str,
        keyword: str,
        title: str,
        outline: dict,
        content: str,
        word_count: int,
        status: str
    ) -> GeneratedContent:
        """
        Create a new generated content record
        Args:
-            project_id: The ID of the project
+            project_id: The project ID this content belongs to
-            tier: The tier level (1, 2, etc.)
+            tier: Content tier (tier1, tier2, tier3)
            keyword: The keyword used for generation
            title: Generated title
            outline: Generated outline (JSON)
            content: Generated HTML content
            word_count: Final word count
            status: Status (generated, augmented, failed)
        Returns:
            The created GeneratedContent object
        """
-        content = GeneratedContent(
+        content_record = GeneratedContent(
            project_id=project_id,
            tier=tier,
-            status="pending",
+            keyword=keyword,
-            generation_stage="title",
+            title=title,
-            is_active=False
+            outline=outline,
            content=content,
            word_count=word_count,
            status=status
        )
-        self.session.add(content)
+        self.session.add(content_record)
        self.session.commit()
-        self.session.refresh(content)
+        self.session.refresh(content_record)
-        return content
+        return content_record
    def get_by_id(self, content_id: int) -> Optional[GeneratedContent]:
        """
@ -482,46 +501,51 @@ class GeneratedContentRepository(IGeneratedContentRepository):
        Returns:
            The updated GeneratedContent object
        """
 =======
        content_record = GeneratedContent(
            project_id=project_id,
            tier=tier,
            keyword=keyword,
            title=title,
            outline=outline,
            content=content,
            word_count=word_count,
            status=status
        )
        self.session.add(content_record)
        self.session.commit()
        self.session.refresh(content_record)
        return content_record
    def get_by_id(self, content_id: int) -> Optional[GeneratedContent]:
        """Get content by ID"""
        return self.session.query(GeneratedContent).filter(GeneratedContent.id == content_id).first()
    def get_by_project_id(self, project_id: int) -> List[GeneratedContent]:
        """Get all content for a project"""
        return self.session.query(GeneratedContent).filter(GeneratedContent.project_id == project_id).all()
    def get_by_project_and_tier(self, project_id: int, tier: str) -> List[GeneratedContent]:
        """Get content for a project and tier"""
        return self.session.query(GeneratedContent).filter(
            GeneratedContent.project_id == project_id,
            GeneratedContent.tier == tier
        ).all()
    def get_by_keyword(self, keyword: str) -> List[GeneratedContent]:
        """Get content by keyword"""
        return self.session.query(GeneratedContent).filter(GeneratedContent.keyword == keyword).all()
    def update(self, content: GeneratedContent) -> GeneratedContent:
        """Update existing content"""
        self.session.add(content)
        self.session.commit()
        self.session.refresh(content)
        return content
    def set_active(self, content_id: int, project_id: int, tier: int) -> bool:
        """
        Set a content version as active (deactivates others)
        Args:
            content_id: The ID of the content to activate
            project_id: The project ID
            tier: The tier level
        Returns:
            True if successful, False if content not found
        """
        content = self.get_by_id(content_id)
        if not content:
            return False
        self.session.query(GeneratedContent).filter(
            GeneratedContent.project_id == project_id,
            GeneratedContent.tier == tier
        ).update({"is_active": False})
        content.is_active = True
        self.session.commit()
        return True
    def delete(self, content_id: int) -> bool:
-        """
+        """Delete content by ID"""
        Delete a generated content record by ID
        Args:
            content_id: The ID of the content to delete
        Returns:
            True if deleted, False if content not found
        """
        content = self.get_by_id(content_id)
        if content:
            self.session.delete(content)
--- a/src/generation/ai_client.py
+++ b/src/generation/ai_client.py
@ -1,169 +1,145 @@
 """
-AI client for OpenRouter API integration
+OpenRouter AI client and prompt management
 """
-import os
+import time
 import json
-from typing import Dict, Any, Optional
+from pathlib import Path
-from openai import OpenAI
+from typing import Optional, Dict, Any
-from dotenv import load_dotenv
+from openai import OpenAI, RateLimitError, APIError
-from src.core.config import Config
+from src.core.config import get_config
-
+AVAILABLE_MODELS = {
-class AIClientError(Exception):
+    "gpt-4o-mini": "openai/gpt-4o-mini",
-    """Base exception for AI client errors"""
+    "claude-sonnet-4.5": "anthropic/claude-3.5-sonnet"
-    pass
+}
 class AIClient:
-    """Client for interacting with AI models via OpenRouter"""
+    """OpenRouter API client using OpenAI SDK"""
-    def __init__(self, config: Optional[Config] = None):
+    def __init__(
-        """
+        self, 
-        Initialize AI client
+        api_key: str, 
        model: str, 
        base_url: str = "https://openrouter.ai/api/v1"
    ):
        self.client = OpenAI(api_key=api_key, base_url=base_url)
-        Args:
+        if model in AVAILABLE_MODELS:
-            config: Application configuration (uses get_config() if None)
+            self.model = AVAILABLE_MODELS[model]
-        """
+        else:
-        load_dotenv()
+            self.model = model
-        from src.core.config import get_config
+    def generate_completion(
        self.config = config or get_config()
        api_key = os.getenv("AI_API_KEY")
        if not api_key:
            raise AIClientError("AI_API_KEY environment variable not set")
        # OpenRouter requires specific headers and configuration
        self.client = OpenAI(
            base_url=self.config.ai_service.base_url,
            api_key=api_key,
            default_headers={
                "HTTP-Referer": "https://github.com/yourusername/Big-Link-Man",
                "X-Title": "Big Link Man Content Generator"
            }
        )
        self.default_model = self.config.ai_service.model
        self.max_tokens = self.config.ai_service.max_tokens
        self.temperature = self.config.ai_service.temperature
        self.timeout = self.config.ai_service.timeout
    def generate(
        self, 
        prompt: str, 
-        model: Optional[str] = None,
+        system_message: Optional[str] = None,
-        temperature: Optional[float] = None,
+        max_tokens: int = 4000,
-        max_tokens: Optional[int] = None,
+        temperature: float = 0.7,
-        response_format: Optional[Dict[str, Any]] = None
+        json_mode: bool = False
    ) -> str:
        """
-        Generate text using AI model
+        Generate completion from OpenRouter API
        Args:
-            prompt: The prompt text
+            prompt: User prompt text
-            model: Model to use (defaults to config default)
+            system_message: Optional system message
-            temperature: Temperature (defaults to config default)
+            max_tokens: Maximum tokens to generate
-            max_tokens: Max tokens (defaults to config default)
+            temperature: Sampling temperature (0-1)
-            response_format: Optional response format for structured output
+            json_mode: If True, requests JSON response format
        Returns:
-            Generated text
+            Generated text completion
        Raises:
            AIClientError: If generation fails
        """
-        try:
+        messages = []
-            kwargs = {
+        if system_message:
-                "model": model or self.default_model,
+            messages.append({"role": "system", "content": system_message})
-                "messages": [{"role": "user", "content": prompt}],
+        messages.append({"role": "user", "content": prompt})
-                "temperature": temperature if temperature is not None else self.temperature,
+        
-                "max_tokens": max_tokens or self.max_tokens,
+        kwargs: Dict[str, Any] = {
-                "timeout": self.timeout,
+            "model": self.model,
            "messages": messages,
            "max_tokens": max_tokens,
            "temperature": temperature
        }
-            if response_format:
+        if json_mode:
-                kwargs["response_format"] = response_format
+            kwargs["response_format"] = {"type": "json_object"}
        retries = 3
        for attempt in range(retries):
            try:
                response = self.client.chat.completions.create(**kwargs)
                content = response.choices[0].message.content or ""
                # Debug: print first 200 chars if json_mode
                if json_mode:
                    print(f"[DEBUG] AI Response (first 200 chars): {content[:200]}")
                return content
-            if not response.choices:
+            except RateLimitError as e:
-                raise AIClientError("No response from AI model")
+                if attempt < retries - 1:
                    wait_time = 2 ** attempt
                    print(f"Rate limit hit. Retrying in {wait_time}s...")
                    time.sleep(wait_time)
                else:
                    raise
-            content = response.choices[0].message.content
+            except APIError as e:
-            if not content:
+                if attempt < retries - 1 and "network" in str(e).lower():
-                raise AIClientError("Empty response from AI model")
+                    wait_time = 2 ** attempt
-            
+                    print(f"Network error. Retrying in {wait_time}s...")
-            return content.strip()
+                    time.sleep(wait_time)
                else:
                    raise
            except Exception as e:
-            raise AIClientError(f"AI generation failed: {e}")
+                raise
-    def generate_json(
+        return ""
-        self,
+
-        prompt: str,
+
-        model: Optional[str] = None,
+class PromptManager:
-        temperature: Optional[float] = None,
+    """Manages loading and formatting of prompt templates"""
-        max_tokens: Optional[int] = None
+    
-    ) -> Dict[str, Any]:
+    def __init__(self, prompts_dir: str = "src/generation/prompts"):
        self.prompts_dir = Path(prompts_dir)
        self.prompts: Dict[str, dict] = {}
    def load_prompt(self, prompt_name: str) -> dict:
        """Load prompt from JSON file"""
        if prompt_name in self.prompts:
            return self.prompts[prompt_name]
        prompt_file = self.prompts_dir / f"{prompt_name}.json"
        if not prompt_file.exists():
            raise FileNotFoundError(f"Prompt file not found: {prompt_file}")
        with open(prompt_file, 'r', encoding='utf-8') as f:
            prompt_data = json.load(f)
        self.prompts[prompt_name] = prompt_data
        return prompt_data
    def format_prompt(self, prompt_name: str, **kwargs) -> tuple[str, str]:
        """
-        Generate JSON-formatted response
+        Format prompt with variables
        Args:
-            prompt: The prompt text (should request JSON output)
+            prompt_name: Name of the prompt template
-            model: Model to use
+            **kwargs: Variables to inject into the template
            temperature: Temperature
            max_tokens: Max tokens
        Returns:
-            Parsed JSON response
+            Tuple of (system_message, user_prompt)
        Raises:
            AIClientError: If generation or parsing fails
        """
-        response_text = self.generate(
+        prompt_data = self.load_prompt(prompt_name)
            prompt=prompt,
            model=model,
            temperature=temperature,
            max_tokens=max_tokens,
            response_format={"type": "json_object"}
        )
-        try:
+        system_message = prompt_data.get("system_message", "")
-            return json.loads(response_text)
+        user_prompt = prompt_data.get("user_prompt", "")
        except json.JSONDecodeError as e:
            raise AIClientError(f"Failed to parse JSON response: {e}\nResponse: {response_text}")
-    def validate_model(self, model: str) -> bool:
+        if system_message:
-        """
+            system_message = system_message.format(**kwargs)
        Check if a model is available in configuration
-        Args:
+        user_prompt = user_prompt.format(**kwargs)
            model: Model identifier
        Returns:
            True if model is available
        """
        available = self.config.ai_service.available_models
        return model in available.values() or model in available.keys()
    def get_model_id(self, model_name: str) -> str:
        """
        Get full model ID from short name
        Args:
            model_name: Short name (e.g., "claude-3.5-sonnet") or full ID
        Returns:
            Full model ID
        """
        available = self.config.ai_service.available_models
        if model_name in available:
            return available[model_name]
        if model_name in available.values():
            return model_name
        return model_name
        return system_message, user_prompt
--- a/src/generation/batch_processor.py
+++ b/src/generation/batch_processor.py
@ -1,15 +1,12 @@
 """
-Batch job processor for generating multiple articles across tiers
+Batch processor for content generation jobs
 """
-import time
+from typing import Dict, Any
-from typing import Optional
+import click
-from sqlalchemy.orm import Session
+from src.generation.service import ContentGenerator
-from src.database.models import Project
+from src.generation.job_config import JobConfig, Job, TierConfig
-from src.database.repositories import ProjectRepository
+from src.database.repositories import GeneratedContentRepository, ProjectRepository
 from src.generation.service import ContentGenerationService, GenerationError
 from src.generation.job_config import JobConfig, JobResult
 from src.core.config import Config, get_config
 class BatchProcessor:
@ -17,167 +14,205 @@ class BatchProcessor:
    def __init__(
        self,
-        session: Session,
+        content_generator: ContentGenerator,
-        config: Optional[Config] = None
+        content_repo: GeneratedContentRepository,
        project_repo: ProjectRepository
    ):
-        """
+        self.generator = content_generator
-        Initialize batch processor
+        self.content_repo = content_repo
-        
+        self.project_repo = project_repo
-        Args:
+        self.stats = {
-            session: Database session
+            "total_jobs": 0,
-            config: Application configuration
+            "processed_jobs": 0,
-        """
+            "total_articles": 0,
-        self.session = session
+            "generated_articles": 0,
-        self.config = config or get_config()
+            "augmented_articles": 0,
-        self.project_repo = ProjectRepository(session)
+            "failed_articles": 0
-        self.generation_service = ContentGenerationService(session, config)
+        }
    def process_job(
        self, 
-        job_config: JobConfig,
+        job_file_path: str, 
-        progress_callback: Optional[callable] = None,
+        debug: bool = False,
-        debug: bool = False
+        continue_on_error: bool = False
-    ) -> JobResult:
+    ):
        """
-        Process a batch job according to configuration
+        Process all jobs in job file
        Args:
-            job_config: Job configuration
+            job_file_path: Path to job JSON file
-            progress_callback: Optional callback function(tier, article_num, total, status)
+            debug: If True, save AI responses to debug_output/
-        
+            continue_on_error: If True, continue on article generation failure
        Returns:
            JobResult with statistics
        """
-        start_time = time.time()
+        job_config = JobConfig(job_file_path)
        jobs = job_config.get_jobs()
-        project = self.project_repo.get_by_id(job_config.project_id)
+        self.stats["total_jobs"] = len(jobs)
        for job_idx, job in enumerate(jobs, 1):
            try:
                self._process_single_job(job, job_idx, debug, continue_on_error)
                self.stats["processed_jobs"] += 1
            except Exception as e:
                click.echo(f"Error processing job {job_idx}: {e}")
                if not continue_on_error:
                    raise
        self._print_summary()
    def _process_single_job(
        self, 
        job: Job, 
        job_idx: int, 
        debug: bool,
        continue_on_error: bool
    ):
        """Process a single job"""
        project = self.project_repo.get_by_id(job.project_id)
        if not project:
-            raise ValueError(f"Project {job_config.project_id} not found")
+            raise ValueError(f"Project {job.project_id} not found")
-        result = JobResult(
+        click.echo(f"\nProcessing Job {job_idx}/{self.stats['total_jobs']}: Project ID {job.project_id}")
-            job_name=job_config.job_name,
+        
-            project_id=job_config.project_id,
+        for tier_name, tier_config in job.tiers.items():
-            total_articles=job_config.get_total_articles(),
+            self._process_tier(
-            successful=0,
+                job.project_id, 
-            failed=0,
+                tier_name, 
-            skipped=0
+                tier_config, 
                debug, 
                continue_on_error
            )
-        consecutive_failures = 0
+    def _process_tier(
        self, 
        project_id: int,
        tier_name: str,
        tier_config: TierConfig,
        debug: bool,
        continue_on_error: bool
    ):
        """Process all articles for a tier"""
        click.echo(f"  {tier_name}: Generating {tier_config.count} articles")
-        for tier_config in job_config.tiers:
+        project = self.project_repo.get_by_id(project_id)
-            tier = tier_config.tier
+        keyword = project.main_keyword
-            for article_num in range(1, tier_config.article_count + 1):
+        for article_num in range(1, tier_config.count + 1):
-                if progress_callback:
+            self.stats["total_articles"] += 1
                    progress_callback(
                        tier=tier,
                        article_num=article_num,
                        total=tier_config.article_count,
                        status="starting"
                    )
            try:
-                    content = self.generation_service.generate_article(
+                self._generate_single_article(
-                        project=project,
+                    project_id,
-                        tier=tier,
+                    tier_name,
-                        title_model=tier_config.models.title,
+                    tier_config,
-                        outline_model=tier_config.models.outline,
+                    article_num,
-                        content_model=tier_config.models.content,
+                    keyword,
-                        max_retries=tier_config.validation_attempts,
+                    debug
-                        progress_callback=progress_callback,
+                )
                self.stats["generated_articles"] += 1
            except Exception as e:
                self.stats["failed_articles"] += 1
                import traceback
                click.echo(f"    [{article_num}/{tier_config.count}] FAILED: {e}")
                click.echo(f"    Traceback: {traceback.format_exc()}")
                try:
                    self.content_repo.create(
                        project_id=project_id,
                        tier=tier_name,
                        keyword=keyword,
                        title="Failed Generation",
                        outline={"error": str(e)},
                        content="",
                        word_count=0,
                        status="failed"
                    )
                except Exception as db_error:
                    click.echo(f"    Failed to save error record: {db_error}")
                if not continue_on_error:
                    raise
    def _generate_single_article(
        self,
        project_id: int,
        tier_name: str,
        tier_config: TierConfig,
        article_num: int,
        keyword: str,
        debug: bool
    ):
        """Generate a single article"""
        prefix = f"    [{article_num}/{tier_config.count}]"
        click.echo(f"{prefix} Generating title...")
        title = self.generator.generate_title(project_id, debug=debug)
        click.echo(f"{prefix} Generated title: \"{title}\"")
        click.echo(f"{prefix} Generating outline...")
        outline = self.generator.generate_outline(
            project_id=project_id,
            title=title,
            min_h2=tier_config.min_h2_tags,
            max_h2=tier_config.max_h2_tags,
            min_h3=tier_config.min_h3_tags,
            max_h3=tier_config.max_h3_tags,
            debug=debug
        )
-                    result.successful += 1
+        h2_count = len(outline["outline"])
-                    result.add_tier_result(tier, "successful")
+        h3_count = sum(len(section.get("h3", [])) for section in outline["outline"])
-                    consecutive_failures = 0
+        click.echo(f"{prefix} Generated outline: {h2_count} H2s, {h3_count} H3s")
-                    if progress_callback:
+        click.echo(f"{prefix} Generating content...")
-                        progress_callback(
+        content = self.generator.generate_content(
-                            tier=tier,
+            project_id=project_id,
-                            article_num=article_num,
+            title=title,
-                            total=tier_config.article_count,
+            outline=outline,
-                            status="completed",
+            min_word_count=tier_config.min_word_count,
-                            content_id=content.id
+            max_word_count=tier_config.max_word_count,
            debug=debug
        )
-                except GenerationError as e:
+        word_count = self.generator.count_words(content)
-                    error_msg = f"Tier {tier}, Article {article_num}: {str(e)}"
+        click.echo(f"{prefix} Generated content: {word_count:,} words")
                    result.add_error(error_msg)
                    consecutive_failures += 1
-                    if job_config.failure_config.skip_on_failure:
+        status = "generated"
                        result.skipped += 1
                        result.add_tier_result(tier, "skipped")
-                        if progress_callback:
+        if word_count < tier_config.min_word_count:
-                            progress_callback(
+            click.echo(f"{prefix} Below minimum ({tier_config.min_word_count:,}), augmenting...")
-                                tier=tier,
+            content = self.generator.augment_content(
-                                article_num=article_num,
+                content=content,
-                                total=tier_config.article_count,
+                target_word_count=tier_config.min_word_count,
-                                status="skipped",
+                debug=debug,
-                                error=str(e)
+                project_id=project_id
            )
            word_count = self.generator.count_words(content)
            click.echo(f"{prefix} Augmented content: {word_count:,} words")
            status = "augmented"
            self.stats["augmented_articles"] += 1
        saved_content = self.content_repo.create(
            project_id=project_id,
            tier=tier_name,
            keyword=keyword,
            title=title,
            outline=outline,
            content=content,
            word_count=word_count,
            status=status
        )
-                        if consecutive_failures >= job_config.failure_config.max_consecutive_failures:
+        click.echo(f"{prefix} Saved (ID: {saved_content.id}, Status: {status})")
                            result.add_error(
                                f"Stopping job: {consecutive_failures} consecutive failures exceeded threshold"
                            )
                            result.duration = time.time() - start_time
                            return result
                    else:
                        result.failed += 1
                        result.add_tier_result(tier, "failed")
                        result.duration = time.time() - start_time
                        if progress_callback:
                            progress_callback(
                                tier=tier,
                                article_num=article_num,
                                total=tier_config.article_count,
                                status="failed",
                                error=str(e)
                            )
                        return result
                except Exception as e:
                    error_msg = f"Tier {tier}, Article {article_num}: Unexpected error: {str(e)}"
                    result.add_error(error_msg)
                    result.failed += 1
                    result.add_tier_result(tier, "failed")
                    result.duration = time.time() - start_time
                    if progress_callback:
                        progress_callback(
                            tier=tier,
                            article_num=article_num,
                            total=tier_config.article_count,
                            status="failed",
                            error=str(e)
                        )
                    return result
        result.duration = time.time() - start_time
        return result
    def process_job_from_file(
        self,
        job_file_path: str,
        progress_callback: Optional[callable] = None
    ) -> JobResult:
        """
        Load and process a job from a JSON file
        Args:
            job_file_path: Path to job configuration JSON file
            progress_callback: Optional progress callback
        Returns:
            JobResult with statistics
        """
        job_config = JobConfig.from_file(job_file_path)
        return self.process_job(job_config, progress_callback)
    def _print_summary(self):
        """Print job processing summary"""
        click.echo("\n" + "="*60)
        click.echo("SUMMARY")
        click.echo("="*60)
        click.echo(f"Jobs processed: {self.stats['processed_jobs']}/{self.stats['total_jobs']}")
        click.echo(f"Articles generated: {self.stats['generated_articles']}/{self.stats['total_articles']}")
        click.echo(f"Augmented: {self.stats['augmented_articles']}")
        click.echo(f"Failed: {self.stats['failed_articles']}")
        click.echo("="*60)
--- a/src/generation/job_config.py
+++ b/src/generation/job_config.py
@ -1,213 +1,129 @@
 """
-Job configuration schema and validation for batch content generation
+Job configuration parser for batch content generation
 """
 from typing import List, Dict, Optional, Literal
 from pydantic import BaseModel, Field, field_validator
 import json
 from dataclasses import dataclass
 from typing import Optional, Dict, Any
 from pathlib import Path
-
+TIER_DEFAULTS = {
-class ModelConfig(BaseModel):
+    "tier1": {
-    """AI models configuration for each generation stage"""
+        "min_word_count": 2000,
-    title: str = Field(..., description="Model for title generation")
+        "max_word_count": 2500,
-    outline: str = Field(..., description="Model for outline generation")
+        "min_h2_tags": 3,
-    content: str = Field(..., description="Model for content generation")
+        "max_h2_tags": 5,
        "min_h3_tags": 5,
        "max_h3_tags": 10
    },
    "tier2": {
        "min_word_count": 1500,
        "max_word_count": 2000,
        "min_h2_tags": 2,
        "max_h2_tags": 4,
        "min_h3_tags": 3,
        "max_h3_tags": 8
    },
    "tier3": {
        "min_word_count": 1000,
        "max_word_count": 1500,
        "min_h2_tags": 2,
        "max_h2_tags": 3,
        "min_h3_tags": 2,
        "max_h3_tags": 6
    }
 }
-class AnchorTextConfig(BaseModel):
+@dataclass
-    """Anchor text configuration"""
+class TierConfig:
-    mode: Literal["default", "override", "append"] = Field(
+    """Configuration for a specific tier"""
-        default="default",
+    count: int
-        description="How to handle anchor text: default (use CORA), override (replace), append (add to)"
+    min_word_count: int
-    )
+    max_word_count: int
-    custom_text: Optional[List[str]] = Field(
+    min_h2_tags: int
-        default=None,
+    max_h2_tags: int
-        description="Custom anchor text for override mode"
+    min_h3_tags: int
-    )
+    max_h3_tags: int
    additional_text: Optional[List[str]] = Field(
        default=None,
        description="Additional anchor text for append mode"
    )
-class TierConfig(BaseModel):
+@dataclass
-    """Configuration for a single tier"""
+class Job:
-    tier: int = Field(..., ge=1, description="Tier number (1 = strictest validation)")
+    """Job definition for content generation"""
    article_count: int = Field(..., ge=1, description="Number of articles to generate")
    models: ModelConfig = Field(..., description="AI models for this tier")
    anchor_text_config: AnchorTextConfig = Field(
        default_factory=AnchorTextConfig,
        description="Anchor text configuration"
    )
    validation_attempts: int = Field(
        default=3,
        ge=1,
        le=10,
        description="Max validation retry attempts per stage"
    )
 class FailureConfig(BaseModel):
    """Failure handling configuration"""
    max_consecutive_failures: int = Field(
        default=5,
        ge=1,
        description="Stop job after this many consecutive failures"
    )
    skip_on_failure: bool = Field(
        default=True,
        description="Skip failed articles and continue, or stop immediately"
    )
 class InterlinkingConfig(BaseModel):
    """Interlinking configuration"""
    links_per_article_min: int = Field(
        default=2,
        ge=0,
        description="Minimum links to other articles"
    )
    links_per_article_max: int = Field(
        default=4,
        ge=0,
        description="Maximum links to other articles"
    )
    include_home_link: bool = Field(
        default=True,
        description="Include link to home page"
    )
    @field_validator('links_per_article_max')
    @classmethod
    def validate_max_greater_than_min(cls, v, info):
        if 'links_per_article_min' in info.data and v < info.data['links_per_article_min']:
            raise ValueError("links_per_article_max must be >= links_per_article_min")
        return v
 class JobConfig(BaseModel):
    """Complete job configuration"""
    job_name: str = Field(..., description="Descriptive name for the job")
    project_id: int = Field(..., ge=1, description="Project ID to use for all tiers")
    description: Optional[str] = Field(None, description="Optional job description")
    tiers: List[TierConfig] = Field(..., min_length=1, description="Tier configurations")
    failure_config: FailureConfig = Field(
        default_factory=FailureConfig,
        description="Failure handling configuration"
    )
    interlinking: InterlinkingConfig = Field(
        default_factory=InterlinkingConfig,
        description="Interlinking configuration"
    )
    @field_validator('tiers')
    @classmethod
    def validate_unique_tiers(cls, v):
        tier_numbers = [tier.tier for tier in v]
        if len(tier_numbers) != len(set(tier_numbers)):
            raise ValueError("Tier numbers must be unique")
        return v
    @classmethod
    def from_file(cls, file_path: str) -> 'JobConfig':
        """
        Load job configuration from JSON file
        Args:
            file_path: Path to the JSON file
        Returns:
            JobConfig instance
        Raises:
            FileNotFoundError: If file doesn't exist
            ValueError: If JSON is invalid or validation fails
        """
        path = Path(file_path)
        if not path.exists():
            raise FileNotFoundError(f"Job configuration file not found: {file_path}")
        try:
            with open(path, 'r', encoding='utf-8') as f:
                data = json.load(f)
            return cls(**data)
        except json.JSONDecodeError as e:
            raise ValueError(f"Invalid JSON in {file_path}: {e}")
        except Exception as e:
            raise ValueError(f"Failed to parse job configuration: {e}")
    def to_file(self, file_path: str) -> None:
        """
        Save job configuration to JSON file
        Args:
            file_path: Path to save the JSON file
        """
        path = Path(file_path)
        path.parent.mkdir(parents=True, exist_ok=True)
        with open(path, 'w', encoding='utf-8') as f:
            json.dump(self.model_dump(), f, indent=2)
    def get_total_articles(self) -> int:
        """Get total number of articles across all tiers"""
        return sum(tier.article_count for tier in self.tiers)
 class JobResult(BaseModel):
    """Result of a job execution"""
    job_name: str
    project_id: int
-    total_articles: int
+    tiers: Dict[str, TierConfig]
    successful: int
    failed: int
    skipped: int
    tier_results: Dict[int, Dict[str, int]] = Field(default_factory=dict)
    errors: List[str] = Field(default_factory=list)
    duration: float = 0.0
    def add_tier_result(self, tier: int, status: str) -> None:
        """Track result for a tier"""
        if tier not in self.tier_results:
            self.tier_results[tier] = {"successful": 0, "failed": 0, "skipped": 0}
-        if status in self.tier_results[tier]:
+class JobConfig:
-            self.tier_results[tier][status] += 1
+    """Parser for job configuration files"""
-    def add_error(self, error: str) -> None:
+    def __init__(self, job_file_path: str):
-        """Add an error message"""
+        """
-        self.errors.append(error)
+        Load and parse job file, apply defaults
-    def to_summary(self) -> str:
+        Args:
-        """Generate a human-readable summary"""
+            job_file_path: Path to JSON job file
-        lines = [
+        """
-            f"Job: {self.job_name}",
+        self.job_file_path = Path(job_file_path)
-            f"Project ID: {self.project_id}",
+        self.jobs: list[Job] = []
-            f"Duration: {self.duration:.2f}s",
+        self._load()
            f"",
            f"Results:",
            f"  Total Articles: {self.total_articles}",
            f"  Successful: {self.successful}",
            f"  Failed: {self.failed}",
            f"  Skipped: {self.skipped}",
            f"",
            f"By Tier:"
        ]
-        for tier, results in sorted(self.tier_results.items()):
+    def _load(self):
-            lines.append(f"  Tier {tier}:")
+        """Load and parse the job file"""
-            lines.append(f"    Successful: {results['successful']}")
+        if not self.job_file_path.exists():
-            lines.append(f"    Failed: {results['failed']}")
+            raise FileNotFoundError(f"Job file not found: {self.job_file_path}")
            lines.append(f"    Skipped: {results['skipped']}")
-        if self.errors:
+        with open(self.job_file_path, 'r', encoding='utf-8') as f:
-            lines.append("")
+            data = json.load(f)
            lines.append(f"Errors ({len(self.errors)}):")
            for error in self.errors[:10]:
                lines.append(f"  - {error}")
            if len(self.errors) > 10:
                lines.append(f"  ... and {len(self.errors) - 10} more")
-        return "\n".join(lines)
+        if "jobs" not in data:
            raise ValueError("Job file must contain 'jobs' array")
        for job_data in data["jobs"]:
            self._validate_job(job_data)
            job = self._parse_job(job_data)
            self.jobs.append(job)
    def _validate_job(self, job_data: dict):
        """Validate job structure"""
        if "project_id" not in job_data:
            raise ValueError("Job missing 'project_id'")
        if "tiers" not in job_data:
            raise ValueError("Job missing 'tiers'")
        if not isinstance(job_data["tiers"], dict):
            raise ValueError("'tiers' must be a dictionary")
    def _parse_job(self, job_data: dict) -> Job:
        """Parse a single job"""
        project_id = job_data["project_id"]
        tiers = {}
        for tier_name, tier_data in job_data["tiers"].items():
            tier_config = self._parse_tier(tier_name, tier_data)
            tiers[tier_name] = tier_config
        return Job(project_id=project_id, tiers=tiers)
    def _parse_tier(self, tier_name: str, tier_data: dict) -> TierConfig:
        """Parse tier configuration with defaults"""
        defaults = TIER_DEFAULTS.get(tier_name, TIER_DEFAULTS["tier3"])
        return TierConfig(
            count=tier_data.get("count", 1),
            min_word_count=tier_data.get("min_word_count", defaults["min_word_count"]),
            max_word_count=tier_data.get("max_word_count", defaults["max_word_count"]),
            min_h2_tags=tier_data.get("min_h2_tags", defaults["min_h2_tags"]),
            max_h2_tags=tier_data.get("max_h2_tags", defaults["max_h2_tags"]),
            min_h3_tags=tier_data.get("min_h3_tags", defaults["min_h3_tags"]),
            max_h3_tags=tier_data.get("max_h3_tags", defaults["max_h3_tags"])
        )
    def get_jobs(self) -> list[Job]:
        """Return list of all jobs in file"""
        return self.jobs
    def get_tier_config(self, job: Job, tier_name: str) -> Optional[TierConfig]:
        """Get tier config with defaults applied"""
        return job.tiers.get(tier_name)
--- a/src/generation/prompts/content_augmentation.json
+++ b/src/generation/prompts/content_augmentation.json
@ -1,9 +1,5 @@
 {
-  "system": "You are a content enhancement specialist who adds natural, relevant paragraphs to articles to meet optimization targets.",
+  "system_message": "You are an expert content editor who expands articles by adding depth, detail, and additional relevant information while maintaining topical focus and quality.",
-  "user_template": "Add new paragraph(s) to the following article to address these missing elements:\n\nCurrent Article:\n{current_content}\n\nWhat's Missing:\n{missing_elements}\n\nMain Keyword: {main_keyword}\nEntities to use: {target_entities}\nRelated Searches to reference: {target_searches}\nTarget Word Count for New Content: {target_word_count} words\n\nInstructions:\n1. Write {target_word_count} words of new content (1-3 paragraphs as needed)\n2. Naturally incorporate the missing keywords/entities/searches\n3. Make it relevant to the article topic\n4. Use a professional, engaging tone\n5. Don't directly repeat information already in the article\n6. The paragraphs should feel like natural additions\n7. IMPORTANT: Write at least {target_word_count} words to ensure we meet the target\n\nSuggested placement: {suggested_placement}\n\nRespond with ONLY the new paragraph(s) in HTML format:\n<p>First paragraph here...</p>\n<p>Second paragraph here...</p>\n\nDo not include the entire article, just the new paragraph(s) to insert.",
+  "user_prompt": "Please expand on the following article to add more detail and depth, ensuring you maintain the existing topical focus. Target word count: {target_word_count} words.\n\nCurrent article:\n{content}\n\nReturn the expanded article as an HTML fragment with the same structure (using <h2>, <h3>, <p> tags). You can add new paragraphs, expand existing ones, or add new subsections as needed. Do NOT change the existing headings unless necessary."
  "validation": {
    "output_format": "html"
  }
 }
--- a/src/generation/prompts/content_generation.json
+++ b/src/generation/prompts/content_generation.json
@ -1,12 +1,5 @@
 {
-  "system": "You are an creative content writer who creates comprehensive, engaging articles that strictly follow the provided outline and meet all CORA optimization requirements.",
+  "system_message": "You are an expert content writer who creates engaging, informative, and SEO-optimized articles that provide real value to readers while incorporating relevant keywords naturally.",
-  "user_template": "Write a complete, SEO-optimized article following this outline:\n\n{outline}\n\nArticle Details:\n- Title: {title}\n- Main Keyword: {main_keyword}\n- Target Token Count: {word_count}\n- Keyword Frequency Target: {term_frequency}% mentions\n\nEntities to incorporate: {entities}\nRelated Searches to reference: {related_searches}\n\nCritical Requirements:\n1. Follow the outline structure EXACTLY - use the provided H2 and H3 headings word-for-word\n2. Do NOT add numbering, Roman numerals, or letters to the headings\n3. The article must be {word_count} tokens long (±100 tokens)\n4. Mention the main keyword \"{main_keyword}\" naturally {term_frequency}% times throughout\n5. Write 2-3 substantial paragraphs under each heading. Reference industry standards, regulations, or best practices. Use relevant LSI and entities for the topic\n6. For the FAQ section:\n   - Each FAQ answer MUST begin by restating the question\n   - Provide detailed, helpful answers (100-150 words each)\n7. Incorporate entities and related searches naturally throughout\n8. Write in a professional, engaging tone. Use active voice for 80% of sentences\n9. Make content informative and valuable to readers. Use technical terminology appropriate for industry professionals.\n10. Use varied sentence structures and vocabulary.\n11. STRICTLY PROHIBITED: Filler phrases: 'it is important to note', as mentioned earlier', 'in conclusion' - Marketing language: 'revolutionary', 'game-changing', 'industry-leading', 'best-in-class' - Generic openings: 'In today's world', 'As we all know', 'It goes without saying' \n\nFormatting Requirements:\n- Use <h1> for the main title\n- Use <h2> for major sections\n- Use <h3> for subsections\n- Use <p> for paragraphs\n- Use <ul> and <li> for lists where appropriate\n- Do NOT include any CSS, <html>, <head>, or <body> tags\n- Return ONLY the article content HTML\n\nExample structure:\n<h1>Main Title</h1>\n<p>Introduction paragraph...</p>\n\n<h2>First Section</h2>\n<p>Content...</p>\n\n<h3>Subsection</h3>\n<p>More content...</p>\n\nWrite the complete article now.",
+  "user_prompt": "Write a complete article based on:\nTitle: {title}\nOutline: {outline}\nKeyword: {keyword}\n\nEntities to include naturally: {entities}\nRelated searches to address: {related_searches}\n\nTarget word count range: {min_word_count} to {max_word_count} words\n\nReturn as an HTML fragment with <h2>, <h3>, and <p> tags. Do NOT include <!DOCTYPE>, <html>, <head>, or <body> tags. Start directly with the first <h2> heading.\n\nWrite naturally and informatively. Incorporate the keyword, entities, and related searches organically throughout the content."
  "validation": {
    "output_format": "html",
    "min_word_count": true,
    "max_word_count": true,
    "keyword_frequency_target": true,
    "outline_structure_match": true
  }
 }
--- a/src/generation/prompts/outline_generation.json
+++ b/src/generation/prompts/outline_generation.json
@ -1,11 +1,5 @@
 {
-  "system": "You are an expert content strategist who creates compelling, specific article titles that provide clear direction for content creation. You also strive to meet strict CORA optimization targets.",
+  "system_message": "You are an expert content outliner who creates well-structured, comprehensive article outlines that cover topics thoroughly and logically.",
-  "user_template": "Create a detailed article outline for the following:\n\nTitle: {title}\nMain Keyword: {main_keyword}\nTarget Word Count: {word_count}\n\nCORA Targets:\n- H2 headings needed: {h2_total}\n- H2s with main keyword: {h2_exact}\n- H2s with related searches: {h2_related_search}\n- H2s with entities: {h2_entities}\n- H3 headings needed: {h3_total}\n- H3s with main keyword: {h3_exact}\n- H3s with related searches: {h3_related_search}\n- H3s with entities: {h3_entities}\n\nAvailable Entities: {entities}\nRelated Searches: {related_searches}\n\nThe title provided above will serve as the H1 heading for this article. Focus on creating the H2 and H3 structure that supports this title.\n\nRequirements:\n1. Create exactly {h2_total} H2 headings\n2. Create exactly {h3_total} H3 headings (distributed under H2s)\n3. At least {h2_exact} H2s must contain the exact keyword \"{main_keyword}\"\n4. The FIRST H2 should contain the main keyword\n5. Incorporate entities and related searches naturally into headings\n6. Include a \"Frequently Asked Questions\" H2 section with at least 3 H3 questions\n7. Each H3 question should be a complete question ending with ?\n8. Structure should flow logically\nCreate headings that build logically toward actionable insights\n9. Use specific, searchable language over generic terms\n 9. Include sub-topic hints in parentheses where helpful \n 10. Focus on reader problems and solutions.\n 11. FORBIDDEN ELEMENTS: Future-tense speculation ('The Future of...', 'Upcoming Trends') - Generic business-speak ('in today's competitive landscape', 'cutting-edge solutions') - Vague qualifiers ('best practices', 'industry-leading', 'world-class') \n\nIMPORTANT FORMATTING RULES:\n- Do NOT include numbering (1., 2., 3.)\n- Do NOT include Roman numerals (I., II., III.)\n- Do NOT include letters (A., B., C.)\n- Do NOT include any outline-style prefixes\n- Return clean heading text only\n\nWRONG: \"I. Introduction to {main_keyword}\"\nWRONG: \"1. Getting Started with {main_keyword}\"\nRIGHT: \"Introduction to {main_keyword}\"\nRIGHT: \"Getting Started with {main_keyword}\"\n\nRespond ONLY with valid JSON in this exact format (no additional text, explanations, or commentary):\n{{\n  \"sections\": [\n    {{\n      \"h2\": \"H2 heading text\",\n      \"h3s\": [\"H3 heading 1\", \"H3 heading 2\"]\n    }}\n  ]\n}}\n\nReturn ONLY the JSON object. Do not include any text before or after the JSON.",
+  "user_prompt": "Create an article outline for:\nTitle: {title}\nKeyword: {keyword}\n\nConstraints:\n- Between {min_h2} and {max_h2} H2 headings\n- Between {min_h3} and {max_h3} H3 subheadings total (distributed across H2 sections)\n\nEntities to incorporate: {entities}\nRelated searches to address: {related_searches}\n\nReturn ONLY valid JSON in this exact format:\n{{\"outline\": [{{\"h2\": \"Heading text\", \"h3\": [\"Subheading 1\", \"Subheading 2\"]}}, ...]}}\n\nEnsure the outline meets the minimum heading requirements and includes relevant entities and related searches."
  "validation": {
    "output_format": "json",
    "required_fields": ["sections"],
    "h2_count_must_match": true,
    "h3_count_must_match": true
  }
 }
--- a/src/generation/prompts/title_generation.json
+++ b/src/generation/prompts/title_generation.json
@ -1,10 +1,5 @@
 {
-  "system": "You are an expert content strategist who creates compelling, specific article titles that provide clear direction for content creation.",
+  "system_message": "You are an expert SEO content writer who creates compelling, search-optimized titles that attract clicks while accurately representing the content topic.",
-  "user_template": "Generate an unique, compelling article title for the broad topic:  \"{main_keyword}\".\n\nContext:\n- Main Keyword: {main_keyword}\n- - Top Entities: {entities}\n-   Related Searches: {related_searches}\n\nRequirements:\n1. The title MUST contain the exact main keyword: \"{main_keyword}\"\n2. The title should be compelling and click-worthy\n3. Each title must be specific enough that an AI could create substantial, focused content outline from the title alone\n4.Titles should be creative yet professionally relevant to: {{subject}}. It does not have to be directly related but must be at least tangentially related.\n5. Consider incorporating 1-2 related entities or searches if natural\n6. Mix formats: how-to guides (25%), case studies (10%), expert analyses (20%), comparison pieces (15%), trend analyses (10%), problem-solving articles (10%), listicles(10%)\nAvoid generic business jargon and AI slop (cutting-edge,game-changing, revolutionary)\n7- Use domain-specific terminology appropriate for an article about {main_keyword}\n 8-Include specific, actionable language that suggests clear content direction\n\nRespond with ONLY the title text, no quotes or additional formatting.\n\nExample format: \"Complete Guide to {main_keyword}: Tips and Best Practices\"",
+  "user_prompt": "Generate an SEO-optimized title for an article about: {keyword}\n\nRelated entities: {entities}\n\nRelated searches: {related_searches}\n\nReturn only the title text, no formatting or quotes."
  "validation": {
    "must_contain_keyword": true,
    "min_length": 30,
    "max_length": 120
  }
 }
--- a/src/generation/rule_engine.py
+++ b/src/generation/rule_engine.py
@ -1,337 +1,3 @@
-"""
+# Content validation rules
-Content validation rule engine for CORA-compliant HTML generation
+# DEPRECATED: This module has been replaced by the simplified generation pipeline in service.py
-"""
+# Kept for reference only.
 from dataclasses import dataclass, field
 from typing import Dict, List, Optional, Any
 from html.parser import HTMLParser
 import re
 from src.core.config import Config
 from src.database.models import Project
@dataclass
 class ValidationIssue:
    """Single validation issue (error or warning)"""
    rule_name: str
    severity: str
    message: str
    expected: Optional[Any] = None
    actual: Optional[Any] = None
@dataclass
 class ValidationResult:
    """Result of content validation"""
    passed: bool
    errors: List[ValidationIssue] = field(default_factory=list)
    warnings: List[ValidationIssue] = field(default_factory=list)
    def add_error(self, rule_name: str, message: str, expected: Any = None, actual: Any = None):
        self.errors.append(ValidationIssue(rule_name, "error", message, expected, actual))
        self.passed = False
    def add_warning(self, rule_name: str, message: str, expected: Any = None, actual: Any = None):
        self.warnings.append(ValidationIssue(rule_name, "warning", message, expected, actual))
    def to_dict(self) -> Dict:
        return {
            "passed": self.passed,
            "errors": [
                {
                    "rule": e.rule_name,
                    "severity": e.severity,
                    "message": e.message,
                    "expected": e.expected,
                    "actual": e.actual
                } for e in self.errors
            ],
            "warnings": [
                {
                    "rule": w.rule_name,
                    "severity": w.severity,
                    "message": w.message,
                    "expected": w.expected,
                    "actual": w.actual
                } for w in self.warnings
            ]
        }
 class ContentHTMLParser(HTMLParser):
    """HTML parser to extract structure and content for validation"""
    def __init__(self):
        super().__init__()
        self.title: Optional[str] = None
        self.meta_description: Optional[str] = None
        self.h1_tags: List[str] = []
        self.h2_tags: List[str] = []
        self.h3_tags: List[str] = []
        self.images: List[Dict[str, str]] = []
        self.links: List[Dict[str, str]] = []
        self.text_content: str = ""
        self._current_tag: Optional[str] = None
        self._current_data: List[str] = []
        self._in_title = False
        self._in_h1 = False
        self._in_h2 = False
        self._in_h3 = False
    def handle_starttag(self, tag: str, attrs: List[tuple]):
        self._current_tag = tag
        attrs_dict = dict(attrs)
        if tag == "title":
            self._in_title = True
            self._current_data = []
        elif tag == "meta" and attrs_dict.get("name") == "description":
            self.meta_description = attrs_dict.get("content", "")
        elif tag == "h1":
            self._in_h1 = True
            self._current_data = []
        elif tag == "h2":
            self._in_h2 = True
            self._current_data = []
        elif tag == "h3":
            self._in_h3 = True
            self._current_data = []
        elif tag == "img":
            self.images.append({
                "src": attrs_dict.get("src", ""),
                "alt": attrs_dict.get("alt", "")
            })
        elif tag == "a":
            self.links.append({
                "href": attrs_dict.get("href", ""),
                "text": ""
            })
    def handle_endtag(self, tag: str):
        if tag == "title" and self._in_title:
            self.title = "".join(self._current_data).strip()
            self._in_title = False
        elif tag == "h1" and self._in_h1:
            self.h1_tags.append("".join(self._current_data).strip())
            self._in_h1 = False
        elif tag == "h2" and self._in_h2:
            self.h2_tags.append("".join(self._current_data).strip())
            self._in_h2 = False
        elif tag == "h3" and self._in_h3:
            self.h3_tags.append("".join(self._current_data).strip())
            self._in_h3 = False
        self._current_tag = None
    def handle_data(self, data: str):
        if self._in_title or self._in_h1 or self._in_h2 or self._in_h3:
            self._current_data.append(data)
        if self._current_tag == "a" and self.links:
            self.links[-1]["text"] += data
        if self._current_tag not in ["script", "style", "head"]:
            self.text_content += data
 class ContentRuleEngine:
    """Validates HTML content against universal rules and CORA targets"""
    def __init__(self, config: Config):
        self.config = config
        self.universal_rules = config.get("content_rules.universal", {})
        self.cora_config = config.get("content_rules.cora_validation", {})
    def validate(self, html_content: str, project: Project) -> ValidationResult:
        """
        Validate HTML content against all rules
        Args:
            html_content: Generated HTML content
            project: Project with CORA targets
        Returns:
            ValidationResult with errors and warnings
        """
        result = ValidationResult(passed=True)
        parser = ContentHTMLParser()
        parser.feed(html_content)
        self._validate_universal_rules(parser, project, result)
        if self.cora_config.get("enabled", True):
            self._validate_cora_targets(parser, project, result)
        return result
    def _validate_universal_rules(self, parser: ContentHTMLParser, project: Project, result: ValidationResult):
        """Validate universal hard rules that apply to all content"""
        word_count = len(parser.text_content.split())
        min_length = self.universal_rules.get("min_content_length", 0)
        max_length = self.universal_rules.get("max_content_length", float('inf'))
        if word_count < min_length:
            result.add_error(
                "min_content_length",
                f"Content is too short",
                expected=f">={min_length} words",
                actual=f"{word_count} words"
            )
        if word_count > max_length:
            result.add_error(
                "max_content_length",
                f"Content is too long",
                expected=f"<={max_length} words",
                actual=f"{word_count} words"
            )
        if self.universal_rules.get("title_exact_match_required", False):
            if not parser.title or not self._contains_keyword(parser.title, project.main_keyword):
                result.add_error(
                    "title_exact_match_required",
                    "Title must contain main keyword",
                    expected=project.main_keyword,
                    actual=parser.title or "(no title)"
                )
        if self.universal_rules.get("h1_exact_match_required", False):
            if not parser.h1_tags or not any(self._contains_keyword(h1, project.main_keyword) for h1 in parser.h1_tags):
                result.add_error(
                    "h1_exact_match_required",
                    "At least one H1 must contain main keyword",
                    expected=project.main_keyword,
                    actual=parser.h1_tags
                )
        h2_min = self.universal_rules.get("h2_exact_match_min", 0)
        h2_with_keyword = sum(1 for h2 in parser.h2_tags if self._contains_keyword(h2, project.main_keyword))
        if h2_with_keyword < h2_min:
            result.add_error(
                "h2_exact_match_min",
                f"Not enough H2 tags with main keyword",
                expected=f">={h2_min}",
                actual=h2_with_keyword
            )
        h3_min = self.universal_rules.get("h3_exact_match_min", 0)
        h3_with_keyword = sum(1 for h3 in parser.h3_tags if self._contains_keyword(h3, project.main_keyword))
        if h3_with_keyword < h3_min:
            result.add_error(
                "h3_exact_match_min",
                f"Not enough H3 tags with main keyword",
                expected=f">={h3_min}",
                actual=h3_with_keyword
            )
        if self.universal_rules.get("faq_section_required", False):
            if not self._has_faq_section(parser.h2_tags, parser.h3_tags):
                result.add_error(
                    "faq_section_required",
                    "Content must include an FAQ section"
                )
        if self.universal_rules.get("image_alt_text_keyword_required", False):
            for img in parser.images:
                if not self._contains_keyword(img.get("alt", ""), project.main_keyword):
                    result.add_error(
                        "image_alt_text_keyword_required",
                        f"Image alt text missing main keyword",
                        expected=project.main_keyword,
                        actual=img.get("alt", "(no alt)")
                    )
        if self.universal_rules.get("image_alt_text_entity_required", False) and project.entities:
            for img in parser.images:
                alt_text = img.get("alt", "")
                has_entity = any(self._contains_keyword(alt_text, entity) for entity in project.entities)
                if not has_entity:
                    result.add_error(
                        "image_alt_text_entity_required",
                        f"Image alt text missing entities",
                        expected=f"One of: {project.entities[:3]}",
                        actual=alt_text or "(no alt)"
                    )
    def _validate_cora_targets(self, parser: ContentHTMLParser, project: Project, result: ValidationResult):
        """Validate content against CORA-specific targets"""
        is_tier_1 = project.tier == 1
        round_down = self.cora_config.get("round_averages_down", True)
        counts = self._count_keyword_entities(parser, project)
        checks = [
            ("h1_exact", counts["h1_exact"], project.h1_exact, "H1 tags with exact keyword match"),
            ("h1_related_search", counts["h1_related_search"], project.h1_related_search, "H1 tags with related searches"),
            ("h1_entities", counts["h1_entities"], project.h1_entities, "H1 tags with entities"),
            ("h2_total", len(parser.h2_tags), project.h2_total, "Total H2 tags"),
            ("h2_exact", counts["h2_exact"], project.h2_exact, "H2 tags with exact keyword match"),
            ("h2_related_search", counts["h2_related_search"], project.h2_related_search, "H2 tags with related searches"),
            ("h2_entities", counts["h2_entities"], project.h2_entities, "H2 tags with entities"),
            ("h3_total", len(parser.h3_tags), project.h3_total, "Total H3 tags"),
            ("h3_exact", counts["h3_exact"], project.h3_exact, "H3 tags with exact keyword match"),
            ("h3_related_search", counts["h3_related_search"], project.h3_related_search, "H3 tags with related searches"),
            ("h3_entities", counts["h3_entities"], project.h3_entities, "H3 tags with entities"),
        ]
        for rule_name, actual, target, description in checks:
            if target is None:
                continue
            expected = int(target) if round_down else round(target)
            if actual < expected:
                message = f"{description} below CORA target"
                if is_tier_1:
                    result.add_error(rule_name, message, expected=expected, actual=actual)
                else:
                    result.add_warning(rule_name, message, expected=expected, actual=actual)
    def _count_keyword_entities(self, parser: ContentHTMLParser, project: Project) -> Dict[str, int]:
        """Count occurrences of keywords, entities, and related searches in headings"""
        entities = project.entities or []
        related_searches = project.related_searches or []
        return {
            "h1_exact": sum(1 for h1 in parser.h1_tags if self._contains_keyword(h1, project.main_keyword)),
            "h1_related_search": sum(1 for h1 in parser.h1_tags if self._contains_any(h1, related_searches)),
            "h1_entities": sum(1 for h1 in parser.h1_tags if self._contains_any(h1, entities)),
            "h2_exact": sum(1 for h2 in parser.h2_tags if self._contains_keyword(h2, project.main_keyword)),
            "h2_related_search": sum(1 for h2 in parser.h2_tags if self._contains_any(h2, related_searches)),
            "h2_entities": sum(1 for h2 in parser.h2_tags if self._contains_any(h2, entities)),
            "h3_exact": sum(1 for h3 in parser.h3_tags if self._contains_keyword(h3, project.main_keyword)),
            "h3_related_search": sum(1 for h3 in parser.h3_tags if self._contains_any(h3, related_searches)),
            "h3_entities": sum(1 for h3 in parser.h3_tags if self._contains_any(h3, entities)),
        }
    def _contains_keyword(self, text: str, keyword: str) -> bool:
        """Check if text contains keyword (case-insensitive, word boundary)"""
        if not text or not keyword:
            return False
        pattern = r'\b' + re.escape(keyword.lower()) + r'\b'
        return bool(re.search(pattern, text.lower()))
    def _contains_any(self, text: str, terms: List[str]) -> bool:
        """Check if text contains any of the terms"""
        if not text or not terms:
            return False
        return any(self._contains_keyword(text, term) for term in terms)
    def _has_faq_section(self, h2_tags: List[str], h3_tags: List[str]) -> bool:
        """Check if content has an FAQ section"""
        faq_patterns = [r'\bfaq\b', r'\bfrequently asked questions\b', r'\bq&a\b', r'\bquestions\b']
        for h2 in h2_tags:
            if any(re.search(pattern, h2.lower()) for pattern in faq_patterns):
                return True
        for h3 in h3_tags:
            if any(re.search(pattern, h3.lower()) for pattern in faq_patterns):
                return True
        return False
--- a/src/generation/service.py
+++ b/src/generation/service.py
@ -1,388 +1,311 @@
 """
-Content generation service - orchestrates the three-stage AI generation pipeline
+Content generation service with three-stage pipeline
 """
-import time
+import re
 import json
 from html import unescape
 from pathlib import Path
-from typing import Dict, Any, Optional, Tuple
+from datetime import datetime
-from src.database.models import Project, GeneratedContent
+from typing import Optional, Tuple
-from src.database.repositories import GeneratedContentRepository
+from src.generation.ai_client import AIClient, PromptManager
-from src.generation.ai_client import AIClient, AIClientError
+from src.database.repositories import ProjectRepository, GeneratedContentRepository
 from src.generation.validator import StageValidator
 from src.generation.augmenter import ContentAugmenter
 from src.generation.rule_engine import ContentRuleEngine
 from src.core.config import Config, get_config
 from sqlalchemy.orm import Session
-class GenerationError(Exception):
+class ContentGenerator:
-    """Content generation error"""
+    """Main service for generating content through AI pipeline"""
    pass
 class ContentGenerationService:
    """Service for AI-powered content generation with validation"""
    MAX_H2_TOTAL = 5
    MAX_H3_TOTAL = 13
    def __init__(
        self,
-        session: Session,
+        ai_client: AIClient,
-        config: Optional[Config] = None,
+        prompt_manager: PromptManager,
-        ai_client: Optional[AIClient] = None
+        project_repo: ProjectRepository,
        content_repo: GeneratedContentRepository
    ):
        self.ai_client = ai_client
        self.prompt_manager = prompt_manager
        self.project_repo = project_repo
        self.content_repo = content_repo
    def generate_title(self, project_id: int, debug: bool = False) -> str:
        """
-        Initialize service
+        Generate SEO-optimized title
        Args:
-            session: Database session
+            project_id: Project ID to generate title for
-            config: Application configuration
+            debug: If True, save response to debug_output/
            ai_client: AI client (creates new if None)
        """
        self.session = session
        self.config = config or get_config()
        self.ai_client = ai_client or AIClient(self.config)
        self.content_repo = GeneratedContentRepository(session)
        self.rule_engine = ContentRuleEngine(self.config)
        self.validator = StageValidator(self.config, self.rule_engine)
        self.augmenter = ContentAugmenter(ai_client=self.ai_client)
        self.prompts_dir = Path(__file__).parent / "prompts"
    def generate_article(
        self,
        project: Project,
        tier: int,
        title_model: str,
        outline_model: str,
        content_model: str,
        max_retries: int = 3,
        progress_callback: Optional[callable] = None,
        debug: bool = False
    ) -> GeneratedContent:
        """
        Generate complete article through three-stage pipeline
        Args:
            project: Project with CORA data
            tier: Tier level
            title_model: Model for title generation
            outline_model: Model for outline generation
            content_model: Model for content generation
            max_retries: Max retry attempts per stage
            progress_callback: Optional callback for progress updates
            debug: Enable debug output
        Returns:
-            GeneratedContent record with completed article
+            Generated title string
        Raises:
            GenerationError: If generation fails after all retries
        """
-        start_time = time.time()
+        project = self.project_repo.get_by_id(project_id)
        if not project:
            raise ValueError(f"Project {project_id} not found")
-        content_record = self.content_repo.create(project.id, tier)
+        entities_str = ", ".join(project.entities or [])
-        content_record.title_model = title_model
+        related_str = ", ".join(project.related_searches or [])
        content_record.outline_model = outline_model
        content_record.content_model = content_model
        self.content_repo.update(content_record)
-        try:
+        system_msg, user_prompt = self.prompt_manager.format_prompt(
-            title = self._generate_title(project, content_record, title_model, max_retries)
+            "title_generation",
-            
+            keyword=project.main_keyword,
            content_record.generation_stage = "outline"
            self.content_repo.update(content_record)
            outline = self._generate_outline(project, title, content_record, outline_model, max_retries)
            content_record.generation_stage = "content"
            self.content_repo.update(content_record)
            html_content = self._generate_content(
                project, title, outline, content_record, content_model, max_retries
            )
            content_record.status = "completed"
            content_record.generation_duration = time.time() - start_time
            self.content_repo.update(content_record)
            return content_record
        except Exception as e:
            content_record.status = "failed"
            content_record.error_message = str(e)
            content_record.generation_duration = time.time() - start_time
            self.content_repo.update(content_record)
            raise GenerationError(f"Article generation failed: {e}")
    def _generate_title(
        self,
        project: Project,
        content_record: GeneratedContent,
        model: str,
        max_retries: int
    ) -> str:
        """Generate and validate title"""
        prompt_template = self._load_prompt("title_generation.json")
        entities_str = ", ".join(project.entities[:10]) if project.entities else "N/A"
        searches_str = ", ".join(project.related_searches[:10]) if project.related_searches else "N/A"
        prompt = prompt_template["user_template"].format(
            main_keyword=project.main_keyword,
            word_count=project.word_count,
            entities=entities_str,
-            related_searches=searches_str
+            related_searches=related_str
        )
-        for attempt in range(1, max_retries + 1):
+        title = self.ai_client.generate_completion(
-            content_record.title_attempts = attempt
+            prompt=user_prompt,
-            self.content_repo.update(content_record)
+            system_message=system_msg,
-            
+            max_tokens=100,
            try:
                title = self.ai_client.generate(
                    prompt=prompt,
                    model=model,
            temperature=0.7
        )
-                is_valid, errors = self.validator.validate_title(title, project)
+        title = title.strip().strip('"').strip("'")
        if debug:
            self._save_debug_output(
                project_id, "title", title, "txt"
            )
                if is_valid:
                    content_record.title = title
                    self.content_repo.update(content_record)
        return title
-                if attempt < max_retries:
+    def generate_outline(
                    prompt += f"\n\nPrevious attempt failed: {', '.join(errors)}. Please fix these issues."
            except AIClientError as e:
                if attempt == max_retries:
                    raise GenerationError(f"Title generation failed after {max_retries} attempts: {e}")
        raise GenerationError(f"Title validation failed after {max_retries} attempts")
    def _generate_outline(
        self, 
-        project: Project,
+        project_id: int, 
        title: str, 
-        content_record: GeneratedContent,
+        min_h2: int,
-        model: str,
+        max_h2: int,
-        max_retries: int
+        min_h3: int,
-    ) -> Dict[str, Any]:
+        max_h3: int,
-        """Generate and validate outline"""
+        debug: bool = False
-        prompt_template = self._load_prompt("outline_generation.json")
+    ) -> dict:
        """
        Generate article outline in JSON format
-        entities_str = ", ".join(project.entities[:20]) if project.entities else "N/A"
+        Args:
-        searches_str = ", ".join(project.related_searches[:20]) if project.related_searches else "N/A"
+            project_id: Project ID
            title: Article title
            min_h2: Minimum H2 headings
            max_h2: Maximum H2 headings
            min_h3: Minimum H3 subheadings total
            max_h3: Maximum H3 subheadings total
            debug: If True, save response to debug_output/
-        h2_total = int(project.h2_total) if project.h2_total else 5
+        Returns:
-        h2_exact = int(project.h2_exact) if project.h2_exact else 1
+            Outline dictionary: {"outline": [{"h2": "...", "h3": ["...", "..."]}]}
        h2_related = int(project.h2_related_search) if project.h2_related_search else 1
        h2_entities = int(project.h2_entities) if project.h2_entities else 2
-        h3_total = int(project.h3_total) if project.h3_total else 10
+        Raises:
-        h3_exact = int(project.h3_exact) if project.h3_exact else 1
+            ValueError: If outline doesn't meet minimum requirements
-        h3_related = int(project.h3_related_search) if project.h3_related_search else 2
+        """
-        h3_entities = int(project.h3_entities) if project.h3_entities else 3
+        project = self.project_repo.get_by_id(project_id)
        if not project:
            raise ValueError(f"Project {project_id} not found")
-        if self.config.content_rules.cora_validation.round_averages_down:
+        entities_str = ", ".join(project.entities or [])
-            h2_total = int(h2_total)
+        related_str = ", ".join(project.related_searches or [])
            h3_total = int(h3_total)
-        h2_total = min(h2_total, self.MAX_H2_TOTAL)
+        system_msg, user_prompt = self.prompt_manager.format_prompt(
-        h3_total = min(h3_total, self.MAX_H3_TOTAL)
+            "outline_generation",
        prompt = prompt_template["user_template"].format(
            title=title,
-            main_keyword=project.main_keyword,
+            keyword=project.main_keyword,
-            word_count=project.word_count,
+            min_h2=min_h2,
-            h2_total=h2_total,
+            max_h2=max_h2,
-            h2_exact=h2_exact,
+            min_h3=min_h3,
-            h2_related_search=h2_related,
+            max_h3=max_h3,
            h2_entities=h2_entities,
            h3_total=h3_total,
            h3_exact=h3_exact,
            h3_related_search=h3_related,
            h3_entities=h3_entities,
            entities=entities_str,
-            related_searches=searches_str
+            related_searches=related_str
        )
-        for attempt in range(1, max_retries + 1):
+        outline_json = self.ai_client.generate_completion(
-            content_record.outline_attempts = attempt
+            prompt=user_prompt,
-            self.content_repo.update(content_record)
+            system_message=system_msg,
            max_tokens=2000,
            temperature=0.7,
            json_mode=True
        )
        print(f"[DEBUG] Raw outline response: {outline_json}")
        # Save raw response immediately
        if debug:
            self._save_debug_output(project_id, "outline_raw", outline_json, "txt")
            print(f"[DEBUG] Raw outline response: {outline_json}")
        try:
-                outline_json_str = self.ai_client.generate_json(
+            outline = json.loads(outline_json)
-                    prompt=prompt,
+        except json.JSONDecodeError as e:
-                    model=model,
+            if debug:
-                    temperature=0.7,
+                self._save_debug_output(project_id, "outline_error", outline_json, "txt")
-                    max_tokens=2000
+            raise ValueError(f"Failed to parse outline JSON: {e}\nResponse: {outline_json[:500]}")
        if "outline" not in outline:
            if debug:
                self._save_debug_output(project_id, "outline_invalid", json.dumps(outline, indent=2), "json")
            raise ValueError(f"Outline missing 'outline' key. Got keys: {list(outline.keys())}\nContent: {outline}")
        h2_count = len(outline["outline"])
        h3_count = sum(len(section.get("h3", [])) for section in outline["outline"])
        if h2_count < min_h2:
            raise ValueError(f"Outline has {h2_count} H2s, minimum is {min_h2}")
        if h3_count < min_h3:
            raise ValueError(f"Outline has {h3_count} H3s, minimum is {min_h3}")
        if debug:
            self._save_debug_output(
                project_id, "outline", json.dumps(outline, indent=2), "json"
            )
                if isinstance(outline_json_str, str):
                    outline = json.loads(outline_json_str)
                else:
                    outline = outline_json_str
                is_valid, errors, missing = self.validator.validate_outline(outline, project)
                if is_valid:
                    content_record.outline = json.dumps(outline)
                    self.content_repo.update(content_record)
        return outline
-                if attempt < max_retries:
+    def generate_content(
                    if missing:
                        augmented_outline, aug_log = self.augmenter.augment_outline(
                            outline, missing, project.main_keyword,
                            project.entities or [], project.related_searches or []
                        )
                        is_valid_aug, errors_aug, _ = self.validator.validate_outline(
                            augmented_outline, project
                        )
                        if is_valid_aug:
                            content_record.outline = json.dumps(augmented_outline)
                            content_record.augmented = True
                            content_record.augmentation_log = aug_log
                            self.content_repo.update(content_record)
                            return augmented_outline
                    prompt += f"\n\nPrevious attempt failed: {', '.join(errors)}. Please meet ALL CORA targets exactly."
            except (AIClientError, json.JSONDecodeError) as e:
                if attempt == max_retries:
                    raise GenerationError(f"Outline generation failed after {max_retries} attempts: {e}")
        raise GenerationError(f"Outline validation failed after {max_retries} attempts")
    def _generate_content(
        self, 
-        project: Project,
+        project_id: int, 
        title: str, 
-        outline: Dict[str, Any],
+        outline: dict,
-        content_record: GeneratedContent,
+        min_word_count: int,
-        model: str,
+        max_word_count: int,
-        max_retries: int
+        debug: bool = False
    ) -> str:
-        """Generate and validate full HTML content"""
+        """
-        prompt_template = self._load_prompt("content_generation.json")
+        Generate full article HTML fragment
-        outline_str = self._format_outline_for_prompt(outline)
+        Args:
-        entities_str = ", ".join(project.entities[:30]) if project.entities else "N/A"
+            project_id: Project ID
-        searches_str = ", ".join(project.related_searches[:30]) if project.related_searches else "N/A"
+            title: Article title
            outline: Article outline dict
            min_word_count: Minimum word count for guidance
            max_word_count: Maximum word count for guidance
            debug: If True, save response to debug_output/
-        prompt = prompt_template["user_template"].format(
+        Returns:
-            outline=outline_str,
+            HTML string with <h2>, <h3>, <p> tags
        """
        project = self.project_repo.get_by_id(project_id)
        if not project:
            raise ValueError(f"Project {project_id} not found")
        entities_str = ", ".join(project.entities or [])
        related_str = ", ".join(project.related_searches or [])
        outline_str = json.dumps(outline, indent=2)
        system_msg, user_prompt = self.prompt_manager.format_prompt(
            "content_generation",
            title=title,
-            main_keyword=project.main_keyword,
+            outline=outline_str,
-            word_count=project.word_count,
+            keyword=project.main_keyword,
            term_frequency=project.term_frequency or self.config.content_rules.universal.default_term_frequency,
            entities=entities_str,
-            related_searches=searches_str
+            related_searches=related_str,
            min_word_count=min_word_count,
            max_word_count=max_word_count
        )
-        for attempt in range(1, max_retries + 1):
+        content = self.ai_client.generate_completion(
-            content_record.content_attempts = attempt
+            prompt=user_prompt,
-            self.content_repo.update(content_record)
+            system_message=system_msg,
-            
+            max_tokens=8000,
-            try:
+            temperature=0.7
                html_content = self.ai_client.generate(
                    prompt=prompt,
                    model=model,
                    temperature=0.7,
                    max_tokens=self.config.ai_service.max_tokens
        )
-                is_valid, validation_result = self.validator.validate_content(html_content, project)
+        content = content.strip()
-                content_record.validation_errors = len(validation_result.errors)
+        if debug:
-                content_record.validation_warnings = len(validation_result.warnings)
+            self._save_debug_output(
-                content_record.validation_report = validation_result.to_dict()
+                project_id, "content", content, "html"
                self.content_repo.update(content_record)
                if is_valid:
                    content_record.content = html_content
                    word_count = len(html_content.split())
                    content_record.word_count = word_count
                    self.content_repo.update(content_record)
                    return html_content
                if attempt < max_retries:
                    missing = self.validator.extract_missing_elements(validation_result, project, html_content)
                    has_word_deficit = missing.get("word_count_deficit", 0) > 0
                    if has_word_deficit:
                        try:
                            augmented_html, aug_log = self.augmenter.augment_content_with_ai(
                                html_content, missing, project.main_keyword,
                                project.entities or [], project.related_searches or [],
                                model=model
            )
-                            is_valid_aug, validation_result_aug = self.validator.validate_content(
+        return content
-                                augmented_html, project
+    
    def validate_word_count(self, content: str, min_words: int, max_words: int) -> Tuple[bool, int]:
        """
        Validate content word count
        Args:
            content: HTML content string
            min_words: Minimum word count
            max_words: Maximum word count
        Returns:
            Tuple of (is_valid, actual_count)
        """
        word_count = self.count_words(content)
        is_valid = min_words <= word_count <= max_words
        return is_valid, word_count
    def count_words(self, html_content: str) -> int:
        """
        Count words in HTML content
        Args:
            html_content: HTML string
        Returns:
            Number of words
        """
        text = re.sub(r'<[^>]+>', '', html_content)
        text = unescape(text)
        words = text.split()
        return len(words)
    def augment_content(
        self, 
        content: str, 
        target_word_count: int,
        debug: bool = False,
        project_id: Optional[int] = None
    ) -> str:
        """
        Expand article content to meet minimum word count
        Args:
            content: Current HTML content
            target_word_count: Target word count
            debug: If True, save response to debug_output/
            project_id: Optional project ID for debug output
        Returns:
            Expanded HTML content
        """
        system_msg, user_prompt = self.prompt_manager.format_prompt(
            "content_augmentation",
            content=content,
            target_word_count=target_word_count
        )
-                            content_record.content = augmented_html
+        augmented = self.ai_client.generate_completion(
-                            content_record.augmented = True
+            prompt=user_prompt,
-                            existing_log = content_record.augmentation_log or {}
+            system_message=system_msg,
-                            existing_log["content_ai_augmentation"] = aug_log
+            max_tokens=8000,
-                            content_record.augmentation_log = existing_log
+            temperature=0.7
-                            content_record.validation_errors = len(validation_result_aug.errors)
+        )
                            content_record.validation_warnings = len(validation_result_aug.warnings)
                            content_record.validation_report = validation_result_aug.to_dict()
                            word_count = len(augmented_html.split())
                            content_record.word_count = word_count
                            self.content_repo.update(content_record)
-                            missing_after = self.validator.extract_missing_elements(validation_result_aug, project, augmented_html)
+        augmented = augmented.strip()
                            still_short = missing_after.get("word_count_deficit", 0) > 0
-                            if not still_short:
+        if debug and project_id:
-                                return augmented_html
+            self._save_debug_output(
                project_id, "augmented", augmented, "html"
            )
-                            html_content = augmented_html
+        return augmented
                            validation_result = validation_result_aug
-                        except Exception as e:
+    def _save_debug_output(
-                            print(f"AI augmentation failed: {e}")
+        self, 
-                            error_summary = f"Word count too short. AI augmentation failed: {str(e)}"
+        project_id: int, 
-                            prompt += f"\n\nPrevious content failed validation: {error_summary}. Generate MORE content to meet the word count target."
+        stage: str, 
-                    else:
+        content: str, 
-                        content_record.content = html_content
+        extension: str,
-                        word_count = len(html_content.split())
+        tier: Optional[str] = None,
-                        content_record.word_count = word_count
+        article_num: Optional[int] = None
-                        self.content_repo.update(content_record)
+    ):
-                        return html_content
+        """Save debug output to file"""
        debug_dir = Path("debug_output")
        debug_dir.mkdir(exist_ok=True)
-            except AIClientError as e:
+        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
                if attempt == max_retries:
                    raise GenerationError(f"Content generation failed after {max_retries} attempts: {e}")
-        raise GenerationError(f"Content validation failed after {max_retries} attempts")
+        tier_part = f"_tier{tier}" if tier else ""
        article_part = f"_article{article_num}" if article_num else ""
-    def _load_prompt(self, filename: str) -> Dict[str, Any]:
+        filename = f"{stage}_project{project_id}{tier_part}{article_part}_{timestamp}.{extension}"
-        """Load prompt template from JSON file"""
+        filepath = debug_dir / filename
        prompt_path = self.prompts_dir / filename
        if not prompt_path.exists():
            raise GenerationError(f"Prompt template not found: {filename}")
-        with open(prompt_path, 'r', encoding='utf-8') as f:
+        with open(filepath, 'w', encoding='utf-8') as f:
-            return json.load(f)
+            f.write(content)
    def _format_outline_for_prompt(self, outline: Dict[str, Any]) -> str:
        """Format outline JSON into readable string for content prompt"""
        lines = [f"H1: {outline.get('h1', '')}"]
        for section in outline.get("sections", []):
            lines.append(f"\nH2: {section['h2']}")
            for h3 in section.get("h3s", []):
                lines.append(f"  H3: {h3}")
        return "\n".join(lines)
--- a/tests/integration/test_generate_batch.py
+++ b/tests/integration/test_generate_batch.py
@ -0,0 +1,52 @@
 """
 Integration test for batch generation (stub)
 """
 import pytest
 from unittest.mock import Mock, MagicMock
 from src.generation.batch_processor import BatchProcessor
 from src.generation.service import ContentGenerator
 def test_batch_processor_initialization():
    """Test BatchProcessor can be initialized"""
    mock_generator = Mock(spec=ContentGenerator)
    mock_content_repo = Mock()
    mock_project_repo = Mock()
    processor = BatchProcessor(
        content_generator=mock_generator,
        content_repo=mock_content_repo,
        project_repo=mock_project_repo
    )
    assert processor is not None
    assert processor.stats["total_jobs"] == 0
    assert processor.stats["processed_jobs"] == 0
 def test_batch_processor_stats_initialization():
    """Test BatchProcessor initializes stats correctly"""
    mock_generator = Mock(spec=ContentGenerator)
    mock_content_repo = Mock()
    mock_project_repo = Mock()
    processor = BatchProcessor(
        content_generator=mock_generator,
        content_repo=mock_content_repo,
        project_repo=mock_project_repo
    )
    expected_keys = [
        "total_jobs",
        "processed_jobs",
        "total_articles",
        "generated_articles",
        "augmented_articles",
        "failed_articles"
    ]
    for key in expected_keys:
        assert key in processor.stats
        assert processor.stats[key] == 0
--- a/tests/unit/test_content_generator.py
+++ b/tests/unit/test_content_generator.py
@ -0,0 +1,95 @@
 """
 Unit tests for ContentGenerator service
 """
 import pytest
 from src.generation.service import ContentGenerator
 def test_count_words_simple():
    """Test word count on simple text"""
    generator = ContentGenerator(None, None, None, None)
    html = "<p>This is a test with five words</p>"
    count = generator.count_words(html)
    assert count == 7
 def test_count_words_with_headings():
    """Test word count with HTML headings"""
    generator = ContentGenerator(None, None, None, None)
    html = """
    <h2>Main Heading</h2>
    <p>This is a paragraph with some words.</p>
    <h3>Subheading</h3>
    <p>Another paragraph here.</p>
    """
    count = generator.count_words(html)
    assert count > 10
 def test_count_words_strips_html_tags():
    """Test that HTML tags are stripped before counting"""
    generator = ContentGenerator(None, None, None, None)
    html = "<p>Hello <strong>world</strong> this <em>is</em> a test</p>"
    count = generator.count_words(html)
    assert count == 6
 def test_validate_word_count_within_range():
    """Test validation when word count is within range"""
    generator = ContentGenerator(None, None, None, None)
    content = "<p>" + " ".join(["word"] * 100) + "</p>"
    is_valid, count = generator.validate_word_count(content, 50, 150)
    assert is_valid is True
    assert count == 100
 def test_validate_word_count_below_minimum():
    """Test validation when word count is below minimum"""
    generator = ContentGenerator(None, None, None, None)
    content = "<p>" + " ".join(["word"] * 30) + "</p>"
    is_valid, count = generator.validate_word_count(content, 50, 150)
    assert is_valid is False
    assert count == 30
 def test_validate_word_count_above_maximum():
    """Test validation when word count is above maximum"""
    generator = ContentGenerator(None, None, None, None)
    content = "<p>" + " ".join(["word"] * 200) + "</p>"
    is_valid, count = generator.validate_word_count(content, 50, 150)
    assert is_valid is False
    assert count == 200
 def test_count_words_empty_content():
    """Test word count on empty content"""
    generator = ContentGenerator(None, None, None, None)
    count = generator.count_words("")
    assert count == 0
 def test_count_words_only_tags():
    """Test word count on content with only HTML tags"""
    generator = ContentGenerator(None, None, None, None)
    html = "<div><p></p><span></span></div>"
    count = generator.count_words(html)
    assert count == 0
--- a/tests/unit/test_job_config.py
+++ b/tests/unit/test_job_config.py
@ -1,208 +1,176 @@
 """
-Unit tests for job configuration
+Unit tests for JobConfig parser
 """
 import pytest
 import json
 import tempfile
 from pathlib import Path
-from src.generation.job_config import (
+from src.generation.job_config import JobConfig, TIER_DEFAULTS
    JobConfig, TierConfig, ModelConfig, AnchorTextConfig,
    FailureConfig, InterlinkingConfig
 )
-def test_model_config_creation():
+@pytest.fixture
-    """Test ModelConfig creation"""
+def temp_job_file(tmp_path):
-    config = ModelConfig(
+    """Create a temporary job file for testing"""
-        title="model1",
+    def _create_file(data):
-        outline="model2",
+        job_file = tmp_path / "test_job.json"
-        content="model3"
+        with open(job_file, 'w') as f:
-    )
+            json.dump(data, f)
-    
+        return str(job_file)
-    assert config.title == "model1"
+    return _create_file
    assert config.outline == "model2"
    assert config.content == "model3"
-def test_anchor_text_config_modes():
+def test_load_job_config_valid(temp_job_file):
-    """Test different anchor text modes"""
+    """Test loading valid job file"""
-    default_config = AnchorTextConfig(mode="default")
+    data = {
-    assert default_config.mode == "default"
+        "jobs": [
    override_config = AnchorTextConfig(
        mode="override",
        custom_text=["anchor1", "anchor2"]
    )
    assert override_config.mode == "override"
    assert len(override_config.custom_text) == 2
    append_config = AnchorTextConfig(
        mode="append",
        additional_text=["extra"]
    )
    assert append_config.mode == "append"
 def test_tier_config_creation():
    """Test TierConfig creation"""
    models = ModelConfig(
        title="model1",
        outline="model2",
        content="model3"
    )
    tier_config = TierConfig(
        tier=1,
        article_count=15,
        models=models
    )
    assert tier_config.tier == 1
    assert tier_config.article_count == 15
    assert tier_config.validation_attempts == 3
 def test_job_config_creation():
    """Test JobConfig creation"""
    models = ModelConfig(
        title="model1",
        outline="model2",
        content="model3"
    )
    tier = TierConfig(
        tier=1,
        article_count=10,
        models=models
    )
    job = JobConfig(
        job_name="Test Job",
        project_id=1,
        tiers=[tier]
    )
    assert job.job_name == "Test Job"
    assert job.project_id == 1
    assert len(job.tiers) == 1
    assert job.get_total_articles() == 10
 def test_job_config_multiple_tiers():
    """Test JobConfig with multiple tiers"""
    models = ModelConfig(
        title="model1",
        outline="model2",
        content="model3"
    )
    tier1 = TierConfig(tier=1, article_count=10, models=models)
    tier2 = TierConfig(tier=2, article_count=20, models=models)
    job = JobConfig(
        job_name="Multi-Tier Job",
        project_id=1,
        tiers=[tier1, tier2]
    )
    assert job.get_total_articles() == 30
 def test_job_config_unique_tiers_validation():
    """Test that tier numbers must be unique"""
    models = ModelConfig(
        title="model1",
        outline="model2",
        content="model3"
    )
    tier1 = TierConfig(tier=1, article_count=10, models=models)
    tier2 = TierConfig(tier=1, article_count=20, models=models)
    with pytest.raises(ValueError, match="unique"):
        JobConfig(
            job_name="Duplicate Tiers",
            project_id=1,
            tiers=[tier1, tier2]
        )
 def test_job_config_from_file():
    """Test loading JobConfig from JSON file"""
    config_data = {
        "job_name": "Test Job",
        "project_id": 1,
        "tiers": [
            {
-                "tier": 1,
+                "project_id": 1,
-                "article_count": 5,
+                "tiers": {
-                "models": {
+                    "tier1": {
-                    "title": "model1",
+                        "count": 5
-                    "outline": "model2",
+                    }
                    "content": "model3"
                }
            }
        ]
    }
-    with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:
+    job_file = temp_job_file(data)
-        json.dump(config_data, f)
+    config = JobConfig(job_file)
        temp_path = f.name
-    try:
+    assert len(config.get_jobs()) == 1
-        job = JobConfig.from_file(temp_path)
+    assert config.get_jobs()[0].project_id == 1
-        assert job.job_name == "Test Job"
+    assert "tier1" in config.get_jobs()[0].tiers
        assert job.project_id == 1
        assert len(job.tiers) == 1
    finally:
        Path(temp_path).unlink()
-def test_job_config_to_file():
+def test_tier_defaults_applied(temp_job_file):
-    """Test saving JobConfig to JSON file"""
+    """Test defaults applied when not in job file"""
-    models = ModelConfig(
+    data = {
-        title="model1",
+        "jobs": [
-        outline="model2",
+            {
-        content="model3"
+                "project_id": 1,
-    )
+                "tiers": {
                    "tier1": {
                        "count": 3
                    }
                }
            }
        ]
    }
-    tier = TierConfig(tier=1, article_count=5, models=models)
+    job_file = temp_job_file(data)
-    job = JobConfig(
+    config = JobConfig(job_file)
        job_name="Test Job",
        project_id=1,
        tiers=[tier]
    )
-    with tempfile.NamedTemporaryFile(mode='w', suffix='.json', delete=False) as f:
+    job = config.get_jobs()[0]
-        temp_path = f.name
+    tier1_config = job.tiers["tier1"]
-    try:
+    assert tier1_config.count == 3
-        job.to_file(temp_path)
+    assert tier1_config.min_word_count == TIER_DEFAULTS["tier1"]["min_word_count"]
-        assert Path(temp_path).exists()
+    assert tier1_config.max_word_count == TIER_DEFAULTS["tier1"]["max_word_count"]
        loaded_job = JobConfig.from_file(temp_path)
        assert loaded_job.job_name == job.job_name
        assert loaded_job.project_id == job.project_id
    finally:
        Path(temp_path).unlink()
-def test_interlinking_config_validation():
+def test_custom_values_override_defaults(temp_job_file):
-    """Test InterlinkingConfig validation"""
+    """Test custom values override defaults"""
-    config = InterlinkingConfig(
+    data = {
-        links_per_article_min=2,
+        "jobs": [
-        links_per_article_max=4
+            {
-    )
+                "project_id": 1,
                "tiers": {
                    "tier1": {
                        "count": 5,
                        "min_word_count": 3000,
                        "max_word_count": 3500
                    }
                }
            }
        ]
    }
-    assert config.links_per_article_min == 2
+    job_file = temp_job_file(data)
-    assert config.links_per_article_max == 4
+    config = JobConfig(job_file)
    job = config.get_jobs()[0]
    tier1_config = job.tiers["tier1"]
    assert tier1_config.min_word_count == 3000
    assert tier1_config.max_word_count == 3500
-def test_failure_config_defaults():
+def test_multiple_jobs_in_file(temp_job_file):
-    """Test FailureConfig default values"""
+    """Test parsing file with multiple jobs"""
-    config = FailureConfig()
+    data = {
        "jobs": [
            {
                "project_id": 1,
                "tiers": {"tier1": {"count": 5}}
            },
            {
                "project_id": 2,
                "tiers": {"tier2": {"count": 10}}
            }
        ]
    }
-    assert config.max_consecutive_failures == 5
+    job_file = temp_job_file(data)
-    assert config.skip_on_failure is True
+    config = JobConfig(job_file)
    jobs = config.get_jobs()
    assert len(jobs) == 2
    assert jobs[0].project_id == 1
    assert jobs[1].project_id == 2
 def test_multiple_tiers_in_job(temp_job_file):
    """Test job with multiple tiers"""
    data = {
        "jobs": [
            {
                "project_id": 1,
                "tiers": {
                    "tier1": {"count": 5},
                    "tier2": {"count": 10},
                    "tier3": {"count": 15}
                }
            }
        ]
    }
    job_file = temp_job_file(data)
    config = JobConfig(job_file)
    job = config.get_jobs()[0]
    assert len(job.tiers) == 3
    assert "tier1" in job.tiers
    assert "tier2" in job.tiers
    assert "tier3" in job.tiers
 def test_invalid_job_file_no_jobs_key(temp_job_file):
    """Test error when jobs key is missing"""
    data = {"invalid": []}
    job_file = temp_job_file(data)
    with pytest.raises(ValueError, match="must contain 'jobs'"):
        JobConfig(job_file)
 def test_invalid_job_missing_project_id(temp_job_file):
    """Test error when project_id is missing"""
    data = {
        "jobs": [
            {
                "tiers": {"tier1": {"count": 5}}
            }
        ]
    }
    job_file = temp_job_file(data)
    with pytest.raises(ValueError, match="missing 'project_id'"):
        JobConfig(job_file)
 def test_file_not_found():
    """Test error when file doesn't exist"""
    with pytest.raises(FileNotFoundError):
        JobConfig("nonexistent_file.json")