Big-Link-Man/docs/stories/story-2.2. simplified-ai-co...

41 lines
3.2 KiB
Markdown

# Story 2.2: Simplified AI Content Generation via Batch Job
## Status
Completed
## Story
**As a** User,
**I want** to control AI content generation via a batch file that specifies word count and heading limits,
**so that** I can easily create topically relevant articles without unnecessary complexity or rigid validation.
## Acceptance Criteria
1. **Batch Job Control:** The `generate-batch` command accepts a JSON job file that specifies `min_word_count`, `max_word_count`, `max_h2_tags`, and `max_h3_tags` for each tier.
2. **Three-Stage Generation:** The system uses a simple three-stage pipeline:
* Generates a title using the project's SEO data.
* Generates an outline based on the title, SEO data, and the `max_h2`/`max_h3` limits from the job file.
* Generates the full article content based on the validated outline.
3. **SEO Data Integration:** The generation process for all stages is informed by the project's `keyword`, `entities`, and `related_searches` to ensure topical relevance.
4. **Word Count Validation:** After generation, the system validates the content *only* against the `min_word_count` and `max_word_count` specified in the job file.
5. **Simple Augmentation:** If the generated content is below `min_word_count`, the system makes **one** attempt to append additional content using a simple "expand on this article" prompt.
6. **Database Storage:** The final generated title, outline, and content are stored in the `GeneratedContent` table.
7. **CLI Execution:** The `generate-batch` command successfully runs the job, logs progress to the console, and indicates when the process is complete.
## Dev Notes
* **Objective:** This story replaces the previous, overly complex stories 2.2 and 2.3. The goal is maximum simplicity and user control via the job file.
* **Key Change:** Remove the entire `ContentRuleEngine` and all strict CORA validation logic. The only validation required is a final word count check.
* **Job File is King:** All operational parameters (`min_word_count`, `max_word_count`, `max_h2_tags`, `max_h3_tags`) must be read from the job file for each tier being processed.
* **Augmentation:** Keep it simple. If `word_count < min_word_count`, make a single API call to the AI with a prompt like: "Please expand on the following article to add more detail and depth, ensuring you maintain the existing topical focus. Here is the article: {content}". Do not create a complex augmentation system.
## Implementation Plan
See **[story-2.2-task-breakdown.md](story-2.2-task-breakdown.md)** for detailed implementation tasks.
The task breakdown is organized into 7 phases:
1. **Phase 1**: Data Model & Schema Design (GeneratedContent table, repositories, job file schema)
2. **Phase 2**: AI Client & Prompt Management (OpenRouter integration, prompt templates)
3. **Phase 3**: Core Generation Pipeline (title, outline, content generation with validation)
4. **Phase 4**: Batch Processing (job config parser, batch processor, error handling)
5. **Phase 5**: CLI Integration (generate-batch command, progress logging, debug output)
6. **Phase 6**: Testing & Validation (unit tests, integration tests, example job files)
7. **Phase 7**: Cleanup & Deprecation (remove old rule engine and validators)