diff --git a/docs/prd/epic-7-image-generation.md b/docs/prd/epic-7-image-generation.md new file mode 100644 index 0000000..bc36ac7 --- /dev/null +++ b/docs/prd/epic-7-image-generation.md @@ -0,0 +1,41 @@ +# Epic 7: AI-Powered Image Generation + +## Epic Goal +To automatically generate and insert high-quality images into generated articles, including hero images for all tiers and content images for T1 articles, using AI-powered image generation via fal.ai FLUX.1 schnell API. + +## Status +- **Story 7.1**: 📋 PLANNING (TBD story points) + +## Stories + +### Story 7.1: Generate and Insert Images into Articles +**Status:** 📋 PLANNING +**Document:** [story-7.1-image-generation.md](../stories/story-7.1-image-generation.md) + +**As a developer**, I want to automatically generate hero images for all article tiers and content images for T1 articles, so that articles are visually enhanced without manual image creation. + +**Acceptance Criteria** +* Hero images generated for all tiers (configurable per tier in job JSON) +* Content images generated for T1 articles (1-3 images by default, configurable) +* Images generated using fal.ai FLUX.1 schnell API +* Theme prompt generated once per project using AI, reused for all images +* Images uploaded to storage zone `images/` subdirectory +* Images inserted into HTML content (hero after H1, content images after H2s) +* Alt text includes 3 entities and 2 related searches +* Graceful failure handling (continue without images if generation fails) +* Job JSON configuration for image sizes and counts per tier + +**Implementation Notes** +* Hero images: 1280x720 default, title truncated to 3-4 words, uppercase +* Content images: 512x512 default, named with entity + related_search +* Max 5 concurrent fal.ai API calls (same pattern as OpenRouter) +* Images are optional enhancements (failures don't block article generation) +* Theme prompt cached in Project model for reuse + +## Notes +- Images enhance SEO and user engagement +- fal.ai API provides fast generation (<3 seconds per image) +- Theme consistency across project ensures visual coherence +- Job JSON allows fine-grained control per tier +- Technical debt: Image optimization, CDN caching, format conversion deferred + diff --git a/docs/stories/story-7.1-image-generation.md b/docs/stories/story-7.1-image-generation.md new file mode 100644 index 0000000..c8b297a --- /dev/null +++ b/docs/stories/story-7.1-image-generation.md @@ -0,0 +1,435 @@ +# Story 7.1: Generate and Insert Images into Articles + +## Status +**PLANNING** - Ready for Implementation + +## Story +**As a developer**, I want to automatically generate hero images for all article tiers and content images for T1 articles using AI-powered image generation, so that articles are visually enhanced without manual image creation. + +## Context +- Epic 7 introduces AI-powered image generation to enhance article quality +- fal.ai FLUX.1 schnell API provides fast text-to-image generation (<3 seconds) +- Hero images needed for all tiers to improve visual appeal +- Content images needed for T1 articles (1-3 images) to break up text +- Images should be thematically consistent across a project +- Images are optional enhancements (failures shouldn't block article generation) +- Job JSON configuration allows per-tier customization + +## Acceptance Criteria + +### Core Image Generation Functionality +- Hero images generated for all tiers (if enabled in job config) +- Content images generated for T1 articles (1-3 by default, configurable) +- Images generated using fal.ai FLUX.1 schnell API +- Theme prompt generated once per project using AI, cached in database +- Images uploaded to storage zone `images/` subdirectory +- Images inserted into HTML content before saving to database +- Alt text includes 3 random entities and 2 random related searches +- Graceful failure handling (log errors, continue without images) + +### Hero Image Specifications +- Default size: 1280x720 (configurable per tier in job JSON) +- Title truncated to 3-4 words, converted to UPPERCASE +- Prompt format: `{theme_prompt} Text: '{TITLE_UPPERCASE}' in clean simple uppercase letters, positioned in middle of image.` +- Naming: `{main-keyword}.jpg` (slugified) +- Placement: Inserted immediately after first `

` tag +- Format: Always JPG +- Title truncation: Extract first 3-4 words from title, convert to UPPERCASE + +### Content Image Specifications +- Default size: 512x512 (configurable per tier in job JSON) +- Default count: T1 = 1-3 images (configurable min/max in job JSON) +- Prompt format: `{theme_prompt} Focus on {entity} and {related_search}, professional illustration style.` +- Naming: `{main-keyword}-{random-entity}-{random-related-search}.jpg` (all slugified) +- Placement: Distributed after H2 sections (one image per H2, evenly distributed) +- Format: Always JPG +- Alt text: `alt="entity1 related_search1 entity2 related_search2 entity3"` (3 entities, 2 related searches) + +### Theme Prompt Generation +- Generated once per project using AI (OpenRouter) +- Cached in `Project.image_theme_prompt` field (TEXT, nullable) +- Prompt template: `src/generation/prompts/image_theme_generation.json` +- Reused for all images in the project for visual consistency +- Example: "Professional industrial scene, modern manufacturing equipment, clean factory environment, high-quality industrial photography, dramatic lighting." + +### Job JSON Configuration +- New `image_config` section per tier in job JSON +- Structure: + ```json + { + "tiers": { + "tier1": { + "image_config": { + "hero": { + "width": 1280, + "height": 720 + }, + "content": { + "min_num_images": 1, + "max_num_images": 3, + "width": 512, + "height": 512 + } + } + } + } + } + ``` +- Defaults if not specified: + - Hero: 1280x720 (all tiers) + - Content: T1 = 1-3, T2/T3 = 0-0 (disabled) +- If `image_config` missing entirely: Use tier defaults +- If `image_config` present but `hero`/`content` missing: Use defaults for missing keys + +### Database Schema Updates +- Add `image_theme_prompt` (TEXT, nullable) to `Project` model +- Add `hero_image_url` (TEXT, nullable) to `GeneratedContent` model +- Add `content_images` (JSON, nullable) to `GeneratedContent` model + - Format: `["url1", "url2", "url3"]` (array of image URLs) +- Migration script required + +### Image Storage +- Images uploaded to `images/` subdirectory in storage zone +- Check if file exists before upload (use existing if collision) +- Return full public URL for database storage +- Handle file collisions gracefully (use existing file) + +### Error Handling +- Log errors with prompt used for debugging +- Continue article generation if image generation fails +- Images are "nice to have", not required +- Max 5 concurrent fal.ai API calls (same pattern as OpenRouter) +- Retry logic similar to OpenRouter (3 attempts with backoff) + +## Tasks / Subtasks + +### 1. Create Image Generation Module +**Effort:** 3 story points + +- [ ] Create `src/generation/image_generator.py` module +- [ ] Implement `ImageGenerator` class: + - `get_theme_prompt(project_id) -> str` (get or generate) + - `generate_hero_image(project_id, title, width, height) -> bytes` + - `generate_content_image(project_id, entity, related_search, width, height) -> bytes` +- [ ] Integrate fal.ai API client (`fal-client` package) +- [ ] Use `FAL_API_KEY` from `.env` for authentication (note: env var is `FAL_API_KEY`, not `FAL_KEY`) +- [ ] Implement concurrency control (max 5 concurrent calls) +- [ ] Implement retry logic (3 attempts with exponential backoff) +- [ ] Log errors with prompts used + +### 2. Create Theme Prompt Generation +**Effort:** 2 story points + +- [ ] Create `src/generation/prompts/image_theme_generation.json` prompt template +- [ ] Implement theme prompt generation using AI (OpenRouter) +- [ ] Cache theme prompt in `Project.image_theme_prompt` field +- [ ] Generate only once per project (reuse for all articles) +- [ ] Update `Project` model with `image_theme_prompt` field +- [ ] Create migration script for database schema update + +### 3. Extend Job Configuration +**Effort:** 2 story points + +- [ ] Update `TierConfig` in `src/generation/job_config.py`: + - Parse `image_config` from JSON + - Provide defaults if missing + - Validate min/max ranges +- [ ] Support `hero` and `content` sub-configs +- [ ] Handle missing `image_config` gracefully (use tier defaults) + +### 4. Implement Image Upload & Storage +**Effort:** 2 story points + +- [ ] Extend `BunnyStorageClient` or create image upload method +- [ ] Upload to `images/` subdirectory in storage zone +- [ ] Check if file exists before upload (use existing if collision) +- [ ] Generate full public URL for database storage +- [ ] Handle file naming (slugify main_keyword, entities, related_searches) +- [ ] Return URL for database storage + +### 5. Implement HTML Image Insertion +**Effort:** 3 story points + +- [ ] Create `src/generation/image_injection.py` module +- [ ] Implement `insert_hero_after_h1(html, hero_url, alt_text) -> str` +- [ ] Implement `insert_content_images_after_h2s(html, image_urls, alt_texts) -> str` +- [ ] Parse HTML to find H1/H2 tags +- [ ] Distribute content images evenly across H2 sections +- [ ] Generate alt text: 3 random entities + 2 random related searches +- [ ] Preserve HTML structure and formatting + +### 6. Integrate into Article Generation Flow +**Effort:** 3 story points + +- [ ] Update `BatchProcessor._generate_single_article()`: + - Generate hero image after title is available (all tiers) + - Generate content images after content generation (T1 only) + - Insert images into HTML before saving +- [ ] Update `BatchProcessor._generate_single_article_thread_safe()` (same changes) +- [ ] Handle tier-specific image config +- [ ] Store image URLs in database (hero_image_url, content_images JSON) +- [ ] Continue on image generation failure (log but don't block) + +### 7. Database Schema Updates +**Effort:** 2 story points + +- [ ] Update `Project` model: Add `image_theme_prompt` field +- [ ] Update `GeneratedContent` model: + - Add `hero_image_url` field + - Add `content_images` field (JSON) +- [ ] Create migration script `scripts/migrate_add_image_fields.py` +- [ ] Update repository interfaces if needed + +### 8. Title Truncation Logic +**Effort:** 1 story point + +- [ ] Implement title truncation to 3-4 words (extract first 3-4 words) +- [ ] Convert to UPPERCASE +- [ ] Handle edge cases (short titles, special characters) +- [ ] Use in hero image prompt generation + +### 9. Environment Variable Setup +**Effort:** 1 story point + +- [ ] Add `FAL_API_KEY` to `env.example` +- [ ] Document in README or setup guide +- [ ] Validate key exists before image generation + +### 10. Unit Tests +**Effort:** 3 story points + +- [ ] Test `ImageGenerator` class (mock fal.ai API) +- [ ] Test theme prompt generation and caching +- [ ] Test title truncation logic +- [ ] Test HTML image insertion +- [ ] Test job config parsing +- [ ] Test error handling and retry logic +- [ ] Achieve >80% code coverage + +### 11. Integration Tests +**Effort:** 2 story points + +- [ ] Test end-to-end image generation for small batch +- [ ] Test hero image generation (all tiers) +- [ ] Test content image generation (T1 only) +- [ ] Test image insertion into HTML +- [ ] Test database storage of image URLs +- [ ] Test file collision handling +- [ ] Test graceful failure (API errors) + +## Technical Notes + +### fal.ai API Integration + +**Installation:** +```bash +pip install fal-client +``` + +**Environment Variable:** +```bash +FAL_API_KEY=your_fal_api_key_here +``` +Note: The environment variable is `FAL_API_KEY` (not `FAL_KEY`). + +**API Usage:** +```python +import fal_client + +result = fal_client.subscribe( + "fal-ai/flux-1/schnell", + arguments={ + "prompt": "Your prompt here", + "image_size": {"width": 1280, "height": 720}, + "num_inference_steps": 4, + "guidance_scale": 3.5, + "output_format": "jpeg" + }, + with_logs=True +) + +# result["images"][0]["url"] contains the image URL +# Download image from URL or use data URI if sync_mode=True +``` + +**Concurrency:** +- Max 5 concurrent calls (same pattern as OpenRouter) +- Use ThreadPoolExecutor or similar +- Queue requests if limit exceeded + +### Theme Prompt Generation + +**Prompt Template:** +```json +{ + "system_message": "You are an expert at creating visual style descriptions for professional images.", + "user_prompt": "Create a concise visual style description (2-3 sentences) for professional images related to: {main_keyword}\n\nEntities: {entities}\nRelated searches: {related_searches}\n\nReturn only the style description, focusing on visual elements like lighting, environment, color scheme, and overall aesthetic. This will be used as a base for all images in this project.\n\nExample format: 'Professional industrial scene, modern manufacturing equipment, clean factory environment, high-quality industrial photography, dramatic lighting.'" +} +``` + +**Caching:** +- Store in `Project.image_theme_prompt` field +- Generate once per project +- Reuse for all articles in project + +### Image Naming Convention + +**Hero Image:** +- Format: `{main-keyword}.jpg` +- Example: `shaft-machining.jpg` +- Slugify main_keyword + +**Content Images:** +- Format: `{main-keyword}-{entity}-{related-search}.jpg` +- Example: `shaft-machining-cnc-lathe-precision-turning.jpg` +- Pick random entity from `project.entities` +- Pick random related_search from `project.related_searches` +- Slugify all parts +- Handle missing entities/related_searches gracefully + +### HTML Insertion Logic + +**Hero Image:** +- Find first `

` tag in HTML +- Insert `` tag immediately after closing `

` +- Format: `{alt_text}` + +**Content Images:** +- Find all `

` sections +- Distribute images evenly across H2s +- If 1 image: after first H2 +- If 2 images: after first and second H2 +- If 3 images: after first, second, third H2 +- Insert after closing `

`, before next content + +**Alt Text Generation:** +- Pick 3 random entities from `project.entities` +- Pick 2 random related_searches from `project.related_searches` +- Format: `alt="entity1 related_search1 entity2 related_search2 entity3"` +- Handle missing entities/searches gracefully + +### Job JSON Example + +```json +{ + "project_id": 1, + "tiers": { + "tier1": { + "count": 5, + "image_config": { + "hero": { + "width": 1280, + "height": 720 + }, + "content": { + "min_num_images": 1, + "max_num_images": 3, + "width": 512, + "height": 512 + } + } + }, + "tier2": { + "count": 20, + "image_config": { + "hero": { + "width": 1280, + "height": 720 + } + } + } + } +} +``` + +### Database Schema Updates + +```sql +-- Add theme prompt to projects +ALTER TABLE projects ADD COLUMN image_theme_prompt TEXT NULL; + +-- Add image fields to generated_content +ALTER TABLE generated_content ADD COLUMN hero_image_url TEXT NULL; +ALTER TABLE generated_content ADD COLUMN content_images JSON NULL; +``` + +## Dependencies +- fal.ai API account and `FAL_API_KEY` +- `fal-client` Python package +- OpenRouter API for theme prompt generation +- Bunny.net storage for image hosting +- Story 4.1: Deployment infrastructure (for image uploads) + +## Future Considerations +- Image optimization (compression, WebP conversion) +- CDN caching for images +- Image format conversion (PNG, WebP options) +- Custom image styles per project +- Image regeneration on content updates +- Image analytics (views, engagement) + +## Technical Debt Created +- Image optimization deferred (current: full-size JPGs) +- No CDN cache purging for images +- No image format conversion (always JPG) +- Theme prompt regeneration not implemented (manual update required) +- Image deletion not implemented (orphaned images may accumulate) + +## Total Effort +24 story points + +### Effort Breakdown +1. Image Generation Module (3 points) +2. Theme Prompt Generation (2 points) +3. Job Configuration Extension (2 points) +4. Image Upload & Storage (2 points) +5. HTML Image Insertion (3 points) +6. Article Generation Integration (3 points) +7. Database Schema Updates (2 points) +8. Title Truncation Logic (1 point) +9. Environment Variable Setup (1 point) +10. Unit Tests (3 points) +11. Integration Tests (2 points) + +## Questions & Clarifications + +### Question 1: Image Format +**Status:** ✓ RESOLVED + +Decision: Always JPG +- Consistent format across all images +- Good balance of quality and file size +- Future: Could add format options in job config + +### Question 2: Hero Image Title Truncation +**Status:** ✓ RESOLVED + +Decision: Truncate to 3-4 words, UPPERCASE +- Prevents overly long text in images +- Improves readability +- Example: "Overview of Shaft Machining" → "OVERVIEW OF SHAFT MACHINING" + +### Question 3: Content Image Count +**Status:** ✓ RESOLVED + +Decision: T1 default 1-3 images, configurable per tier +- Job JSON allows fine-grained control +- Defaults: T1 = 1-3, T2/T3 = 0-0 (disabled) +- Can override in job config + +### Question 4: Image Failure Handling +**Status:** ✓ RESOLVED + +Decision: Log errors, continue without images +- Images are enhancements, not required +- Article generation continues even if images fail +- Log includes prompt used for debugging + +## Notes +- Images enhance SEO and user engagement +- fal.ai provides fast generation (<3 seconds per image) +- Theme consistency ensures visual coherence across project +- Job JSON provides flexibility for different use cases +- Graceful degradation if image generation fails +- All API keys from `.env` file only +