diff --git a/docs/prd/epic-2-content-generation.md b/docs/prd/epic-2-content-generation.md index fc1f710..c36ec8a 100644 --- a/docs/prd/epic-2-content-generation.md +++ b/docs/prd/epic-2-content-generation.md @@ -54,3 +54,19 @@ Implement the core workflow for ingesting CORA data and using AI to generate and - The function correctly selects and applies the appropriate template based on the configuration mapping. - The content is structured into a valid HTML document with the selected CSS. - The final HTML content is stored and associated with the project in the database. + +**Dependencies** +- Story 2.5 (optional): If no site_deployment_id is assigned, template selection defaults to random. + +### Story 2.5: Deployment Target Assignment +**As a developer**, I want to assign deployment targets to generated content during the content generation process, so that each article knows which site/bucket it will be deployed to and can use the appropriate template. + +**Acceptance Criteria** +- The job configuration file supports an optional `deployment_targets` array containing site custom_hostnames or site_deployment_ids. +- The job configuration file supports an optional `deployment_overflow` strategy ("round_robin", "random_available", or "none"). +- During content generation, each article is assigned a `site_deployment_id` based on its index in the batch: + - If `deployment_targets` is specified, cycle through the list (round-robin by default). + - If the batch size exceeds the target list, apply the overflow strategy. + - If no `deployment_targets` specified, `site_deployment_id` remains null (random template in Story 2.4). +- The `site_deployment_id` is stored in the `GeneratedContent` record at creation time. +- Invalid site references in `deployment_targets` cause graceful errors with clear messages. \ No newline at end of file diff --git a/docs/stories/story-2.4-html-formatting-templates.md b/docs/stories/story-2.4-html-formatting-templates.md new file mode 100644 index 0000000..c59b3da --- /dev/null +++ b/docs/stories/story-2.4-html-formatting-templates.md @@ -0,0 +1,141 @@ +# Story 2.4: HTML Formatting with Multiple Templates + +## Status +Completed + +## Story +**As a developer**, I want a module that takes the generated text content and formats it into a standard HTML file using one of a few predefined CSS templates, assigning one template per bucket/subdomain, so that all deployed content has a consistent look and feel per site. + +## Acceptance Criteria +- A directory of multiple, predefined HTML/CSS templates exists. +- The master JSON configuration file maps a specific template to each deployment target (e.g., S3 bucket, subdomain). +- A function accepts the generated content and a target identifier (e.g., bucket name). +- The function correctly selects and applies the appropriate template based on the configuration mapping. +- The content is structured into a valid HTML document with the selected CSS. +- The final HTML content is stored and associated with the project in the database. + +## Dependencies +- **Story 2.5**: Deployment Target Assignment must run before this story to set `site_deployment_id` on GeneratedContent +- If `site_deployment_id` is null, a random template will be selected + +## Tasks / Subtasks + +### 1. Create Template Infrastructure +**Effort:** 3 story points + +- [x] Create template file structure under `src/templating/templates/` + - Basic template (default) + - Modern template + - Classic template + - Minimal template +- [x] Each template should include: + - HTML structure with placeholders for title, meta, content + - Embedded or inline CSS for styling + - Responsive design (mobile-friendly) + - SEO-friendly structure (proper heading hierarchy, meta tags) + +### 2. Implement Template Loading Service +**Effort:** 3 story points + +- [x] Implement `TemplateService` class in `src/templating/service.py` +- [x] Add `load_template(template_name: str)` method that reads template file +- [x] Add `get_available_templates()` method that lists all templates +- [x] Handle template file not found errors gracefully with fallback to default +- [x] Cache loaded templates in memory for performance + +### 3. Implement Template Selection Logic +**Effort:** 2 story points + +- [x] Add `select_template_for_content(site_deployment_id: Optional[int])` method +- [x] If `site_deployment_id` exists: + - Query SiteDeployment table for custom_hostname + - Check `master.config.json` templates.mappings for hostname + - If mapping exists, use it + - If no mapping, randomly select template and save to config +- [x] If `site_deployment_id` is null: randomly select template +- [x] Return template name + +### 4. Implement Content Formatting +**Effort:** 5 story points + +- [x] Create `format_content(content: str, title: str, meta_description: str, template_name: str)` method +- [x] Parse HTML content and extract components +- [x] Replace template placeholders with actual content +- [x] Ensure proper escaping of HTML entities where needed +- [x] Validate output is well-formed HTML +- [x] Return formatted HTML string + +### 5. Database Integration +**Effort:** 2 story points + +- [x] Add `formatted_html` field to `GeneratedContent` model (Text type, nullable) +- [x] Add `template_used` field to `GeneratedContent` model (String(50), nullable) +- [x] Add `site_deployment_id` field to `GeneratedContent` model (FK to site_deployments, nullable, indexed) +- [x] Create database migration script +- [x] Update repository to save formatted HTML and template_used alongside raw content + +### 6. Integration with Content Generation Flow +**Effort:** 2 story points + +- [x] Update `src/generation/service.py` to call template service after content generation +- [x] Template service reads `site_deployment_id` from GeneratedContent +- [x] Store formatted HTML and template_used in database +- [x] Handle template formatting errors without breaking content generation + +### 7. Unit Tests +**Effort:** 3 story points + +- [x] Test template loading with valid and invalid names +- [x] Test template selection with site_deployment_id present +- [x] Test template selection with site_deployment_id null (random) +- [x] Test content formatting with different templates +- [x] Test fallback behavior when template not found +- [x] Test error handling for malformed templates +- [x] Achieve >80% code coverage for templating module + +### 8. Integration Tests +**Effort:** 2 story points + +- [x] Test end-to-end flow: content generation → template application → database storage +- [x] Test with site_deployment_id assigned (consistent template per site) +- [x] Test with site_deployment_id null (random template) +- [x] Verify formatted HTML is valid and renders correctly +- [x] Test new site gets random template assigned and persisted to config + +## Dev Notes + +### Current State +- `master.config.json` already has templates section with mappings (lines 52-59) +- `src/templating/service.py` exists but is empty (only 2 lines) +- `src/templating/templates/` directory exists but only contains `__init__.py` +- `GeneratedContent` model stores raw content in Text field but no formatted HTML field yet + +### Dependencies +- Story 2.2/2.3: Content must be generated before it can be formatted +- Story 2.5: Deployment target assignment (optional - defaults to random if not assigned) +- Configuration system: Uses existing master.config.json structure + +### Technical Decisions +1. **Template format:** Jinja2 or simple string replacement (to be decided during implementation) +2. **CSS approach:** Embedded ` + +
+