# Story 3.4: Generate Boilerplate Site Pages ## Status **QA COMPLETE** - Ready for Production ## Story **As a developer**, I want to automatically generate boilerplate `about.html`, `contact.html`, and `privacy.html` pages for each site in my batch, so that the navigation menu links from Story 3.3 work and the sites appear complete. ## Context - Story 3.3 added navigation menus to all HTML templates with links to: - `/index.html` (homepage) - `about.html` (about page) - `privacy.html` (privacy policy) - `contact.html` (contact page) - Currently, these pages don't exist, creating broken links - Each site needs its own set of these pages - Pages should use the same template as the articles (basic/modern/classic/minimal) - Content should be generic but professional enough for a real site - Privacy policy needs to be comprehensive and legally sound (generic template) ## Acceptance Criteria ### Core Functionality - A function generates the three boilerplate pages for a given site - Pages are created AFTER articles are generated but BEFORE deployment - Each page uses the same template as the articles for that site - Pages are stored in the database for deployment - Pages are associated with the correct site (via `site_deployment_id`) ### Page Content Requirements #### About Page (`about.html`) - Empty page with just the template applied - No content text required (just template navigation/structure) - User can add content later if needed #### Contact Page (`contact.html`) - Empty page with just the template applied - No content text required (just template navigation/structure) - User can add content later if needed #### Privacy Policy (`privacy.html`) - **Option 1 (Minimal):** Empty page like about/contact - No content text required (just template navigation/structure) - User can add content later if needed **Decision:** Start with Option 1 (empty pages) for all three pages. Privacy policy content can be added later via backfill update or manual edit if needed. ### Template Integration - Use same template engine as article content (`src/templating/service.py`) - Read template from `site.template_name` field in database - Pages use same template as articles on the same site (consistent look) - Include navigation menu (which will link to these same pages) ### Database Storage - Create new `site_pages` table (clean separation from articles): - `id`, `site_deployment_id`, `page_type`, `content`, `created_at`, `updated_at` - Foreign key to `site_deployments` with CASCADE delete - Unique constraint on (site_deployment_id, page_type) - Indexes on site_deployment_id and page_type - Each site can have one of each page type (about, contact, privacy) - Pages are fundamentally different from articles, deserve own table ### URL Generation - Pages use simple filenames: `about.html`, `contact.html`, `privacy.html` - Full URLs: `https://{hostname}/about.html` - No slug generation needed (fixed filenames) - Pages tracked separately from article URLs ### Integration Point - Hook into batch generation workflow in `src/generation/batch_processor.py` - After site assignment (Story 3.1) and before deployment (Epic 4) - Generate pages ONLY for newly created sites (not existing sites) - One-time backfill script to add pages to all existing imported sites ### Two Use Cases 1. **One-time backfill**: Script to generate pages for all existing sites in database (hundreds of sites) 2. **Ongoing generation**: Automatically generate pages only when new sites are created (provision-site, auto_create_sites, etc.) ## Tasks / Subtasks ### 1. Create SitePage Database Table **Effort:** 2 story points - [ ] Create new `site_pages` table with schema: - `id`, `site_deployment_id`, `page_type`, `content`, `created_at`, `updated_at` - [ ] Add `SitePage` model to `src/database/models.py` - [ ] Create migration script `scripts/migrate_add_site_pages.py` - [ ] Add unique constraint on (site_deployment_id, page_type) - [ ] Add indexes on site_deployment_id and page_type - [ ] Add CASCADE delete (if site deleted, pages deleted) - [ ] Test migration on development database ### 2. Create SitePage Repository **Effort:** 2 story points - [ ] Create `ISitePageRepository` interface in `src/database/interfaces.py`: - `create(site_deployment_id, page_type, content) -> SitePage` - `get_by_site(site_deployment_id) -> List[SitePage]` - `get_by_site_and_type(site_deployment_id, page_type) -> Optional[SitePage]` - `update_content(page_id, content) -> SitePage` - `exists(site_deployment_id, page_type) -> bool` - `delete(page_id) -> bool` - [ ] Implement `SitePageRepository` in `src/database/repositories.py` - [ ] Add to repository factory/dependency injection ### 3. Create Page Content Templates (SIMPLIFIED) **Effort:** 1 story point (reduced from 3) - [ ] Create `src/generation/page_templates.py` module - [ ] Implement `get_page_content(page_type: str, domain: str) -> str`: - Returns just a heading: `

About Us

`, `

Contact

`, `

Privacy Policy

` - All three pages use same heading-only approach - No other content text - [ ] No need for extensive content generation - [ ] Pages are just placeholders until user adds content manually ### 4. Implement Page Generation Logic (SIMPLIFIED) **Effort:** 2 story points (reduced from 3) - [ ] Create `src/generation/site_page_generator.py` module - [ ] Implement `generate_site_pages(site_deployment: SiteDeployment, template_name: str, page_repo, template_service) -> List[SitePage]`: - Get domain from site (custom_hostname or bcdn_hostname) - For each page type (about, contact, privacy): - Get heading-only content from `page_templates.py` - Wrap heading in HTML template using `template_service` - Store page in database - Return list of created pages - [ ] Pages have just heading (e.g., `

About Us

`) wrapped in template - [ ] Log page generation at INFO level ### 5. Integrate with Site Creation (Not Batch Processor) **Effort:** 2 story points - [ ] Update `src/generation/site_provisioning.py`: - After creating new site via bunny.net API, generate boilerplate pages - Call `generate_site_pages()` immediately after site creation - Log page generation results - [ ] Update `provision-site` CLI command: - Generate pages after site is provisioned - [ ] Handle errors gracefully (log warning if page generation fails, continue with site creation) - [ ] **DO NOT generate pages in batch processor** (only for new sites, not existing sites) ### 6. Update Template Service (No Changes Needed) **Effort:** 0 story points - [x] Template service already handles simple content - [x] Just pass heading HTML through existing `format_content()` method - [x] No changes needed to template service ### 7. Create Backfill Script for Existing Sites **Effort:** 2 story points - [ ] Create `scripts/backfill_site_pages.py`: - Query all sites in database that don't have pages - For each site: generate about, contact, privacy pages - Use default template (or infer from site name if possible) - Progress reporting (e.g., "Generating pages for site 50/400...") - Dry-run mode to preview changes - CLI arguments: `--dry-run`, `--template`, `--batch-size` - [ ] Add error handling for individual site failures (continue with next site) - [ ] Log results: successful, failed, skipped counts ### 8. Homepage Generation (Optional - Deferred) **Effort:** 2 story points (if implemented) - [ ] **DEFER to Epic 4 or later** - [ ] Homepage (`index.html`) requires knowing all articles on the site - [ ] Not needed for Story 3.4 (navigation menu links to `/index.html` can 404 for now) - [ ] Document in technical notes ### 9. Unit Tests (SIMPLIFIED) **Effort:** 2 story points (reduced from 3) - [ ] Test heading-only page content generation - [ ] Test domain extraction from SiteDeployment (custom vs bcdn hostname) - [ ] Test page HTML wrapping with each template type - [ ] Test SitePage repository CRUD operations - [ ] Test duplicate page prevention (unique constraint) - [ ] Test page generation for single site - [ ] Test backfill script logic - [ ] Mock template service and repositories - [ ] Achieve >80% code coverage for new modules ### 10. Integration Tests (SIMPLIFIED) **Effort:** 1 story point (reduced from 2) - [ ] Test site creation triggers page generation - [ ] Test with different template types (basic, modern, classic, minimal) - [ ] Test with custom domain sites vs bunny.net-only sites - [ ] Test pages stored correctly in database - [ ] Test backfill script on real database - [ ] Verify navigation menu links work (pages exist at expected paths) ## Technical Notes ### SitePage Model ```python class SitePage(Base): __tablename__ = "site_pages" id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) site_deployment_id: Mapped[int] = mapped_column( Integer, ForeignKey('site_deployments.id', ondelete='CASCADE'), nullable=False ) page_type: Mapped[str] = mapped_column(String(20), nullable=False) # about, contact, privacy, homepage content: Mapped[str] = mapped_column(Text, nullable=False) # Full HTML created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False) updated_at: Mapped[datetime] = mapped_column( DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False ) # Relationships site_deployment: Mapped["SiteDeployment"] = relationship("SiteDeployment", back_populates="pages") # Unique constraint __table_args__ = ( UniqueConstraint('site_deployment_id', 'page_type', name='uq_site_page_type'), ) ``` ### Database Migration ```sql CREATE TABLE site_pages ( id INTEGER PRIMARY KEY AUTOINCREMENT, site_deployment_id INTEGER NOT NULL, page_type VARCHAR(20) NOT NULL, content TEXT NOT NULL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (site_deployment_id) REFERENCES site_deployments(id) ON DELETE CASCADE, UNIQUE (site_deployment_id, page_type) ); CREATE INDEX idx_site_pages_site ON site_pages(site_deployment_id); CREATE INDEX idx_site_pages_type ON site_pages(page_type); ``` ### Page Content Template Examples (SIMPLIFIED) #### Implementation - Heading Only ```python # src/generation/page_templates.py def get_page_content(page_type: str, domain: str) -> str: """ Generate minimal content for boilerplate pages. Just a heading - no other content text. """ page_titles = { "about": "About Us", "contact": "Contact", "privacy": "Privacy Policy" } return f"

{page_titles.get(page_type, page_type.title())}

" ``` #### Result - Heading Only Example ```html About Us

About Us

``` #### Why Heading-Only Pages Work 1. **Fixes broken nav links** - Pages exist, no 404 errors 2. **Better UX than completely empty** - User sees something when they click the link 3. **User can customize** - Add content manually later for specific sites 4. **Minimal effort** - No need to generate/maintain generic content 5. **Deployment ready** - Pages can be deployed as-is 6. **Future enhancement** - Can add content generation later if needed ### Integration with Site Creation ```python # In src/generation/site_provisioning.py def create_bunnynet_site(name_prefix: str, region: str = "DE", template: str = "basic"): # Step 1: Create Storage Zone storage = bunny_client.create_storage_zone(...) # Step 2: Create Pull Zone pull = bunny_client.create_pull_zone(...) # Step 3: Save to database site = site_repo.create(...) # Step 4: Generate boilerplate pages (NEW - Story 3.4) logger.info(f"Generating boilerplate pages for new site {site.id}...") try: generate_site_pages(site, template, page_repo, template_service) logger.info(f"Successfully created about, contact, privacy pages for site {site.id}") except Exception as e: logger.warning(f"Failed to generate pages for site {site.id}: {e}") # Don't fail site creation if page generation fails return site ``` ### Backfill Script Usage ```bash # One-time backfill for all existing sites (dry-run first) uv run python scripts/backfill_site_pages.py \ --username admin \ --password yourpass \ --template basic \ --dry-run # Output: # Found 423 sites without boilerplate pages # [DRY RUN] Would generate pages for site 1 (www.example.com) # [DRY RUN] Would generate pages for site 2 (site123.b-cdn.net) # ... # [DRY RUN] Total: 423 sites would be updated # Actually generate pages uv run python scripts/backfill_site_pages.py \ --username admin \ --password yourpass \ --template basic # Output: # Generating pages for site 1/423 (www.example.com)... ✓ # Generating pages for site 2/423 (site123.b-cdn.net)... ✓ # ... # Complete: 423 successful, 0 failed, 0 skipped # Use different template per site (default: basic) uv run python scripts/backfill_site_pages.py \ --username admin \ --password yourpass \ --template modern \ --batch-size 50 # Process 50 sites at a time ``` ### Page URL Structure ``` Homepage: https://example.com/index.html About: https://example.com/about.html Contact: https://example.com/contact.html Privacy: https://example.com/privacy.html Article 1: https://example.com/how-to-fix-engines.html Article 2: https://example.com/engine-maintenance-tips.html ``` ### Template Application Example ```python # For articles (existing) template_service.apply_template( content=article.content, template_name="modern", title=article.title, meta_description=article.meta_description, url=article_url ) # For pages (new) template_service.apply_template_to_page( content=page_content, # Markdown or HTML from page_templates.py template_name="modern", page_title="About Us", # Static title domain=site.custom_hostname or site.pull_zone_bcdn_hostname ) ``` ### Backfill Script Implementation ```python # scripts/backfill_site_pages.py def backfill_site_pages( page_repo, site_repo, template_service, template: str = "basic", dry_run: bool = False, batch_size: int = 100 ): """Generate boilerplate pages for all sites that don't have them""" # Get all sites all_sites = site_repo.get_all() logger.info(f"Found {len(all_sites)} total sites in database") # Filter to sites without pages sites_needing_pages = [] for site in all_sites: existing_pages = page_repo.get_by_site(site.id) if len(existing_pages) < 3: # Should have about, contact, privacy sites_needing_pages.append(site) logger.info(f"Found {len(sites_needing_pages)} sites without boilerplate pages") if dry_run: for site in sites_needing_pages: domain = site.custom_hostname or site.pull_zone_bcdn_hostname logger.info(f"[DRY RUN] Would generate pages for site {site.id} ({domain})") logger.info(f"[DRY RUN] Total: {len(sites_needing_pages)} sites would be updated") return # Generate pages for each site successful = 0 failed = 0 for idx, site in enumerate(sites_needing_pages, 1): domain = site.custom_hostname or site.pull_zone_bcdn_hostname logger.info(f"Generating pages for site {idx}/{len(sites_needing_pages)} ({domain})...") try: generate_site_pages(site, template, page_repo, template_service) successful += 1 except Exception as e: logger.error(f"Failed to generate pages for site {site.id}: {e}") failed += 1 # Progress checkpoint every batch_size sites if idx % batch_size == 0: logger.info(f"Progress: {idx}/{len(sites_needing_pages)} sites processed") logger.info(f"Complete: {successful} successful, {failed} failed") ``` ### Domain Extraction ```python def get_domain_from_site(site_deployment: SiteDeployment) -> str: """Extract domain for use in page content (email addresses, etc.)""" if site_deployment.custom_hostname: return site_deployment.custom_hostname else: return site_deployment.pull_zone_bcdn_hostname ``` ### Privacy Policy Legal Note The privacy policy template should be: - Generic enough to apply to blog/content sites - Comprehensive enough to cover common scenarios (cookies, analytics, third-party links) - NOT legal advice - users should consult a lawyer for specific requirements - Include standard disclaimers - Regularly reviewed and updated (document version/date) Recommended approach: Use a well-tested generic template from a reputable source (e.g., Privacy Policy Generator) and adapt it to fit our template structure. ## Dependencies - Story 3.1: Site assignment must be complete (need to know which sites are in use) - Story 3.3: Navigation menu is already in templates (pages fulfill those links) - Story 2.4: Template service exists and can apply HTML templates - Story 1.6: SiteDeployment table exists ## Future Considerations - Story 4.1 will deploy these pages along with articles - Future: Custom page content per project (override generic templates) - Future: Homepage generation with dynamic article listing - Future: Allow users to edit boilerplate page content via CLI or web interface - Future: Additional pages (terms of service, disclaimer, etc.) - Future: Page templates with more customization options (site name, tagline, etc.) ## Deferred to Later - **Homepage (`index.html`) generation** - Could be part of this story or deferred to Epic 4 - If generated here: Simple page listing all articles on the site - If deferred: Epic 4 deployment could create a basic redirect or placeholder - **Custom page content per project** - Allow projects to override default templates - **Multi-language support** - Generate pages in different languages based on project settings ## Total Effort 14 story points (reduced from 20 due to heading-only simplification and no template service changes) ### Effort Breakdown 1. Database Schema (2 points) - site_pages table only 2. Repository Layer (2 points) - SitePageRepository 3. Page Content Templates (1 point) - heading-only 4. Generation Logic (2 points) - reads site.template_name from DB 5. Site Creation Integration (2 points) 6. Template Service Updates (0 points) - no changes needed 7. Backfill Script (2 points) 8. Homepage Generation (deferred) 9. Unit Tests (2 points) 10. Integration Tests (1 point) **Total: 14 story points** ### Effort Reduction Original estimate: 20 story points (with full page content) Simplified (heading-only pages): 15 story points Savings: 5 story points (no complex content generation needed) ## Notes - Pages should be visually consistent with articles (same template) - **Pages have heading only** - just `

` tag, no body content - Better UX than completely empty (user sees page title when they click nav link) - User can manually add content later for specific sites if desired - Pages are generated once per site at creation time - Future enhancement: Add content generation for privacy policy if legally required - Future enhancement: CLI command to update page content for specific sites