19 KiB
Story 3.4: Generate Boilerplate Site Pages
Status
Not Started
Story
As a developer, I want to automatically generate boilerplate about.html, contact.html, and privacy.html pages for each site in my batch, so that the navigation menu links from Story 3.3 work and the sites appear complete.
Context
- Story 3.3 added navigation menus to all HTML templates with links to:
/index.html(homepage)about.html(about page)privacy.html(privacy policy)contact.html(contact page)
- Currently, these pages don't exist, creating broken links
- Each site needs its own set of these pages
- Pages should use the same template as the articles (basic/modern/classic/minimal)
- Content should be generic but professional enough for a real site
- Privacy policy needs to be comprehensive and legally sound (generic template)
Acceptance Criteria
Core Functionality
- A function generates the three boilerplate pages for a given site
- Pages are created AFTER articles are generated but BEFORE deployment
- Each page uses the same template as the articles for that site
- Pages are stored in the database for deployment
- Pages are associated with the correct site (via
site_deployment_id)
Page Content Requirements
About Page (about.html)
- Empty page with just the template applied
- No content text required (just template navigation/structure)
- User can add content later if needed
Contact Page (contact.html)
- Empty page with just the template applied
- No content text required (just template navigation/structure)
- User can add content later if needed
Privacy Policy (privacy.html)
- Option 1 (Minimal): Empty page like about/contact
- No content text required (just template navigation/structure)
- User can add content later if needed
Decision: Start with Option 1 (empty pages) for all three pages. Privacy policy content can be added later via backfill update or manual edit if needed.
Template Integration
- Use same template engine as article content (
src/templating/service.py) - Apply the site's assigned template (basic/modern/classic/minimal)
- Pages should visually match the articles on the same site
- Include navigation menu (which will link to these same pages)
Database Storage
- Create new
site_pagestable (clean separation from articles):id,site_deployment_id,page_type,content,created_at,updated_at- Foreign key to
site_deploymentswith CASCADE delete - Unique constraint on (site_deployment_id, page_type)
- Indexes on site_deployment_id and page_type
- Each site can have one of each page type (about, contact, privacy)
- Pages are fundamentally different from articles, deserve own table
URL Generation
- Pages use simple filenames:
about.html,contact.html,privacy.html - Full URLs:
https://{hostname}/about.html - No slug generation needed (fixed filenames)
- Pages tracked separately from article URLs
Integration Point
- Hook into batch generation workflow in
src/generation/batch_processor.py - After site assignment (Story 3.1) and before deployment (Epic 4)
- Generate pages ONLY for newly created sites (not existing sites)
- One-time backfill script to add pages to all existing imported sites
Two Use Cases
- One-time backfill: Script to generate pages for all existing sites in database (hundreds of sites)
- Ongoing generation: Automatically generate pages only when new sites are created (provision-site, auto_create_sites, etc.)
Tasks / Subtasks
1. Create SitePage Database Table
Effort: 2 story points
- Create new
site_pagestable with schema:id,site_deployment_id,page_type,content,created_at,updated_at
- Add
SitePagemodel tosrc/database/models.py - Create migration script
scripts/migrate_add_site_pages.py - Add unique constraint on (site_deployment_id, page_type)
- Add indexes on site_deployment_id and page_type
- Add CASCADE delete (if site deleted, pages deleted)
- Test migration on development database
2. Create SitePage Repository
Effort: 2 story points
- Create
ISitePageRepositoryinterface insrc/database/interfaces.py:create(site_deployment_id, page_type, content) -> SitePageget_by_site(site_deployment_id) -> List[SitePage]get_by_site_and_type(site_deployment_id, page_type) -> Optional[SitePage]update_content(page_id, content) -> SitePageexists(site_deployment_id, page_type) -> booldelete(page_id) -> bool
- Implement
SitePageRepositoryinsrc/database/repositories.py - Add to repository factory/dependency injection
3. Create Page Content Templates (SIMPLIFIED)
Effort: 1 story point (reduced from 3)
- Create
src/generation/page_templates.pymodule - Implement
get_page_content(page_type: str, domain: str) -> str:- Returns just a heading:
<h1>About Us</h1>,<h1>Contact</h1>,<h1>Privacy Policy</h1> - All three pages use same heading-only approach
- No other content text
- Returns just a heading:
- No need for extensive content generation
- Pages are just placeholders until user adds content manually
4. Implement Page Generation Logic (SIMPLIFIED)
Effort: 2 story points (reduced from 3)
- Create
src/generation/site_page_generator.pymodule - Implement
generate_site_pages(site_deployment: SiteDeployment, template_name: str, page_repo, template_service) -> List[SitePage]:- Get domain from site (custom_hostname or bcdn_hostname)
- For each page type (about, contact, privacy):
- Get heading-only content from
page_templates.py - Wrap heading in HTML template using
template_service - Store page in database
- Get heading-only content from
- Return list of created pages
- Pages have just heading (e.g.,
<h1>About Us</h1>) wrapped in template - Log page generation at INFO level
5. Integrate with Site Creation (Not Batch Processor)
Effort: 2 story points
- Update
src/generation/site_provisioning.py:- After creating new site via bunny.net API, generate boilerplate pages
- Call
generate_site_pages()immediately after site creation - Log page generation results
- Update
provision-siteCLI command:- Generate pages after site is provisioned
- Handle errors gracefully (log warning if page generation fails, continue with site creation)
- DO NOT generate pages in batch processor (only for new sites, not existing sites)
6. Update Template Service
Effort: 1 story point
- Verify
src/templating/service.pycan handle page content:- Pages don't have titles/outlines like articles
- May need simpler template application for pages
- Ensure navigation menu is included
- Add helper method if needed:
apply_template_to_page(content, template_name, domain)
7. Create Backfill Script for Existing Sites
Effort: 2 story points
- Create
scripts/backfill_site_pages.py:- Query all sites in database that don't have pages
- For each site: generate about, contact, privacy pages
- Use default template (or infer from site name if possible)
- Progress reporting (e.g., "Generating pages for site 50/400...")
- Dry-run mode to preview changes
- CLI arguments:
--dry-run,--template,--batch-size
- Add error handling for individual site failures (continue with next site)
- Log results: successful, failed, skipped counts
8. Homepage Generation (Optional - Deferred)
Effort: 2 story points (if implemented)
- DEFER to Epic 4 or later
- Homepage (
index.html) requires knowing all articles on the site - Not needed for Story 3.4 (navigation menu links to
/index.htmlcan 404 for now) - Document in technical notes
9. Unit Tests (SIMPLIFIED)
Effort: 2 story points (reduced from 3)
- Test heading-only page content generation
- Test domain extraction from SiteDeployment (custom vs bcdn hostname)
- Test page HTML wrapping with each template type
- Test SitePage repository CRUD operations
- Test duplicate page prevention (unique constraint)
- Test page generation for single site
- Test backfill script logic
- Mock template service and repositories
- Achieve >80% code coverage for new modules
10. Integration Tests (SIMPLIFIED)
Effort: 1 story point (reduced from 2)
- Test site creation triggers page generation
- Test with different template types (basic, modern, classic, minimal)
- Test with custom domain sites vs bunny.net-only sites
- Test pages stored correctly in database
- Test backfill script on real database
- Verify navigation menu links work (pages exist at expected paths)
Technical Notes
SitePage Model
class SitePage(Base):
__tablename__ = "site_pages"
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
site_deployment_id: Mapped[int] = mapped_column(
Integer,
ForeignKey('site_deployments.id', ondelete='CASCADE'),
nullable=False
)
page_type: Mapped[str] = mapped_column(String(20), nullable=False) # about, contact, privacy, homepage
content: Mapped[str] = mapped_column(Text, nullable=False) # Full HTML
created_at: Mapped[datetime] = mapped_column(DateTime, default=datetime.utcnow, nullable=False)
updated_at: Mapped[datetime] = mapped_column(
DateTime,
default=datetime.utcnow,
onupdate=datetime.utcnow,
nullable=False
)
# Relationships
site_deployment: Mapped["SiteDeployment"] = relationship("SiteDeployment", back_populates="pages")
# Unique constraint
__table_args__ = (
UniqueConstraint('site_deployment_id', 'page_type', name='uq_site_page_type'),
)
Database Migration
CREATE TABLE site_pages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
site_deployment_id INTEGER NOT NULL,
page_type VARCHAR(20) NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (site_deployment_id) REFERENCES site_deployments(id) ON DELETE CASCADE,
UNIQUE (site_deployment_id, page_type)
);
CREATE INDEX idx_site_pages_site ON site_pages(site_deployment_id);
CREATE INDEX idx_site_pages_type ON site_pages(page_type);
Page Content Template Examples (SIMPLIFIED)
Implementation - Heading Only
# src/generation/page_templates.py
def get_page_content(page_type: str, domain: str) -> str:
"""
Generate minimal content for boilerplate pages.
Just a heading - no other content text.
"""
page_titles = {
"about": "About Us",
"contact": "Contact",
"privacy": "Privacy Policy"
}
return f"<h1>{page_titles.get(page_type, page_type.title())}</h1>"
Result - Heading Only Example
<!DOCTYPE html>
<html>
<head>
<title>About Us</title>
<!-- Same template/styling as articles -->
</head>
<body>
<nav>
<ul>
<li><a href="/index.html">Home</a></li>
<li><a href="about.html">About</a></li>
<li><a href="privacy.html">Privacy</a></li>
<li><a href="contact.html">Contact</a></li>
</ul>
</nav>
<main>
<h1>About Us</h1>
<!-- No other content - user can add later if needed -->
</main>
</body>
</html>
Why Heading-Only Pages Work
- Fixes broken nav links - Pages exist, no 404 errors
- Better UX than completely empty - User sees something when they click the link
- User can customize - Add content manually later for specific sites
- Minimal effort - No need to generate/maintain generic content
- Deployment ready - Pages can be deployed as-is
- Future enhancement - Can add content generation later if needed
Integration with Site Creation
# In src/generation/site_provisioning.py
def create_bunnynet_site(name_prefix: str, region: str = "DE", template: str = "basic"):
# Step 1: Create Storage Zone
storage = bunny_client.create_storage_zone(...)
# Step 2: Create Pull Zone
pull = bunny_client.create_pull_zone(...)
# Step 3: Save to database
site = site_repo.create(...)
# Step 4: Generate boilerplate pages (NEW - Story 3.4)
logger.info(f"Generating boilerplate pages for new site {site.id}...")
try:
generate_site_pages(site, template, page_repo, template_service)
logger.info(f"Successfully created about, contact, privacy pages for site {site.id}")
except Exception as e:
logger.warning(f"Failed to generate pages for site {site.id}: {e}")
# Don't fail site creation if page generation fails
return site
Backfill Script Usage
# One-time backfill for all existing sites (dry-run first)
uv run python scripts/backfill_site_pages.py \
--username admin \
--password yourpass \
--template basic \
--dry-run
# Output:
# Found 423 sites without boilerplate pages
# [DRY RUN] Would generate pages for site 1 (www.example.com)
# [DRY RUN] Would generate pages for site 2 (site123.b-cdn.net)
# ...
# [DRY RUN] Total: 423 sites would be updated
# Actually generate pages
uv run python scripts/backfill_site_pages.py \
--username admin \
--password yourpass \
--template basic
# Output:
# Generating pages for site 1/423 (www.example.com)... ✓
# Generating pages for site 2/423 (site123.b-cdn.net)... ✓
# ...
# Complete: 423 successful, 0 failed, 0 skipped
# Use different template per site (default: basic)
uv run python scripts/backfill_site_pages.py \
--username admin \
--password yourpass \
--template modern \
--batch-size 50 # Process 50 sites at a time
Page URL Structure
Homepage: https://example.com/index.html
About: https://example.com/about.html
Contact: https://example.com/contact.html
Privacy: https://example.com/privacy.html
Article 1: https://example.com/how-to-fix-engines.html
Article 2: https://example.com/engine-maintenance-tips.html
Template Application Example
# For articles (existing)
template_service.apply_template(
content=article.content,
template_name="modern",
title=article.title,
meta_description=article.meta_description,
url=article_url
)
# For pages (new)
template_service.apply_template_to_page(
content=page_content, # Markdown or HTML from page_templates.py
template_name="modern",
page_title="About Us", # Static title
domain=site.custom_hostname or site.pull_zone_bcdn_hostname
)
Backfill Script Implementation
# scripts/backfill_site_pages.py
def backfill_site_pages(
page_repo,
site_repo,
template_service,
template: str = "basic",
dry_run: bool = False,
batch_size: int = 100
):
"""Generate boilerplate pages for all sites that don't have them"""
# Get all sites
all_sites = site_repo.get_all()
logger.info(f"Found {len(all_sites)} total sites in database")
# Filter to sites without pages
sites_needing_pages = []
for site in all_sites:
existing_pages = page_repo.get_by_site(site.id)
if len(existing_pages) < 3: # Should have about, contact, privacy
sites_needing_pages.append(site)
logger.info(f"Found {len(sites_needing_pages)} sites without boilerplate pages")
if dry_run:
for site in sites_needing_pages:
domain = site.custom_hostname or site.pull_zone_bcdn_hostname
logger.info(f"[DRY RUN] Would generate pages for site {site.id} ({domain})")
logger.info(f"[DRY RUN] Total: {len(sites_needing_pages)} sites would be updated")
return
# Generate pages for each site
successful = 0
failed = 0
for idx, site in enumerate(sites_needing_pages, 1):
domain = site.custom_hostname or site.pull_zone_bcdn_hostname
logger.info(f"Generating pages for site {idx}/{len(sites_needing_pages)} ({domain})...")
try:
generate_site_pages(site, template, page_repo, template_service)
successful += 1
except Exception as e:
logger.error(f"Failed to generate pages for site {site.id}: {e}")
failed += 1
# Progress checkpoint every batch_size sites
if idx % batch_size == 0:
logger.info(f"Progress: {idx}/{len(sites_needing_pages)} sites processed")
logger.info(f"Complete: {successful} successful, {failed} failed")
Domain Extraction
def get_domain_from_site(site_deployment: SiteDeployment) -> str:
"""Extract domain for use in page content (email addresses, etc.)"""
if site_deployment.custom_hostname:
return site_deployment.custom_hostname
else:
return site_deployment.pull_zone_bcdn_hostname
Privacy Policy Legal Note
The privacy policy template should be:
- Generic enough to apply to blog/content sites
- Comprehensive enough to cover common scenarios (cookies, analytics, third-party links)
- NOT legal advice - users should consult a lawyer for specific requirements
- Include standard disclaimers
- Regularly reviewed and updated (document version/date)
Recommended approach: Use a well-tested generic template from a reputable source (e.g., Privacy Policy Generator) and adapt it to fit our template structure.
Dependencies
- Story 3.1: Site assignment must be complete (need to know which sites are in use)
- Story 3.3: Navigation menu is already in templates (pages fulfill those links)
- Story 2.4: Template service exists and can apply HTML templates
- Story 1.6: SiteDeployment table exists
Future Considerations
- Story 4.1 will deploy these pages along with articles
- Future: Custom page content per project (override generic templates)
- Future: Homepage generation with dynamic article listing
- Future: Allow users to edit boilerplate page content via CLI or web interface
- Future: Additional pages (terms of service, disclaimer, etc.)
- Future: Page templates with more customization options (site name, tagline, etc.)
Deferred to Later
- Homepage (
index.html) generation - Could be part of this story or deferred to Epic 4- If generated here: Simple page listing all articles on the site
- If deferred: Epic 4 deployment could create a basic redirect or placeholder
- Custom page content per project - Allow projects to override default templates
- Multi-language support - Generate pages in different languages based on project settings
Total Effort
15 story points (reduced from 20 due to heading-only simplification)
Effort Breakdown
- Database Schema (2 points)
- Repository Layer (2 points)
- Page Content Templates (1 point)
- Generation Logic (2 points)
- Site Creation Integration (2 points)
- Template Service Updates (1 point)
- Backfill Script (2 points)
- Homepage Generation (deferred)
- Unit Tests (2 points)
- Integration Tests (1 point)
Total: 15 story points
Effort Reduction
Original estimate: 20 story points (with full page content) Simplified (heading-only pages): 15 story points Savings: 5 story points (no complex content generation needed)
Notes
- Pages should be visually consistent with articles (same template)
- Pages have heading only - just
<h1>tag, no body content - Better UX than completely empty (user sees page title when they click nav link)
- User can manually add content later for specific sites if desired
- Pages are generated once per site at creation time
- Future enhancement: Add content generation for privacy policy if legally required
- Future enhancement: CLI command to update page content for specific sites