651 lines
24 KiB
Markdown
651 lines
24 KiB
Markdown
# Story 4.1: Deploy Content to Cloud Storage
|
|
|
|
## Status
|
|
**DRAFT** - Needs Review
|
|
|
|
## Story
|
|
**As a developer**, I want to upload all generated HTML files for a batch to their designated Bunny.net storage buckets so that the content is hosted and ready to be served.
|
|
|
|
## Context
|
|
- Epic 4 is about deploying finalized content to cloud storage
|
|
- Story 3.4 implemented boilerplate site pages (about, contact, privacy)
|
|
- Articles have URLs and are assigned to sites (Story 3.1)
|
|
- Interlinking is complete (Story 3.3)
|
|
- Content is ready to deploy after batch processing completes
|
|
- Bunny.net is the only cloud provider for now (multi-cloud is technical debt)
|
|
|
|
## Acceptance Criteria
|
|
|
|
### Core Deployment Functionality
|
|
- CLI command `deploy-batch --batch_id <id>` deploys all content in a batch
|
|
- Deployment is also triggered automatically after batch generation completes
|
|
- Deployment uploads both articles and boilerplate pages (about, contact, privacy)
|
|
- For boilerplate pages: Check `site_pages` table, deploy pages that exist
|
|
- Read HTML content directly from `site_pages.content` field (stored in Story 3.4)
|
|
- Authentication uses:
|
|
- `BUNNY_API_KEY` from `.env` (storage API operations)
|
|
- `storage_zone_password` from SiteDeployment model (per-zone)
|
|
- `BUNNY_ACCOUNT_API_KEY` from `.env` (only for creating zones, not uploads)
|
|
- For each piece of content, identify correct destination storage bucket/path
|
|
- Upload final HTML to target path (e.g., `about.html`, `my-article-slug.html`)
|
|
|
|
### Error Handling
|
|
- Continue on error (don't halt entire deployment if one file fails)
|
|
- Log errors for individual file failures
|
|
- Report summary at end: successful uploads, failed uploads, total time
|
|
- Both screen output and log file
|
|
|
|
### URL Tracking (Story 4.2 Preview)
|
|
- After article is successfully deployed, log its public URL to tier-segregated text file
|
|
- Create `deployment_logs/` folder if it doesn't exist
|
|
- Two files per day: `YYYY-MM-DD_tier1_urls.txt` and `YYYY-MM-DD_other_tiers_urls.txt`
|
|
- URLs for Tier 1 articles → `_tier1_urls.txt`
|
|
- URLs for Tier 2+ articles → `_other_tiers_urls.txt`
|
|
- Boilerplate pages (about, contact, privacy) are NOT logged to these files
|
|
- **Must avoid duplicate URLs**: Read file, check if URL exists, only append if new
|
|
- Prevents duplicates from manual re-runs after automatic deployment
|
|
|
|
### Database Updates (Story 4.3 Preview)
|
|
- Update article status to 'deployed' after successful upload
|
|
- Store final public URL in database
|
|
- Transactional updates to ensure data integrity
|
|
|
|
## Tasks / Subtasks
|
|
|
|
### 1. Create Bunny.net Storage Upload Client
|
|
**Effort:** 3 story points
|
|
|
|
- [ ] Create `src/deployment/bunny_storage.py` module
|
|
- [ ] Implement `BunnyStorageClient` class for uploading files
|
|
- [ ] Use Bunny.net Storage API (different from Account API)
|
|
- [ ] Authentication using:
|
|
- `BUNNY_API_KEY` from `.env` (account-level storage API key)
|
|
- `storage_zone_password` from SiteDeployment model (per-zone password)
|
|
- Determine correct authentication method during implementation
|
|
- [ ] Methods:
|
|
- `upload_file(zone_name, zone_password, file_path, content, content_type='text/html')`
|
|
- `file_exists(zone_name, zone_password, file_path) -> bool`
|
|
- `list_files(zone_name, zone_password, prefix='') -> List[str]`
|
|
- [ ] Handle HTTP errors, timeouts, retries (3 retries with exponential backoff)
|
|
- [ ] Logging at INFO level for uploads, ERROR for failures
|
|
|
|
### 2. Create Deployment Service
|
|
**Effort:** 3 story points
|
|
|
|
- [ ] Create `src/deployment/deployment_service.py` module
|
|
- [ ] Implement `DeploymentService` class with:
|
|
- `deploy_batch(batch_id, project_id, continue_on_error=True)`
|
|
- `deploy_article(content_id, site_deployment)`
|
|
- `deploy_boilerplate_page(site_page, site_deployment)`
|
|
- [ ] Query all `GeneratedContent` records for project_id
|
|
- [ ] Query all `SitePage` records for sites in batch
|
|
- [ ] For each article:
|
|
- Get site deployment info (storage zone, region, hostname)
|
|
- Generate file path (slug-based, e.g., `my-article-slug.html`)
|
|
- Upload HTML content to Bunny.net storage
|
|
- Log success/failure
|
|
- [ ] For each boilerplate page (if exists):
|
|
- Get site deployment info
|
|
- Generate file path (e.g., `about.html`, `contact.html`, `privacy.html`)
|
|
- Upload HTML content
|
|
- Log success/failure
|
|
- [ ] Track deployment results (successful, failed, skipped)
|
|
- [ ] Return deployment summary
|
|
|
|
### 3. Implement URL Generation for Deployment
|
|
**Effort:** 2 story points
|
|
|
|
- [ ] Extend `src/generation/url_generator.py` module
|
|
- [ ] Add `generate_public_url(site_deployment, file_path) -> str`:
|
|
- Use custom_hostname if available, else pull_zone_bcdn_hostname
|
|
- Return full URL: `https://{hostname}/{file_path}`
|
|
- [ ] Add `generate_file_path(content) -> str`:
|
|
- For articles: Use slug from title or keyword (lowercase, hyphens, .html extension)
|
|
- For boilerplate pages: Fixed names (about.html, contact.html, privacy.html)
|
|
- [ ] Handle edge cases (special characters, long slugs, conflicts)
|
|
|
|
### 4. Implement URL Logging to Text Files
|
|
**Effort:** 2 story points
|
|
|
|
- [ ] Create `src/deployment/url_logger.py` module
|
|
- [ ] Implement `URLLogger` class with:
|
|
- `log_article_url(url, tier, date=None)`
|
|
- `get_existing_urls(tier, date=None) -> Set[str]`
|
|
- [ ] Create `deployment_logs/` directory if doesn't exist
|
|
- [ ] Determine file based on tier and date:
|
|
- Tier 1: `deployment_logs/YYYY-MM-DD_tier1_urls.txt`
|
|
- Tier 2+: `deployment_logs/YYYY-MM-DD_other_tiers_urls.txt`
|
|
- [ ] Check if URL already exists in file before appending
|
|
- [ ] Append URL to file (one per line)
|
|
- [ ] Thread-safe file writing (use file locks)
|
|
|
|
### 5. Implement Database Status Updates
|
|
**Effort:** 2 story points
|
|
|
|
- [ ] Update `src/database/models.py`:
|
|
- Add `deployed_url` field to `GeneratedContent` (nullable string)
|
|
- Add `deployed_at` field to `GeneratedContent` (nullable datetime)
|
|
- [ ] Create migration script `scripts/migrate_add_deployment_fields.py`
|
|
- [ ] Update `GeneratedContentRepository` with:
|
|
- `mark_as_deployed(content_id, url, timestamp=None)`
|
|
- `get_deployed_content(project_id) -> List[GeneratedContent]`
|
|
- [ ] Use transactions to ensure atomicity
|
|
- [ ] Log status updates at INFO level
|
|
|
|
### 6. Create CLI Command: deploy-batch
|
|
**Effort:** 2 story points
|
|
|
|
- [ ] Add `deploy-batch` command to `src/cli/commands.py`
|
|
- [ ] Arguments:
|
|
- `--batch_id` (required): Batch/project ID to deploy
|
|
- `--admin-user` (optional): Admin username for authentication
|
|
- `--admin-password` (optional): Admin password
|
|
- `--continue-on-error` (default: True): Continue if file fails
|
|
- `--dry-run` (default: False): Preview what would be deployed
|
|
- [ ] Authenticate admin user
|
|
- [ ] Load Bunny.net credentials from `.env`
|
|
- [ ] Call `DeploymentService.deploy_batch()`
|
|
- [ ] Display progress (articles uploaded, pages uploaded, errors)
|
|
- [ ] Show final summary with statistics
|
|
- [ ] Exit code 0 if all succeeded, 1 if any failures
|
|
|
|
### 7. Integrate Deployment into Batch Processing
|
|
**Effort:** 2 story points
|
|
|
|
- [ ] Update `src/generation/batch_processor.py`
|
|
- [ ] Add optional `auto_deploy` parameter to `process_job()`
|
|
- [ ] After interlinking completes, trigger deployment if `auto_deploy=True`
|
|
- [ ] Use same deployment service as CLI command
|
|
- [ ] Log deployment results
|
|
- [ ] Handle deployment errors gracefully (don't fail batch if deployment fails)
|
|
- [ ] Make `auto_deploy=True` by default (deploy immediately after generation)
|
|
- [ ] Allow `auto_deploy=False` flag for testing/debugging scenarios
|
|
|
|
### 8. Environment Variable Validation
|
|
**Effort:** 1 story point
|
|
|
|
- [ ] Confirm `src/core/config.py` loads Bunny.net keys from `.env` only
|
|
- [ ] Add validation in deployment service to check required env vars:
|
|
- `BUNNY_API_KEY` (for storage uploads)
|
|
- `BUNNY_ACCOUNT_API_KEY` (for account operations, if needed)
|
|
- [ ] Raise clear error if keys are missing
|
|
- [ ] Document in technical notes which keys are required
|
|
- [ ] Do NOT reference `master.config.json` for any API keys
|
|
|
|
### 9. Unit Tests
|
|
**Effort:** 3 story points
|
|
|
|
- [ ] Test `BunnyStorageClient` upload functionality (mock HTTP calls)
|
|
- [ ] Test URL generation for various content types
|
|
- [ ] Test file path generation (slug creation, special characters)
|
|
- [ ] Test URL logger (file creation, duplicate prevention)
|
|
- [ ] Test deployment service (successful upload, failed upload, mixed results)
|
|
- [ ] Test database status updates
|
|
- [ ] Mock Bunny.net API responses
|
|
- [ ] Achieve >80% code coverage for new modules
|
|
|
|
### 10. Integration Tests
|
|
**Effort:** 2 story points
|
|
|
|
- [ ] Test end-to-end deployment of small batch (2-3 articles)
|
|
- [ ] Test deployment with boilerplate pages
|
|
- [ ] Test deployment without boilerplate pages
|
|
- [ ] Test URL logging (multiple deployments, different days)
|
|
- [ ] Test database updates (status changes, URLs stored)
|
|
- [ ] Test CLI command with dry-run mode
|
|
- [ ] Test continue-on-error behavior
|
|
- [ ] Verify no duplicate URLs in log files
|
|
|
|
## Technical Notes
|
|
|
|
### Bunny.net Storage API
|
|
|
|
Bunny.net has two separate APIs:
|
|
1. **Account API** (existing `BunnyNetClient`): For creating storage zones, pull zones
|
|
- Uses `BUNNY_ACCOUNT_API_KEY` from `.env`
|
|
2. **Storage API** (new `BunnyStorageClient`): For uploading/managing files
|
|
- Uses `BUNNY_API_KEY` from `.env` (account-level storage access)
|
|
- Uses `storage_zone_password` from `SiteDeployment` model (per-zone password)
|
|
- Requires BOTH credentials for authentication
|
|
|
|
Storage API authentication:
|
|
- Base URL: `https://storage.bunnycdn.com/{zone_name}/{file_path}`
|
|
- Authentication method to be determined during implementation:
|
|
- `BUNNY_API_KEY` from `.env` (account-level)
|
|
- `storage_zone_password` from database (per-zone, returned in JSON when zone is created)
|
|
- May require one or both keys depending on Bunny.net's API requirements
|
|
- Storage API key can be extracted from Bunny.net JSON response during zone creation
|
|
- If implementation issues arise, reference code/examples can be provided
|
|
|
|
Upload example:
|
|
```python
|
|
# Get site from database
|
|
site = site_repo.get_by_id(site_deployment_id)
|
|
|
|
# Get API key from .env
|
|
bunny_api_key = os.getenv("BUNNY_API_KEY")
|
|
|
|
# Upload (authentication method TBD during implementation)
|
|
PUT https://storage.bunnycdn.com/{site.storage_zone_name}/my-article.html
|
|
Headers:
|
|
AccessKey: {bunny_api_key OR site.storage_zone_password} # TBD
|
|
Content-Type: text/html
|
|
Body:
|
|
<html>...</html>
|
|
```
|
|
|
|
### File Path Structure
|
|
|
|
```
|
|
Storage Zone: my-zone
|
|
Region: DE (Germany)
|
|
|
|
Articles:
|
|
/my-article-slug.html
|
|
/another-article.html
|
|
/third-article-title.html
|
|
|
|
Boilerplate pages:
|
|
/about.html
|
|
/contact.html
|
|
/privacy.html
|
|
|
|
Not using subdirectories for simplicity
|
|
Future: Could organize by date or category
|
|
```
|
|
|
|
### URL Logger Implementation
|
|
|
|
```python
|
|
# src/deployment/url_logger.py
|
|
|
|
import os
|
|
from datetime import datetime
|
|
from typing import Set
|
|
from pathlib import Path
|
|
import fcntl # For file locking on Unix
|
|
|
|
class URLLogger:
|
|
def __init__(self, logs_dir: str = "deployment_logs"):
|
|
self.logs_dir = Path(logs_dir)
|
|
self.logs_dir.mkdir(exist_ok=True)
|
|
|
|
def log_article_url(self, url: str, tier: str, date: datetime = None):
|
|
if date is None:
|
|
date = datetime.utcnow()
|
|
|
|
# Determine file
|
|
tier_num = self._extract_tier_number(tier)
|
|
if tier_num == 1:
|
|
filename = f"{date.strftime('%Y-%m-%d')}_tier1_urls.txt"
|
|
else:
|
|
filename = f"{date.strftime('%Y-%m-%d')}_other_tiers_urls.txt"
|
|
|
|
filepath = self.logs_dir / filename
|
|
|
|
# Check for duplicates
|
|
existing = self.get_existing_urls(tier, date)
|
|
if url in existing:
|
|
return # Skip duplicate
|
|
|
|
# Append to file (with lock)
|
|
with open(filepath, 'a') as f:
|
|
fcntl.flock(f, fcntl.LOCK_EX)
|
|
f.write(f"{url}\n")
|
|
fcntl.flock(f, fcntl.LOCK_UN)
|
|
|
|
def get_existing_urls(self, tier: str, date: datetime = None) -> Set[str]:
|
|
"""
|
|
Get existing URLs from log file to prevent duplicates
|
|
|
|
This is critical for preventing duplicate entries when:
|
|
- Auto-deployment runs, then manual re-run happens
|
|
- Deployment fails partway and is restarted
|
|
"""
|
|
if date is None:
|
|
date = datetime.utcnow()
|
|
|
|
tier_num = self._extract_tier_number(tier)
|
|
if tier_num == 1:
|
|
filename = f"{date.strftime('%Y-%m-%d')}_tier1_urls.txt"
|
|
else:
|
|
filename = f"{date.strftime('%Y-%m-%d')}_other_tiers_urls.txt"
|
|
|
|
filepath = self.logs_dir / filename
|
|
|
|
if not filepath.exists():
|
|
return set()
|
|
|
|
with open(filepath, 'r') as f:
|
|
return set(line.strip() for line in f if line.strip())
|
|
|
|
def _extract_tier_number(self, tier: str) -> int:
|
|
# Extract number from "tier1", "tier2", etc.
|
|
return int(''.join(c for c in tier if c.isdigit()))
|
|
```
|
|
|
|
### Deployment Service Implementation
|
|
|
|
```python
|
|
# src/deployment/deployment_service.py
|
|
|
|
from typing import List, Dict, Any
|
|
from src.deployment.bunny_storage import BunnyStorageClient
|
|
from src.deployment.url_logger import URLLogger
|
|
from src.database.repositories import GeneratedContentRepository, SitePageRepository, SiteDeploymentRepository
|
|
from src.generation.url_generator import generate_public_url, generate_file_path
|
|
import logging
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
class DeploymentService:
|
|
def __init__(
|
|
self,
|
|
storage_client: BunnyStorageClient,
|
|
content_repo: GeneratedContentRepository,
|
|
site_repo: SiteDeploymentRepository,
|
|
page_repo: SitePageRepository,
|
|
url_logger: URLLogger
|
|
):
|
|
self.storage = storage_client
|
|
self.content_repo = content_repo
|
|
self.site_repo = site_repo
|
|
self.page_repo = page_repo
|
|
self.url_logger = url_logger
|
|
|
|
def deploy_batch(self, project_id: int, continue_on_error: bool = True) -> Dict[str, Any]:
|
|
"""
|
|
Deploy all content for a project/batch
|
|
|
|
Returns:
|
|
Dict with deployment statistics:
|
|
{
|
|
'articles_deployed': 10,
|
|
'articles_failed': 1,
|
|
'pages_deployed': 6,
|
|
'pages_failed': 0,
|
|
'total_time': 45.2
|
|
}
|
|
"""
|
|
results = {
|
|
'articles_deployed': 0,
|
|
'articles_failed': 0,
|
|
'pages_deployed': 0,
|
|
'pages_failed': 0,
|
|
'errors': []
|
|
}
|
|
|
|
# Get all articles for project
|
|
articles = self.content_repo.get_by_project_id(project_id)
|
|
logger.info(f"Found {len(articles)} articles to deploy for project {project_id}")
|
|
|
|
# Deploy articles
|
|
for article in articles:
|
|
if not article.site_deployment_id:
|
|
logger.warning(f"Article {article.id} has no site assigned, skipping")
|
|
continue
|
|
|
|
try:
|
|
site = self.site_repo.get_by_id(article.site_deployment_id)
|
|
if not site:
|
|
raise ValueError(f"Site {article.site_deployment_id} not found")
|
|
|
|
# Deploy article
|
|
url = self.deploy_article(article, site)
|
|
|
|
# Log URL to text file
|
|
self.url_logger.log_article_url(url, article.tier)
|
|
|
|
# Update database
|
|
self.content_repo.mark_as_deployed(article.id, url)
|
|
|
|
results['articles_deployed'] += 1
|
|
logger.info(f"Deployed article {article.id} to {url}")
|
|
|
|
except Exception as e:
|
|
results['articles_failed'] += 1
|
|
results['errors'].append({
|
|
'type': 'article',
|
|
'id': article.id,
|
|
'error': str(e)
|
|
})
|
|
logger.error(f"Failed to deploy article {article.id}: {e}")
|
|
|
|
if not continue_on_error:
|
|
raise
|
|
|
|
# Get unique sites from articles
|
|
site_ids = set(a.site_deployment_id for a in articles if a.site_deployment_id)
|
|
|
|
# Deploy boilerplate pages for each site
|
|
for site_id in site_ids:
|
|
site = self.site_repo.get_by_id(site_id)
|
|
pages = self.page_repo.get_by_site(site_id)
|
|
|
|
if not pages:
|
|
logger.debug(f"Site {site_id} has no boilerplate pages, skipping")
|
|
continue
|
|
|
|
logger.info(f"Found {len(pages)} boilerplate pages for site {site_id}")
|
|
|
|
for page in pages:
|
|
try:
|
|
# Read HTML from database (stored in page.content from Story 3.4)
|
|
url = self.deploy_boilerplate_page(page, site)
|
|
results['pages_deployed'] += 1
|
|
logger.info(f"Deployed page {page.page_type} to {url}")
|
|
|
|
except Exception as e:
|
|
results['pages_failed'] += 1
|
|
results['errors'].append({
|
|
'type': 'page',
|
|
'site_id': site_id,
|
|
'page_type': page.page_type,
|
|
'error': str(e)
|
|
})
|
|
logger.error(f"Failed to deploy page {page.page_type} for site {site_id}: {e}")
|
|
|
|
if not continue_on_error:
|
|
raise
|
|
|
|
return results
|
|
|
|
def deploy_article(self, article, site) -> str:
|
|
"""Deploy a single article, return public URL"""
|
|
file_path = generate_file_path(article)
|
|
url = generate_public_url(site, file_path)
|
|
|
|
# Upload using both BUNNY_API_KEY and zone password
|
|
# BunnyStorageClient determines which auth method to use
|
|
self.storage.upload_file(
|
|
zone_name=site.storage_zone_name,
|
|
zone_password=site.storage_zone_password, # Per-zone password from DB
|
|
file_path=file_path,
|
|
content=article.formatted_html,
|
|
content_type='text/html'
|
|
)
|
|
|
|
return url
|
|
|
|
def deploy_boilerplate_page(self, page, site) -> str:
|
|
"""
|
|
Deploy a boilerplate page, return public URL
|
|
|
|
Note: Uses stored HTML from page.content (from Story 3.4)
|
|
Technical debt: Could regenerate on-the-fly instead of storing
|
|
"""
|
|
file_path = f"{page.page_type}.html"
|
|
url = generate_public_url(site, file_path)
|
|
|
|
# Upload using both BUNNY_API_KEY and zone password
|
|
self.storage.upload_file(
|
|
zone_name=site.storage_zone_name,
|
|
zone_password=site.storage_zone_password,
|
|
file_path=file_path,
|
|
content=page.content, # Full HTML stored in DB
|
|
content_type='text/html'
|
|
)
|
|
|
|
return url
|
|
```
|
|
|
|
### CLI Command Example
|
|
|
|
```bash
|
|
# Deploy a batch manually
|
|
uv run python -m src.cli deploy-batch \
|
|
--batch_id 123 \
|
|
--admin-user admin \
|
|
--admin-password mypass
|
|
|
|
# Output:
|
|
# Authenticating...
|
|
# Loading Bunny.net credentials...
|
|
# Deploying batch 123...
|
|
# [1/50] Deploying article "How to Fix Engines"... ✓
|
|
# [2/50] Deploying article "Engine Maintenance Tips"... ✓
|
|
# ...
|
|
# [50/50] Deploying article "Common Engine Problems"... ✓
|
|
# Deploying boilerplate pages...
|
|
# [1/6] Deploying about.html for site1.b-cdn.net... ✓
|
|
# [2/6] Deploying contact.html for site1.b-cdn.net... ✓
|
|
# ...
|
|
#
|
|
# Deployment Summary:
|
|
# ==================
|
|
# Articles deployed: 48
|
|
# Articles failed: 2
|
|
# Pages deployed: 6
|
|
# Pages failed: 0
|
|
# Total time: 2m 34s
|
|
#
|
|
# Failed articles:
|
|
# - Article 15: Connection timeout
|
|
# - Article 32: Invalid HTML content
|
|
|
|
# Dry-run mode
|
|
uv run python -m src.cli deploy-batch \
|
|
--batch_id 123 \
|
|
--dry-run
|
|
|
|
# Output shows what would be deployed without actually uploading
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
Required in `.env` file:
|
|
```bash
|
|
# Bunny.net Account API (for creating/managing storage zones and pull zones)
|
|
BUNNY_ACCOUNT_API_KEY=your_account_api_key_here
|
|
|
|
# Bunny.net Storage API (for uploading files to storage)
|
|
BUNNY_API_KEY=your_storage_api_key_here
|
|
|
|
# Note: storage_zone_password is per-zone and stored in database
|
|
# Both BUNNY_API_KEY and storage_zone_password may be needed for uploads
|
|
# API keys should ONLY be in .env file, NOT in master.config.json
|
|
```
|
|
|
|
### Database Schema Updates
|
|
|
|
```sql
|
|
-- Add deployment tracking fields to generated_content
|
|
ALTER TABLE generated_content ADD COLUMN deployed_url TEXT NULL;
|
|
ALTER TABLE generated_content ADD COLUMN deployed_at TIMESTAMP NULL;
|
|
|
|
CREATE INDEX idx_generated_content_deployed ON generated_content(deployed_at);
|
|
```
|
|
|
|
## Dependencies
|
|
- Story 3.1: Site assignment (need site_deployment_id on articles)
|
|
- Story 3.3: Content interlinking (HTML must be finalized)
|
|
- Story 3.4: Boilerplate pages (need SitePage table)
|
|
- Bunny.net Storage API access
|
|
- Environment variables configured in `.env`
|
|
|
|
## Future Considerations
|
|
- Story 4.2: URL logging (partially implemented here)
|
|
- Story 4.3: Database status updates (partially implemented here)
|
|
- Story 4.4: Post-deployment verification
|
|
- Multi-cloud support (AWS S3, Azure, DigitalOcean, etc.)
|
|
- CDN cache purging after deployment
|
|
- Parallel uploads for faster deployment
|
|
- Resumable uploads for large files
|
|
- Deployment rollback mechanism
|
|
|
|
## Technical Debt Created
|
|
- Multi-cloud support deferred (only Bunny.net for now)
|
|
- No CDN cache purging yet (Story 4.x)
|
|
- No deployment verification yet (Story 4.4)
|
|
- URL logging is simple (no database tracking of logged URLs)
|
|
- Boilerplate pages stored as full HTML in database (inefficient)
|
|
- Better approach: Store just page_type marker, regenerate HTML on-the-fly at deployment
|
|
- Reduces storage, ensures consistency with current templates
|
|
- Defer optimization to later story
|
|
|
|
## Total Effort
|
|
22 story points
|
|
|
|
### Effort Breakdown
|
|
1. Bunny Storage Client (3 points)
|
|
2. Deployment Service (3 points)
|
|
3. URL Generation (2 points)
|
|
4. URL Logging (2 points)
|
|
5. Database Updates (2 points)
|
|
6. CLI Command (2 points)
|
|
7. Batch Integration (2 points)
|
|
8. Environment Audit (1 point)
|
|
9. Unit Tests (3 points)
|
|
10. Integration Tests (2 points)
|
|
|
|
## Questions & Clarifications
|
|
|
|
### Question 1: Boilerplate Page Deployment Strategy
|
|
**Status:** ✓ RESOLVED
|
|
|
|
The approach:
|
|
- Check `site_pages` table in database
|
|
- Only deploy boilerplate pages if they exist in DB
|
|
- Read HTML content from `site_pages.content` field
|
|
- Most sites won't have them (only newly created sites from Story 3.4+)
|
|
- Don't check remote buckets (database is source of truth)
|
|
|
|
### Question 2: URL Duplicate Prevention
|
|
**Status:** ✓ RESOLVED
|
|
|
|
Approach:
|
|
- Read entire file before appending
|
|
- Check if URL exists in memory (set), skip if duplicate
|
|
- File locking for thread-safety
|
|
- This prevents duplicate URLs from manual re-runs after automatic deployment
|
|
- No database tracking needed (file is source of truth)
|
|
|
|
### Question 3: Auto-deploy Default Behavior
|
|
**Status:** ✓ RESOLVED
|
|
|
|
Decision: **ON by default**
|
|
- Auto-deploy after batch generation completes
|
|
- No reason to delay deployment in normal workflow
|
|
- CLI command still available for manual re-deployment if auto-deploy fails
|
|
- Can be disabled for testing via flag if needed
|
|
|
|
### Question 4: API Keys in master.config.json
|
|
**Status:** ✓ RESOLVED
|
|
|
|
Decision: **Ignore master.config.json for API keys**
|
|
- All API keys come from `.env` file only
|
|
- Even if keys exist in master.config.json now, they'll be removed in future epics
|
|
- Don't reference master.config.json for any authentication
|
|
- Only use .env for credentials
|
|
|
|
## Notes
|
|
- Keep deployment simple for first iteration
|
|
- Focus on reliability over speed
|
|
- Auto-deploy is ON by default (deploy immediately after batch generation)
|
|
- Manual CLI command available for re-deployment or testing
|
|
- Comprehensive error reporting is critical
|
|
- URL logging format is simple (one URL per line)
|
|
- All API keys come from `.env` file, NOT master.config.json
|
|
- Storage API authentication details will be determined during implementation
|
|
|