197 lines
7.7 KiB
Markdown
197 lines
7.7 KiB
Markdown
# Story 4.1 Implementation Summary
|
|
|
|
## Status: COMPLETE
|
|
|
|
## Overview
|
|
Successfully implemented deployment of generated content to Bunny.net cloud storage with tier-segregated URL logging and automatic deployment after batch generation.
|
|
|
|
## Implementation Details
|
|
|
|
### 1. Bunny.net Storage Client (`src/deployment/bunny_storage.py`)
|
|
- `BunnyStorageClient` class for uploading files to Bunny.net storage zones
|
|
- Uses per-zone `storage_zone_password` from database for authentication
|
|
- Region-aware URL generation:
|
|
- Frankfurt (DE): `storage.bunnycdn.com` (no prefix)
|
|
- Other regions: `{region}.storage.bunnycdn.com` (e.g., `la.storage.bunnycdn.com`)
|
|
- Uses `application/octet-stream` content-type per Bunny.net API requirements
|
|
- Implements retry logic with exponential backoff (3 attempts)
|
|
- Methods:
|
|
- `upload_file()`: Upload HTML content to storage zone
|
|
- `file_exists()`: Check if file exists in storage
|
|
- `list_files()`: List files in storage zone
|
|
- **Tested with real Bunny.net storage** - successful upload verified
|
|
|
|
### 2. Database Updates
|
|
- Added `deployed_url` (TEXT, nullable) to `generated_content` table
|
|
- Added `deployed_at` (TIMESTAMP, nullable, indexed) to `generated_content` table
|
|
- Created migration script: `scripts/migrate_add_deployment_fields.py`
|
|
- Added repository methods:
|
|
- `GeneratedContentRepository.mark_as_deployed()`: Update deployment status
|
|
- `GeneratedContentRepository.get_deployed_content()`: Query deployed articles
|
|
|
|
### 3. URL Logger (`src/deployment/url_logger.py`)
|
|
- `URLLogger` class for tier-segregated URL logging
|
|
- Creates daily log files in `deployment_logs/` directory:
|
|
- `YYYY-MM-DD_tier1_urls.txt` for Tier 1 articles
|
|
- `YYYY-MM-DD_other_tiers_urls.txt` for Tier 2+ articles
|
|
- Automatic duplicate prevention by reading existing URLs before appending
|
|
- Boilerplate pages (about, contact, privacy) are NOT logged
|
|
|
|
### 4. URL Generation (`src/generation/url_generator.py`)
|
|
Extended with new functions:
|
|
- `generate_public_url()`: Create full HTTPS URL from site + file path
|
|
- `generate_file_path()`: Generate storage path for articles (slug-based)
|
|
- `generate_page_file_path()`: Generate storage path for boilerplate pages
|
|
|
|
### 5. Deployment Service (`src/deployment/deployment_service.py`)
|
|
- `DeploymentService` class orchestrates deployment workflow
|
|
- `deploy_batch()`: Deploy all content for a project
|
|
- Uploads articles with `formatted_html`
|
|
- Uploads boilerplate pages (about, contact, privacy)
|
|
- Logs article URLs to tier-segregated files
|
|
- Updates database with deployment status and URLs
|
|
- Returns detailed statistics
|
|
- `deploy_article()`: Deploy single article
|
|
- `deploy_boilerplate_page()`: Deploy single boilerplate page
|
|
- Continues on error by default (configurable)
|
|
|
|
### 6. CLI Command (`src/cli/commands.py`)
|
|
Added `deploy-batch` command:
|
|
```bash
|
|
uv run python -m src.cli deploy-batch \
|
|
--batch-id 123 \
|
|
--admin-user admin \
|
|
--admin-password mypass
|
|
```
|
|
|
|
Options:
|
|
- `--batch-id` (required): Project/batch ID to deploy
|
|
- `--admin-user` / `--admin-password`: Authentication
|
|
- `--continue-on-error`: Continue if file fails (default: True)
|
|
- `--dry-run`: Preview what would be deployed
|
|
|
|
### 7. Automatic Deployment Integration (`src/generation/batch_processor.py`)
|
|
- Added `auto_deploy` parameter to `process_job()` (default: True)
|
|
- Deployment triggers automatically after all tiers complete
|
|
- Uses same `DeploymentService` as manual CLI command
|
|
- Graceful error handling (logs warning, continues batch processing)
|
|
- Can be disabled via `auto_deploy=False` for testing
|
|
|
|
### 8. Configuration (`src/core/config.py`)
|
|
- Added `get_bunny_storage_api_key()` validation function
|
|
- Checks for `BUNNY_API_KEY` in `.env` file
|
|
- Clear error messages if keys are missing
|
|
|
|
### 9. Testing (`tests/integration/test_deployment.py`)
|
|
Comprehensive integration tests covering:
|
|
- URL generation and slug creation
|
|
- Tier-segregated URL logging with duplicate prevention
|
|
- Bunny.net storage client uploads
|
|
- Deployment service (articles, pages, batches)
|
|
- All 13 tests passing
|
|
|
|
## File Structure
|
|
|
|
```
|
|
deployment_logs/
|
|
YYYY-MM-DD_tier1_urls.txt # Tier 1 article URLs
|
|
YYYY-MM-DD_other_tiers_urls.txt # Tier 2+ article URLs
|
|
|
|
src/deployment/
|
|
bunny_storage.py # Storage upload client
|
|
deployment_service.py # Main deployment orchestration
|
|
url_logger.py # Tier-segregated URL logging
|
|
|
|
scripts/
|
|
migrate_add_deployment_fields.py # Database migration
|
|
|
|
tests/integration/
|
|
test_deployment.py # Integration tests
|
|
```
|
|
|
|
## Database Schema
|
|
|
|
```sql
|
|
ALTER TABLE generated_content ADD COLUMN deployed_url TEXT NULL;
|
|
ALTER TABLE generated_content ADD COLUMN deployed_at TIMESTAMP NULL;
|
|
CREATE INDEX idx_generated_content_deployed ON generated_content(deployed_at);
|
|
```
|
|
|
|
## Usage Examples
|
|
|
|
### Manual Deployment
|
|
```bash
|
|
# Deploy a specific batch
|
|
uv run python -m src.cli deploy-batch --batch-id 123 --admin-user admin --admin-password pass
|
|
|
|
# Dry run to preview
|
|
uv run python -m src.cli deploy-batch --batch-id 123 --dry-run
|
|
```
|
|
|
|
### Automatic Deployment
|
|
```bash
|
|
# Generate batch with auto-deployment (default)
|
|
uv run python -m src.cli generate-batch --job-file jobs/my_job.json
|
|
|
|
# Generate without auto-deployment
|
|
# (Add --auto-deploy flag to generate-batch command if needed)
|
|
```
|
|
|
|
## Key Design Decisions
|
|
|
|
1. **Authentication**: Uses per-zone `storage_zone_password` from database for uploads. No API key from `.env` needed for storage operations. The `BUNNY_ACCOUNT_API_KEY` is only for zone creation/management.
|
|
2. **File Locking**: Skipped for simplicity - duplicate prevention via file reading is sufficient
|
|
3. **Auto-deploy Default**: ON by default for convenience, can be disabled for testing
|
|
4. **Continue on Error**: Enabled by default to ensure partial deployments complete
|
|
5. **URL Logging**: Simple text files (one URL per line) for easy parsing by Story 4.2
|
|
6. **Boilerplate Pages**: Deploy stored HTML from `site_pages.content` (from Story 3.4)
|
|
|
|
## Dependencies Met
|
|
|
|
- Story 3.1: Site assignment (articles have `site_deployment_id`)
|
|
- Story 3.3: Content interlinking (HTML is finalized)
|
|
- Story 3.4: Boilerplate pages (`SitePage` table exists)
|
|
|
|
## Environment Variables Required
|
|
|
|
```bash
|
|
BUNNY_ACCOUNT_API_KEY=your_account_api_key_here # For zone creation (already existed)
|
|
```
|
|
|
|
**Important**: File uploads do NOT use an API key from `.env`. They use the per-zone `storage_zone_password` stored in the database (in the `site_deployments` table). This password is set automatically when zones are created via `provision-site` or `sync-sites` commands.
|
|
|
|
## Testing Results
|
|
|
|
All 18 tests passing:
|
|
- URL generation (4 tests)
|
|
- URL logging (4 tests)
|
|
- Storage client (2 tests)
|
|
- Deployment service (3 tests)
|
|
- Storage URL generation (5 tests - including DE region special case)
|
|
|
|
**Real-world validation:** Successfully uploaded test file to Bunny.net storage and verified HTTP 201 response.
|
|
|
|
## Known Limitations / Technical Debt
|
|
|
|
1. Only supports Bunny.net (multi-cloud deferred to future stories)
|
|
2. No CDN cache purging after deployment (Story 4.x)
|
|
3. No deployment verification/validation (Story 4.4)
|
|
4. URL logging is file-based (no database tracking)
|
|
5. Boilerplate pages stored as full HTML in DB (inefficient, works for now)
|
|
|
|
## Next Steps
|
|
|
|
- Story 4.2: URL logging enhancements (partially implemented here)
|
|
- Story 4.3: Database status updates (partially implemented here)
|
|
- Story 4.4: Post-deployment verification
|
|
- Future: Multi-cloud support, CDN cache purging, parallel uploads
|
|
|
|
## Notes
|
|
|
|
- Simple and reliable implementation prioritized over complex features
|
|
- Auto-deployment is the default happy path
|
|
- Manual CLI command available for re-deployment or troubleshooting
|
|
- Comprehensive error reporting for debugging
|
|
- All API keys managed via `.env` only (not `master.config.json`)
|
|
|