Add CLI documentation, S3 scripts, and update deployment code

main
PeninsulaInd 2025-12-29 12:51:59 -06:00
parent 56d7e4f642
commit 3cd62e4135
15 changed files with 2102 additions and 39 deletions

View File

@ -516,11 +516,22 @@ Verify `storage_zone_password` in database (set during site provisioning)
## Documentation
- **CLI Command Reference**: `docs/CLI_COMMAND_REFERENCE.md` - Comprehensive documentation for all CLI commands
- Product Requirements: `docs/prd.md`
- Architecture: `docs/architecture/`
- Implementation Summaries: `STORY_*.md` files
- Quick Start Guides: `*_QUICKSTART.md` files
### Regenerating CLI Documentation
To regenerate the CLI command reference after adding or modifying commands:
```bash
uv run python scripts/generate_cli_docs.py
```
This will update `docs/CLI_COMMAND_REFERENCE.md` with all current commands and their options.
## License
All rights reserved.

View File

@ -0,0 +1,545 @@
# CLI Command Reference
Comprehensive documentation for all CLI commands.
> **Note:** This documentation is auto-generated from the Click command definitions. To regenerate after adding or modifying commands, run:
> ```bash
> uv run python scripts/generate_cli_docs.py
> ```
## Quick Reference
### System Commands
- `config` - Show current configuration
- `health` - Check system health
- `models` - List available AI models
### User Management
- `add-user` - Create a new user (requires admin)
- `delete-user` - Delete a user (requires admin)
- `list-users` - List all users (requires admin)
### Site Management
- `provision-site` - Provision a new site with Storage Zone and Pull Zone
- `attach-domain` - Attach a domain to an existing Storage Zone
- `list-sites` - List all site deployments
- `get-site` - Get detailed information about a site
- `remove-site` - Remove a site deployment record
- `sync-sites` - Sync existing bunny.net sites to database
### Project Management
- `ingest-cora` - Ingest a CORA .xlsx report and create a new project
- `ingest-simple` - Ingest a simple spreadsheet and create a new project
- `list-projects` - List all projects for the authenticated user
### Content Generation
- `generate-batch` - Generate content batch from job file
### Deployment
- `deploy-batch` - Deploy all content in a batch to cloud storage
- `verify-deployment` - Verify deployed URLs return 200 OK status
### Link Export
- `get-links` - Export article URLs with optional link details
## Table of Contents
- [System](#system)
- [User Management](#user-management)
- [Site Management](#site-management)
- [Project Management](#project-management)
- [Content Generation](#content-generation)
- [Deployment](#deployment)
- [Link Export](#link-export)
---
## System
### `config`
Show current configuration
No options required.
**Example:**
```bash
uv run python main.py config
```
---
### `health`
Check system health
No options required.
**Example:**
```bash
uv run python main.py health
```
---
### `models`
List available AI models
No options required.
**Example:**
```bash
uv run python main.py models
```
---
## User Management
### `add-user`
Create a new user (requires admin authentication)
**Options:**
- `--username`
- Type: STRING | Username for the new user
- `--password`
- Type: STRING | Password for the new user
- `--role`
- Type: Choice: `Admin`, `User` | Role for the new user
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
**Example:**
```bash
uv run python main.py add-user
```
---
### `delete-user`
Delete a user by username (requires admin authentication)
**Options:**
- `--username`
- Type: STRING | Username to delete
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
- `--yes`
- Type: BOOL | Flag (boolean) | Confirm the action without prompting.
**Example:**
```bash
uv run python main.py delete-user
```
---
### `list-users`
List all users (requires admin authentication)
**Options:**
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
**Example:**
```bash
uv run python main.py list-users
```
---
## Site Management
### `attach-domain`
Attach a domain to an existing Storage Zone (requires admin)
**Options:**
- `--name`
- Type: STRING | Site name
- `--domain`
- Type: STRING | Custom domain (FQDN, e.g., www.example.com)
- `--storage-name`
- Type: STRING | Existing Storage Zone name
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
**Example:**
```bash
uv run python main.py attach-domain
```
---
### `get-site`
Get detailed information about a site deployment (requires admin)
**Options:**
- `--domain`
- Type: STRING | Custom domain to lookup
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
**Example:**
```bash
uv run python main.py get-site
```
---
### `list-sites`
List all site deployments (requires admin)
**Options:**
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
**Example:**
```bash
uv run python main.py list-sites
```
---
### `provision-site`
Provision a new site with Storage Zone and Pull Zone (requires admin)
**Options:**
- `--name`
- Type: STRING | Site name
- `--domain`
- Type: STRING | Custom domain (FQDN, e.g., www.example.com)
- `--storage-name`
- Type: STRING | Storage Zone name (must be globally unique)
- `--region`
- Type: Choice: `DE`, `NY`, `LA`, `SG`, `SYD` | Storage region
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
**Example:**
```bash
uv run python main.py provision-site
```
---
### `remove-site`
Remove a site deployment record (requires admin)
**Options:**
- `--domain`
- Type: STRING | Custom domain to remove
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
- `--yes`
- Type: BOOL | Flag (boolean) | Confirm the action without prompting.
**Example:**
```bash
uv run python main.py remove-site
```
---
### `sync-sites`
Sync existing bunny.net sites with custom domains to database (requires admin)
**Options:**
- `--admin-user`
- Type: STRING | Admin username for authentication
- `--admin-password`
- Type: STRING | Admin password for authentication
- `--dry-run`
- Type: BOOL | Flag (boolean) | Show what would be imported without making changes
**Example:**
```bash
uv run python main.py sync-sites
```
---
## Project Management
### `ingest-cora`
Ingest a CORA .xlsx report and create a new project
**Options:**
- `--file`, `-f` **(required)**
- Type: Path (must exist) | Path to CORA .xlsx file
- `--name`, `-n` **(required)**
- Type: STRING | Project name
- `--money-site-url`, `-m`
- Type: STRING | Money site URL (e.g., https://example.com)
- `--custom-anchors`, `-a`
- Type: STRING | Comma-separated list of custom anchor text (optional)
- `--username`, `-u`
- Type: STRING | Username for authentication
- `--password`, `-p`
- Type: STRING | Password for authentication
**Example:**
```bash
uv run python main.py ingest-cora --file path/to/file.xlsx --name "My Project"
```
---
### `ingest-simple`
Ingest a simple spreadsheet and create a new project
Expected spreadsheet format:
- First row: Headers (main_keyword, project_name, related_searches, entities)
- Second row: Data values
Required columns: main_keyword, project_name, related_searches, entities
- main_keyword: Single phrase keyword
- project_name: Name for the project
- related_searches: Comma-delimited list (e.g., "term1, term2, term3")
- entities: Comma-delimited list (e.g., "entity1, entity2, entity3")
Optional columns (with defaults):
- word_count: Default 1500
- term_frequency: Default 3
**Options:**
- `--file`, `-f` **(required)**
- Type: Path (must exist) | Path to simple .xlsx spreadsheet file
- `--name`, `-n`
- Type: STRING | Project name (overrides project_name from spreadsheet if provided)
- `--money-site-url`, `-m`
- Type: STRING | Money site URL (e.g., https://example.com)
- `--username`, `-u`
- Type: STRING | Username for authentication
- `--password`, `-p`
- Type: STRING | Password for authentication
**Example:**
```bash
uv run python main.py ingest-simple --file path/to/file.xlsx
```
---
### `list-projects`
List all projects for the authenticated user
**Options:**
- `--username`, `-u`
- Type: STRING | Username for authentication
- `--password`, `-p`
- Type: STRING | Password for authentication
**Example:**
```bash
uv run python main.py list-projects
```
---
## Content Generation
### `generate-batch`
Generate content batch from job file
**Options:**
- `--job-file`, `-j` **(required)**
- Type: Path (must exist) | Path to job JSON file
- `--username`, `-u`
- Type: STRING | Username for authentication
- `--password`, `-p`
- Type: STRING | Password for authentication
- `--debug`
- Type: BOOL | Flag (boolean) | Save AI responses to debug_output/
- `--continue-on-error`
- Type: BOOL | Flag (boolean) | Continue processing if article generation fails
- `--model`, `-m`
- Type: STRING | Default: `gpt-4o-mini` | AI model to use (gpt-4o-mini, x-ai/grok-4-fast)
**Example:**
```bash
uv run python main.py generate-batch --file path/to/file.xlsx --debug
```
---
## Deployment
### `deploy-batch`
Deploy all content in a batch to cloud storage
**Options:**
- `--batch-id`, `-b` **(required)**
- Type: INT | Project/batch ID to deploy
- `--username`, `-u`
- Type: STRING | Username for authentication
- `--password`, `-p`
- Type: STRING | Password for authentication
- `--continue-on-error`
- Type: BOOL | Flag (boolean) | Continue if file fails (default: True)
- `--dry-run`
- Type: BOOL | Flag (boolean) | Preview what would be deployed
**Example:**
```bash
uv run python main.py deploy-batch --batch-id 1
```
---
### `verify-deployment`
Verify deployed URLs return 200 OK status
**Options:**
- `--batch-id`, `-b` **(required)**
- Type: INT | Project/batch ID to verify
- `--sample`, `-s`
- Type: INT | Number of random URLs to check (default: check all)
- `--timeout`, `-t`
- Type: INT | Default: `10` | Request timeout in seconds (default: 10)
**Example:**
```bash
uv run python main.py verify-deployment --batch-id 1
```
---
## Link Export
### `get-links`
Export article URLs with optional link details for a project and tier
**Options:**
- `--project-id`, `-p` **(required)**
- Type: INT | Project ID to get links for
- `--tier`, `-t` **(required)**
- Type: STRING | Tier to filter (e.g., "1" or "2+" for tier 2 and above)
- `--with-anchor-text`
- Type: BOOL | Flag (boolean) | Include anchor text used for tiered links
- `--with-destination-url`
- Type: BOOL | Flag (boolean) | Include destination URL that the article links to
**Example:**
```bash
uv run python main.py get-links --project-id 1 --tier 1
```
---

View File

@ -33,6 +33,7 @@ Job files define batch content generation parameters using JSON format.
- `deployment_targets` (optional): Array of site custom_hostnames or site_deployment_ids to cycle through
- `deployment_overflow` (optional): Strategy when batch size exceeds deployment_targets ("round_robin", "random_available", or "none"). Default: "round_robin"
- `image_theme_prompt` (optional): Override the image theme prompt for all images in this job. If not specified, uses the cached theme from the database or generates a new one using AI. This is a single string that describes the visual style, color scheme, lighting, and overall aesthetic for generated images.
- `anchor_text_config` (optional): Control anchor text used for interlinking. Can be set at job-level (applies to all tiers) or tier-level (overrides for specific tier). See [Anchor Text Configuration](#anchor-text-configuration) section below.
### Tier Level
- `count` (required): Number of articles to generate for this tier
@ -42,6 +43,7 @@ Job files define batch content generation parameters using JSON format.
- `max_h2_tags` (optional): Maximum H2 headings (uses defaults if not specified)
- `min_h3_tags` (optional): Minimum H3 subheadings total (uses defaults if not specified)
- `max_h3_tags` (optional): Maximum H3 subheadings total (uses defaults if not specified)
- `anchor_text_config` (optional): Override anchor text for this tier only. See [Anchor Text Configuration](#anchor-text-configuration) section below.
## Tier Defaults
@ -175,6 +177,111 @@ If tier parameters are not specified, these defaults are used:
The `image_theme_prompt` overrides the default AI-generated theme for all images (hero and content) in this job. Use it to ensure consistent visual styling or to avoid default color schemes. If omitted, the system will use the cached theme from the project database, or generate a new one if none exists.
## Anchor Text Configuration
Control the anchor text used when injecting links between articles. Anchor text configuration can be set at the job level (applies to all tiers) or tier level (overrides for that specific tier).
### Modes
- **`default`**: Use algorithm-generated anchor text based on tier rules (Tier 1: main_keyword variations, Tier 2: related_searches, etc.)
- **`override`**: Replace algorithm-generated terms with `custom_text` array
- **`append`**: Add `custom_text` array to algorithm-generated terms
- **`explicit`**: Use only explicitly provided terms (no algorithm-generated terms)
### Job-Level Configuration
Set anchor text for all tiers at once:
```json
{
"jobs": [{
"project_id": 26,
"anchor_text_config": {
"mode": "explicit",
"tier1": ["high volume", "precision machining", "custom manufacturing"],
"tier2": ["high volume production", "bulk manufacturing", "large scale"]
},
"tiers": {
"tier1": {"count": 12},
"tier2": {"count": 38}
}
}]
}
```
### Tier-Level Configuration
Override job-level config for a specific tier:
```json
{
"jobs": [{
"project_id": 26,
"anchor_text_config": {
"mode": "override",
"custom_text": ["custom term 1", "custom term 2"]
},
"tiers": {
"tier1": {
"count": 12,
"anchor_text_config": {
"mode": "explicit",
"terms": ["high volume", "precision"]
}
},
"tier2": {"count": 38}
}
}]
}
```
### Examples
**Override mode (same terms for all tiers):**
```json
{
"anchor_text_config": {
"mode": "override",
"custom_text": ["precision machining", "custom manufacturing"]
}
}
```
**Explicit mode (different terms per tier):**
```json
{
"anchor_text_config": {
"mode": "explicit",
"tier1": ["high volume", "precision machining"],
"tier2": ["bulk production", "large scale"],
"tier3": ["mass production"]
}
}
```
**Mixed modes (explicit for tier1, default for tier2):**
```json
{
"tiers": {
"tier1": {
"count": 12,
"anchor_text_config": {
"mode": "explicit",
"terms": ["high volume", "precision"]
}
},
"tier2": {
"count": 38,
"anchor_text_config": {
"mode": "default"
}
}
}
}
```
**Note:** When using `explicit` mode, terms are randomized across articles so all provided terms are used, not just the first one.
## Usage
Run batch generation with:

View File

@ -0,0 +1,404 @@
"""
S3 Bucket Discovery and Registration Script
Discovers all AWS S3 buckets and allows interactive selection to register them
as SiteDeployment records for use in the site assignment pool.
"""
import os
import sys
import hashlib
import logging
from typing import List, Dict, Optional
from datetime import datetime
import boto3
import click
from botocore.exceptions import ClientError, BotoCoreError, NoCredentialsError
# Add parent directory to path for imports
sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from src.database.session import db_manager
from src.database.repositories import SiteDeploymentRepository
from src.deployment.s3_storage import map_aws_region_to_short_code
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
class BucketInfo:
"""Information about an S3 bucket"""
def __init__(self, name: str, region: str, creation_date: Optional[datetime] = None):
self.name = name
self.region = region
self.creation_date = creation_date
self.is_registered = False
def __repr__(self):
return f"BucketInfo(name={self.name}, region={self.region})"
def get_s3_client():
"""
Create and return a boto3 S3 client
Raises:
SystemExit: If AWS credentials are not found
"""
try:
access_key = os.getenv('AWS_ACCESS_KEY_ID')
secret_key = os.getenv('AWS_SECRET_ACCESS_KEY')
if not access_key or not secret_key:
click.echo("Error: AWS credentials not found.", err=True)
click.echo("Please set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.", err=True)
sys.exit(1)
return boto3.client('s3')
except Exception as e:
click.echo(f"Error creating S3 client: {e}", err=True)
sys.exit(1)
def list_all_buckets(s3_client) -> List[BucketInfo]:
"""
List all S3 buckets and retrieve their metadata
Args:
s3_client: boto3 S3 client
Returns:
List of BucketInfo objects
Raises:
SystemExit: If unable to list buckets
"""
try:
response = s3_client.list_buckets()
buckets = []
for bucket in response.get('Buckets', []):
bucket_name = bucket['Name']
creation_date = bucket.get('CreationDate')
# Get bucket region
try:
region_response = s3_client.get_bucket_location(Bucket=bucket_name)
region = region_response.get('LocationConstraint', 'us-east-1')
# AWS returns None for us-east-1, so normalize it
if region is None or region == '':
region = 'us-east-1'
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code', '')
if error_code == 'AccessDenied':
logger.warning(f"Access denied to get region for bucket {bucket_name}, using default")
region = 'us-east-1'
else:
logger.warning(f"Could not get region for bucket {bucket_name}: {e}, using default")
region = 'us-east-1'
buckets.append(BucketInfo(
name=bucket_name,
region=region,
creation_date=creation_date
))
return buckets
except NoCredentialsError:
click.echo("Error: AWS credentials not found or invalid.", err=True)
click.echo("Please configure AWS credentials using:", err=True)
click.echo(" - Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY", err=True)
click.echo(" - AWS credentials file: ~/.aws/credentials", err=True)
click.echo(" - IAM role (if running on EC2)", err=True)
sys.exit(1)
except ClientError as e:
error_code = e.response.get('Error', {}).get('Code', '')
error_message = e.response.get('Error', {}).get('Message', str(e))
click.echo(f"Error listing buckets: {error_code} - {error_message}", err=True)
if error_code == 'AccessDenied':
click.echo("Insufficient permissions. Ensure your AWS credentials have s3:ListAllMyBuckets permission.", err=True)
sys.exit(1)
except Exception as e:
click.echo(f"Unexpected error listing buckets: {e}", err=True)
sys.exit(1)
def check_existing_deployments(site_repo: SiteDeploymentRepository, bucket_names: List[str]) -> Dict[str, bool]:
"""
Check which buckets are already registered in the database
Args:
site_repo: SiteDeploymentRepository instance
bucket_names: List of bucket names to check
Returns:
Dictionary mapping bucket names to boolean (True if registered)
"""
existing = {}
all_sites = site_repo.get_all()
registered_buckets = {
site.s3_bucket_name
for site in all_sites
if site.s3_bucket_name and site.storage_provider in ('s3', 's3_compatible')
}
for bucket_name in bucket_names:
existing[bucket_name] = bucket_name in registered_buckets
return existing
def generate_unique_hostname(bucket_name: str, site_repo: SiteDeploymentRepository, attempt: int = 0) -> str:
"""
Generate a unique hostname for the pull_zone_bcdn_hostname field
Args:
bucket_name: S3 bucket name
site_repo: SiteDeploymentRepository to check for existing hostnames
attempt: Retry attempt number (for appending suffix)
Returns:
Unique hostname string
"""
if attempt == 0:
base_hostname = f"s3-{bucket_name}.b-cdn.net"
else:
base_hostname = f"s3-{bucket_name}-{attempt}.b-cdn.net"
# Check if hostname already exists
existing = site_repo.get_by_bcdn_hostname(base_hostname)
if existing is None:
return base_hostname
# Try again with incremented suffix
return generate_unique_hostname(bucket_name, site_repo, attempt + 1)
def generate_bucket_hash(bucket_name: str) -> int:
"""
Generate a numeric hash from bucket name for placeholder IDs
Args:
bucket_name: S3 bucket name
Returns:
Integer hash (positive, within reasonable range)
"""
hash_obj = hashlib.md5(bucket_name.encode())
hash_int = int(hash_obj.hexdigest(), 16)
# Take modulo to keep it reasonable, but ensure it's positive
return abs(hash_int % 1000000)
def register_bucket(
bucket_info: BucketInfo,
site_repo: SiteDeploymentRepository,
site_name: Optional[str] = None,
custom_domain: Optional[str] = None
) -> bool:
"""
Register an S3 bucket as a SiteDeployment record
Args:
bucket_info: BucketInfo object with bucket details
site_repo: SiteDeploymentRepository instance
site_name: Optional site name (defaults to bucket name)
custom_domain: Optional custom domain for S3
Returns:
True if successful, False otherwise
"""
bucket_name = bucket_info.name
bucket_region = bucket_info.region
# Check if already registered
all_sites = site_repo.get_all()
for site in all_sites:
if site.s3_bucket_name == bucket_name and site.storage_provider == 's3':
click.echo(f" [SKIP] Bucket '{bucket_name}' is already registered (site_id={site.id})")
return False
# Generate placeholder values for Bunny.net fields
bucket_hash = generate_bucket_hash(bucket_name)
short_region = map_aws_region_to_short_code(bucket_region)
unique_hostname = generate_unique_hostname(bucket_name, site_repo)
# Use provided site_name or default to bucket name
final_site_name = site_name or bucket_name
try:
deployment = site_repo.create(
site_name=final_site_name,
storage_provider='s3',
storage_zone_id=bucket_hash,
storage_zone_name=f"s3-{bucket_name}",
storage_zone_password="s3-placeholder",
storage_zone_region=short_region,
pull_zone_id=bucket_hash,
pull_zone_bcdn_hostname=unique_hostname,
custom_hostname=None,
s3_bucket_name=bucket_name,
s3_bucket_region=bucket_region,
s3_custom_domain=custom_domain,
s3_endpoint_url=None
)
click.echo(f" [OK] Registered bucket '{bucket_name}' as site_id={deployment.id}")
return True
except ValueError as e:
click.echo(f" [ERROR] Failed to register bucket '{bucket_name}': {e}", err=True)
return False
except Exception as e:
click.echo(f" [ERROR] Unexpected error registering bucket '{bucket_name}': {e}", err=True)
return False
def display_buckets(buckets: List[BucketInfo], existing_map: Dict[str, bool]):
"""
Display buckets in a formatted table
Args:
buckets: List of BucketInfo objects
existing_map: Dictionary mapping bucket names to registration status
"""
click.echo("\n" + "=" * 80)
click.echo("Available S3 Buckets")
click.echo("=" * 80)
click.echo(f"{'#':<4} {'Bucket Name':<40} {'Region':<15} {'Status':<15}")
click.echo("-" * 80)
for idx, bucket in enumerate(buckets, 1):
bucket.is_registered = existing_map.get(bucket.name, False)
status = "[REGISTERED]" if bucket.is_registered else "[AVAILABLE]"
click.echo(f"{idx:<4} {bucket.name:<40} {bucket.region:<15} {status:<15}")
click.echo("=" * 80)
def main():
"""Main entry point for the discovery script"""
click.echo("S3 Bucket Discovery and Registration")
click.echo("=" * 80)
# Initialize database
try:
db_manager.initialize()
except Exception as e:
click.echo(f"Error initializing database: {e}", err=True)
sys.exit(1)
session = db_manager.get_session()
site_repo = SiteDeploymentRepository(session)
try:
# Get S3 client
click.echo("\nConnecting to AWS S3...")
s3_client = get_s3_client()
# List all buckets
click.echo("Discovering S3 buckets...")
buckets = list_all_buckets(s3_client)
if not buckets:
click.echo("No S3 buckets found in your AWS account.")
return
# Check which buckets are already registered
bucket_names = [b.name for b in buckets]
existing_map = check_existing_deployments(site_repo, bucket_names)
# Display buckets
display_buckets(buckets, existing_map)
# Filter out already registered buckets
available_buckets = [b for b in buckets if not existing_map.get(b.name, False)]
if not available_buckets:
click.echo("\nAll buckets are already registered.")
return
# Prompt for bucket selection
click.echo(f"\nFound {len(available_buckets)} available bucket(s) to register.")
click.echo("Enter bucket numbers to register (comma-separated, e.g., 1,3,5):")
click.echo("Or press Enter to skip registration.")
selection_input = click.prompt("Selection", default="", type=str).strip()
if not selection_input:
click.echo("No buckets selected. Exiting.")
return
# Parse selection
try:
selected_indices = [int(x.strip()) - 1 for x in selection_input.split(',')]
except ValueError:
click.echo("Error: Invalid selection format. Use comma-separated numbers (e.g., 1,3,5)", err=True)
return
# Validate indices
valid_selections = []
for idx in selected_indices:
if 0 <= idx < len(buckets):
if buckets[idx].name in [b.name for b in available_buckets]:
valid_selections.append(buckets[idx])
else:
click.echo(f"Warning: Bucket #{idx + 1} is already registered, skipping.", err=True)
else:
click.echo(f"Warning: Invalid bucket number {idx + 1}, skipping.", err=True)
if not valid_selections:
click.echo("No valid buckets selected.")
return
# Register selected buckets
click.echo(f"\nRegistering {len(valid_selections)} bucket(s)...")
success_count = 0
for bucket_info in valid_selections:
click.echo(f"\nRegistering bucket: {bucket_info.name}")
# Prompt for site name
default_site_name = bucket_info.name
site_name = click.prompt("Site name", default=default_site_name, type=str).strip()
if not site_name:
site_name = default_site_name
# Prompt for custom domain (optional)
custom_domain = click.prompt(
"Custom domain (optional, press Enter to skip)",
default="",
type=str
).strip()
if not custom_domain:
custom_domain = None
# Confirm registration
if click.confirm(f"Register '{bucket_info.name}' as '{site_name}'?"):
if register_bucket(bucket_info, site_repo, site_name, custom_domain):
success_count += 1
else:
click.echo(f" [SKIP] Registration cancelled for '{bucket_info.name}'")
click.echo(f"\n{'=' * 80}")
click.echo(f"Registration complete: {success_count}/{len(valid_selections)} bucket(s) registered.")
click.echo("=" * 80)
except KeyboardInterrupt:
click.echo("\n\nOperation cancelled by user.")
sys.exit(0)
except Exception as e:
click.echo(f"\nUnexpected error: {e}", err=True)
logger.exception("Unexpected error in bucket discovery")
sys.exit(1)
finally:
session.close()
if __name__ == "__main__":
main()

View File

@ -0,0 +1,238 @@
#!/usr/bin/env python3
"""
Generate comprehensive CLI documentation from Click commands
"""
import sys
from pathlib import Path
# Add project root to Python path
project_root = Path(__file__).parent.parent
sys.path.insert(0, str(project_root))
# Import after path setup
from src.cli.commands import app
import click
def format_option(option):
"""Format a Click option for documentation"""
names = []
if option.opts:
names.extend(option.opts)
if option.secondary_opts:
names.extend(option.secondary_opts)
name_str = ", ".join(f"`{n}`" for n in names)
# Get option type info
type_info = ""
if hasattr(option, 'type') and option.type:
if isinstance(option.type, click.Choice):
choices = ", ".join(f"`{c}`" for c in option.type.choices)
type_info = f"Choice: {choices}"
elif isinstance(option.type, click.Path):
type_info = "Path"
if hasattr(option.type, 'exists') and option.type.exists:
type_info += " (must exist)"
elif hasattr(option.type, '__name__'):
type_info = option.type.__name__
else:
type_info = str(option.type)
# Get default value
default_info = ""
if option.is_flag:
default_info = "Flag (boolean)"
elif option.default is not None and not callable(option.default):
# Filter out Click's Sentinel.UNSET
default_val = str(option.default)
if 'Sentinel' not in default_val and 'UNSET' not in default_val:
default_info = f"Default: `{option.default}`"
# Required indicator
required = option.required
return {
'name': name_str,
'help': option.help or "",
'type': type_info,
'default': default_info,
'required': required
}
def format_command(cmd):
"""Format a Click command for documentation"""
if not isinstance(cmd, click.Command):
return None
doc = {
'name': cmd.name,
'help': cmd.get_short_help_str() or cmd.help or "",
'description': cmd.help or "",
'options': []
}
# Get all options
for param in cmd.params:
if isinstance(param, click.Option):
doc['options'].append(format_option(param))
return doc
def generate_docs():
"""Generate comprehensive CLI documentation"""
commands = []
for name, cmd in app.commands.items():
cmd_doc = format_command(cmd)
if cmd_doc:
commands.append(cmd_doc)
# Sort commands alphabetically
commands.sort(key=lambda x: x['name'])
# Group commands by category
categories = {
'System': ['config', 'health', 'models'],
'User Management': ['add-user', 'delete-user', 'list-users'],
'Site Management': ['provision-site', 'attach-domain', 'list-sites', 'get-site', 'remove-site', 'sync-sites'],
'Project Management': ['ingest-cora', 'ingest-simple', 'list-projects'],
'Content Generation': ['generate-batch'],
'Deployment': ['deploy-batch', 'verify-deployment'],
'Link Export': ['get-links']
}
# Build markdown
md_lines = []
md_lines.append("# CLI Command Reference")
md_lines.append("")
md_lines.append("Comprehensive documentation for all CLI commands.")
md_lines.append("")
md_lines.append("## Table of Contents")
md_lines.append("")
for category in categories.keys():
md_lines.append(f"- [{category}](#{category.lower().replace(' ', '-')})")
md_lines.append("")
md_lines.append("---")
md_lines.append("")
# Generate documentation for each category
for category, command_names in categories.items():
md_lines.append(f"## {category}")
md_lines.append("")
for cmd_doc in commands:
if cmd_doc['name'] in command_names:
md_lines.append(f"### `{cmd_doc['name']}`")
md_lines.append("")
if cmd_doc['description']:
md_lines.append(cmd_doc['description'])
md_lines.append("")
if cmd_doc['options']:
md_lines.append("**Options:**")
md_lines.append("")
for opt in cmd_doc['options']:
parts = [opt['name']]
if opt['required']:
parts.append("**(required)**")
md_lines.append(f"- {' '.join(parts)}")
details = []
if opt['type']:
details.append(f"Type: {opt['type']}")
if opt['default'] and 'Sentinel' not in opt['default']:
details.append(opt['default'])
if opt['help']:
details.append(opt['help'])
if details:
md_lines.append(f" - {' | '.join(details)}")
md_lines.append("")
else:
md_lines.append("No options required.")
md_lines.append("")
md_lines.append("**Example:**")
md_lines.append("")
md_lines.append("```bash")
example_cmd = f"uv run python main.py {cmd_doc['name']}"
# Build example with required options and common optional ones
example_parts = []
for opt in cmd_doc['options']:
opt_name = opt['name'].replace('`', '').split(',')[0].strip()
if opt['required']:
# Add required options with example values
if '--username' in opt_name or '--admin-user' in opt_name:
example_parts.append("--username admin")
elif '--password' in opt_name or '--admin-password' in opt_name:
example_parts.append("--password yourpass")
elif '--file' in opt_name or '-f' in opt_name:
example_parts.append("--file path/to/file.xlsx")
elif '--job-file' in opt_name or '-j' in opt_name:
example_parts.append("--job-file jobs/example.json")
elif '--project-id' in opt_name or '-p' in opt_name:
example_parts.append("--project-id 1")
elif '--batch-id' in opt_name or '-b' in opt_name:
example_parts.append("--batch-id 1")
elif '--domain' in opt_name:
example_parts.append("--domain www.example.com")
elif '--name' in opt_name:
example_parts.append("--name \"My Project\"")
elif '--tier' in opt_name or '-t' in opt_name:
example_parts.append("--tier 1")
elif '--storage-name' in opt_name:
example_parts.append("--storage-name my-storage-zone")
elif '--region' in opt_name:
example_parts.append("--region DE")
else:
example_parts.append(f"{opt_name} <value>")
elif not opt['required'] and '--debug' in opt_name:
# Include common flags in examples
example_parts.append("--debug")
if example_parts:
example_cmd += " " + " ".join(example_parts)
md_lines.append(example_cmd)
md_lines.append("```")
md_lines.append("")
md_lines.append("---")
md_lines.append("")
# Add any commands not in categories
uncategorized = [c for c in commands if not any(c['name'] in names for names in categories.values())]
if uncategorized:
md_lines.append("## Other Commands")
md_lines.append("")
for cmd_doc in uncategorized:
md_lines.append(f"### `{cmd_doc['name']}`")
md_lines.append("")
if cmd_doc['description']:
md_lines.append(cmd_doc['description'])
md_lines.append("")
return "\n".join(md_lines)
if __name__ == "__main__":
docs = generate_docs()
output_file = Path(__file__).parent.parent / "docs" / "CLI_COMMAND_REFERENCE.md"
output_file.parent.mkdir(exist_ok=True)
with open(output_file, 'w', encoding='utf-8') as f:
f.write(docs)
print(f"CLI documentation generated: {output_file}")
print(f"Total commands documented: {len(app.commands)}")

View File

@ -0,0 +1,142 @@
"""
Real S3 integration test - actually uploads to S3 bucket
Requires AWS credentials in environment:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- AWS_REGION (optional, can be set per-site)
Usage:
Set environment variables and run:
python scripts/test_s3_real.py
"""
import os
import sys
import time
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.deployment.s3_storage import S3StorageClient, S3StorageError, S3StorageAuthError
from src.database.models import SiteDeployment
from unittest.mock import Mock
def test_real_s3_upload():
"""Test actual S3 upload with real bucket"""
print("Testing Real S3 Upload\n")
# Check for credentials
access_key = os.environ.get('AWS_ACCESS_KEY_ID')
secret_key = os.environ.get('AWS_SECRET_ACCESS_KEY')
region = os.environ.get('AWS_REGION', 'us-east-1')
if not access_key or not secret_key:
print("[SKIP] AWS credentials not found in environment")
print("Set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to run this test")
return
# Get bucket name from environment or use default
bucket_name = os.environ.get('TEST_S3_BUCKET')
if not bucket_name:
print("[ERROR] TEST_S3_BUCKET environment variable not set")
print("Set TEST_S3_BUCKET to your test bucket name")
return
print(f"Using bucket: {bucket_name}")
print(f"Region: {region}\n")
# Create mock site with S3 config
site = Mock(spec=SiteDeployment)
site.s3_bucket_name = bucket_name
site.s3_bucket_region = region
site.s3_custom_domain = None
site.s3_endpoint_url = None
client = S3StorageClient(max_retries=3)
try:
# Test 1: Upload a simple HTML file
print("1. Uploading test file to S3...")
timestamp = int(time.time())
file_path = f"test-{timestamp}.html"
content = f"<html><body>Test upload at {timestamp}</body></html>"
result = client.upload_file(
site=site,
file_path=file_path,
content=content
)
if result.success:
print(f" [OK] Upload successful!")
print(f" [OK] File: {result.file_path}")
print(f" [OK] URL: {result.message}")
# Try to verify the file is accessible
import boto3
s3_client = boto3.client('s3', region_name=region)
try:
response = s3_client.head_object(Bucket=bucket_name, Key=file_path)
print(f" [OK] File verified in S3 (size: {response['ContentLength']} bytes)")
except Exception as e:
print(f" [WARN] Could not verify file in S3: {e}")
else:
print(f" [FAIL] Upload failed: {result.message}")
return
# Test 2: Upload with custom domain (if configured)
if os.environ.get('TEST_S3_CUSTOM_DOMAIN'):
print("\n2. Testing custom domain URL generation...")
site.s3_custom_domain = os.environ.get('TEST_S3_CUSTOM_DOMAIN')
file_path2 = f"test-custom-{timestamp}.html"
result2 = client.upload_file(
site=site,
file_path=file_path2,
content="<html><body>Custom domain test</body></html>"
)
if result2.success:
print(f" [OK] Upload with custom domain successful")
print(f" [OK] URL: {result2.message}")
else:
print(f" [FAIL] Upload failed: {result2.message}")
# Test 3: Test error handling (missing bucket name)
print("\n3. Testing error handling...")
site_no_bucket = Mock(spec=SiteDeployment)
site_no_bucket.s3_bucket_name = None
try:
client.upload_file(
site=site_no_bucket,
file_path="test.html",
content="<html>Test</html>"
)
print(" [FAIL] Should have raised ValueError for missing bucket")
except ValueError as e:
print(f" [OK] Correctly raised ValueError: {e}")
print("\n" + "="*50)
print("[SUCCESS] Real S3 tests passed!")
print("="*50)
print(f"\nTest file uploaded: {file_path}")
print(f"Clean up: aws s3 rm s3://{bucket_name}/{file_path}")
except S3StorageAuthError as e:
print(f"\n[ERROR] Authentication failed: {e}")
print("Check your AWS credentials")
except S3StorageError as e:
print(f"\n[ERROR] S3 error: {e}")
except Exception as e:
print(f"\n[ERROR] Unexpected error: {e}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
test_real_s3_upload()

View File

@ -0,0 +1,105 @@
"""
Quick test to verify Story 6.3 works - creates and retrieves site deployments with new fields
"""
import sys
import time
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.database.session import db_manager
from src.database.repositories import SiteDeploymentRepository
def test():
"""Test creating and retrieving site deployments with new multi-cloud fields"""
print("Testing Story 6.3: Multi-Cloud Storage Fields\n")
db_manager.initialize()
session = db_manager.get_session()
repo = SiteDeploymentRepository(session)
try:
timestamp = int(time.time())
# Test 1: Create bunny deployment (backward compatibility)
print("1. Creating bunny deployment (default)...")
bunny_site = repo.create(
site_name="Test Bunny Site",
storage_zone_id=100 + timestamp,
storage_zone_name="test-zone",
storage_zone_password="test-pass",
storage_zone_region="DE",
pull_zone_id=100 + timestamp,
pull_zone_bcdn_hostname=f"test-bunny-{timestamp}.b-cdn.net"
)
print(f" [OK] Created: ID={bunny_site.id}, provider={bunny_site.storage_provider}")
assert bunny_site.storage_provider == "bunny", "Should default to bunny"
# Test 2: Create S3 deployment
print("\n2. Creating S3 deployment...")
s3_site = repo.create(
site_name="Test S3 Site",
storage_zone_id=200 + timestamp,
storage_zone_name="s3-zone",
storage_zone_password="s3-pass",
storage_zone_region="NY",
pull_zone_id=200 + timestamp,
pull_zone_bcdn_hostname=f"test-s3-{timestamp}.b-cdn.net",
storage_provider="s3",
s3_bucket_name="my-bucket",
s3_bucket_region="us-east-1",
s3_custom_domain="cdn.example.com"
)
print(f" [OK] Created: ID={s3_site.id}, provider={s3_site.storage_provider}")
print(f" [OK] S3 fields: bucket={s3_site.s3_bucket_name}, region={s3_site.s3_bucket_region}")
assert s3_site.storage_provider == "s3", "Should be s3"
assert s3_site.s3_bucket_name == "my-bucket", "Bucket name should match"
# Test 3: Retrieve and verify
print("\n3. Retrieving S3 deployment...")
retrieved = repo.get_by_id(s3_site.id)
assert retrieved is not None, "Should retrieve the site"
assert retrieved.storage_provider == "s3", "Provider should be s3"
assert retrieved.s3_bucket_name == "my-bucket", "Bucket should match"
print(f" [OK] Retrieved: provider={retrieved.storage_provider}, bucket={retrieved.s3_bucket_name}")
# Test 4: Create S3-compatible deployment
print("\n4. Creating S3-compatible deployment...")
s3c_site = repo.create(
site_name="Test DO Spaces",
storage_zone_id=300 + timestamp,
storage_zone_name="do-zone",
storage_zone_password="do-pass",
storage_zone_region="LA",
pull_zone_id=300 + timestamp,
pull_zone_bcdn_hostname=f"test-do-{timestamp}.b-cdn.net",
storage_provider="s3_compatible",
s3_bucket_name="spaces-bucket",
s3_bucket_region="nyc3",
s3_endpoint_url="https://nyc3.digitaloceanspaces.com"
)
print(f" [OK] Created: provider={s3c_site.storage_provider}, endpoint={s3c_site.s3_endpoint_url}")
assert s3c_site.storage_provider == "s3_compatible", "Should be s3_compatible"
assert s3c_site.s3_endpoint_url == "https://nyc3.digitaloceanspaces.com", "Endpoint should match"
session.commit()
print("\n" + "="*50)
print("[SUCCESS] ALL TESTS PASSED - Story 6.3 works!")
print("="*50)
except Exception as e:
session.rollback()
print(f"\n[ERROR] {e}")
import traceback
traceback.print_exc()
sys.exit(1)
finally:
session.close()
db_manager.close()
if __name__ == "__main__":
test()

View File

@ -20,7 +20,7 @@ from src.generation.ai_client import AIClient, PromptManager
from src.generation.service import ContentGenerator
from src.generation.batch_processor import BatchProcessor
from src.database.repositories import GeneratedContentRepository, SitePageRepository
from src.deployment.bunny_storage import BunnyStorageClient, BunnyStorageError
from src.deployment.bunny_storage import BunnyStorageError
from src.deployment.deployment_service import DeploymentService
from src.deployment.url_logger import URLLogger
from src.templating.service import TemplateService
@ -638,6 +638,39 @@ def list_sites(admin_user: Optional[str], admin_password: Optional[str]):
raise click.Abort()
@app.command("discover-s3-buckets")
def discover_s3_buckets():
"""Discover and register AWS S3 buckets as site deployments"""
try:
# Import here to avoid circular dependencies
import subprocess
import sys
from pathlib import Path
# Get the script path
script_dir = Path(__file__).parent.parent.parent
script_path = script_dir / "scripts" / "discover_s3_buckets.py"
if not script_path.exists():
click.echo(f"Error: Discovery script not found at {script_path}", err=True)
raise click.Abort()
# Run the discovery script
click.echo("Running S3 bucket discovery script...\n")
result = subprocess.run([sys.executable, str(script_path)], check=False)
if result.returncode != 0:
click.echo(f"\nDiscovery script exited with code {result.returncode}", err=True)
raise click.Abort()
except FileNotFoundError:
click.echo("Error: Discovery script not found", err=True)
raise click.Abort()
except Exception as e:
click.echo(f"Error running discovery script: {e}", err=True)
raise click.Abort()
@app.command("get-site")
@click.option("--domain", prompt=True, help="Custom domain to lookup")
@click.option("--admin-user", help="Admin username for authentication")
@ -1346,11 +1379,9 @@ def deploy_batch(
click.echo("\nDry run complete. Use without --dry-run to actually deploy.")
return
storage_client = BunnyStorageClient(max_retries=3)
url_logger = URLLogger()
deployment_service = DeploymentService(
storage_client=storage_client,
content_repo=content_repo,
site_repo=site_repo,
page_repo=page_repo,

View File

@ -5,9 +5,12 @@ Bunny.net Storage API client for uploading files to storage zones
import requests
import time
import logging
from typing import List, Optional
from typing import List, Optional, TYPE_CHECKING
from dataclasses import dataclass
if TYPE_CHECKING:
from src.database.models import SiteDeployment
logger = logging.getLogger(__name__)
@ -65,9 +68,7 @@ class BunnyStorageClient:
def upload_file(
self,
zone_name: str,
zone_password: str,
zone_region: str,
site: "SiteDeployment",
file_path: str,
content: str,
content_type: str = 'application/octet-stream'
@ -76,9 +77,7 @@ class BunnyStorageClient:
Upload a file to Bunny.net storage zone
Args:
zone_name: Storage zone name
zone_password: Storage zone password (from database)
zone_region: Storage zone region (e.g., 'LA', 'NY', 'DE')
site: SiteDeployment object with storage zone configuration
file_path: Path within storage zone (e.g., 'my-article.html')
content: File content to upload
content_type: MIME type (default: application/octet-stream per Bunny.net docs)
@ -94,6 +93,10 @@ class BunnyStorageClient:
Per Bunny.net docs, content must be raw binary and content-type should be
application/octet-stream. Success response is HTTP 201.
"""
zone_name = site.storage_zone_name
zone_password = site.storage_zone_password
zone_region = site.storage_zone_region
base_url = self._get_storage_url(zone_region)
url = f"{base_url}/{zone_name}/{file_path}"
headers = {

View File

@ -7,7 +7,8 @@ import logging
import time
from typing import Dict, Any, List
from datetime import datetime
from src.deployment.bunny_storage import BunnyStorageClient, BunnyStorageError
from src.deployment.bunny_storage import BunnyStorageError
from src.deployment.storage_factory import create_storage_client
from src.deployment.url_logger import URLLogger
from src.database.repositories import (
GeneratedContentRepository,
@ -29,7 +30,6 @@ class DeploymentService:
def __init__(
self,
storage_client: BunnyStorageClient,
content_repo: GeneratedContentRepository,
site_repo: SiteDeploymentRepository,
page_repo: SitePageRepository,
@ -39,13 +39,11 @@ class DeploymentService:
Initialize deployment service
Args:
storage_client: BunnyStorageClient for uploads
content_repo: Repository for content records
site_repo: Repository for site deployments
page_repo: Repository for boilerplate pages
url_logger: URLLogger for tracking deployed URLs
"""
self.storage = storage_client
self.content_repo = content_repo
self.site_repo = site_repo
self.page_repo = page_repo
@ -183,10 +181,9 @@ class DeploymentService:
file_path = generate_file_path(article)
url = generate_public_url(site, file_path)
self.storage.upload_file(
zone_name=site.storage_zone_name,
zone_password=site.storage_zone_password,
zone_region=site.storage_zone_region,
client = create_storage_client(site)
client.upload_file(
site=site,
file_path=file_path,
content=article.formatted_html
)
@ -217,10 +214,9 @@ class DeploymentService:
file_path = generate_page_file_path(page)
url = generate_public_url(site, file_path)
self.storage.upload_file(
zone_name=site.storage_zone_name,
zone_password=site.storage_zone_password,
zone_region=site.storage_zone_region,
client = create_storage_client(site)
client.upload_file(
site=site,
file_path=file_path,
content=page.content
)

View File

@ -412,3 +412,44 @@ class S3StorageClient:
raise S3StorageError(f"Upload failed after {self.max_retries} attempts")
def map_aws_region_to_short_code(aws_region: str) -> str:
"""
Map AWS region code (e.g., 'us-east-1') to short region code used by the system
Args:
aws_region: AWS region code (e.g., 'us-east-1', 'eu-west-1')
Returns:
Short region code (e.g., 'US', 'EU')
Note:
Returns 'US' as default for unknown regions
"""
region_mapping = {
# US regions
'us-east-1': 'US',
'us-east-2': 'US',
'us-west-1': 'US',
'us-west-2': 'US',
# EU regions
'eu-west-1': 'EU',
'eu-west-2': 'EU',
'eu-west-3': 'EU',
'eu-central-1': 'EU',
'eu-north-1': 'EU',
'eu-south-1': 'EU',
# Asia Pacific
'ap-southeast-1': 'SG',
'ap-southeast-2': 'SYD',
'ap-northeast-1': 'JP',
'ap-northeast-2': 'KR',
'ap-south-1': 'IN',
# Other
'ca-central-1': 'CA',
'sa-east-1': 'SA',
'af-south-1': 'AF',
'me-south-1': 'ME',
}
return region_mapping.get(aws_region.lower(), 'US')

View File

@ -18,7 +18,6 @@ from src.generation.url_generator import generate_urls_for_batch
from src.interlinking.tiered_links import find_tiered_links
from src.interlinking.content_injection import inject_interlinks
from src.generation.site_assignment import assign_sites_to_batch, assign_site_to_single_article
from src.deployment.bunny_storage import BunnyStorageClient
from src.deployment.deployment_service import DeploymentService
from src.deployment.url_logger import URLLogger
from src.generation.image_generator import ImageGenerator
@ -1146,12 +1145,10 @@ class BatchProcessor:
"""
click.echo(f"\n Deployment: Starting automatic deployment for project {project_id}...")
storage_client = BunnyStorageClient(max_retries=3)
url_logger = URLLogger()
page_repo = SitePageRepository(self.content_repo.session)
deployment_service = DeploymentService(
storage_client=storage_client,
content_repo=self.content_repo,
site_repo=self.site_deployment_repo,
page_repo=page_repo,

View File

@ -138,10 +138,13 @@ class TestBunnyStorageClient:
client = BunnyStorageClient(max_retries=3)
site = Mock(spec=SiteDeployment)
site.storage_zone_name = "test-zone"
site.storage_zone_password = "test-password"
site.storage_zone_region = "DE"
result = client.upload_file(
zone_name="test-zone",
zone_password="test-password",
zone_region="DE",
site=site,
file_path="test.html",
content="<html>Test</html>"
)
@ -166,11 +169,14 @@ class TestBunnyStorageClient:
client = BunnyStorageClient(max_retries=3)
site = Mock(spec=SiteDeployment)
site.storage_zone_name = "test-zone"
site.storage_zone_password = "bad-password"
site.storage_zone_region = "DE"
with pytest.raises(Exception) as exc_info:
client.upload_file(
zone_name="test-zone",
zone_password="bad-password",
zone_region="DE",
site=site,
file_path="test.html",
content="<html>Test</html>"
)
@ -181,7 +187,8 @@ class TestBunnyStorageClient:
class TestDeploymentService:
"""Test deployment service integration"""
def test_deploy_article(self, tmp_path):
@patch('src.deployment.deployment_service.create_storage_client')
def test_deploy_article(self, mock_create_client, tmp_path):
"""Test deploying a single article"""
mock_storage = Mock(spec=BunnyStorageClient)
mock_storage.upload_file.return_value = UploadResult(
@ -189,6 +196,7 @@ class TestDeploymentService:
file_path="test-article.html",
message="Success"
)
mock_create_client.return_value = mock_storage
mock_content_repo = Mock()
mock_site_repo = Mock()
@ -197,7 +205,6 @@ class TestDeploymentService:
url_logger = URLLogger(logs_dir=str(tmp_path))
service = DeploymentService(
storage_client=mock_storage,
content_repo=mock_content_repo,
site_repo=mock_site_repo,
page_repo=mock_page_repo,
@ -219,9 +226,15 @@ class TestDeploymentService:
url = service.deploy_article(article, site)
assert url == "https://www.example.com/test-article.html"
mock_storage.upload_file.assert_called_once()
mock_create_client.assert_called_once_with(site)
mock_storage.upload_file.assert_called_once_with(
site=site,
file_path="test-article.html",
content="<html>Test Content</html>"
)
def test_deploy_boilerplate_page(self, tmp_path):
@patch('src.deployment.deployment_service.create_storage_client')
def test_deploy_boilerplate_page(self, mock_create_client, tmp_path):
"""Test deploying a boilerplate page"""
mock_storage = Mock(spec=BunnyStorageClient)
mock_storage.upload_file.return_value = UploadResult(
@ -229,6 +242,7 @@ class TestDeploymentService:
file_path="about.html",
message="Success"
)
mock_create_client.return_value = mock_storage
mock_content_repo = Mock()
mock_site_repo = Mock()
@ -237,7 +251,6 @@ class TestDeploymentService:
url_logger = URLLogger(logs_dir=str(tmp_path))
service = DeploymentService(
storage_client=mock_storage,
content_repo=mock_content_repo,
site_repo=mock_site_repo,
page_repo=mock_page_repo,
@ -258,9 +271,15 @@ class TestDeploymentService:
url = service.deploy_boilerplate_page(page, site)
assert url == "https://www.example.com/about.html"
mock_storage.upload_file.assert_called_once()
mock_create_client.assert_called_once_with(site)
mock_storage.upload_file.assert_called_once_with(
site=site,
file_path="about.html",
content="<html>About Page</html>"
)
def test_deploy_batch(self, tmp_path):
@patch('src.deployment.deployment_service.create_storage_client')
def test_deploy_batch(self, mock_create_client, tmp_path):
"""Test deploying an entire batch"""
mock_storage = Mock(spec=BunnyStorageClient)
mock_storage.upload_file.return_value = UploadResult(
@ -268,6 +287,7 @@ class TestDeploymentService:
file_path="test.html",
message="Success"
)
mock_create_client.return_value = mock_storage
mock_content_repo = Mock()
mock_site_repo = Mock()
@ -302,7 +322,6 @@ class TestDeploymentService:
url_logger = URLLogger(logs_dir=str(tmp_path))
service = DeploymentService(
storage_client=mock_storage,
content_repo=mock_content_repo,
site_repo=mock_site_repo,
page_repo=mock_page_repo,
@ -314,6 +333,7 @@ class TestDeploymentService:
assert results['articles_deployed'] == 2
assert results['articles_failed'] == 0
assert results['pages_deployed'] == 0
assert mock_create_client.call_count == 2
assert mock_storage.upload_file.call_count == 2
assert mock_content_repo.mark_as_deployed.call_count == 2

View File

@ -0,0 +1,239 @@
"""
Integration tests for Story 6.3: Database Schema Updates for Multi-Cloud
Tests the migration script and verifies the new fields work correctly with a real database.
"""
import pytest
from sqlalchemy import text, inspect
from src.database.session import db_manager
from src.database.models import SiteDeployment
from src.database.repositories import SiteDeploymentRepository
@pytest.fixture
def db_connection():
"""Get a database connection for testing"""
db_manager.initialize()
engine = db_manager.get_engine()
connection = engine.connect()
yield connection
connection.close()
db_manager.close()
def test_migration_adds_storage_provider_column(db_connection):
"""Test that migration adds storage_provider column"""
inspector = inspect(db_connection)
columns = {col['name']: col for col in inspector.get_columns('site_deployments')}
assert 'storage_provider' in columns
assert columns['storage_provider']['nullable'] is False
assert columns['storage_provider']['type'].length == 20
def test_migration_adds_s3_columns(db_connection):
"""Test that migration adds all S3-specific columns"""
inspector = inspect(db_connection)
columns = {col['name']: col for col in inspector.get_columns('site_deployments')}
assert 's3_bucket_name' in columns
assert columns['s3_bucket_name']['nullable'] is True
assert 's3_bucket_region' in columns
assert columns['s3_bucket_region']['nullable'] is True
assert 's3_custom_domain' in columns
assert columns['s3_custom_domain']['nullable'] is True
assert 's3_endpoint_url' in columns
assert columns['s3_endpoint_url']['nullable'] is True
def test_migration_creates_storage_provider_index(db_connection):
"""Test that migration creates index on storage_provider"""
inspector = inspect(db_connection)
indexes = inspector.get_indexes('site_deployments')
index_names = [idx['name'] for idx in indexes]
assert any('storage_provider' in name for name in index_names)
def test_existing_records_have_bunny_default(db_connection):
"""Test that existing records have storage_provider='bunny'"""
result = db_connection.execute(text("""
SELECT COUNT(*) as count
FROM site_deployments
WHERE storage_provider = 'bunny' OR storage_provider IS NULL
"""))
total_result = db_connection.execute(text("""
SELECT COUNT(*) as count FROM site_deployments
"""))
bunny_count = result.fetchone()[0]
total_count = total_result.fetchone()[0]
if total_count > 0:
assert bunny_count == total_count
def test_create_bunny_deployment_with_repository(db_connection):
"""Test creating a bunny deployment using repository (backward compatibility)"""
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=db_connection)
session = Session()
try:
repo = SiteDeploymentRepository(session)
deployment = repo.create(
site_name="Test Bunny Site",
storage_zone_id=999,
storage_zone_name="test-zone",
storage_zone_password="test-password",
storage_zone_region="DE",
pull_zone_id=999,
pull_zone_bcdn_hostname="test-bunny.b-cdn.net"
)
assert deployment.storage_provider == "bunny"
assert deployment.s3_bucket_name is None
assert deployment.s3_bucket_region is None
assert deployment.id is not None
session.commit()
retrieved = repo.get_by_id(deployment.id)
assert retrieved is not None
assert retrieved.storage_provider == "bunny"
finally:
session.rollback()
session.close()
def test_create_s3_deployment_with_repository(db_connection):
"""Test creating an S3 deployment using repository with new fields"""
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=db_connection)
session = Session()
try:
repo = SiteDeploymentRepository(session)
deployment = repo.create(
site_name="Test S3 Site",
storage_zone_id=888,
storage_zone_name="s3-zone",
storage_zone_password="s3-password",
storage_zone_region="NY",
pull_zone_id=888,
pull_zone_bcdn_hostname="test-s3.b-cdn.net",
storage_provider="s3",
s3_bucket_name="my-test-bucket",
s3_bucket_region="us-east-1",
s3_custom_domain="cdn.example.com"
)
assert deployment.storage_provider == "s3"
assert deployment.s3_bucket_name == "my-test-bucket"
assert deployment.s3_bucket_region == "us-east-1"
assert deployment.s3_custom_domain == "cdn.example.com"
assert deployment.s3_endpoint_url is None
session.commit()
retrieved = repo.get_by_id(deployment.id)
assert retrieved is not None
assert retrieved.storage_provider == "s3"
assert retrieved.s3_bucket_name == "my-test-bucket"
finally:
session.rollback()
session.close()
def test_create_s3_compatible_deployment_with_repository(db_connection):
"""Test creating an S3-compatible deployment with custom endpoint"""
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=db_connection)
session = Session()
try:
repo = SiteDeploymentRepository(session)
deployment = repo.create(
site_name="Test DO Spaces Site",
storage_zone_id=777,
storage_zone_name="do-zone",
storage_zone_password="do-password",
storage_zone_region="LA",
pull_zone_id=777,
pull_zone_bcdn_hostname="test-do.b-cdn.net",
storage_provider="s3_compatible",
s3_bucket_name="my-spaces-bucket",
s3_bucket_region="nyc3",
s3_endpoint_url="https://nyc3.digitaloceanspaces.com"
)
assert deployment.storage_provider == "s3_compatible"
assert deployment.s3_bucket_name == "my-spaces-bucket"
assert deployment.s3_bucket_region == "nyc3"
assert deployment.s3_endpoint_url == "https://nyc3.digitaloceanspaces.com"
session.commit()
retrieved = repo.get_by_id(deployment.id)
assert retrieved is not None
assert retrieved.storage_provider == "s3_compatible"
assert retrieved.s3_endpoint_url == "https://nyc3.digitaloceanspaces.com"
finally:
session.rollback()
session.close()
def test_model_fields_accessible(db_connection):
"""Test that all new model fields are accessible"""
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=db_connection)
session = Session()
try:
repo = SiteDeploymentRepository(session)
deployment = repo.create(
site_name="Model Test Site",
storage_zone_id=666,
storage_zone_name="model-zone",
storage_zone_password="model-password",
storage_zone_region="DE",
pull_zone_id=666,
pull_zone_bcdn_hostname="model-test.b-cdn.net",
storage_provider="s3",
s3_bucket_name="model-bucket",
s3_bucket_region="us-west-2",
s3_custom_domain="model.example.com",
s3_endpoint_url="https://s3.us-west-2.amazonaws.com"
)
assert hasattr(deployment, 'storage_provider')
assert hasattr(deployment, 's3_bucket_name')
assert hasattr(deployment, 's3_bucket_region')
assert hasattr(deployment, 's3_custom_domain')
assert hasattr(deployment, 's3_endpoint_url')
assert deployment.storage_provider == "s3"
assert deployment.s3_bucket_name == "model-bucket"
assert deployment.s3_bucket_region == "us-west-2"
assert deployment.s3_custom_domain == "model.example.com"
assert deployment.s3_endpoint_url == "https://s3.us-west-2.amazonaws.com"
finally:
session.rollback()
session.close()

View File

@ -0,0 +1,184 @@
"""
Unit tests for SiteDeploymentRepository
Story 6.3: Database Schema Updates for Multi-Cloud
"""
import pytest
from unittest.mock import Mock, MagicMock
from sqlalchemy.exc import IntegrityError
from src.database.repositories import SiteDeploymentRepository
from src.database.models import SiteDeployment
@pytest.fixture
def mock_session():
return Mock()
@pytest.fixture
def site_deployment_repo(mock_session):
return SiteDeploymentRepository(mock_session)
def test_create_site_deployment_bunny_default(site_deployment_repo, mock_session):
"""Test creating site deployment with default bunny provider (backward compatibility)"""
mock_session.add = Mock()
mock_session.commit = Mock()
mock_session.refresh = Mock()
deployment = site_deployment_repo.create(
site_name="Test Site",
storage_zone_id=1,
storage_zone_name="test-zone",
storage_zone_password="test-password",
storage_zone_region="DE",
pull_zone_id=1,
pull_zone_bcdn_hostname="test.b-cdn.net"
)
assert mock_session.add.called
assert mock_session.commit.called
assert mock_session.refresh.called
added_deployment = mock_session.add.call_args[0][0]
assert added_deployment.storage_provider == "bunny"
assert added_deployment.s3_bucket_name is None
assert added_deployment.s3_bucket_region is None
def test_create_site_deployment_with_s3_fields(site_deployment_repo, mock_session):
"""Test creating site deployment with S3 provider and S3 fields"""
mock_session.add = Mock()
mock_session.commit = Mock()
mock_session.refresh = Mock()
deployment = site_deployment_repo.create(
site_name="S3 Site",
storage_zone_id=1,
storage_zone_name="test-zone",
storage_zone_password="test-password",
storage_zone_region="DE",
pull_zone_id=1,
pull_zone_bcdn_hostname="test.b-cdn.net",
storage_provider="s3",
s3_bucket_name="my-bucket",
s3_bucket_region="us-east-1",
s3_custom_domain="cdn.example.com",
s3_endpoint_url=None
)
assert mock_session.add.called
assert mock_session.commit.called
added_deployment = mock_session.add.call_args[0][0]
assert added_deployment.storage_provider == "s3"
assert added_deployment.s3_bucket_name == "my-bucket"
assert added_deployment.s3_bucket_region == "us-east-1"
assert added_deployment.s3_custom_domain == "cdn.example.com"
assert added_deployment.s3_endpoint_url is None
def test_create_site_deployment_s3_compatible(site_deployment_repo, mock_session):
"""Test creating site deployment with S3-compatible provider and custom endpoint"""
mock_session.add = Mock()
mock_session.commit = Mock()
mock_session.refresh = Mock()
deployment = site_deployment_repo.create(
site_name="DO Spaces Site",
storage_zone_id=1,
storage_zone_name="test-zone",
storage_zone_password="test-password",
storage_zone_region="DE",
pull_zone_id=1,
pull_zone_bcdn_hostname="test.b-cdn.net",
storage_provider="s3_compatible",
s3_bucket_name="my-spaces-bucket",
s3_bucket_region="nyc3",
s3_endpoint_url="https://nyc3.digitaloceanspaces.com"
)
added_deployment = mock_session.add.call_args[0][0]
assert added_deployment.storage_provider == "s3_compatible"
assert added_deployment.s3_bucket_name == "my-spaces-bucket"
assert added_deployment.s3_bucket_region == "nyc3"
assert added_deployment.s3_endpoint_url == "https://nyc3.digitaloceanspaces.com"
def test_create_site_deployment_explicit_bunny(site_deployment_repo, mock_session):
"""Test creating site deployment with explicit bunny provider"""
mock_session.add = Mock()
mock_session.commit = Mock()
mock_session.refresh = Mock()
deployment = site_deployment_repo.create(
site_name="Bunny Site",
storage_zone_id=1,
storage_zone_name="test-zone",
storage_zone_password="test-password",
storage_zone_region="DE",
pull_zone_id=1,
pull_zone_bcdn_hostname="test.b-cdn.net",
storage_provider="bunny"
)
added_deployment = mock_session.add.call_args[0][0]
assert added_deployment.storage_provider == "bunny"
def test_create_site_deployment_duplicate_hostname_raises_error(site_deployment_repo, mock_session):
"""Test creating site deployment with duplicate hostname raises error"""
mock_session.add = Mock()
mock_session.commit = Mock(side_effect=IntegrityError("", "", ""))
mock_session.rollback = Mock()
with pytest.raises(ValueError) as exc_info:
site_deployment_repo.create(
site_name="Test Site",
storage_zone_id=1,
storage_zone_name="test-zone",
storage_zone_password="test-password",
storage_zone_region="DE",
pull_zone_id=1,
pull_zone_bcdn_hostname="test.b-cdn.net"
)
assert "already exists" in str(exc_info.value)
assert mock_session.rollback.called
def test_get_by_id(site_deployment_repo, mock_session):
"""Test getting site deployment by ID"""
mock_deployment = Mock(spec=SiteDeployment, id=1, site_name="Test Site")
mock_query = Mock()
mock_filter = Mock()
mock_filter.first = Mock(return_value=mock_deployment)
mock_query.filter = Mock(return_value=mock_filter)
mock_session.query = Mock(return_value=mock_query)
deployment = site_deployment_repo.get_by_id(1)
assert deployment == mock_deployment
assert mock_session.query.called
def test_get_all(site_deployment_repo, mock_session):
"""Test getting all site deployments"""
mock_deployments = [
Mock(spec=SiteDeployment, id=1, site_name="Site 1"),
Mock(spec=SiteDeployment, id=2, site_name="Site 2")
]
mock_query = Mock()
mock_query.all = Mock(return_value=mock_deployments)
mock_session.query = Mock(return_value=mock_query)
deployments = site_deployment_repo.get_all()
assert len(deployments) == 2
assert mock_session.query.called