16 KiB

Raw Blame History

Epic 6: Multi-Cloud Storage Support

Epic Goal

To extend the deployment system to support AWS S3 and S3-compatible cloud storage providers (DigitalOcean Spaces, Backblaze B2, Linode Object Storage, etc.), providing flexibility beyond Bunny.net while maintaining backward compatibility with existing deployments.

Rationale

Currently, the system only supports Bunny.net storage, creating vendor lock-in and limiting deployment options. Many users have existing infrastructure on AWS S3 or prefer S3-compatible services for cost, performance, or compliance reasons. This epic will:

Increase Flexibility: Support multiple cloud storage providers
Reduce Vendor Lock-in: Enable migration between providers
Leverage Existing Infrastructure: Use existing S3 buckets and credentials
Maintain Compatibility: Existing Bunny.net deployments continue to work unchanged

Status

Story 6.1: 🔄 PLANNING (Storage Provider Abstraction)
Story 6.2: 🔄 PLANNING (S3 Client Implementation)
Story 6.3: 🔄 PLANNING (Database Schema Updates)
Story 6.4: 🔄 PLANNING (URL Generation for S3)
Story 6.5: 🔄 PLANNING (S3-Compatible Services Support)
Story 6.6: 🔄 PLANNING (Bucket Provisioning Script)

Stories

Story 6.1: Storage Provider Abstraction Layer

Estimated Effort: 3 story points

As a developer, I want a simple way to support multiple storage providers without cluttering DeploymentService with if/elif chains, so that adding new providers (eventually 8+) is straightforward.

Acceptance Criteria:

Create a simple factory function create_storage_client(site: SiteDeployment) that returns the appropriate client:
- 'bunny' → BunnyStorageClient()
- 's3' → S3StorageClient()
- 's3_compatible' → S3StorageClient() (with custom endpoint)
- Future providers added here
Refactor BunnyStorageClient.upload_file() to accept site: SiteDeployment parameter:
- Change from: upload_file(zone_name, zone_password, zone_region, file_path, content)
- Change to: upload_file(site: SiteDeployment, file_path: str, content: str)
- Client extracts bunny-specific fields from site internally
Update DeploymentService to use factory and unified interface:
- Remove hardcoded BunnyStorageClient from __init__
- In deploy_article() and deploy_boilerplate_page(): create client per site
- Call: client.upload_file(site, file_path, content) (same signature for all providers)
Optional: Add StorageClient Protocol for type hints (helps with 8+ providers)
All existing Bunny.net deployments continue to work without changes
Unit tests verify factory returns correct clients

Technical Notes:

Factory function is simple if/elif chain (one place to maintain)
All clients use same method signature: upload_file(site, file_path, content)
Each client extracts provider-specific fields from site object internally
Protocol is optional but recommended for type safety with many providers
Factory pattern keeps DeploymentService clean (no provider-specific logic)
Backward compatibility: default provider is "bunny" if not specified

Story 6.2: AWS S3 Client Implementation

Estimated Effort: 8 story points

As a user, I want to deploy content to AWS S3 buckets, so that I can use my existing AWS infrastructure.

Acceptance Criteria:

Create S3StorageClient implementing StorageClient interface
Use boto3 library for AWS S3 operations
Support standard AWS S3 regions
Authentication via AWS credentials from environment variables
Automatically configure bucket for public READ access only (not write):
- Apply public-read ACL or bucket policy on first upload
- Ensure bucket allows public read access (disable block public access settings)
- Verify public read access is enabled before deployment
- Security: Never enable public write access - only read permissions
Upload files with correct content-type headers
Generate public URLs from bucket name and region
Support custom domain mapping (if configured)
Error handling for common S3 errors (403, 404, bucket not found, etc.)
Retry logic with exponential backoff (consistent with BunnyStorageClient)
Unit tests with mocked boto3 calls

Configuration:

AWS credentials from environment variables (global):
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- AWS_REGION (default region, can be overridden per-site)
Per-site configuration stored in database:
- s3_bucket_name: S3 bucket name
- s3_bucket_region: AWS region (optional, uses default if not set)
- s3_custom_domain: Optional custom domain for URL generation (manual setup)

URL Generation:

Default: https://{bucket_name}.s3.{region}.amazonaws.com/{file_path}
With custom domain: https://{custom_domain}/{file_path}
Support for path-style URLs if needed: https://s3.{region}.amazonaws.com/{bucket_name}/{file_path}

Technical Notes:

boto3 session management (reuse sessions for performance)
Content-type detection (text/html for HTML files)
Automatic public read access configuration (read-only, never write):
- Check and configure bucket policy for public read access only
- Disable "Block Public Access" settings for read access
- Apply public-read ACL to uploaded objects (not public-write)
- Validate public read access before deployment
- Security: Uploads require authenticated credentials, only reads are public

Story 6.3: Database Schema Updates for Multi-Cloud

Estimated Effort: 3 story points

As a developer, I want to store provider-specific configuration in the database, so that each site can use its preferred storage provider.

Acceptance Criteria:

Add storage_provider field to site_deployments table:
- Type: String(20), Not Null, Default: 'bunny'
- Values: 'bunny', 's3', 's3_compatible'
- Indexed for query performance
Add S3-specific fields (nullable, only used when provider is 's3' or 's3_compatible'):
- s3_bucket_name: String(255), Nullable
- s3_bucket_region: String(50), Nullable
- s3_custom_domain: String(255), Nullable
- s3_endpoint_url: String(500), Nullable (for S3-compatible services)
Create migration script to:
- Add new fields with appropriate defaults
- Set storage_provider='bunny' for all existing records
- Preserve all existing bunny.net fields
Update SiteDeployment model with new fields
Update repository methods to handle new fields
Backward compatibility: existing queries continue to work

Migration Strategy:

Existing sites default to 'bunny' provider
No data loss or breaking changes
New fields are nullable to support gradual migration

Story 6.4: URL Generation for S3 Providers

Estimated Effort: 3 story points

As a user, I want public URLs for S3-deployed content to be generated correctly, so that articles are accessible via the expected URLs.

Acceptance Criteria:

Update generate_public_url() in url_generator.py to handle S3 providers
Support multiple URL formats:
- Virtual-hosted style: https://bucket.s3.region.amazonaws.com/file.html
- Path-style: https://s3.region.amazonaws.com/bucket/file.html (if needed)
- Custom domain: https://custom-domain.com/file.html
URL generation logic based on storage_provider field
Maintain existing behavior for Bunny.net (no changes)
Handle S3-compatible services with custom endpoints
Unit tests for all URL generation scenarios

Technical Notes:

Virtual-hosted style is default for AWS S3
Custom domain takes precedence if configured
S3-compatible services may need path-style URLs depending on endpoint

Story 6.5: S3-Compatible Services Support

Estimated Effort: 5 story points

As a user, I want to deploy to S3-compatible services (Linode Object Storage, DreamHost Object Storage, DigitalOcean Spaces), so that I can use S3-compatible storage providers the same way I use Bunny.net.

Acceptance Criteria:

Extend S3StorageClient to support S3-compatible endpoints
Support provider-specific configurations:
- Linode Object Storage: Custom endpoint
- DreamHost Object Storage: Custom endpoint
- DigitalOcean Spaces: Custom endpoint (e.g., https://nyc3.digitaloceanspaces.com)
Store s3_endpoint_url per site for custom endpoints
Handle provider-specific authentication differences
Support provider-specific URL generation
Configuration examples in documentation
Unit tests for each supported service

Supported Services:

AWS S3 (standard)
Linode Object Storage
DreamHost Object Storage
DigitalOcean Spaces
Backblaze
Cloudflare
(Other S3-compatible services can be added as needed)

Configuration:

Per-service credentials in .env (global environment variables):
- LINODE_ACCESS_KEY / LINODE_SECRET_KEY (for Linode)
- DREAMHOST_ACCESS_KEY / DREAMHOST_SECRET_KEY (for DreamHost)
- DO_SPACES_ACCESS_KEY / DO_SPACES_SECRET_KEY (for DigitalOcean)
Endpoint URLs stored per-site in s3_endpoint_url field
Provider type stored in storage_provider ('s3_compatible')
Automatic public access configuration (same as AWS S3)

Technical Notes:

Most S3-compatible services work with boto3 using custom endpoints
Some may require minor authentication adjustments
URL generation may differ (e.g., DigitalOcean uses different domain structure)

Story 6.6: S3 Bucket Provisioning Script

Estimated Effort: 3 story points

As a user, I want a script to automatically create and configure S3 buckets with proper public access settings, so that I can quickly set up new storage targets without manual AWS console work.

Acceptance Criteria:

Create CLI command: provision-s3-bucket --name <bucket> --region <region> [--provider <s3|linode|dreamhost|do>]
Automatically create bucket if it doesn't exist
Configure bucket for public read access only (not write):
- Apply bucket policy allowing public read (GET requests only)
- Disable "Block Public Access" settings for read access
- Set appropriate CORS headers if needed
- Security: Never enable public write access - uploads require authentication
Support multiple providers:
- AWS S3 (standard regions)
- Linode Object Storage
- DreamHost Object Storage
- DigitalOcean Spaces
Validate bucket configuration after creation
Option to link bucket to existing site deployment
Clear error messages for common issues (bucket name conflicts, permissions, etc.)
Documentation with examples for each provider

Usage Examples:

# Create AWS S3 bucket
provision-s3-bucket --name my-site-bucket --region us-east-1

# Create Linode bucket
provision-s3-bucket --name my-site-bucket --region us-east-1 --provider linode

# Create and link to site
provision-s3-bucket --name my-site-bucket --region us-east-1 --site-id 5

Technical Notes:

Uses boto3 for all providers (with custom endpoints for S3-compatible)
Bucket naming validation (AWS rules apply)
Idempotent: safe to run multiple times
Optional: Can be integrated into provision-site command later

Technical Considerations

Architecture Changes

Unified Method Signature:

# All storage clients use the same signature
class BunnyStorageClient:
    def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult:
        # Extract bunny-specific fields from site
        zone_name = site.storage_zone_name
        zone_password = site.storage_zone_password
        # ... do upload

class S3StorageClient:
    def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult:
        # Extract S3-specific fields from site
        bucket_name = site.s3_bucket_name
        # ... do upload

Simple Factory Function:

def create_storage_client(site: SiteDeployment):
    """Create appropriate storage client based on site provider"""
    if site.storage_provider == 'bunny':
        return BunnyStorageClient()
    elif site.storage_provider == 's3':
        return S3StorageClient()
    elif site.storage_provider == 's3_compatible':
        return S3StorageClient()  # Same client, uses site.s3_endpoint_url
    # Future: elif site.storage_provider == 'cloudflare': ...
    else:
        raise ValueError(f"Unknown provider: {site.storage_provider}")

Clean DeploymentService:

# In deploy_article():
client = create_storage_client(site)  # One line, works for all providers
client.upload_file(site, file_path, content)  # Same call for all

Optional Protocol (recommended for type safety with 8+ providers):

from typing import Protocol

class StorageClient(Protocol):
    def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult: ...

Credential Management

Decision: Global Environment Variables

All credentials stored in .env file (global)
AWS: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION
Linode: LINODE_ACCESS_KEY, LINODE_SECRET_KEY
DreamHost: DREAMHOST_ACCESS_KEY, DREAMHOST_SECRET_KEY
DigitalOcean: DO_SPACES_ACCESS_KEY, DO_SPACES_SECRET_KEY
Simple, secure, follows cloud provider best practices
Works well for single-account deployments
Per-site credentials can be added later if needed for multi-account scenarios

URL Generation Strategy

Bunny.net: Uses CDN hostname (custom or bunny.net domain) AWS S3: Uses bucket name + region or custom domain (manual setup) S3-Compatible: Uses service-specific endpoint or custom domain (manual setup)

Custom domain mapping is supported but requires manual configuration (documented, not automated).

Backward Compatibility

All existing Bunny.net sites continue to work
Default storage_provider='bunny' for existing records
No breaking changes to existing APIs
No migration tools provided (sites can stay on Bunny.net or be manually reconfigured)

Testing Strategy

Unit tests with mocked boto3/requests
Integration tests with test S3 buckets (optional)
Backward compatibility tests for Bunny.net
URL generation tests for all providers

Dependencies

boto3 library for AWS S3 operations
Existing deployment infrastructure (Epic 4)
Database migration tools

Decisions Made

Credential Storage: ✅ Global environment variables (Option A)
- All credentials in .env file
- Simple, secure, follows cloud provider best practices
S3-Compatible Services: ✅ Support Linode, DreamHost, and DigitalOcean
- All services supported equally - no priority/decision logic in this epic
- Provider selection happens elsewhere in the codebase
- This epic just enables S3-compatible services to work the same as Bunny.net
Custom Domains: ✅ Manual setup (deferred automation)
- Custom domains require manual configuration
- Documented process, no automation in this epic
Bucket Provisioning: ✅ Manual with optional script (Story 6.6)
- Primary: Manual bucket creation
- Optional: provision-s3-bucket CLI script for automated setup
Public Access: ✅ Automatic configuration (read-only)
- System automatically configures buckets for public READ access only
- Applies bucket policies for read access, disables block public access, sets public-read ACLs
- Security: Never enables public write access - all uploads require authenticated credentials
Migration Path: ✅ No migration tools
- No automated migration from Bunny.net to S3
- Sites can be manually reconfigured if needed

Success Metrics

✅ Deploy content to AWS S3 successfully
✅ Deploy content to S3-compatible services (Linode, DreamHost, DigitalOcean) successfully
✅ All existing Bunny.net deployments continue working
✅ URL generation works correctly for all providers
✅ Buckets automatically configured for public read access (not write)
✅ Zero breaking changes to existing functionality
✅ Bucket provisioning script works for all supported providers

16 KiB Raw Blame History

Epic 6: Multi-Cloud Storage Support

Epic Goal

Rationale

Status

Stories

Story 6.1: Storage Provider Abstraction Layer

Story 6.2: AWS S3 Client Implementation

Story 6.3: Database Schema Updates for Multi-Cloud

Story 6.4: URL Generation for S3 Providers

Story 6.5: S3-Compatible Services Support

Story 6.6: S3 Bucket Provisioning Script

Technical Considerations

Architecture Changes

Credential Management

URL Generation Strategy

Backward Compatibility

Testing Strategy

Dependencies

Decisions Made

Success Metrics

16 KiB

Raw Blame History