11 KiB

Raw Blame History

Epic 6: Multi-Cloud Storage Support

Epic Goal

To extend the deployment system to support AWS S3 and S3-compatible cloud storage providers (DigitalOcean Spaces, Backblaze B2, Linode Object Storage, etc.), providing flexibility beyond Bunny.net while maintaining backward compatibility with existing deployments.

Rationale

Currently, the system only supports Bunny.net storage, creating vendor lock-in and limiting deployment options. Many users have existing infrastructure on AWS S3 or prefer S3-compatible services for cost, performance, or compliance reasons. This epic will:

Increase Flexibility: Support multiple cloud storage providers
Reduce Vendor Lock-in: Enable migration between providers
Leverage Existing Infrastructure: Use existing S3 buckets and credentials
Maintain Compatibility: Existing Bunny.net deployments continue to work unchanged

Status

Story 6.1: 🔄 PLANNING (Storage Provider Abstraction)
Story 6.2: 🔄 PLANNING (S3 Client Implementation)
Story 6.3: 🔄 PLANNING (Database Schema Updates)
Story 6.4: 🔄 PLANNING (URL Generation for S3)
Story 6.5: 🔄 PLANNING (S3-Compatible Services Support)

Stories

Story 6.1: Storage Provider Abstraction Layer

Estimated Effort: 5 story points

As a developer, I want a unified storage interface that abstracts provider-specific details, so that the deployment service can work with any storage provider without code changes.

Acceptance Criteria:

Create a StorageClient protocol/interface with common methods:
- upload_file(file_path: str, content: str, content_type: str) -> UploadResult
- file_exists(file_path: str) -> bool
- list_files(prefix: str = '') -> List[str]
Refactor BunnyStorageClient to implement the interface
Create a StorageClientFactory that returns the appropriate client based on provider type
Update DeploymentService to use the factory instead of hardcoding BunnyStorageClient
All existing Bunny.net deployments continue to work without changes
Unit tests verify interface compliance

Technical Notes:

Use Python Protocol (typing) or ABC for interface definition
Factory pattern: create_storage_client(site: SiteDeployment) -> StorageClient
Maintain backward compatibility: default provider is "bunny" if not specified

Story 6.2: AWS S3 Client Implementation

Estimated Effort: 8 story points

As a user, I want to deploy content to AWS S3 buckets, so that I can use my existing AWS infrastructure.

Acceptance Criteria:

Create S3StorageClient implementing StorageClient interface
Use boto3 library for AWS S3 operations
Support standard AWS S3 regions
Authentication via AWS credentials (access key ID, secret access key)
Handle bucket permissions (public read access required)
Upload files with correct content-type headers
Generate public URLs from bucket name and region
Support custom domain mapping (if configured)
Error handling for common S3 errors (403, 404, bucket not found, etc.)
Retry logic with exponential backoff (consistent with BunnyStorageClient)
Unit tests with mocked boto3 calls

Configuration:

AWS credentials from environment variables:
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- AWS_REGION (default region, can be overridden per-site)
Per-site configuration stored in database:
- bucket_name: S3 bucket name
- bucket_region: AWS region (optional, uses default if not set)
- custom_domain: Optional custom domain for URL generation

URL Generation:

Default: https://{bucket_name}.s3.{region}.amazonaws.com/{file_path}
With custom domain: https://{custom_domain}/{file_path}
Support for path-style URLs if needed: https://s3.{region}.amazonaws.com/{bucket_name}/{file_path}

Technical Notes:

boto3 session management (reuse sessions for performance)
Content-type detection (text/html for HTML files)
Public read ACL or bucket policy required for public URLs

Story 6.3: Database Schema Updates for Multi-Cloud

Estimated Effort: 3 story points

As a developer, I want to store provider-specific configuration in the database, so that each site can use its preferred storage provider.

Acceptance Criteria:

Add storage_provider field to site_deployments table:
- Type: String(20), Not Null, Default: 'bunny'
- Values: 'bunny', 's3', 's3_compatible'
- Indexed for query performance
Add S3-specific fields (nullable, only used when provider is 's3' or 's3_compatible'):
- s3_bucket_name: String(255), Nullable
- s3_bucket_region: String(50), Nullable
- s3_custom_domain: String(255), Nullable
- s3_endpoint_url: String(500), Nullable (for S3-compatible services)
Create migration script to:
- Add new fields with appropriate defaults
- Set storage_provider='bunny' for all existing records
- Preserve all existing bunny.net fields
Update SiteDeployment model with new fields
Update repository methods to handle new fields
Backward compatibility: existing queries continue to work

Migration Strategy:

Existing sites default to 'bunny' provider
No data loss or breaking changes
New fields are nullable to support gradual migration

Story 6.4: URL Generation for S3 Providers

Estimated Effort: 3 story points

As a user, I want public URLs for S3-deployed content to be generated correctly, so that articles are accessible via the expected URLs.

Acceptance Criteria:

Update generate_public_url() in url_generator.py to handle S3 providers
Support multiple URL formats:
- Virtual-hosted style: https://bucket.s3.region.amazonaws.com/file.html
- Path-style: https://s3.region.amazonaws.com/bucket/file.html (if needed)
- Custom domain: https://custom-domain.com/file.html
URL generation logic based on storage_provider field
Maintain existing behavior for Bunny.net (no changes)
Handle S3-compatible services with custom endpoints
Unit tests for all URL generation scenarios

Technical Notes:

Virtual-hosted style is default for AWS S3
Custom domain takes precedence if configured
S3-compatible services may need path-style URLs depending on endpoint

Story 6.5: S3-Compatible Services Support

Estimated Effort: 5 story points

As a user, I want to deploy to S3-compatible services (DigitalOcean Spaces, Backblaze B2, Linode Object Storage), so that I can use cost-effective alternatives to AWS.

Acceptance Criteria:

Extend S3StorageClient to support S3-compatible endpoints
Support provider-specific configurations:
- DigitalOcean Spaces: Custom endpoint (e.g., https://nyc3.digitaloceanspaces.com)
- Backblaze B2: Custom endpoint and authentication
- Linode Object Storage: Custom endpoint
Store s3_endpoint_url per site for custom endpoints
Handle provider-specific authentication differences
Support provider-specific URL generation
Configuration examples in documentation
Unit tests for each supported service

Supported Services (Initial):

DigitalOcean Spaces
Backblaze B2
Linode Object Storage
(Others can be added as needed)

Configuration:

Per-service credentials in .env or per-site in database
Endpoint URLs stored per-site in s3_endpoint_url field
Provider type stored in storage_provider ('s3_compatible')

Technical Notes:

Most S3-compatible services work with boto3 using custom endpoints
Some may require minor authentication adjustments
URL generation may differ (e.g., DigitalOcean uses different domain structure)

Technical Considerations

Architecture Changes

Interface/Protocol Design:

class StorageClient(Protocol):
    def upload_file(...) -> UploadResult: ...
    def file_exists(...) -> bool: ...
    def list_files(...) -> List[str]: ...

Factory Pattern:

def create_storage_client(site: SiteDeployment) -> StorageClient:
    if site.storage_provider == 'bunny':
        return BunnyStorageClient()
    elif site.storage_provider in ('s3', 's3_compatible'):
        return S3StorageClient(site)
    else:
        raise ValueError(f"Unknown provider: {site.storage_provider}")

Dependency Injection:
- DeploymentService receives StorageClient from factory
- No hardcoded provider dependencies

Credential Management

Option A: Environment Variables (Recommended for AWS)

Global AWS credentials in .env
Simple, secure, follows AWS best practices
Works well for single-account deployments

Option B: Per-Site Credentials

Store credentials in database (encrypted)
Required for multi-account or S3-compatible services
More complex but more flexible

Decision Needed: Which approach for initial implementation?

URL Generation Strategy

Bunny.net: Uses CDN hostname (custom or bunny.net domain) AWS S3: Uses bucket name + region or custom domain S3-Compatible: Uses service-specific endpoint or custom domain

All providers should support custom domain mapping for consistent URLs.

Backward Compatibility

All existing Bunny.net sites continue to work
Default storage_provider='bunny' for existing records
No breaking changes to existing APIs
Migration is optional (sites can stay on Bunny.net)

Testing Strategy

Unit tests with mocked boto3/requests
Integration tests with test S3 buckets (optional)
Backward compatibility tests for Bunny.net
URL generation tests for all providers

Dependencies

boto3 library for AWS S3 operations
Existing deployment infrastructure (Epic 4)
Database migration tools

Open Questions

Credential Storage: Per-site in DB vs. global env vars? (Recommendation: Start with env vars, add per-site later if needed)
S3-Compatible Priority: Which services to support first? (Recommendation: DigitalOcean Spaces, then Backblaze B2)
Custom Domains: How are custom domains configured? Manual setup or automated? (Recommendation: Manual for now, document process)
Bucket Provisioning: Should we automate S3 bucket creation, or require manual setup? (Recommendation: Manual for now, similar to current Bunny.net approach)
Public Access: How to ensure buckets are publicly readable? (Recommendation: Document requirements, validate in tests)
Migration Path: Should we provide tools to migrate existing Bunny.net sites to S3? (Recommendation: Defer to future story)

Success Metrics

✅ Deploy content to AWS S3 successfully
✅ Deploy content to at least one S3-compatible service
✅ All existing Bunny.net deployments continue working
✅ URL generation works correctly for all providers
✅ Zero breaking changes to existing functionality

11 KiB Raw Blame History

Epic 6: Multi-Cloud Storage Support

Epic Goal

Rationale

Status

Stories

Story 6.1: Storage Provider Abstraction Layer

Story 6.2: AWS S3 Client Implementation

Story 6.3: Database Schema Updates for Multi-Cloud

Story 6.4: URL Generation for S3 Providers

Story 6.5: S3-Compatible Services Support

Technical Considerations

Architecture Changes

Credential Management

URL Generation Strategy

Backward Compatibility

Testing Strategy

Dependencies

Open Questions

Success Metrics

11 KiB

Raw Blame History