11 KiB
Epic 6: Multi-Cloud Storage Support
Epic Goal
To extend the deployment system to support AWS S3 and S3-compatible cloud storage providers (DigitalOcean Spaces, Backblaze B2, Linode Object Storage, etc.), providing flexibility beyond Bunny.net while maintaining backward compatibility with existing deployments.
Rationale
Currently, the system only supports Bunny.net storage, creating vendor lock-in and limiting deployment options. Many users have existing infrastructure on AWS S3 or prefer S3-compatible services for cost, performance, or compliance reasons. This epic will:
- Increase Flexibility: Support multiple cloud storage providers
- Reduce Vendor Lock-in: Enable migration between providers
- Leverage Existing Infrastructure: Use existing S3 buckets and credentials
- Maintain Compatibility: Existing Bunny.net deployments continue to work unchanged
Status
- Story 6.1: 🔄 PLANNING (Storage Provider Abstraction)
- Story 6.2: 🔄 PLANNING (S3 Client Implementation)
- Story 6.3: 🔄 PLANNING (Database Schema Updates)
- Story 6.4: 🔄 PLANNING (URL Generation for S3)
- Story 6.5: 🔄 PLANNING (S3-Compatible Services Support)
Stories
Story 6.1: Storage Provider Abstraction Layer
Estimated Effort: 5 story points
As a developer, I want a unified storage interface that abstracts provider-specific details, so that the deployment service can work with any storage provider without code changes.
Acceptance Criteria:
- Create a
StorageClientprotocol/interface with common methods:upload_file(file_path: str, content: str, content_type: str) -> UploadResultfile_exists(file_path: str) -> boollist_files(prefix: str = '') -> List[str]
- Refactor
BunnyStorageClientto implement the interface - Create a
StorageClientFactorythat returns the appropriate client based on provider type - Update
DeploymentServiceto use the factory instead of hardcodingBunnyStorageClient - All existing Bunny.net deployments continue to work without changes
- Unit tests verify interface compliance
Technical Notes:
- Use Python
Protocol(typing) or ABC for interface definition - Factory pattern:
create_storage_client(site: SiteDeployment) -> StorageClient - Maintain backward compatibility: default provider is "bunny" if not specified
Story 6.2: AWS S3 Client Implementation
Estimated Effort: 8 story points
As a user, I want to deploy content to AWS S3 buckets, so that I can use my existing AWS infrastructure.
Acceptance Criteria:
- Create
S3StorageClientimplementingStorageClientinterface - Use boto3 library for AWS S3 operations
- Support standard AWS S3 regions
- Authentication via AWS credentials (access key ID, secret access key)
- Handle bucket permissions (public read access required)
- Upload files with correct content-type headers
- Generate public URLs from bucket name and region
- Support custom domain mapping (if configured)
- Error handling for common S3 errors (403, 404, bucket not found, etc.)
- Retry logic with exponential backoff (consistent with BunnyStorageClient)
- Unit tests with mocked boto3 calls
Configuration:
- AWS credentials from environment variables:
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYAWS_REGION(default region, can be overridden per-site)
- Per-site configuration stored in database:
bucket_name: S3 bucket namebucket_region: AWS region (optional, uses default if not set)custom_domain: Optional custom domain for URL generation
URL Generation:
- Default:
https://{bucket_name}.s3.{region}.amazonaws.com/{file_path} - With custom domain:
https://{custom_domain}/{file_path} - Support for path-style URLs if needed:
https://s3.{region}.amazonaws.com/{bucket_name}/{file_path}
Technical Notes:
- boto3 session management (reuse sessions for performance)
- Content-type detection (text/html for HTML files)
- Public read ACL or bucket policy required for public URLs
Story 6.3: Database Schema Updates for Multi-Cloud
Estimated Effort: 3 story points
As a developer, I want to store provider-specific configuration in the database, so that each site can use its preferred storage provider.
Acceptance Criteria:
- Add
storage_providerfield tosite_deploymentstable:- Type: String(20), Not Null, Default: 'bunny'
- Values: 'bunny', 's3', 's3_compatible'
- Indexed for query performance
- Add S3-specific fields (nullable, only used when provider is 's3' or 's3_compatible'):
s3_bucket_name: String(255), Nullables3_bucket_region: String(50), Nullables3_custom_domain: String(255), Nullables3_endpoint_url: String(500), Nullable (for S3-compatible services)
- Create migration script to:
- Add new fields with appropriate defaults
- Set
storage_provider='bunny'for all existing records - Preserve all existing bunny.net fields
- Update
SiteDeploymentmodel with new fields - Update repository methods to handle new fields
- Backward compatibility: existing queries continue to work
Migration Strategy:
- Existing sites default to 'bunny' provider
- No data loss or breaking changes
- New fields are nullable to support gradual migration
Story 6.4: URL Generation for S3 Providers
Estimated Effort: 3 story points
As a user, I want public URLs for S3-deployed content to be generated correctly, so that articles are accessible via the expected URLs.
Acceptance Criteria:
- Update
generate_public_url()inurl_generator.pyto handle S3 providers - Support multiple URL formats:
- Virtual-hosted style:
https://bucket.s3.region.amazonaws.com/file.html - Path-style:
https://s3.region.amazonaws.com/bucket/file.html(if needed) - Custom domain:
https://custom-domain.com/file.html
- Virtual-hosted style:
- URL generation logic based on
storage_providerfield - Maintain existing behavior for Bunny.net (no changes)
- Handle S3-compatible services with custom endpoints
- Unit tests for all URL generation scenarios
Technical Notes:
- Virtual-hosted style is default for AWS S3
- Custom domain takes precedence if configured
- S3-compatible services may need path-style URLs depending on endpoint
Story 6.5: S3-Compatible Services Support
Estimated Effort: 5 story points
As a user, I want to deploy to S3-compatible services (DigitalOcean Spaces, Backblaze B2, Linode Object Storage), so that I can use cost-effective alternatives to AWS.
Acceptance Criteria:
- Extend
S3StorageClientto support S3-compatible endpoints - Support provider-specific configurations:
- DigitalOcean Spaces: Custom endpoint (e.g.,
https://nyc3.digitaloceanspaces.com) - Backblaze B2: Custom endpoint and authentication
- Linode Object Storage: Custom endpoint
- DigitalOcean Spaces: Custom endpoint (e.g.,
- Store
s3_endpoint_urlper site for custom endpoints - Handle provider-specific authentication differences
- Support provider-specific URL generation
- Configuration examples in documentation
- Unit tests for each supported service
Supported Services (Initial):
- DigitalOcean Spaces
- Backblaze B2
- Linode Object Storage
- (Others can be added as needed)
Configuration:
- Per-service credentials in
.envor per-site in database - Endpoint URLs stored per-site in
s3_endpoint_urlfield - Provider type stored in
storage_provider('s3_compatible')
Technical Notes:
- Most S3-compatible services work with boto3 using custom endpoints
- Some may require minor authentication adjustments
- URL generation may differ (e.g., DigitalOcean uses different domain structure)
Technical Considerations
Architecture Changes
-
Interface/Protocol Design:
class StorageClient(Protocol): def upload_file(...) -> UploadResult: ... def file_exists(...) -> bool: ... def list_files(...) -> List[str]: ... -
Factory Pattern:
def create_storage_client(site: SiteDeployment) -> StorageClient: if site.storage_provider == 'bunny': return BunnyStorageClient() elif site.storage_provider in ('s3', 's3_compatible'): return S3StorageClient(site) else: raise ValueError(f"Unknown provider: {site.storage_provider}") -
Dependency Injection:
DeploymentServicereceivesStorageClientfrom factory- No hardcoded provider dependencies
Credential Management
Option A: Environment Variables (Recommended for AWS)
- Global AWS credentials in
.env - Simple, secure, follows AWS best practices
- Works well for single-account deployments
Option B: Per-Site Credentials
- Store credentials in database (encrypted)
- Required for multi-account or S3-compatible services
- More complex but more flexible
Decision Needed: Which approach for initial implementation?
URL Generation Strategy
Bunny.net: Uses CDN hostname (custom or bunny.net domain) AWS S3: Uses bucket name + region or custom domain S3-Compatible: Uses service-specific endpoint or custom domain
All providers should support custom domain mapping for consistent URLs.
Backward Compatibility
- All existing Bunny.net sites continue to work
- Default
storage_provider='bunny'for existing records - No breaking changes to existing APIs
- Migration is optional (sites can stay on Bunny.net)
Testing Strategy
- Unit tests with mocked boto3/requests
- Integration tests with test S3 buckets (optional)
- Backward compatibility tests for Bunny.net
- URL generation tests for all providers
Dependencies
- boto3 library for AWS S3 operations
- Existing deployment infrastructure (Epic 4)
- Database migration tools
Open Questions
-
Credential Storage: Per-site in DB vs. global env vars? (Recommendation: Start with env vars, add per-site later if needed)
-
S3-Compatible Priority: Which services to support first? (Recommendation: DigitalOcean Spaces, then Backblaze B2)
-
Custom Domains: How are custom domains configured? Manual setup or automated? (Recommendation: Manual for now, document process)
-
Bucket Provisioning: Should we automate S3 bucket creation, or require manual setup? (Recommendation: Manual for now, similar to current Bunny.net approach)
-
Public Access: How to ensure buckets are publicly readable? (Recommendation: Document requirements, validate in tests)
-
Migration Path: Should we provide tools to migrate existing Bunny.net sites to S3? (Recommendation: Defer to future story)
Success Metrics
- ✅ Deploy content to AWS S3 successfully
- ✅ Deploy content to at least one S3-compatible service
- ✅ All existing Bunny.net deployments continue working
- ✅ URL generation works correctly for all providers
- ✅ Zero breaking changes to existing functionality