374 lines
16 KiB
Markdown
374 lines
16 KiB
Markdown
# Epic 6: Multi-Cloud Storage Support
|
|
|
|
## Epic Goal
|
|
To extend the deployment system to support AWS S3 and S3-compatible cloud storage providers (DigitalOcean Spaces, Backblaze B2, Linode Object Storage, etc.), providing flexibility beyond Bunny.net while maintaining backward compatibility with existing deployments.
|
|
|
|
## Rationale
|
|
Currently, the system only supports Bunny.net storage, creating vendor lock-in and limiting deployment options. Many users have existing infrastructure on AWS S3 or prefer S3-compatible services for cost, performance, or compliance reasons. This epic will:
|
|
|
|
- **Increase Flexibility**: Support multiple cloud storage providers
|
|
- **Reduce Vendor Lock-in**: Enable migration between providers
|
|
- **Leverage Existing Infrastructure**: Use existing S3 buckets and credentials
|
|
- **Maintain Compatibility**: Existing Bunny.net deployments continue to work unchanged
|
|
|
|
## Status
|
|
- **Story 6.1**: 🔄 PLANNING (Storage Provider Abstraction)
|
|
- **Story 6.2**: 🔄 PLANNING (S3 Client Implementation)
|
|
- **Story 6.3**: 🔄 PLANNING (Database Schema Updates)
|
|
- **Story 6.4**: 🔄 PLANNING (URL Generation for S3)
|
|
- **Story 6.5**: 🔄 PLANNING (S3-Compatible Services Support)
|
|
- **Story 6.6**: 🔄 PLANNING (Bucket Provisioning Script)
|
|
|
|
## Stories
|
|
|
|
### Story 6.1: Storage Provider Abstraction Layer
|
|
**Estimated Effort**: 3 story points
|
|
|
|
**As a developer**, I want a simple way to support multiple storage providers without cluttering `DeploymentService` with if/elif chains, so that adding new providers (eventually 8+) is straightforward.
|
|
|
|
**Acceptance Criteria**:
|
|
* Create a simple factory function `create_storage_client(site: SiteDeployment)` that returns the appropriate client:
|
|
- `'bunny'` → `BunnyStorageClient()`
|
|
- `'s3'` → `S3StorageClient()`
|
|
- `'s3_compatible'` → `S3StorageClient()` (with custom endpoint)
|
|
- Future providers added here
|
|
* Refactor `BunnyStorageClient.upload_file()` to accept `site: SiteDeployment` parameter:
|
|
- Change from: `upload_file(zone_name, zone_password, zone_region, file_path, content)`
|
|
- Change to: `upload_file(site: SiteDeployment, file_path: str, content: str)`
|
|
- Client extracts bunny-specific fields from `site` internally
|
|
* Update `DeploymentService` to use factory and unified interface:
|
|
- Remove hardcoded `BunnyStorageClient` from `__init__`
|
|
- In `deploy_article()` and `deploy_boilerplate_page()`: create client per site
|
|
- Call: `client.upload_file(site, file_path, content)` (same signature for all providers)
|
|
* Optional: Add `StorageClient` Protocol for type hints (helps with 8+ providers)
|
|
* All existing Bunny.net deployments continue to work without changes
|
|
* Unit tests verify factory returns correct clients
|
|
|
|
**Technical Notes**:
|
|
* Factory function is simple if/elif chain (one place to maintain)
|
|
* All clients use same method signature: `upload_file(site, file_path, content)`
|
|
* Each client extracts provider-specific fields from `site` object internally
|
|
* Protocol is optional but recommended for type safety with many providers
|
|
* Factory pattern keeps `DeploymentService` clean (no provider-specific logic)
|
|
* Backward compatibility: default provider is "bunny" if not specified
|
|
|
|
---
|
|
|
|
### Story 6.2: AWS S3 Client Implementation
|
|
**Estimated Effort**: 8 story points
|
|
|
|
**As a user**, I want to deploy content to AWS S3 buckets, so that I can use my existing AWS infrastructure.
|
|
|
|
**Acceptance Criteria**:
|
|
* Create `S3StorageClient` implementing `StorageClient` interface
|
|
* Use boto3 library for AWS S3 operations
|
|
* Support standard AWS S3 regions
|
|
* Authentication via AWS credentials from environment variables
|
|
* Automatically configure bucket for public READ access only (not write):
|
|
- Apply public-read ACL or bucket policy on first upload
|
|
- Ensure bucket allows public read access (disable block public access settings)
|
|
- Verify public read access is enabled before deployment
|
|
- **Security**: Never enable public write access - only read permissions
|
|
* Upload files with correct content-type headers
|
|
* Generate public URLs from bucket name and region
|
|
* Support custom domain mapping (if configured)
|
|
* Error handling for common S3 errors (403, 404, bucket not found, etc.)
|
|
* Retry logic with exponential backoff (consistent with BunnyStorageClient)
|
|
* Unit tests with mocked boto3 calls
|
|
|
|
**Configuration**:
|
|
* AWS credentials from environment variables (global):
|
|
- `AWS_ACCESS_KEY_ID`
|
|
- `AWS_SECRET_ACCESS_KEY`
|
|
- `AWS_REGION` (default region, can be overridden per-site)
|
|
* Per-site configuration stored in database:
|
|
- `s3_bucket_name`: S3 bucket name
|
|
- `s3_bucket_region`: AWS region (optional, uses default if not set)
|
|
- `s3_custom_domain`: Optional custom domain for URL generation (manual setup)
|
|
|
|
**URL Generation**:
|
|
* Default: `https://{bucket_name}.s3.{region}.amazonaws.com/{file_path}`
|
|
* With custom domain: `https://{custom_domain}/{file_path}`
|
|
* Support for path-style URLs if needed: `https://s3.{region}.amazonaws.com/{bucket_name}/{file_path}`
|
|
|
|
**Technical Notes**:
|
|
* boto3 session management (reuse sessions for performance)
|
|
* Content-type detection (text/html for HTML files)
|
|
* Automatic public read access configuration (read-only, never write):
|
|
- Check and configure bucket policy for public read access only
|
|
- Disable "Block Public Access" settings for read access
|
|
- Apply public-read ACL to uploaded objects (not public-write)
|
|
- Validate public read access before deployment
|
|
- **Security**: Uploads require authenticated credentials, only reads are public
|
|
|
|
---
|
|
|
|
### Story 6.3: Database Schema Updates for Multi-Cloud
|
|
**Estimated Effort**: 3 story points
|
|
|
|
**As a developer**, I want to store provider-specific configuration in the database, so that each site can use its preferred storage provider.
|
|
|
|
**Acceptance Criteria**:
|
|
* Add `storage_provider` field to `site_deployments` table:
|
|
- Type: String(20), Not Null, Default: 'bunny'
|
|
- Values: 'bunny', 's3', 's3_compatible'
|
|
- Indexed for query performance
|
|
* Add S3-specific fields (nullable, only used when provider is 's3' or 's3_compatible'):
|
|
- `s3_bucket_name`: String(255), Nullable
|
|
- `s3_bucket_region`: String(50), Nullable
|
|
- `s3_custom_domain`: String(255), Nullable
|
|
- `s3_endpoint_url`: String(500), Nullable (for S3-compatible services)
|
|
* Create migration script to:
|
|
- Add new fields with appropriate defaults
|
|
- Set `storage_provider='bunny'` for all existing records
|
|
- Preserve all existing bunny.net fields
|
|
* Update `SiteDeployment` model with new fields
|
|
* Update repository methods to handle new fields
|
|
* Backward compatibility: existing queries continue to work
|
|
|
|
**Migration Strategy**:
|
|
* Existing sites default to 'bunny' provider
|
|
* No data loss or breaking changes
|
|
* New fields are nullable to support gradual migration
|
|
|
|
---
|
|
|
|
### Story 6.4: URL Generation for S3 Providers
|
|
**Estimated Effort**: 3 story points
|
|
|
|
**As a user**, I want public URLs for S3-deployed content to be generated correctly, so that articles are accessible via the expected URLs.
|
|
|
|
**Acceptance Criteria**:
|
|
* Update `generate_public_url()` in `url_generator.py` to handle S3 providers
|
|
* Support multiple URL formats:
|
|
- Virtual-hosted style: `https://bucket.s3.region.amazonaws.com/file.html`
|
|
- Path-style: `https://s3.region.amazonaws.com/bucket/file.html` (if needed)
|
|
- Custom domain: `https://custom-domain.com/file.html`
|
|
* URL generation logic based on `storage_provider` field
|
|
* Maintain existing behavior for Bunny.net (no changes)
|
|
* Handle S3-compatible services with custom endpoints
|
|
* Unit tests for all URL generation scenarios
|
|
|
|
**Technical Notes**:
|
|
* Virtual-hosted style is default for AWS S3
|
|
* Custom domain takes precedence if configured
|
|
* S3-compatible services may need path-style URLs depending on endpoint
|
|
|
|
---
|
|
|
|
### Story 6.5: S3-Compatible Services Support
|
|
**Estimated Effort**: 5 story points
|
|
|
|
**As a user**, I want to deploy to S3-compatible services (Linode Object Storage, DreamHost Object Storage, DigitalOcean Spaces), so that I can use S3-compatible storage providers the same way I use Bunny.net.
|
|
|
|
**Acceptance Criteria**:
|
|
* Extend `S3StorageClient` to support S3-compatible endpoints
|
|
* Support provider-specific configurations:
|
|
- **Linode Object Storage**: Custom endpoint
|
|
- **DreamHost Object Storage**: Custom endpoint
|
|
- **DigitalOcean Spaces**: Custom endpoint (e.g., `https://nyc3.digitaloceanspaces.com`)
|
|
* Store `s3_endpoint_url` per site for custom endpoints
|
|
* Handle provider-specific authentication differences
|
|
* Support provider-specific URL generation
|
|
* Configuration examples in documentation
|
|
* Unit tests for each supported service
|
|
|
|
**Supported Services**:
|
|
* AWS S3 (standard)
|
|
* Linode Object Storage
|
|
* DreamHost Object Storage
|
|
* DigitalOcean Spaces
|
|
* Backblaze
|
|
* Cloudflare
|
|
* (Other S3-compatible services can be added as needed)
|
|
|
|
**Configuration**:
|
|
* Per-service credentials in `.env` (global environment variables):
|
|
- `LINODE_ACCESS_KEY` / `LINODE_SECRET_KEY` (for Linode)
|
|
- `DREAMHOST_ACCESS_KEY` / `DREAMHOST_SECRET_KEY` (for DreamHost)
|
|
- `DO_SPACES_ACCESS_KEY` / `DO_SPACES_SECRET_KEY` (for DigitalOcean)
|
|
* Endpoint URLs stored per-site in `s3_endpoint_url` field
|
|
* Provider type stored in `storage_provider` ('s3_compatible')
|
|
* Automatic public access configuration (same as AWS S3)
|
|
|
|
**Technical Notes**:
|
|
* Most S3-compatible services work with boto3 using custom endpoints
|
|
* Some may require minor authentication adjustments
|
|
* URL generation may differ (e.g., DigitalOcean uses different domain structure)
|
|
|
|
---
|
|
|
|
### Story 6.6: S3 Bucket Provisioning Script
|
|
**Estimated Effort**: 3 story points
|
|
|
|
**As a user**, I want a script to automatically create and configure S3 buckets with proper public access settings, so that I can quickly set up new storage targets without manual AWS console work.
|
|
|
|
**Acceptance Criteria**:
|
|
* Create CLI command: `provision-s3-bucket --name <bucket> --region <region> [--provider <s3|linode|dreamhost|do>]`
|
|
* Automatically create bucket if it doesn't exist
|
|
* Configure bucket for public read access only (not write):
|
|
- Apply bucket policy allowing public read (GET requests only)
|
|
- Disable "Block Public Access" settings for read access
|
|
- Set appropriate CORS headers if needed
|
|
- **Security**: Never enable public write access - uploads require authentication
|
|
* Support multiple providers:
|
|
- AWS S3 (standard regions)
|
|
- Linode Object Storage
|
|
- DreamHost Object Storage
|
|
- DigitalOcean Spaces
|
|
* Validate bucket configuration after creation
|
|
* Option to link bucket to existing site deployment
|
|
* Clear error messages for common issues (bucket name conflicts, permissions, etc.)
|
|
* Documentation with examples for each provider
|
|
|
|
**Usage Examples**:
|
|
```bash
|
|
# Create AWS S3 bucket
|
|
provision-s3-bucket --name my-site-bucket --region us-east-1
|
|
|
|
# Create Linode bucket
|
|
provision-s3-bucket --name my-site-bucket --region us-east-1 --provider linode
|
|
|
|
# Create and link to site
|
|
provision-s3-bucket --name my-site-bucket --region us-east-1 --site-id 5
|
|
```
|
|
|
|
**Technical Notes**:
|
|
* Uses boto3 for all providers (with custom endpoints for S3-compatible)
|
|
* Bucket naming validation (AWS rules apply)
|
|
* Idempotent: safe to run multiple times
|
|
* Optional: Can be integrated into `provision-site` command later
|
|
|
|
---
|
|
|
|
## Technical Considerations
|
|
|
|
### Architecture Changes
|
|
|
|
1. **Unified Method Signature**:
|
|
```python
|
|
# All storage clients use the same signature
|
|
class BunnyStorageClient:
|
|
def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult:
|
|
# Extract bunny-specific fields from site
|
|
zone_name = site.storage_zone_name
|
|
zone_password = site.storage_zone_password
|
|
# ... do upload
|
|
|
|
class S3StorageClient:
|
|
def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult:
|
|
# Extract S3-specific fields from site
|
|
bucket_name = site.s3_bucket_name
|
|
# ... do upload
|
|
```
|
|
|
|
2. **Simple Factory Function**:
|
|
```python
|
|
def create_storage_client(site: SiteDeployment):
|
|
"""Create appropriate storage client based on site provider"""
|
|
if site.storage_provider == 'bunny':
|
|
return BunnyStorageClient()
|
|
elif site.storage_provider == 's3':
|
|
return S3StorageClient()
|
|
elif site.storage_provider == 's3_compatible':
|
|
return S3StorageClient() # Same client, uses site.s3_endpoint_url
|
|
# Future: elif site.storage_provider == 'cloudflare': ...
|
|
else:
|
|
raise ValueError(f"Unknown provider: {site.storage_provider}")
|
|
```
|
|
|
|
3. **Clean DeploymentService**:
|
|
```python
|
|
# In deploy_article():
|
|
client = create_storage_client(site) # One line, works for all providers
|
|
client.upload_file(site, file_path, content) # Same call for all
|
|
```
|
|
|
|
4. **Optional Protocol** (recommended for type safety with 8+ providers):
|
|
```python
|
|
from typing import Protocol
|
|
|
|
class StorageClient(Protocol):
|
|
def upload_file(self, site: SiteDeployment, file_path: str, content: str) -> UploadResult: ...
|
|
```
|
|
|
|
### Credential Management
|
|
|
|
**Decision: Global Environment Variables**
|
|
- All credentials stored in `.env` file (global)
|
|
- AWS: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION`
|
|
- Linode: `LINODE_ACCESS_KEY`, `LINODE_SECRET_KEY`
|
|
- DreamHost: `DREAMHOST_ACCESS_KEY`, `DREAMHOST_SECRET_KEY`
|
|
- DigitalOcean: `DO_SPACES_ACCESS_KEY`, `DO_SPACES_SECRET_KEY`
|
|
- Simple, secure, follows cloud provider best practices
|
|
- Works well for single-account deployments
|
|
- Per-site credentials can be added later if needed for multi-account scenarios
|
|
|
|
### URL Generation Strategy
|
|
|
|
**Bunny.net**: Uses CDN hostname (custom or bunny.net domain)
|
|
**AWS S3**: Uses bucket name + region or custom domain (manual setup)
|
|
**S3-Compatible**: Uses service-specific endpoint or custom domain (manual setup)
|
|
|
|
Custom domain mapping is supported but requires manual configuration (documented, not automated).
|
|
|
|
### Backward Compatibility
|
|
|
|
- All existing Bunny.net sites continue to work
|
|
- Default `storage_provider='bunny'` for existing records
|
|
- No breaking changes to existing APIs
|
|
- No migration tools provided (sites can stay on Bunny.net or be manually reconfigured)
|
|
|
|
### Testing Strategy
|
|
|
|
- Unit tests with mocked boto3/requests
|
|
- Integration tests with test S3 buckets (optional)
|
|
- Backward compatibility tests for Bunny.net
|
|
- URL generation tests for all providers
|
|
|
|
## Dependencies
|
|
|
|
- **boto3** library for AWS S3 operations
|
|
- Existing deployment infrastructure (Epic 4)
|
|
- Database migration tools
|
|
|
|
## Decisions Made
|
|
|
|
1. **Credential Storage**: ✅ Global environment variables (Option A)
|
|
- All credentials in `.env` file
|
|
- Simple, secure, follows cloud provider best practices
|
|
|
|
2. **S3-Compatible Services**: ✅ Support Linode, DreamHost, and DigitalOcean
|
|
- All services supported equally - no priority/decision logic in this epic
|
|
- Provider selection happens elsewhere in the codebase
|
|
- This epic just enables S3-compatible services to work the same as Bunny.net
|
|
|
|
3. **Custom Domains**: ✅ Manual setup (deferred automation)
|
|
- Custom domains require manual configuration
|
|
- Documented process, no automation in this epic
|
|
|
|
4. **Bucket Provisioning**: ✅ Manual with optional script (Story 6.6)
|
|
- Primary: Manual bucket creation
|
|
- Optional: `provision-s3-bucket` CLI script for automated setup
|
|
|
|
5. **Public Access**: ✅ Automatic configuration (read-only)
|
|
- System automatically configures buckets for public READ access only
|
|
- Applies bucket policies for read access, disables block public access, sets public-read ACLs
|
|
- **Security**: Never enables public write access - all uploads require authenticated credentials
|
|
|
|
6. **Migration Path**: ✅ No migration tools
|
|
- No automated migration from Bunny.net to S3
|
|
- Sites can be manually reconfigured if needed
|
|
|
|
## Success Metrics
|
|
|
|
- ✅ Deploy content to AWS S3 successfully
|
|
- ✅ Deploy content to S3-compatible services (Linode, DreamHost, DigitalOcean) successfully
|
|
- ✅ All existing Bunny.net deployments continue working
|
|
- ✅ URL generation works correctly for all providers
|
|
- ✅ Buckets automatically configured for public read access (not write)
|
|
- ✅ Zero breaking changes to existing functionality
|
|
- ✅ Bucket provisioning script works for all supported providers
|
|
|
|
|