Update outline generation prompt and remove test file
parent
04f10d6d26
commit
4710677614
|
|
@ -0,0 +1,265 @@
|
||||||
|
# Epic 6: Multi-Cloud Storage Support
|
||||||
|
|
||||||
|
## Epic Goal
|
||||||
|
To extend the deployment system to support AWS S3 and S3-compatible cloud storage providers (DigitalOcean Spaces, Backblaze B2, Linode Object Storage, etc.), providing flexibility beyond Bunny.net while maintaining backward compatibility with existing deployments.
|
||||||
|
|
||||||
|
## Rationale
|
||||||
|
Currently, the system only supports Bunny.net storage, creating vendor lock-in and limiting deployment options. Many users have existing infrastructure on AWS S3 or prefer S3-compatible services for cost, performance, or compliance reasons. This epic will:
|
||||||
|
|
||||||
|
- **Increase Flexibility**: Support multiple cloud storage providers
|
||||||
|
- **Reduce Vendor Lock-in**: Enable migration between providers
|
||||||
|
- **Leverage Existing Infrastructure**: Use existing S3 buckets and credentials
|
||||||
|
- **Maintain Compatibility**: Existing Bunny.net deployments continue to work unchanged
|
||||||
|
|
||||||
|
## Status
|
||||||
|
- **Story 6.1**: 🔄 PLANNING (Storage Provider Abstraction)
|
||||||
|
- **Story 6.2**: 🔄 PLANNING (S3 Client Implementation)
|
||||||
|
- **Story 6.3**: 🔄 PLANNING (Database Schema Updates)
|
||||||
|
- **Story 6.4**: 🔄 PLANNING (URL Generation for S3)
|
||||||
|
- **Story 6.5**: 🔄 PLANNING (S3-Compatible Services Support)
|
||||||
|
|
||||||
|
## Stories
|
||||||
|
|
||||||
|
### Story 6.1: Storage Provider Abstraction Layer
|
||||||
|
**Estimated Effort**: 5 story points
|
||||||
|
|
||||||
|
**As a developer**, I want a unified storage interface that abstracts provider-specific details, so that the deployment service can work with any storage provider without code changes.
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
* Create a `StorageClient` protocol/interface with common methods:
|
||||||
|
- `upload_file(file_path: str, content: str, content_type: str) -> UploadResult`
|
||||||
|
- `file_exists(file_path: str) -> bool`
|
||||||
|
- `list_files(prefix: str = '') -> List[str]`
|
||||||
|
* Refactor `BunnyStorageClient` to implement the interface
|
||||||
|
* Create a `StorageClientFactory` that returns the appropriate client based on provider type
|
||||||
|
* Update `DeploymentService` to use the factory instead of hardcoding `BunnyStorageClient`
|
||||||
|
* All existing Bunny.net deployments continue to work without changes
|
||||||
|
* Unit tests verify interface compliance
|
||||||
|
|
||||||
|
**Technical Notes**:
|
||||||
|
* Use Python `Protocol` (typing) or ABC for interface definition
|
||||||
|
* Factory pattern: `create_storage_client(site: SiteDeployment) -> StorageClient`
|
||||||
|
* Maintain backward compatibility: default provider is "bunny" if not specified
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Story 6.2: AWS S3 Client Implementation
|
||||||
|
**Estimated Effort**: 8 story points
|
||||||
|
|
||||||
|
**As a user**, I want to deploy content to AWS S3 buckets, so that I can use my existing AWS infrastructure.
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
* Create `S3StorageClient` implementing `StorageClient` interface
|
||||||
|
* Use boto3 library for AWS S3 operations
|
||||||
|
* Support standard AWS S3 regions
|
||||||
|
* Authentication via AWS credentials (access key ID, secret access key)
|
||||||
|
* Handle bucket permissions (public read access required)
|
||||||
|
* Upload files with correct content-type headers
|
||||||
|
* Generate public URLs from bucket name and region
|
||||||
|
* Support custom domain mapping (if configured)
|
||||||
|
* Error handling for common S3 errors (403, 404, bucket not found, etc.)
|
||||||
|
* Retry logic with exponential backoff (consistent with BunnyStorageClient)
|
||||||
|
* Unit tests with mocked boto3 calls
|
||||||
|
|
||||||
|
**Configuration**:
|
||||||
|
* AWS credentials from environment variables:
|
||||||
|
- `AWS_ACCESS_KEY_ID`
|
||||||
|
- `AWS_SECRET_ACCESS_KEY`
|
||||||
|
- `AWS_REGION` (default region, can be overridden per-site)
|
||||||
|
* Per-site configuration stored in database:
|
||||||
|
- `bucket_name`: S3 bucket name
|
||||||
|
- `bucket_region`: AWS region (optional, uses default if not set)
|
||||||
|
- `custom_domain`: Optional custom domain for URL generation
|
||||||
|
|
||||||
|
**URL Generation**:
|
||||||
|
* Default: `https://{bucket_name}.s3.{region}.amazonaws.com/{file_path}`
|
||||||
|
* With custom domain: `https://{custom_domain}/{file_path}`
|
||||||
|
* Support for path-style URLs if needed: `https://s3.{region}.amazonaws.com/{bucket_name}/{file_path}`
|
||||||
|
|
||||||
|
**Technical Notes**:
|
||||||
|
* boto3 session management (reuse sessions for performance)
|
||||||
|
* Content-type detection (text/html for HTML files)
|
||||||
|
* Public read ACL or bucket policy required for public URLs
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Story 6.3: Database Schema Updates for Multi-Cloud
|
||||||
|
**Estimated Effort**: 3 story points
|
||||||
|
|
||||||
|
**As a developer**, I want to store provider-specific configuration in the database, so that each site can use its preferred storage provider.
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
* Add `storage_provider` field to `site_deployments` table:
|
||||||
|
- Type: String(20), Not Null, Default: 'bunny'
|
||||||
|
- Values: 'bunny', 's3', 's3_compatible'
|
||||||
|
- Indexed for query performance
|
||||||
|
* Add S3-specific fields (nullable, only used when provider is 's3' or 's3_compatible'):
|
||||||
|
- `s3_bucket_name`: String(255), Nullable
|
||||||
|
- `s3_bucket_region`: String(50), Nullable
|
||||||
|
- `s3_custom_domain`: String(255), Nullable
|
||||||
|
- `s3_endpoint_url`: String(500), Nullable (for S3-compatible services)
|
||||||
|
* Create migration script to:
|
||||||
|
- Add new fields with appropriate defaults
|
||||||
|
- Set `storage_provider='bunny'` for all existing records
|
||||||
|
- Preserve all existing bunny.net fields
|
||||||
|
* Update `SiteDeployment` model with new fields
|
||||||
|
* Update repository methods to handle new fields
|
||||||
|
* Backward compatibility: existing queries continue to work
|
||||||
|
|
||||||
|
**Migration Strategy**:
|
||||||
|
* Existing sites default to 'bunny' provider
|
||||||
|
* No data loss or breaking changes
|
||||||
|
* New fields are nullable to support gradual migration
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Story 6.4: URL Generation for S3 Providers
|
||||||
|
**Estimated Effort**: 3 story points
|
||||||
|
|
||||||
|
**As a user**, I want public URLs for S3-deployed content to be generated correctly, so that articles are accessible via the expected URLs.
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
* Update `generate_public_url()` in `url_generator.py` to handle S3 providers
|
||||||
|
* Support multiple URL formats:
|
||||||
|
- Virtual-hosted style: `https://bucket.s3.region.amazonaws.com/file.html`
|
||||||
|
- Path-style: `https://s3.region.amazonaws.com/bucket/file.html` (if needed)
|
||||||
|
- Custom domain: `https://custom-domain.com/file.html`
|
||||||
|
* URL generation logic based on `storage_provider` field
|
||||||
|
* Maintain existing behavior for Bunny.net (no changes)
|
||||||
|
* Handle S3-compatible services with custom endpoints
|
||||||
|
* Unit tests for all URL generation scenarios
|
||||||
|
|
||||||
|
**Technical Notes**:
|
||||||
|
* Virtual-hosted style is default for AWS S3
|
||||||
|
* Custom domain takes precedence if configured
|
||||||
|
* S3-compatible services may need path-style URLs depending on endpoint
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Story 6.5: S3-Compatible Services Support
|
||||||
|
**Estimated Effort**: 5 story points
|
||||||
|
|
||||||
|
**As a user**, I want to deploy to S3-compatible services (DigitalOcean Spaces, Backblaze B2, Linode Object Storage), so that I can use cost-effective alternatives to AWS.
|
||||||
|
|
||||||
|
**Acceptance Criteria**:
|
||||||
|
* Extend `S3StorageClient` to support S3-compatible endpoints
|
||||||
|
* Support provider-specific configurations:
|
||||||
|
- **DigitalOcean Spaces**: Custom endpoint (e.g., `https://nyc3.digitaloceanspaces.com`)
|
||||||
|
- **Backblaze B2**: Custom endpoint and authentication
|
||||||
|
- **Linode Object Storage**: Custom endpoint
|
||||||
|
* Store `s3_endpoint_url` per site for custom endpoints
|
||||||
|
* Handle provider-specific authentication differences
|
||||||
|
* Support provider-specific URL generation
|
||||||
|
* Configuration examples in documentation
|
||||||
|
* Unit tests for each supported service
|
||||||
|
|
||||||
|
**Supported Services** (Initial):
|
||||||
|
* DigitalOcean Spaces
|
||||||
|
* Backblaze B2
|
||||||
|
* Linode Object Storage
|
||||||
|
* (Others can be added as needed)
|
||||||
|
|
||||||
|
**Configuration**:
|
||||||
|
* Per-service credentials in `.env` or per-site in database
|
||||||
|
* Endpoint URLs stored per-site in `s3_endpoint_url` field
|
||||||
|
* Provider type stored in `storage_provider` ('s3_compatible')
|
||||||
|
|
||||||
|
**Technical Notes**:
|
||||||
|
* Most S3-compatible services work with boto3 using custom endpoints
|
||||||
|
* Some may require minor authentication adjustments
|
||||||
|
* URL generation may differ (e.g., DigitalOcean uses different domain structure)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Technical Considerations
|
||||||
|
|
||||||
|
### Architecture Changes
|
||||||
|
|
||||||
|
1. **Interface/Protocol Design**:
|
||||||
|
```python
|
||||||
|
class StorageClient(Protocol):
|
||||||
|
def upload_file(...) -> UploadResult: ...
|
||||||
|
def file_exists(...) -> bool: ...
|
||||||
|
def list_files(...) -> List[str]: ...
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Factory Pattern**:
|
||||||
|
```python
|
||||||
|
def create_storage_client(site: SiteDeployment) -> StorageClient:
|
||||||
|
if site.storage_provider == 'bunny':
|
||||||
|
return BunnyStorageClient()
|
||||||
|
elif site.storage_provider in ('s3', 's3_compatible'):
|
||||||
|
return S3StorageClient(site)
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Unknown provider: {site.storage_provider}")
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Dependency Injection**:
|
||||||
|
- `DeploymentService` receives `StorageClient` from factory
|
||||||
|
- No hardcoded provider dependencies
|
||||||
|
|
||||||
|
### Credential Management
|
||||||
|
|
||||||
|
**Option A: Environment Variables (Recommended for AWS)**
|
||||||
|
- Global AWS credentials in `.env`
|
||||||
|
- Simple, secure, follows AWS best practices
|
||||||
|
- Works well for single-account deployments
|
||||||
|
|
||||||
|
**Option B: Per-Site Credentials**
|
||||||
|
- Store credentials in database (encrypted)
|
||||||
|
- Required for multi-account or S3-compatible services
|
||||||
|
- More complex but more flexible
|
||||||
|
|
||||||
|
**Decision Needed**: Which approach for initial implementation?
|
||||||
|
|
||||||
|
### URL Generation Strategy
|
||||||
|
|
||||||
|
**Bunny.net**: Uses CDN hostname (custom or bunny.net domain)
|
||||||
|
**AWS S3**: Uses bucket name + region or custom domain
|
||||||
|
**S3-Compatible**: Uses service-specific endpoint or custom domain
|
||||||
|
|
||||||
|
All providers should support custom domain mapping for consistent URLs.
|
||||||
|
|
||||||
|
### Backward Compatibility
|
||||||
|
|
||||||
|
- All existing Bunny.net sites continue to work
|
||||||
|
- Default `storage_provider='bunny'` for existing records
|
||||||
|
- No breaking changes to existing APIs
|
||||||
|
- Migration is optional (sites can stay on Bunny.net)
|
||||||
|
|
||||||
|
### Testing Strategy
|
||||||
|
|
||||||
|
- Unit tests with mocked boto3/requests
|
||||||
|
- Integration tests with test S3 buckets (optional)
|
||||||
|
- Backward compatibility tests for Bunny.net
|
||||||
|
- URL generation tests for all providers
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
- **boto3** library for AWS S3 operations
|
||||||
|
- Existing deployment infrastructure (Epic 4)
|
||||||
|
- Database migration tools
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
1. **Credential Storage**: Per-site in DB vs. global env vars? (Recommendation: Start with env vars, add per-site later if needed)
|
||||||
|
|
||||||
|
2. **S3-Compatible Priority**: Which services to support first? (Recommendation: DigitalOcean Spaces, then Backblaze B2)
|
||||||
|
|
||||||
|
3. **Custom Domains**: How are custom domains configured? Manual setup or automated? (Recommendation: Manual for now, document process)
|
||||||
|
|
||||||
|
4. **Bucket Provisioning**: Should we automate S3 bucket creation, or require manual setup? (Recommendation: Manual for now, similar to current Bunny.net approach)
|
||||||
|
|
||||||
|
5. **Public Access**: How to ensure buckets are publicly readable? (Recommendation: Document requirements, validate in tests)
|
||||||
|
|
||||||
|
6. **Migration Path**: Should we provide tools to migrate existing Bunny.net sites to S3? (Recommendation: Defer to future story)
|
||||||
|
|
||||||
|
## Success Metrics
|
||||||
|
|
||||||
|
- ✅ Deploy content to AWS S3 successfully
|
||||||
|
- ✅ Deploy content to at least one S3-compatible service
|
||||||
|
- ✅ All existing Bunny.net deployments continue working
|
||||||
|
- ✅ URL generation works correctly for all providers
|
||||||
|
- ✅ Zero breaking changes to existing functionality
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -1,19 +0,0 @@
|
||||||
{
|
|
||||||
"jobs": [
|
|
||||||
{
|
|
||||||
"project_id": 1,
|
|
||||||
"tiers": {
|
|
||||||
"tier1": {
|
|
||||||
"count": 1,
|
|
||||||
"min_word_count": 500,
|
|
||||||
"max_word_count": 800,
|
|
||||||
"min_h2_tags": 2,
|
|
||||||
"max_h2_tags": 3,
|
|
||||||
"min_h3_tags": 3,
|
|
||||||
"max_h3_tags": 6
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
@ -1,5 +1,5 @@
|
||||||
{
|
{
|
||||||
"system_message": "You are an expert content outliner who creates well-structured, comprehensive article outlines that cover topics thoroughly and logically.",
|
"system_message": "You are an expert content outliner who creates well-structured, comprehensive article outlines that cover topics thoroughly and logically. You are creative and thorough in your headings and subheadings.",
|
||||||
"user_prompt": "Create an article outline for:\nTitle: {title}\nKeyword: {keyword}\n\nConstraints:\n- Between {min_h2} and {max_h2} H2 headings\n- Between {min_h3} and {max_h3} H3 subheadings total (distributed across H2 sections)\n\nEntities to incorporate: {entities}\nRelated searches to address: {related_searches}\n\nReturn ONLY valid JSON in this exact format:\n{{\"outline\": [{{\"h2\": \"Heading text\", \"h3\": [\"Subheading 1\", \"Subheading 2\"]}}, ...]}}\n\nEnsure the outline meets the minimum heading requirements and includes relevant entities and related searches. You can be creative in your headings and subheadings - they just need to be related to the topic {keyword}."
|
"user_prompt": "Create an article outline for:\nTitle: {title}\nKeyword: {keyword}\n\nConstraints:\n- Between {min_h2} and {max_h2} H2 headings\n- Between {min_h3} and {max_h3} H3 subheadings total (distributed across H2 sections)\n\nEntities to incorporate: {entities}\nRelated searches to address: {related_searches}\n\nReturn ONLY valid JSON in this exact format:\n{{\"outline\": [{{\"h2\": \"Heading text\", \"h3\": [\"Subheading 1\", \"Subheading 2\"]}}, ...]}}\n\nEnsure the outline meets the minimum heading requirements and includes relevant entities and related searches. You can be creative in your headings and subheadings - they just need to be related to the topic {keyword}. Do not make the 'definition of {keyword}' a H2 or H3 heading."
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue