These tests import classes and functions that no longer exist:
- ContentGenerationService (renamed to ContentGenerator)
- ContentRuleEngine, ContentHTMLParser (rule_engine.py deprecated)
- ContentAugmenter (depends on dead rule_engine imports)
- get_domain_from_site (removed from site_page_generator)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allows overriding the default random 10-12 tier1 article count
when creating a job file during CORA ingestion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous commit referenced tier1_count inside
create_job_file_for_project without adding it as a parameter.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allows overriding the default random 10-12 T1 article count.
Usage: ingest-cora ... --tier1-count 5
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add --tier1-branded-ratio flag (default: 0.75) to ingest-cora command
- Prompt for branded anchor text when ratio is specified
- Generate explicit anchor_text_config in tier1 job with specified ratio
- Update documentation in CLI_COMMAND_REFERENCE.md, job-schema.md, and gui-planning.md
- Add auto-import-all flag to discover_s3_buckets.py for bulk import
- Add bucket exclusion list (s3_bucket_exclusions.txt) to prevent re-importing manually added FQDN sites
- Add helper scripts for S3 site management (list, check, delete)
- Update README.md with comprehensive S3 bucket management documentation
- Add colinkri_processor.py for batch processing
- Various deployment and storage improvements
- Add explicit anchor text mode support in AnchorTextConfig
- Support tier-specific anchor text terms at job-level (tier1, tier2, tier3, tier4_plus)
- Support tier-level explicit anchor text with 'terms' array
- Update content injection to prioritize explicit terms when mode is 'explicit'
- Add validation for explicit mode requiring term lists
- Update JOB_FIELD_REFERENCE.md with explicit mode documentation and examples
- Add comprehensive unit and integration tests for explicit anchor text
Includes multi-cloud storage migration script and related database changes.
- Add S3StorageClient class implementing StorageClient Protocol
- Support AWS S3 and S3-compatible services with custom endpoints
- Automatic bucket configuration for public read access only
- Content-type detection for uploaded files
- URL generation (default S3 URLs and custom domain support)
- Error handling for common S3 errors (403, 404, NoSuchBucket, etc.)
- Retry logic with exponential backoff (consistent with BunnyStorageClient)
- Update storage_factory to return S3StorageClient for 's3' and 's3_compatible'
- Add comprehensive unit tests with mocked boto3 calls (18 tests, all passing)
Implements Story 6.2 from Epic 6: Multi-Cloud Storage Support
- Move site assignment to occur immediately after article creation
- Generate images after site assignment so they can be uploaded
- Add assign_site_to_single_article() helper function
- Fixes issue where images were generated with site_deployment_id=None
- Add Pillow dependency for image processing
- Remove text instructions from fal.ai prompts (generate clean images)
- Add semi-transparent dark background box behind text for readability
- Overlay full title text with white text and black outline
- Add proper line spacing between text lines
- Fix FAL_KEY environment variable setup
- Add image URL logging to console output during batch processing
- Remove unused h2-prompts file
- Implement ImageGenerator class with hero and content image generation
- Add image theme prompt generation and caching
- Integrate with fal.ai flux-1/schnell API
- Add image upload to storage (Bunny CDN)
- Add image injection into HTML content
- Add test script for image generation
- Update database models and repositories for image fields
- Fix API usage: use arguments parameter and image_size object format
- Fix SessionLocal import error by using db_manager.get_session()
- Create thread-local ContentGenerator instances for each worker
- Ensure each thread uses its own database session
- Prevents database session conflicts in concurrent article generation