Big-Link-Man/docs/stories/story-2.5-deployment-target...

6.4 KiB

Story 2.5: Deployment Target Assignment

Status

Draft

Story

As a developer, I want to assign deployment targets (site_deployment_id) to generated content during the content generation process based on job configuration, so that each article knows which site/bucket it will be deployed to.

Note: This story ONLY assigns site_deployment_id. Template selection logic is handled entirely by Story 2.4.

Acceptance Criteria

  • The job configuration file supports an optional deployment_targets array containing site custom_hostnames or site_deployment_ids.
  • The job configuration file supports an optional deployment_overflow strategy ("round_robin", "random_available", or "none").
  • During content generation, each article is assigned a site_deployment_id based on its index in the batch:
    • If deployment_targets is specified, cycle through the list (round-robin by default).
    • If the batch size exceeds the target list, apply the overflow strategy.
    • If no deployment_targets specified, site_deployment_id remains null (random template in Story 2.4).
  • The site_deployment_id is stored in the GeneratedContent record at creation time.
  • Invalid site references in deployment_targets cause graceful errors with clear messages.

Tasks / Subtasks

1. Update Job Configuration Schema

Effort: 2 story points

  • Add deployment_targets field (optional array of strings) to job config schema
  • Add deployment_overflow field (optional string: "round_robin", "random_available", "none")
  • Default deployment_overflow to "round_robin" if not specified
  • Update job config validation to check deployment_targets format
  • Update example job files in jobs/ directory with new fields

2. Implement Target Resolution Service

Effort: 3 story points

  • Create DeploymentTargetResolver class in src/deployment/ or appropriate module
  • Implement resolve_target(identifier: str) -> Optional[int] method
    • Accept custom_hostname or site_deployment_id (as string)
    • Query SiteDeployment table to get site_deployment_id
    • Return None if not found
  • Implement validate_targets(targets: List[str]) method
    • Pre-validate all targets in deployment_targets array
    • Return list of invalid targets if any
    • Fail fast with clear error message

3. Implement Assignment Strategy Logic

Effort: 4 story points

  • Implement assign_site_for_article(article_index: int, job_config: dict, total_articles: int) -> Optional[int]
  • Round-robin strategy:
    • Cycle through deployment_targets using modulo operation
    • Example: 10 articles, 5 targets → article_index % len(targets)
  • Random available strategy:
    • When article_index exceeds len(targets), query for SiteDeployments not in targets list
    • Randomly select from available sites
    • Handle case where no other sites exist (error)
  • None strategy:
    • Raise error if article_index exceeds len(targets)
    • Strict mode: only deploy exact number of articles as targets
  • Handle case where deployment_targets is None/empty (return None for all)

4. Database Integration

Effort: 2 story points

  • Verify site_deployment_id field exists in GeneratedContent model (added in Story 2.4)
  • Update GeneratedContentRepository.create() to accept site_deployment_id parameter
  • Ensure proper foreign key relationship to SiteDeployment table
  • Add database index on site_deployment_id for query performance

5. Integration with Content Generation Service

Effort: 3 story points

  • Update src/generation/service.py to parse deployment config from job
  • Call target resolver to validate deployment_targets at job start
  • For each article in batch:
    • Call assignment strategy to get site_deployment_id
    • Pass site_deployment_id to repository when creating GeneratedContent
  • Log assignment decisions (INFO level: "Article X assigned to site Y")
  • Handle assignment errors gracefully without breaking batch

6. Unit Tests

Effort: 3 story points

  • Test target resolution with valid hostnames
  • Test target resolution with valid site_deployment_ids
  • Test target resolution with invalid identifiers
  • Test round-robin strategy with various batch sizes
  • Test random_available strategy
  • Test none strategy with overflow scenarios
  • Test validation of deployment_targets array
  • Achieve >80% code coverage

7. Integration Tests

Effort: 2 story points

  • Test full generation flow with deployment_targets specified
  • Test round-robin assignment across 10 articles with 5 targets
  • Test with deployment_targets = null (all articles get null site_deployment_id)
  • Test error handling for invalid deployment targets
  • Verify site_deployment_id persisted correctly in database

Dev Notes

Example Job Config

{
  "job_name": "Multi-Site T1 Launch",
  "project_id": 2,
  "deployment_targets": [
    "www.domain1.com",
    "www.domain2.com",
    "www.domain3.com"
  ],
  "deployment_overflow": "round_robin",
  "tiers": [
    {
      "tier": 1,
      "article_count": 10
    }
  ]
}

Assignment Example (Round-Robin)

10 articles, 3 targets:

  • Article 0 → domain1.com
  • Article 1 → domain2.com
  • Article 2 → domain3.com
  • Article 3 → domain1.com
  • Article 4 → domain2.com
  • ... and so on

Assignment Example (Random Available)

10 articles, 3 targets, 5 total sites in database:

  • Article 0-2 → Round-robin through specified targets
  • Article 3+ → Random selection from domain4.com, domain5.com

Technical Decisions

  1. Target identifier: Support both hostname and numeric ID for flexibility
  2. Validation timing: Validate all targets at job start (fail fast)
  3. Overflow default: Round-robin is the safest default
  4. Null handling: No deployment_targets = all articles get null site_deployment_id

Dependencies

  • Story 1.6: SiteDeployment table must exist
  • Story 2.3: Content generation service must be functional
  • Story 2.4: Consumes site_deployment_id for template selection (but that's 2.4's concern, not this story's)

Database Changes Required

None - site_deployment_id field added in Story 2.4 task #5

Testing Strategy

  • Unit tests: Test assignment algorithms in isolation
  • Integration tests: Test full job execution with various configs
  • Edge cases: Empty targets, oversized batches, invalid hostnames