12 KiB
12 KiB
Job Configuration Schema
This document defines the complete schema for job configuration files used in the Big-Link-Man content automation platform. All job files are JSON format and define batch content generation parameters.
Root Structure
{
"jobs": [
{
// Job object (see Job Object section below)
}
]
}
Root Fields
| Field | Type | Required | Description |
|---|---|---|---|
jobs |
Array<Job> |
Yes | Array of job definitions to process |
Job Object
Each job object defines a complete content generation batch for a specific project.
Required Fields
| Field | Type | Description |
|---|---|---|
project_id |
integer |
The project ID to generate content for |
tiers |
Object |
Dictionary of tier configurations (see Tier Configuration section) |
Optional Fields
| Field | Type | Default | Description |
|---|---|---|---|
models |
Object |
Uses CLI default | AI models to use for each generation stage (Story 2.3 - planned) |
deployment_targets |
Array<string> |
null |
Array of site custom_hostnames for tier1 deployment assignment (Story 2.5) |
tier1_preferred_sites |
Array<string> |
null |
Array of hostnames for tier1 site assignment priority (Story 3.1) |
auto_create_sites |
boolean |
false |
Whether to auto-create sites when pool is insufficient (Story 3.1) |
create_sites_for_keywords |
Array<Object> |
null |
Array of keyword site creation configs (Story 3.1) |
tiered_link_count_range |
Object |
null |
Configuration for tiered link counts (Story 3.2) |
Tier Configuration
Each tier in the tiers object defines content generation parameters for that specific tier level.
Tier Keys
tier1- Premium content (highest quality)tier2- Standard content (medium quality)tier3- Supporting content (basic quality)
Tier Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
count |
integer |
Yes | - | Number of articles to generate for this tier |
min_word_count |
integer |
No | See defaults | Minimum word count for articles |
max_word_count |
integer |
No | See defaults | Maximum word count for articles |
min_h2_tags |
integer |
No | See defaults | Minimum number of H2 headings |
max_h2_tags |
integer |
No | See defaults | Maximum number of H2 headings |
min_h3_tags |
integer |
No | See defaults | Minimum number of H3 subheadings |
max_h3_tags |
integer |
No | See defaults | Maximum number of H3 subheadings |
Tier Defaults
Tier 1 Defaults
{
"min_word_count": 2000,
"max_word_count": 2500,
"min_h2_tags": 3,
"max_h2_tags": 5,
"min_h3_tags": 5,
"max_h3_tags": 10
}
Tier 2 Defaults
{
"min_word_count": 1500,
"max_word_count": 2000,
"min_h2_tags": 2,
"max_h2_tags": 4,
"min_h3_tags": 3,
"max_h3_tags": 8
}
Tier 3 Defaults
{
"min_word_count": 1000,
"max_word_count": 1500,
"min_h2_tags": 2,
"max_h2_tags": 3,
"min_h3_tags": 2,
"max_h3_tags": 6
}
Deployment Target Assignment (Story 2.5)
deployment_targets
- Type:
Array<string>(optional) - Purpose: Assigns tier1 articles to specific sites in round-robin fashion
- Behavior:
- Only affects tier1 articles
- Articles 0 through N-1 get assigned to N deployment targets
- Articles N and beyond get
site_deployment_id = null - If not specified, all articles get
site_deployment_id = null
Example
{
"deployment_targets": [
"www.domain1.com",
"www.domain2.com",
"www.domain3.com"
]
}
Assignment Result:
- Article 0 → www.domain1.com
- Article 1 → www.domain2.com
- Article 2 → www.domain3.com
- Articles 3+ → null (no assignment)
Site Assignment (Story 3.1)
tier1_preferred_sites
- Type:
Array<string>(optional) - Purpose: Preferred sites for tier1 article assignment
- Behavior: Used in priority order before random selection
- Validation: All hostnames must exist in database
auto_create_sites
- Type:
boolean(optional, default:false) - Purpose: Auto-create sites when available pool is insufficient
- Behavior: Creates generic sites using project keyword as prefix
create_sites_for_keywords
- Type:
Array<Object>(optional) - Purpose: Pre-create sites for specific keywords before assignment
- Structure: Each object must have
keyword(string) andcount(integer)
Keyword Site Creation Object
| Field | Type | Required | Description |
|---|---|---|---|
keyword |
string |
Yes | Keyword to create sites for |
count |
integer |
Yes | Number of sites to create for this keyword |
Example
{
"tier1_preferred_sites": [
"www.premium-site1.com",
"site123.b-cdn.net"
],
"auto_create_sites": true,
"create_sites_for_keywords": [
{
"keyword": "engine repair",
"count": 3
},
{
"keyword": "car maintenance",
"count": 2
}
]
}
AI Model Configuration (Story 2.3 - Not Yet Implemented)
models
- Type:
Object(optional) - Purpose: Specifies AI models to use for each generation stage
- Behavior: Allows different models for title, outline, and content generation
- Note: Currently not parsed by job config - uses CLI
--modelflag instead
Models Object Fields
| Field | Type | Description |
|---|---|---|
title |
string |
Model to use for title generation |
outline |
string |
Model to use for outline generation |
content |
string |
Model to use for content generation |
Available Models (from master.config.json)
anthropic/claude-sonnet-4.5(Claude Sonnet 4.5)anthropic/claude-3.5-sonnet(Claude 3.5 Sonnet)openai/gpt-4o(GPT-4 Optimized)openai/gpt-4o-mini(GPT-4 Mini)meta-llama/llama-3.1-70b-instruct(Llama 3.1 70B)meta-llama/llama-3.1-8b-instruct(Llama 3.1 8B)google/gemini-2.5-flash(Gemini 2.5 Flash)
Example
{
"models": {
"title": "openai/gpt-4o-mini",
"outline": "openai/gpt-4o",
"content": "anthropic/claude-3.5-sonnet"
}
}
Implementation Status
This field is defined in the JSON schema but not yet implemented in the job config parser (src/generation/job_config.py). Currently, all stages use the same model specified via CLI --model flag.
Tiered Link Configuration (Story 3.2)
tiered_link_count_range
- Type:
Object(optional) - Purpose: Configures how many tiered links to generate per article
- Default:
{"min": 2, "max": 4}if not specified
Tiered Link Range Object
| Field | Type | Required | Description |
|---|---|---|---|
min |
integer |
Yes | Minimum number of tiered links (must be >= 1) |
max |
integer |
Yes | Maximum number of tiered links (must be >= min) |
Example
{
"tiered_link_count_range": {
"min": 3,
"max": 5
}
}
Complete Example
{
"jobs": [
{
"project_id": 1,
"models": {
"title": "anthropic/claude-3.5-sonnet",
"outline": "anthropic/claude-3.5-sonnet",
"content": "openai/gpt-4o"
},
"deployment_targets": [
"www.primary-domain.com",
"www.secondary-domain.com"
],
"tier1_preferred_sites": [
"www.premium-site1.com",
"site123.b-cdn.net"
],
"auto_create_sites": true,
"create_sites_for_keywords": [
{
"keyword": "engine repair",
"count": 3
},
{
"keyword": "car maintenance",
"count": 2
}
],
"tiered_link_count_range": {
"min": 3,
"max": 5
},
"tiers": {
"tier1": {
"count": 10,
"min_word_count": 2000,
"max_word_count": 2500,
"min_h2_tags": 3,
"max_h2_tags": 5,
"min_h3_tags": 5,
"max_h3_tags": 10
},
"tier2": {
"count": 50,
"min_word_count": 1500,
"max_word_count": 2000
},
"tier3": {
"count": 100
}
}
}
]
}
Validation Rules
Job Level Validation
project_idmust be a positive integertiersmust be an object with at least one tiermodelsmust be an object withtitle,outline, andcontentfields (if specified) - NOT YET VALIDATEDdeployment_targetsmust be an array of strings (if specified)tier1_preferred_sitesmust be an array of strings (if specified)auto_create_sitesmust be a boolean (if specified)create_sites_for_keywordsmust be an array of objects withkeywordandcountfields (if specified)tiered_link_count_rangemust havemin>= 1 andmax>=min(if specified)
Tier Level Validation
countmust be a positive integermin_word_countmust be <=max_word_countmin_h2_tagsmust be <=max_h2_tagsmin_h3_tagsmust be <=max_h3_tags
Site Assignment Validation
- All hostnames in
deployment_targetsmust exist in database - All hostnames in
tier1_preferred_sitesmust exist in database - Keywords in
create_sites_for_keywordsmust be non-empty strings - Count values in
create_sites_for_keywordsmust be positive integers
Usage
CLI Command
uv run python main.py generate-batch --job-file jobs/example.json --username admin --password secret
Command Options
--job-file, -j: Path to job JSON file (required)--username, -u: Username for authentication--password, -p: Password for authentication--debug: Save AI responses to debug_output/--continue-on-error: Continue processing if article generation fails--model, -m: AI model to use (default: gpt-4o-mini)
Implementation History
Story 2.2: Basic Content Generation
- Added
project_idandtiersfields - Added tier configuration with word count and heading constraints
- Added tier defaults for common configurations
Story 2.3: AI Content Generation (Partial)
- Implemented: Database fields for tracking models (title_model, outline_model, content_model)
- Not Implemented: Job config
modelsfield - currently uses CLI--modelflag - Planned: Per-stage model selection from job configuration
Story 2.5: Deployment Target Assignment
- Added
deployment_targetsfield for tier1 site assignment - Implemented round-robin assignment logic
- Added validation for deployment target hostnames
Story 3.1: URL Generation and Site Assignment
- Added
tier1_preferred_sitesfor priority-based assignment - Added
auto_create_sitesfor on-demand site creation - Added
create_sites_for_keywordsfor pre-creation of keyword sites - Extended site assignment beyond deployment targets
Story 3.2: Tiered Link Finding
- Added
tiered_link_count_rangefor configurable link counts - Integrated with tiered link generation system
- Added validation for link count ranges
Future Extensions
The schema is designed to be extensible for future features:
- Story 3.3: Content interlinking injection
- Story 4.x: Cloud deployment and handoff
- Future: Advanced site matching, cost tracking, analytics
Error Handling
Common Validation Errors
"Job missing 'project_id'"- Required field missing"Job missing 'tiers'"- Required field missing"'deployment_targets' must be an array"- Wrong data type"Deployment targets not found in database: invalid.com"- Invalid hostname"'tiered_link_count_range' min must be >= 1"- Invalid range value
Graceful Degradation
- Missing optional fields use sensible defaults
- Invalid hostnames cause clear error messages
- Insufficient sites trigger auto-creation (if enabled) or clear errors
- Failed articles are logged but don't stop batch processing (with
--continue-on-error)