19 KiB
Job Configuration Schema
This document defines the complete schema for job configuration files used in the Big-Link-Man content automation platform. All job files are JSON format and define batch content generation parameters.
Root Structure
{
"jobs": [
{
// Job object (see Job Object section below)
}
]
}
Root Fields
| Field | Type | Required | Description |
|---|---|---|---|
jobs |
Array<Job> |
Yes | Array of job definitions to process |
Job Object
Each job object defines a complete content generation batch for a specific project.
Required Fields
| Field | Type | Description |
|---|---|---|
project_id |
integer |
The project ID to generate content for |
tiers |
Object |
Dictionary of tier configurations (see Tier Configuration section) |
Optional Fields
| Field | Type | Default | Description |
|---|---|---|---|
models |
Object |
Uses CLI default | AI models to use for each generation stage (title, outline, content) |
deployment_targets |
Array<string> |
null |
Array of site custom_hostnames for tier1 deployment assignment (Story 2.5) |
tier1_preferred_sites |
Array<string> |
null |
Array of hostnames for tier1 site assignment priority (Story 3.1) |
auto_create_sites |
boolean |
false |
Whether to auto-create sites when pool is insufficient (Story 3.1) |
create_sites_for_keywords |
Array<Object> |
null |
Array of keyword site creation configs (Story 3.1) |
tiered_link_count_range |
Object |
null |
Configuration for tiered link counts (Story 3.2) |
image_theme_prompt |
string |
null |
Override image theme prompt for all images in this job (Story 7.1) |
Tier Configuration
Each tier in the tiers object defines content generation parameters for that specific tier level.
Tier Keys
tier1- Premium content (highest quality)tier2- Standard content (medium quality)tier3- Supporting content (basic quality)
Tier Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
count |
integer |
Yes | - | Number of articles to generate for this tier |
min_word_count |
integer |
No | See defaults | Minimum word count for articles |
max_word_count |
integer |
No | See defaults | Maximum word count for articles |
min_h2_tags |
integer |
No | See defaults | Minimum number of H2 headings |
max_h2_tags |
integer |
No | See defaults | Maximum number of H2 headings |
min_h3_tags |
integer |
No | See defaults | Minimum number of H3 subheadings |
max_h3_tags |
integer |
No | See defaults | Maximum number of H3 subheadings |
Tier Defaults
Tier 1 Defaults
{
"min_word_count": 2000,
"max_word_count": 2500,
"min_h2_tags": 3,
"max_h2_tags": 5,
"min_h3_tags": 5,
"max_h3_tags": 10
}
Tier 2 Defaults
{
"min_word_count": 1500,
"max_word_count": 2000,
"min_h2_tags": 2,
"max_h2_tags": 4,
"min_h3_tags": 3,
"max_h3_tags": 8
}
Tier 3 Defaults
{
"min_word_count": 1000,
"max_word_count": 1500,
"min_h2_tags": 2,
"max_h2_tags": 3,
"min_h3_tags": 2,
"max_h3_tags": 6
}
Deployment Target Assignment (Story 2.5)
deployment_targets
- Type:
Array<string>(optional) - Purpose: Assigns tier1 articles to specific sites in round-robin fashion
- Behavior:
- Only affects tier1 articles
- Articles 0 through N-1 get assigned to N deployment targets
- Articles N and beyond get
site_deployment_id = null - If not specified, all articles get
site_deployment_id = null
Example
{
"deployment_targets": [
"www.domain1.com",
"www.domain2.com",
"www.domain3.com"
]
}
Assignment Result:
- Article 0 → www.domain1.com
- Article 1 → www.domain2.com
- Article 2 → www.domain3.com
- Articles 3+ → null (no assignment)
Site Assignment (Story 3.1)
tier1_preferred_sites
- Type:
Array<string>(optional) - Purpose: Preferred sites for tier1 article assignment
- Behavior: Used in priority order before random selection
- Validation: All hostnames must exist in database
auto_create_sites
- Type:
boolean(optional, default:false) - Purpose: Auto-create sites when available pool is insufficient
- Behavior: Creates generic sites using project keyword as prefix
- Status: ⚠️ NOT IMPLEMENTED - Parsed but does not function
create_sites_for_keywords
- Type:
Array<Object>(optional) - Purpose: Pre-create sites for specific keywords before assignment
- Structure: Each object must have
keyword(string) andcount(integer) - Status: ⚠️ NOT IMPLEMENTED - Parsed but does not function
Keyword Site Creation Object
| Field | Type | Required | Description |
|---|---|---|---|
keyword |
string |
Yes | Keyword to create sites for |
count |
integer |
Yes | Number of sites to create for this keyword |
Example
{
"tier1_preferred_sites": [
"www.premium-site1.com",
"site123.b-cdn.net"
],
"auto_create_sites": true,
"create_sites_for_keywords": [
{
"keyword": "engine repair",
"count": 3
},
{
"keyword": "car maintenance",
"count": 2
}
]
}
AI Model Configuration
models
- Type:
Object(optional) - Purpose: Specifies AI models to use for each generation stage
- Behavior: Allows different models for title, outline, and content generation
- Note: If not specified, all stages use the model from CLI
--modelflag (default:gpt-4o-mini)
Models Object Fields
| Field | Type | Description |
|---|---|---|
title |
string |
Model to use for title generation |
outline |
string |
Model to use for outline generation |
content |
string |
Model to use for content generation |
Example
{
"models": {
"title": "openai/gpt-4o-mini",
"outline": "openai/gpt-4o",
"content": "anthropic/claude-3.5-sonnet"
}
}
Implementation Status
Implemented - The models field is fully functional. Different models can be specified for title, outline, and content generation stages. If a job file contains a models configuration and you also use the --model CLI flag, the system will warn you that the CLI flag is being ignored in favor of the job config.
Image Theme Configuration (Story 7.1)
image_theme_prompt
- Type:
string(optional) - Purpose: Override the image theme prompt for all images (hero and content) generated in this job
- Behavior:
- If provided, this string is used directly as the theme prompt for all image generation
- If not provided, the system checks for a cached theme in the project database
- If no cached theme exists, a new theme is generated using AI based on the project's keyword, entities, and related searches
- Format: A single string describing visual style, color scheme, lighting, environment, and overall aesthetic
- Note: This is the prompt sent directly to the image generation API (fal.ai FLUX.1 schnell), not split into system/user messages
Example
{
"image_theme_prompt": "Modern industrial workspace, warm amber lighting, deep burgundy accents, professional photography style, clean minimalist aesthetic"
}
Theme Prompt Priority
- Job override (
image_theme_promptin job.json) - Highest priority - Database cache (
Project.image_theme_prompt) - Used if no override - AI generation - Generated using
image_theme_generation.jsontemplate if no cache exists
Best Practices
- Use descriptive color schemes to avoid default blue tones
- Include lighting, environment, and style details
- Keep it concise (2-3 sentences recommended)
- Consider the industry/product when choosing colors and aesthetic
Tiered Link Configuration (Story 3.2)
tiered_link_count_range
- Type:
Object(optional) - Purpose: Configures how many tiered links to generate per article
- Default:
{"min": 2, "max": 4}if not specified - Behavior:
- Tier1: Always 1 link to money site (this setting ignored)
- Tier2+: Random between min and max links to lower tier
Tiered Link Range Object
| Field | Type | Required | Description |
|---|---|---|---|
min |
integer |
Yes | Minimum number of tiered links (must be >= 1) |
max |
integer |
Yes | Maximum number of tiered links (must be >= min) |
Example
{
"tiered_link_count_range": {
"min": 3,
"max": 5
}
}
Interlinking Configuration (Story 3.3)
interlinking
- Type:
Object(optional) - Purpose: Configures internal linking behavior within articles
- Can be set at: Job level (all tiers) or tier level (specific tier)
- Tier-level override: Tier-level config overrides job-level for that tier
Interlinking Object Fields
| Field | Type | Description |
|---|---|---|
links_per_article_min |
integer |
Minimum number of tiered links (same as tiered_link_count_range.min) |
links_per_article_max |
integer |
Maximum number of tiered links (same as tiered_link_count_range.max) |
see_also_min |
integer |
Minimum number of "See Also" links to same-tier articles (default: 4) |
see_also_max |
integer |
Maximum number of "See Also" links to same-tier articles (default: 5) |
Example
{
"interlinking": {
"links_per_article_min": 2,
"links_per_article_max": 4,
"see_also_min": 4,
"see_also_max": 5
}
}
Behavior:
links_per_article_min/max: Controls how many links to lower tier articlessee_also_min/max: Controls how many "See Also" links to randomly selected articles from the same tier
Anchor Text Configuration (Story 8.1)
anchor_text_config
- Type:
Object(optional) - Purpose: Controls anchor text selection for tiered links
- Can be set at: Job level (all tiers) or tier level (specific tier)
- Tier-level override: Tier-level config overrides job-level for that tier
Anchor Text Config Modes
Explicit is great for doing branded anchor text - we can add companyname to the mix as many times as we want to get the percentage we want.
| Mode | Description |
|---|---|
default |
Use master.config.json tier rules (main_keyword for tier1, related_searches for tier2+) |
override |
Replace tier rules with custom_text array |
append |
Add custom_text array to tier rules |
explicit |
Use only explicitly provided terms (no algorithm-generated terms) |
Anchor Text Config Object (Job Level)
| Field | Type | Description |
|---|---|---|
mode |
string |
One of: "default", "override", "append", "explicit" |
custom_text |
Array<string> |
Custom anchor text terms (for override/append modes) |
tier1 |
Array<string> |
Explicit terms for tier1 (for explicit mode) |
tier2 |
Array<string> |
Explicit terms for tier2 (for explicit mode) |
tier3 |
Array<string> |
Explicit terms for tier3 (for explicit mode) |
tier4_plus |
Array<string> |
Explicit terms for tier4+ (for explicit mode) |
Anchor Text Config Object (Tier Level)
| Field | Type | Description |
|---|---|---|
mode |
string |
One of: "default", "override", "append", "explicit" |
custom_text |
Array<string> |
Custom anchor text terms (for override/append modes) |
terms |
Array<string> |
Explicit terms for this tier (for explicit mode) |
Examples
Default mode (use tier rules):
{
"anchor_text_config": {
"mode": "default"
}
}
Override mode (replace with custom text):
{
"anchor_text_config": {
"mode": "override",
"custom_text": ["custom term 1", "custom term 2"]
}
}
Explicit mode (job level):
{
"anchor_text_config": {
"mode": "explicit",
"tier1": ["high volume", "precision machining", "custom manufacturing"],
"tier2": ["high volume production", "bulk manufacturing", "large scale"]
}
}
Explicit mode (tier level override):
{
"tiers": {
"tier1": {
"count": 12,
"anchor_text_config": {
"mode": "explicit",
"terms": ["high volume", "precision"]
}
}
}
}
Explicit mode with branded anchor text ratio (generated via ingest-cora):
When using ingest-cora with --tier1-branded-ratio, the system automatically generates an explicit anchor text list with the specified ratio of branded terms. For example, with a 75% ratio and branded text "Acme Corp", the generated config might look like:
{
"tiers": {
"tier1": {
"count": 10,
"anchor_text_config": {
"mode": "explicit",
"terms": ["Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "Acme Corp", "main keyword", "learn about main keyword", "main keyword guide", "best main keyword", "main keyword tips"]
}
}
}
}
This achieves 75% branded (15/20) and 25% keyword-based (5/20) anchor text selection.
Behavior:
- System tries to find provided terms in content first, then inserts if not found
- When using "explicit" mode, only the provided terms are used (no algorithm-generated terms)
- Tier-level explicit config takes precedence over job-level for that tier
Complete Example
{
"jobs": [
{
"project_id": 1,
"models": {
"title": "anthropic/claude-3.5-sonnet",
"outline": "anthropic/claude-3.5-sonnet",
"content": "openai/gpt-4o"
},
"deployment_targets": [
"www.primary-domain.com",
"www.secondary-domain.com"
],
"tier1_preferred_sites": [
"www.premium-site1.com",
"site123.b-cdn.net"
],
"auto_create_sites": true,
"create_sites_for_keywords": [
{
"keyword": "engine repair",
"count": 3
},
{
"keyword": "car maintenance",
"count": 2
}
],
"tiered_link_count_range": {
"min": 3,
"max": 5
},
"image_theme_prompt": "Modern industrial workspace, warm amber lighting, deep burgundy accents, professional photography style, clean minimalist aesthetic",
"tiers": {
"tier1": {
"count": 10,
"min_word_count": 2000,
"max_word_count": 2500,
"min_h2_tags": 3,
"max_h2_tags": 5,
"min_h3_tags": 5,
"max_h3_tags": 10
},
"tier2": {
"count": 50,
"min_word_count": 1500,
"max_word_count": 2000
},
"tier3": {
"count": 100
}
}
}
]
}
Validation Rules
Job Level Validation
project_idmust be a positive integertiersmust be an object with at least one tiermodelsmust be an object withtitle,outline, andcontentfields (if specified)deployment_targetsmust be an array of strings (if specified)tier1_preferred_sitesmust be an array of strings (if specified)auto_create_sitesmust be a boolean (if specified)create_sites_for_keywordsmust be an array of objects withkeywordandcountfields (if specified)tiered_link_count_rangemust havemin>= 1 andmax>=min(if specified)image_theme_promptmust be a non-empty string (if specified)
Tier Level Validation
countmust be a positive integermin_word_countmust be <=max_word_countmin_h2_tagsmust be <=max_h2_tagsmin_h3_tagsmust be <=max_h3_tags
Site Assignment Validation
- All hostnames in
deployment_targetsmust exist in database - All hostnames in
tier1_preferred_sitesmust exist in database - Keywords in
create_sites_for_keywordsmust be non-empty strings - Count values in
create_sites_for_keywordsmust be positive integers
Usage
CLI Command
uv run python main.py generate-batch --job-file jobs/example.json --username admin --password secret
Command Options
--job-file, -j: Path to job JSON file (required)--username, -u: Username for authentication--password, -p: Password for authentication--debug: Save AI responses to debug_output/--continue-on-error: Continue processing if article generation fails--model, -m: AI model to use (default: gpt-4o-mini). Overridden by job filemodelsconfig if present.
Implementation History
Story 2.2: Basic Content Generation
- Added
project_idandtiersfields - Added tier configuration with word count and heading constraints
- Added tier defaults for common configurations
Story 2.3: AI Content Generation
- Implemented: Per-stage model selection via job config
modelsfield - Implemented: Dynamic model switching in AIClient with
override_modelparameter - Implemented: CLI warning when job contains models but
--modelflag is used - Behavior: Job file
modelsconfig takes precedence over CLI--modelflag
Story 2.5: Deployment Target Assignment
- Added
deployment_targetsfield for tier1 site assignment - Implemented round-robin assignment logic
- Added validation for deployment target hostnames
Story 3.1: URL Generation and Site Assignment
- Added
tier1_preferred_sitesfor priority-based assignment - Added
auto_create_sitesfor on-demand site creation - Added
create_sites_for_keywordsfor pre-creation of keyword sites - Extended site assignment beyond deployment targets
Story 3.2: Tiered Link Finding
- Added
tiered_link_count_rangefor configurable link counts - Integrated with tiered link generation system
- Added validation for link count ranges
Story 7.1: Image Generation
- Added
image_theme_promptfor overriding image theme prompts - Allows manual control over visual style and color schemes
- Overrides database cache and AI generation when specified
Future Extensions
The schema is designed to be extensible for future features:
- Story 3.3: Content interlinking injection
- Story 4.x: Cloud deployment and handoff
- Future: Advanced site matching, cost tracking, analytics
Error Handling
Common Validation Errors
"Job missing 'project_id'"- Required field missing"Job missing 'tiers'"- Required field missing"'deployment_targets' must be an array"- Wrong data type"Deployment targets not found in database: invalid.com"- Invalid hostname"'tiered_link_count_range' min must be >= 1"- Invalid range value
Graceful Degradation
- Missing optional fields use sensible defaults
- Invalid hostnames cause clear error messages
- Insufficient sites trigger auto-creation (if enabled) or clear errors
- Failed articles are logged but don't stop batch processing (with
--continue-on-error)