Big-Link-Man/jobs/JOB_README.md

346 lines
8.3 KiB
Markdown

# Job File Format
Job files define batch content generation parameters using JSON format.
## Structure
```json
{
"jobs": [
{
"project_id": 1,
"tiers": {
"tier1": {
"count": 5,
"min_word_count": 2000,
"max_word_count": 2500,
"min_h2_tags": 3,
"max_h2_tags": 5,
"min_h3_tags": 5,
"max_h3_tags": 10
}
}
}
]
}
```
## Fields
### Job Level
- `project_id` (required): The project ID to generate content for
- `tiers` (required): Dictionary of tier configurations
- `deployment_targets` (optional): Array of site custom_hostnames or site_deployment_ids to cycle through
- `deployment_overflow` (optional): Strategy when batch size exceeds deployment_targets ("round_robin", "random_available", or "none"). Default: "round_robin"
- `image_theme_prompt` (optional): Override the image theme prompt for all images in this job. If not specified, uses the cached theme from the database or generates a new one using AI. This is a single string that describes the visual style, color scheme, lighting, and overall aesthetic for generated images.
- `anchor_text_config` (optional): Control anchor text used for interlinking. Can be set at job-level (applies to all tiers) or tier-level (overrides for specific tier). See [Anchor Text Configuration](#anchor-text-configuration) section below.
### Tier Level
- `count` (required): Number of articles to generate for this tier
- `min_word_count` (optional): Minimum word count (uses defaults if not specified)
- `max_word_count` (optional): Maximum word count (uses defaults if not specified)
- `min_h2_tags` (optional): Minimum H2 headings (uses defaults if not specified)
- `max_h2_tags` (optional): Maximum H2 headings (uses defaults if not specified)
- `min_h3_tags` (optional): Minimum H3 subheadings total (uses defaults if not specified)
- `max_h3_tags` (optional): Maximum H3 subheadings total (uses defaults if not specified)
- `anchor_text_config` (optional): Override anchor text for this tier only. See [Anchor Text Configuration](#anchor-text-configuration) section below.
## Tier Defaults
If tier parameters are not specified, these defaults are used:
### tier1
- `min_word_count`: 2000
- `max_word_count`: 2500
- `min_h2_tags`: 3
- `max_h2_tags`: 5
- `min_h3_tags`: 5
- `max_h3_tags`: 10
### tier2
- `min_word_count`: 1500
- `max_word_count`: 2000
- `min_h2_tags`: 2
- `max_h2_tags`: 4
- `min_h3_tags`: 3
- `max_h3_tags`: 8
### tier3
- `min_word_count`: 1000
- `max_word_count`: 1500
- `min_h2_tags`: 2
- `max_h2_tags`: 3
- `min_h3_tags`: 2
- `max_h3_tags`: 6
## Examples
### Simple: Single Tier with Defaults
```json
{
"jobs": [
{
"project_id": 1,
"tiers": {
"tier1": {
"count": 5
}
}
}
]
}
```
### Custom Word Counts
```json
{
"jobs": [
{
"project_id": 1,
"tiers": {
"tier1": {
"count": 3,
"min_word_count": 2500,
"max_word_count": 3000
}
}
}
]
}
```
### Multi-Tier
```json
{
"jobs": [
{
"project_id": 1,
"tiers": {
"tier1": {
"count": 5
},
"tier2": {
"count": 10
},
"tier3": {
"count": 15
}
}
}
]
}
```
### Multiple Projects
```json
{
"jobs": [
{
"project_id": 1,
"tiers": {
"tier1": {
"count": 5
}
}
},
{
"project_id": 2,
"tiers": {
"tier1": {
"count": 3
},
"tier2": {
"count": 8
}
}
}
]
}
```
### Custom Image Theme
```json
{
"jobs": [
{
"project_id": 1,
"image_theme_prompt": "Modern industrial workspace, warm amber lighting, deep burgundy accents, professional photography style, clean minimalist aesthetic",
"tiers": {
"tier1": {
"count": 5
}
}
}
]
}
```
The `image_theme_prompt` overrides the default AI-generated theme for all images (hero and content) in this job. Use it to ensure consistent visual styling or to avoid default color schemes. If omitted, the system will use the cached theme from the project database, or generate a new one if none exists.
## Anchor Text Configuration
Control the anchor text used when injecting links between articles. Anchor text configuration can be set at the job level (applies to all tiers) or tier level (overrides for that specific tier).
### Modes
- **`default`**: Use algorithm-generated anchor text based on tier rules (Tier 1: main_keyword variations, Tier 2: related_searches, etc.)
- **`override`**: Replace algorithm-generated terms with `custom_text` array
- **`append`**: Add `custom_text` array to algorithm-generated terms
- **`explicit`**: Use only explicitly provided terms (no algorithm-generated terms)
### Job-Level Configuration
Set anchor text for all tiers at once:
```json
{
"jobs": [{
"project_id": 26,
"anchor_text_config": {
"mode": "explicit",
"tier1": ["high volume", "precision machining", "custom manufacturing"],
"tier2": ["high volume production", "bulk manufacturing", "large scale"]
},
"tiers": {
"tier1": {"count": 12},
"tier2": {"count": 38}
}
}]
}
```
### Tier-Level Configuration
Override job-level config for a specific tier:
```json
{
"jobs": [{
"project_id": 26,
"anchor_text_config": {
"mode": "override",
"custom_text": ["custom term 1", "custom term 2"]
},
"tiers": {
"tier1": {
"count": 12,
"anchor_text_config": {
"mode": "explicit",
"terms": ["high volume", "precision"]
}
},
"tier2": {"count": 38}
}
}]
}
```
### Examples
**Override mode (same terms for all tiers):**
```json
{
"anchor_text_config": {
"mode": "override",
"custom_text": ["precision machining", "custom manufacturing"]
}
}
```
**Explicit mode (different terms per tier):**
```json
{
"anchor_text_config": {
"mode": "explicit",
"tier1": ["high volume", "precision machining"],
"tier2": ["bulk production", "large scale"],
"tier3": ["mass production"]
}
}
```
**Mixed modes (explicit for tier1, default for tier2):**
```json
{
"tiers": {
"tier1": {
"count": 12,
"anchor_text_config": {
"mode": "explicit",
"terms": ["high volume", "precision"]
}
},
"tier2": {
"count": 38,
"anchor_text_config": {
"mode": "default"
}
}
}
}
```
**Note:** When using `explicit` mode, terms are randomized across articles so all provided terms are used, not just the first one.
## Usage
Run batch generation with:
```bash
python main.py generate-batch --job-file jobs/example_tier1_batch.json --username youruser --password yourpass
```
### Options
- `--job-file, -j`: Path to job JSON file (required)
- `--username, -u`: Username for authentication
- `--password, -p`: Password for authentication
- `--debug`: Save AI responses to debug_output/
- `--continue-on-error`: Continue processing if article generation fails
- `--model, -m`: AI model to use (default: gpt-4o-mini)
### Deployment Target Assignment (Story 2.5)
Optionally specify which sites/buckets to deploy articles to:
```json
{
"jobs": [
{
"project_id": 1,
"deployment_targets": [
"www.domain1.com",
"www.domain2.com",
"www.domain3.com"
],
"deployment_overflow": "round_robin",
"tiers": {
"tier1": {
"count": 10
}
}
}
]
}
```
This generates 10 articles distributed across 3 sites:
- Articles 0, 3, 6, 9 → domain1.com
- Articles 1, 4, 7 → domain2.com
- Articles 2, 5, 8 → domain3.com
**Overflow Strategies:**
- `round_robin` (default): Cycle back through specified targets
- `random_available`: Use random sites not in the targets list
- `none`: Error if batch exceeds target count (strict mode)
If `deployment_targets` is omitted, articles receive random templates (no site assignment).
### Debug Mode
When using `--debug`, AI responses are saved to `debug_output/`:
- `title_project{id}_tier{tier}_article{n}_{timestamp}.txt`
- `outline_project{id}_tier{tier}_article{n}_{timestamp}.json`
- `content_project{id}_tier{tier}_article{n}_{timestamp}.html`
- `augmented_project{id}_tier{tier}_article{n}_{timestamp}.html` (if augmented)