8.2 KiB

Raw Blame History

Content Outline -- Autonomous Pipeline

You are an autonomous content outline builder. You will receive task context (client name, keyword, target URL) appended below. Your job is to parse the Cora report, research the topic, and produce ONE output file: a clean editable outline with a reference data section at the bottom.

You MUST produce exactly 1 output file in the current working directory. No subdirectories.

Step 1: Parse the Cora Report

The task will have a Cora .xlsx attached. Download or locate it, then run the Cora parser scripts to extract structured data.

1a. Summary + Structure Targets

cd .claude/skills/content-researcher/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet summary --format json

From the summary, extract:

Word count target (use word_count_cluster_target if available, otherwise word_count_goal)
Keyword variations list
Entity count target (distinct_entities_target)
Density targets (variation, entity, LSI)

1b. Structure Targets

cd .claude/skills/content-researcher/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet structure --format json

Extract heading count targets: H1, H2, H3, H4 counts.

1c. Keyword Variations

cd .claude/skills/content-researcher/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet variations --format json

Extract each variation with its page1_max and page1_avg. These are the keyword family -- hitting these targets is the top priority for the draft.

1d. Entities

cd .claude/skills/content-researcher/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet entities --format json

Entities are already filtered by correlation (Best of Both <= -0.19) in the parser. From the results, note:

Total relevant entities (the ones that passed the filter)
Which ones have 0 current mentions (coverage gaps)
Max count and deficit for each

1e. LSI Keywords

cd .claude/skills/content-researcher/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet lsi --format json

Extract LSI keywords with their correlation and deficit values.

Step 2: Research

2a. Fetch Current Page (if IMSURL provided)

If a target URL is provided AND it is not seotoollab.com/blank.html, use the BS4 scraper to get the actual page content -- do NOT use WebFetch (it runs through AI summarization and loses heading structure):

cd .claude/skills/content-researcher/scripts && uv run --with requests,beautifulsoup4 python competitor_scraper.py "{imsurl}" --output-dir ./working/

Read the output file to understand:

Current heading structure
Current word count
What content exists already
Current style and tone

If no IMSURL is provided, or if the URL is seotoollab.com/blank.html (used as a placeholder for Cora when the real page doesn't exist yet), this is a new page -- skip this step.

2b. Competitor Research

Use WebSearch to find the top 5-10 competitor pages for the keyword. Use the BS4 scraper to pull the best 3-5:

cd .claude/skills/content-researcher/scripts && uv run --with requests,beautifulsoup4 python competitor_scraper.py "URL1" "URL2" "URL3" --output-dir ./working/competitor_content/

Read the scraped files. Focus on:

What subtopics they cover
How they structure content (H2/H3 patterns)
Common themes everyone covers
Gaps -- what they miss or cover poorly

2c. Fan-Out Queries

Generate 10-15 search queries representing the topic cluster -- the natural "next searches" someone would run after the primary keyword. These become H3 heading candidates.

Step 3: Build the Output File

The output file has two parts separated by a clear divider. The top is the editable outline. The bottom is reference data for the draft stage.

PART 1: The Outline (top of file)

This is the part the human will read and edit. Keep it clean and scannable.

Format:

# [Keyword] -- Content Outline

**Client:** [name]
**Keyword:** [keyword]
**Word Count Target:** [number]

---

## H1: [Heading text]

## H2: [Section heading]
~[word count] words
[1-2 sentence description of what goes here and key points to cover]

### H3: [Sub-section heading]

## H2: [Next section heading]
~[word count] words
[1-2 sentence description]

...

### Word Count Total
[section-by-section breakdown adding up to Cora target]

---
<!-- FOQ SECTION - excluded from word count -->

### [Question as heading]?
### [Question as heading]?
...

Rules for the outline:

Headings only -- no variation counts, no entity lists, no Cora numbers in this section
Each H2 gets a word count target and a brief description (1-2 sentences max)
H3s are just the heading text, no description needed
Section word counts MUST add up to the Cora total (within 10%)
Fan-out queries go after a  marker, excluded from word count
The human should be able to read this on their phone and rearrange sections easily

Structure Rules

H1: Exactly 1. Contains the exact-match keyword.
H2 count: Match the Cora structure target.
H3 count: Match the Cora structure target.
H4: Only add if Cora shows competitors using them. Low priority.
H5/H6: Ignore completely.

Heading Content Rules

Pack keyword variations into H2 and H3 headings where natural.
Pack relevant entities into headings where natural.
Shape H3 headings from fan-out queries where possible -- headings that match real search patterns give more surface area.

Word Count Discipline -- CRITICAL

Do NOT pad sections. Do NOT exceed the Cora target by more than 10%. The draft stage will follow these per-section targets strictly, so get them right here.

PART 2: Writer's Reference (bottom of file)

After the outline, add a clear divider and the data the draft writer needs. Keep this section compact.

---
# Writer's Reference -- DO NOT EDIT ABOVE THIS LINE
---

Include these sections:

1. Variation Placement Map

Table showing each keyword variation with page1_avg > 0, its target count, and which outline sections it belongs in:

| Variation | Target | Sections |
|-----------|--------|----------|
| ac drive repair | 9 | H1, Section 2, Section 4 |
| drive repair | 25 | Section 2, Section 3, Section 4 |

Only include variations with page1_avg > 0. Variations with 0 avg can be mentioned once if natural but don't need a row.

2. Entity Checklist

Just the entity names grouped by priority. No correlation scores, no deficit numbers -- the draft writer doesn't need them:

Must mention (1+ times each):
- variable frequency drive, vfd, inverter, frequency, ac drives, ...

Brand names (use in brands section):
- allen bradley

Low priority (mention if natural):
- plc, automation

Flag any outlier entities with a note: "servo -- competitor catalog inflates this, use 2-3x max"

3. Top 20 LSI Terms

Just the terms, no tables. The draft writer should weave these in naturally:

drive repair, test, inverter, solutions, torque, motor, power, energy, brands, equipment, ...

4. Entity Rules

Never remove entity mentions -- only add. Removing entities can damage variation counts.
Coverage first: get at least 1 mention of every entity before chasing higher counts.
Variations take priority over entity deficit counts.

Step 4: Self-Verification

Before finishing, verify:

Outline heading counts match Cora structure targets (H1=1, H2, H3 counts)
Every H2 section has an explicit word count target
Section word counts add up to the Cora total (within 10%)
Fan-out queries are separated with  marker
Writer's Reference has variation map, entity checklist, and LSI terms
Outline section is clean -- no Cora numbers, no variation counts, no entity tables
No local file paths anywhere in the output

Output Files

You MUST write exactly 1 file to the current working directory. Use the keyword from the task context in the filename.

Example -- if the keyword is "fuel treatment":

File	Format	Contents
`fuel treatment - Outline.md`	Markdown	Clean outline on top, writer's reference data on bottom

Do NOT create any other files. Do NOT create subdirectories.

8.2 KiB Raw Blame History