# CheddahBot Task Pipeline Flows — Complete Reference ## ClickUp Statuses Used These are the ClickUp task statuses that CheddahBot reads and writes: | Status | Set By | Meaning | |--------|--------|---------| | `to do` | Human (or default) | Task is waiting to be picked up | | `automation underway` | CheddahBot | Bot is actively working on this task | | `running cora` | CheddahBot (AutoCora) | Cora report is being generated by external worker | | `outline review` | CheddahBot (Content) | Phase 1 outline is ready for human review | | `outline approved` | Human | Human reviewed the outline, ready for Phase 2 | | `pr needs review` | CheddahBot (Press Release) | Press release pipeline finished, PRs ready for human review | | `internal review` | CheddahBot (Content/OPT) | Content/OPT pipeline finished, deliverables ready for human review | | `complete` | CheddahBot (Link Building) | Pipeline fully done | | `error` | CheddahBot | Something failed, needs attention | | `in progress` | (configured but not used in automation) | — | **What CheddahBot polls for:** `["to do", "outline approved"]` (config.yaml line 45) --- ## ClickUp Custom Fields Used | Field Name | Type | Used By | What It Holds | |------------|------|---------|---------------| | `Work Category` | Dropdown | All pipelines | Determines which pipeline runs: "Press Release", "Link Building", "On Page Optimization", "Content Creation" | | `PR Topic` | Text | Press Release | Press release topic/keyword (e.g. "Peek Plastic") — required | | `Customer` | Text | Press Release | Client/company name — required | | `Keyword` | Text | Link Building, Content, OPT | Target SEO keyword | | `IMSURL` | Text | All pipelines | Target page URL (money site) — required for Press Release | | `SocialURL` | Text | Press Release | Branded/social URL for the PR | | `LB Method` | Dropdown | Link Building | "Cora Backlinks" or other methods | | `CustomAnchors` | Text | Link Building | Custom anchor text overrides | | `BrandedPlusRatio` | Number | Link Building | Ratio for branded anchors (default 0.7) | | `CLIFlags` | Text | Link Building, Content, OPT | Extra flags passed to tools (e.g., "service") | | `CoraFile` | Text | Link Building | Path to Cora xlsx file | **Tags:** Tasks are tagged with month in `mmmyy` format (e.g., `feb26`, `mar26`). --- ## Background Threads CheddahBot runs 6 daemon threads. All start at boot and run until shutdown. | Thread | Interval | What It Does | |--------|----------|-------------| | **poll** | 60 seconds | Runs cron-scheduled tasks from the database | | **heartbeat** | 30 minutes | Reads HEARTBEAT.md checklist, takes action if needed | | **clickup** | 20 minutes | Polls ClickUp for tasks to auto-execute (only Press Releases currently) | | **folder_watch** | 40 minutes | Scans `//PennQnap1/SHARE1/cora-inbox` for .xlsx files → triggers Link Building | | **autocora** | 5 minutes | Submits Cora jobs for today's tasks + polls for results | | **content_watch** | 40 minutes | Scans `//PennQnap1/SHARE1/content-cora-inbox` for .xlsx files → triggers Content/OPT Phase 1 | | **cora_distribute** | 40 minutes | Scans `//PennQnap1/SHARE1/Cora-For-Human` for .xlsx files → distributes to pipeline inboxes | --- ## Pipeline 1: PRESS RELEASE **Work Category:** "Press Release" **auto_execute:** TRUE — the only pipeline that runs automatically from ClickUp polling **Tool:** `write_press_releases` ### Flow ``` CLICKUP POLL (every 20 min) │ ├─ Finds task with Work Category = "Press Release", status = "to do", due within 3 weeks │ ▼ CHECK LOCAL DB │ Key: clickup:task:{id}:state │ If state = "executing" or "completed" or "failed" → SKIP (already handled) │ ▼ SET STATUS → "automation underway" │ ClickUp API: PUT /task/{id} status │ Local DB: state = "executing" │ ▼ STEP 1: Generate 7 Headlines (chat brain - GPT-4o-mini) │ Uses configured chat model │ Saves to: data/generated/press_releases/{company}/{slug}_headlines.txt │ ▼ STEP 2: AI Judge Picks Best 2 (chat brain) │ Filters out rule-violating headlines (colons, superlatives, etc.) │ Falls back to first 2 if judge returns < 2 │ ▼ STEP 3: Write 2 Full Press Releases (execution brain - Claude Code CLI) │ For each winning headline: │ - Claude writes full 575-800 word PR │ - Validates anchor phrase │ - Saves .txt and .docx │ - Uploads .docx to ClickUp as attachment │ ▼ STEP 4: Generate JSON-LD Schemas (execution brain - Sonnet) │ For each PR: │ - Generates NewsArticle schema │ - Saves .json file │ ▼ SET STATUS → "internal review" │ ClickUp API: comment with results + PUT status │ Local DB: state = "completed" │ ▼ DONE — Human reviews in ClickUp ``` ### ClickUp Fields Read - `PR Topic` → press release topic/keyword (required) - `Customer` → company name in PR (required) - `IMSURL` → target URL for anchor link (required) - `SocialURL` → branded URL (optional) ### What Can Go Wrong - **BUG: Crash mid-step → stuck forever.** DB says "executing", never retries. Manual reset needed. - **BUG: DB says "completed" but ClickUp API failed → out of sync.** DB written before API call. - **BUG: Attachment upload fails silently.** Task marked complete, files missing from ClickUp. - Headline generation returns empty → tool exits with error, task marked "failed" - Schema JSON invalid → warning logged but task still completes --- ## Pipeline 2: LINK BUILDING (Cora Backlinks) **Work Category:** "Link Building" **auto_execute:** FALSE — triggered by folder watcher, not ClickUp polling **Tool:** `run_cora_backlinks` ### Full Lifecycle (3 stages) ``` STAGE A: AUTOCORA SUBMITS CORA JOB ══════════════════════════════════ AUTOCORA LOOP (every 5 min) │ ├─ Calls submit_autocora_jobs(target_date = today) │ Finds tasks: Work Category in ["Link Building", "On Page Optimization", "Content Creation"] │ status = "to do" │ due date = TODAY (exact 24h window) ← ★ BUG: misses overdue tasks │ ├─ Groups tasks by Keyword (case-insensitive) │ If same keyword across multiple tasks → one job covers all │ ├─ For each keyword group: │ Check local DB: autocora:job:{keyword_lower} │ If already submitted → SKIP │ ▼ WRITE JOB FILE │ Path: //PennQnap1/SHARE1/AutoCora/jobs/{job-id}.json │ Content: {"keyword": "...", "url": "IMSURL", "task_ids": ["id1", "id2"]} │ Local DB: autocora:job:{keyword} = {status: "submitted", job_id: "..."} │ ▼ SET ALL TASK STATUSES → "automation underway" STAGE B: EXTERNAL WORKER RUNS CORA (not CheddahBot code) ═════════════════════════════════════════════════════════ Worker on another machine: │ Watches //PennQnap1/SHARE1/AutoCora/jobs/ │ Picks up .json, runs Cora SEO tool │ Writes .xlsx report to Z:/cora-inbox/ ← auto-deposited │ Writes //PennQnap1/SHARE1/AutoCora/results/{job-id}.result = "SUCCESS" or "FAILURE: reason" STAGE C: AUTOCORA POLLS FOR RESULTS ════════════════════════════════════ AUTOCORA LOOP (every 5 min) │ ├─ Scans local DB for autocora:job:* with status = "submitted" │ For each: checks if results/{job-id}.result exists │ ├─ If SUCCESS: │ Local DB: status = "completed" │ ClickUp: all task_ids → status = "running cora" │ ClickUp: comment "Cora report completed for keyword: ..." │ ├─ If FAILURE: │ Local DB: status = "failed" │ ClickUp: all task_ids → status = "error" │ ClickUp: comment with failure reason │ └─ If no result file yet: skip, check again in 5 min STAGE D: FOLDER WATCHER TRIGGERS LINK BUILDING ═══════════════════════════════════════════════ FOLDER WATCHER (every 60 min) │ ├─ Scans Z:/cora-inbox/ for .xlsx files │ Skips: ~$ temp files, already-completed files (via local DB) │ ├─ For each new .xlsx: │ Normalize filename: "anti-vibration-rubber-mounts.xlsx" → "anti vibration rubber mounts" │ ▼ MATCH TO CLICKUP TASK │ Queries all tasks in space with Work Category = "Link Building" │ Fuzzy matches Keyword field against normalized filename: │ - Exact match │ - Substring match (either direction) │ - >80% word overlap │ ├─ NO MATCH → local DB: status = "unmatched", notification sent, retry next scan │ ├─ MATCH FOUND but IMSURL empty → local DB: status = "blocked", ClickUp → "error" │ ▼ SET STATUS → "automation underway" │ ▼ STEP 1: Ingest CORA Report (Big-Link-Man subprocess) │ Runs: E:/dev/Big-Link-Man/.venv/Scripts/python.exe main.py ingest-cora -f {xlsx} -n {keyword} ... │ BLM parses xlsx, creates project, writes job file │ Timeout: 30 minutes │ ClickUp: comment "CORA report ingested. Project ID: ..." │ ▼ STEP 2: Generate Content Batch (Big-Link-Man subprocess) │ Runs: python main.py generate-batch -j {job_file} --continue-on-error │ BLM generates content for each prospect │ Moves job file to jobs/done/ │ ▼ SET STATUS → "complete" │ ClickUp: comment with results │ Move .xlsx to Z:/cora-inbox/processed/ │ Local DB: linkbuilding:watched:{filename} = {status: "completed"} │ ▼ DONE ``` ### ClickUp Fields Read - `Keyword` → matches against .xlsx filename + used as project name - `IMSURL` → money site URL (required) - `LB Method` → must be "Cora Backlinks" or empty - `CustomAnchors`, `BrandedPlusRatio`, `CLIFlags` → passed to BLM ### What Can Go Wrong - **BUG: AutoCora only checks today's tasks.** Due date missed = never gets a Cora report. - **BUG: Crash mid-step → stuck "executing".** Same as PR pipeline. - No ClickUp task with matching Keyword → file sits unmatched, notification sent - IMSURL empty → blocked, ClickUp set to "error" - BLM subprocess timeout (30 min) or crash → task fails - Network share offline → can't write job file or read results ### Retry Behavior - "processing", "blocked", "unmatched" .xlsx files → retried on next scan (KV entry deleted) - "completed", "failed" → never retried --- ## Pipeline 3: CONTENT CREATION **Work Category:** "Content Creation" **auto_execute:** FALSE — triggered by content folder watcher **Tool:** `create_content` (two-phase) ### Flow ``` STAGE A: AUTOCORA SUBMITS CORA JOB (same as Link Building Stage A) ══════════════════════════════════════════════════════════════════ Same AutoCora loop, same BUG with today-only filtering. Worker generates .xlsx → deposits in Z:/content-cora-inbox/ STAGE B: CONTENT WATCHER TRIGGERS PHASE 1 ══════════════════════════════════════════ CONTENT WATCHER (every 60 min) │ ├─ Scans Z:/content-cora-inbox/ for .xlsx files │ Same skip/retry logic as link building watcher │ ├─ Normalize filename, fuzzy match to ClickUp task │ Matches: Work Category in ["Content Creation", "On Page Optimization"] │ ├─ NO MATCH → "unmatched", notification │ ▼ PHASE 1: Research + Outline (execution brain - Claude Code CLI) │ │ ★ BUG: Does NOT set "automation underway" status (link building watcher does) │ │ Build prompt based on content type: │ - If IMSURL present → "optimize existing page" (scrape it, analyze, outline improvements) │ - If IMSURL empty → "new content" (competitor research, outline from scratch) │ - If Cora .xlsx found → "use this Cora report for keyword targets and entities" │ - If CLIFlags contains "service" → includes service page template │ │ Claude Code runs: web searches, scrapes competitors, reads Cora report │ Generates outline with entity recommendations │ ▼ SAVE OUTLINE │ Path: Z:/content-outlines/{keyword-slug}/outline.md │ Local DB: clickup:task:{id}:state = {state: "outline_review", outline_path: "..."} │ ▼ SET STATUS → "outline review" │ ClickUp: comment "Outline ready for review" │ │ ★ BUG: .xlsx NOT moved to processed/ (link building watcher moves files) │ ▼ WAITING FOR HUMAN │ Human opens outline at Z:/content-outlines/{slug}/outline.md │ Human edits/approves │ Human moves ClickUp task to "outline approved" STAGE C: CLICKUP POLL TRIGGERS PHASE 2 ═══════════════════════════════════════ CLICKUP POLL (every 20 min) │ ├─ Finds task with status = "outline approved" (in poll_statuses list) │ ├─ Check local DB: clickup:task:{id}:state │ Sees state = "outline_review" → this means Phase 2 is ready │ ★ BUG: If DB was wiped, no entry → runs Phase 1 AGAIN, overwrites outline │ ▼ PHASE 2: Write Full Content (execution brain - Claude Code CLI) │ │ Reads outline from path stored in local DB (outline_path) │ ★ BUG: If outline file was deleted → Phase 2 fails every time, no recovery │ │ Claude Code writes full content using the approved outline │ Includes entity optimization, keyword density targets from Cora │ ▼ SAVE FINAL CONTENT │ Path: Z:/content-outlines/{keyword-slug}/final-content.md │ Local DB: state = "completed" │ ▼ SET STATUS → "internal review" │ ClickUp: comment with content path │ ▼ DONE — Human reviews final content ``` ### ClickUp Fields Read - `Keyword` → target keyword, used for Cora matching and content generation - `IMSURL` → if present = optimization, if empty = new content - `CLIFlags` → hints like "service" for service page template ### What Can Go Wrong - **BUG: AutoCora only checks today → Cora report never generated for overdue tasks** - **BUG: DB wipe → Phase 2 reruns Phase 1, destroys approved outline** - **BUG: Outline file deleted → Phase 2 permanently fails** - **BUG: No "automation underway" set during Phase 1 from watcher** - **BUG: .xlsx not moved to processed/** - Network share offline → can't save outline or read it back --- ## Pipeline 4: ON PAGE OPTIMIZATION **Work Category:** "On Page Optimization" **auto_execute:** FALSE **Tool:** `create_content` (same as Content Creation) ### Flow Identical to Content Creation except: - Phase 1 prompt says "optimize existing page at {IMSURL}" instead of "create new content" - Phase 1 scrapes the existing page first, then builds optimization outline - IMSURL is always present (it's the page being optimized) Same bugs apply. --- ## The Local DB (KV Store) — What It Tracks | Key Pattern | What It Stores | Read By | Actually Needed? | |---|---|---|---| | `clickup:task:{id}:state` | Full task execution state (status, timestamps, outline_path, errors) | ClickUp poll dedup check, Phase 2 detection | **PARTIALLY** — outline_path is needed for Phase 2, but dedup could use ClickUp status instead | | `autocora:job:{keyword}` | Job submission tracking (job_id, status, task_ids) | AutoCora result poller | **YES** — maps keyword to job_id for result file lookup | | `linkbuilding:watched:{filename}` | File processing state (processing/completed/failed/unmatched/blocked) | Folder watcher scan | **YES** — prevents re-processing files | | `content:watched:{filename}` | Same as above for content files | Content watcher scan | **YES** — prevents re-processing | | `pipeline:status` | Current step text for UI ("Step 2/4: Judging...") | Gradio UI polling | **NO** — just a display string, could be in-memory | | `linkbuilding:status` | Same for link building UI | Gradio UI polling | **NO** — same | | `system:loop:*:last_run` (x6) | Timestamp of last loop run | Dashboard API | **NO** — informational only, never used in logic | --- ## Summary of All Bugs | # | Bug | Severity | Pipelines Affected | |---|-----|----------|-------------------| | 1 | AutoCora only submits for today's due date | HIGH | Link Building, Content, OPT | | 2 | DB wipe → Phase 2 reruns Phase 1 | HIGH | Content, OPT | | 3 | Stuck "executing" after crash, no recovery | HIGH | All 4 | | 4 | Content watcher missing "automation underway" | MEDIUM | Content, OPT | | 5 | Content watcher doesn't move .xlsx to processed/ | MEDIUM | Content, OPT | | 6 | KV written before ClickUp API → out of sync | MEDIUM | All 4 | | 7 | Silent attachment upload failures | MEDIUM | Press Release | | 8 | Phase 2 fails permanently if outline file gone | LOW | Content, OPT |