15 KiB
Link Building Agent Plan
Context
CheddahBot needs a link building agent that orchestrates the external Big-Link-Man CLI tool (E:/dev/Big-Link-Man/). The current workflow is manual: run Cora on another machine → get .xlsx → manually run main.py ingest-cora → manually run main.py generate-batch. This agent automates steps 2 and 3, triggered by folder watching, ClickUp tasks, or chat commands. It must be expandable for future link building methods (MCP server path, ingest-simple, etc.).
Decisions Made
- Watch folder:
Z:/cora-inbox(network drive, Cora machine accessible) - File→task matching: Fuzzy match .xlsx filename stem against ClickUp task's
Keywordcustom field - New ClickUp field "LB Method": Dropdown with initial option "Cora Backlinks" (more added later)
- Dashboard: API endpoint + NotificationBus events only (no frontend work — separate project)
- Sidecar files: Not needed — all metadata comes from the matching ClickUp task
- Tool naming: Orchestrator pattern —
run_link_buildingis a thin dispatcher that readsLB Methodand routes to the specific pipeline tool (e.g.,run_cora_backlinks). Future link building methods get their own tools and slot into the orchestrator.
Files to Create
1. cheddahbot/tools/linkbuilding.py — Main tool module
Four @tool-decorated functions + private helpers:
run_link_building(lb_method="", xlsx_path="", project_name="", money_site_url="", branded_plus_ratio=0.7, custom_anchors="", cli_flags="", ctx=None)
- Orchestrator/dispatcher — reads
lb_method(from ClickUp "LB Method" field or chat) and routes to the correct pipeline tool - If
lb_methodis "Cora Backlinks" or empty (default): callsrun_cora_backlinks() - Future: if
lb_methodis "MCP Link Building": callsrun_mcp_link_building()(not yet implemented) - Passes all other args through to the sub-tool
- This is what the ClickUp skill_map always routes to
run_cora_backlinks(xlsx_path, project_name, money_site_url, branded_plus_ratio=0.7, custom_anchors="", cli_flags="", ctx=None)
- The actual Cora pipeline — runs ingest-cora → generate-batch
- Step 1: Build CLI args, call
_run_blm_command(["ingest-cora", ...]), parse stdout for job file path - Step 2: Call
_run_blm_command(["generate-batch", "-j", job_file, "--continue-on-error"]) - Updates KV store state and posts ClickUp comments at each step (following press_release.py pattern)
- Returns
## ClickUp Syncin output to signal scheduler that sync was handled internally - Can also be called directly from chat for explicit Cora runs
blm_ingest_cora(xlsx_path, project_name, money_site_url, branded_plus_ratio=0.7, custom_anchors="", cli_flags="", ctx=None)
- Standalone ingest — runs ingest-cora only, returns project ID and job file path
- For cases where user wants to ingest but not generate yet
blm_generate_batch(job_file, continue_on_error=True, debug=False, ctx=None)
- Standalone generate — runs generate-batch only on an existing job file
- For re-running generation or running a manually-created job
Private helpers:
_run_blm_command(args, timeout=1800)— subprocess wrapper, runsuv run python main.py <args>from BLM_DIR, injects-u/-pfromBLM_USERNAME/BLM_PASSWORDenv vars_parse_ingest_output(stdout)— regex extract project_id + job_file path_parse_generate_output(stdout)— extract completion stats_build_ingest_args(...)— construct CLI argument list from tool params_set_status(ctx, message)— write pipeline status to KV store (for UI polling)_sync_clickup(ctx, task_id, step, message)— post comment + update state
Critical: always pass -m flag to ingest-cora to prevent interactive stdin prompt from blocking the subprocess.
2. skills/linkbuilding.md — Skill file
YAML frontmatter linking to [run_link_building, run_cora_backlinks, blm_ingest_cora, blm_generate_batch, scan_cora_folder] tools and [link_builder, default] agents. Markdown body describes when to use, default flags, workflow steps.
3. tests/test_linkbuilding.py — Test suite (~40 tests)
All tests mock subprocess.run — never call Big-Link-Man. Categories:
- Output parser unit tests (
_parse_ingest_output,_parse_generate_output) - CLI arg builder tests (all flag combinations, missing required params)
- Full pipeline integration (happy path, ingest failure, generate failure)
- ClickUp state machine (executing → completed, executing → failed)
- Folder watcher scan logic (new files, skip processed, missing ClickUp match)
Files to Modify
4. cheddahbot/config.py — Add LinkBuildingConfig
@dataclass
class LinkBuildingConfig:
blm_dir: str = "E:/dev/Big-Link-Man"
watch_folder: str = "" # empty = disabled
watch_interval_minutes: int = 60
default_branded_plus_ratio: float = 0.7
Add link_building: LinkBuildingConfig field to Config dataclass. Add YAML loading block in load_config() (same pattern as memory/scheduler/shell). Add env var override for BLM_DIR.
5. config.yaml — Three additions
New top-level section:
link_building:
blm_dir: "E:/dev/Big-Link-Man"
watch_folder: "Z:/cora-inbox"
watch_interval_minutes: 60
default_branded_plus_ratio: 0.7
New skill_map entry under clickup:
"Link Building":
tool: "run_link_building"
auto_execute: false # Cora Backlinks triggered by folder watcher, not scheduler
complete_status: "complete" # Override: use "complete" instead of "internal review"
error_status: "internal review" # On failure, move to internal review
field_mapping:
lb_method: "LB Method"
project_name: "task_name"
money_site_url: "IMSURL"
custom_anchors: "CustomAnchors"
branded_plus_ratio: "BrandedPlusRatio"
cli_flags: "CLIFlags"
xlsx_path: "CoraFile"
New agent:
- name: link_builder
display_name: Link Builder
tools: [run_link_building, run_cora_backlinks, blm_ingest_cora, blm_generate_batch, scan_cora_folder, delegate_task, remember, search_memory]
memory_scope: ""
6. cheddahbot/scheduler.py — Add folder watcher (4th daemon thread)
New thread _folder_watch_loop alongside existing poll, heartbeat, and ClickUp threads:
- Starts if
config.link_building.watch_folderis non-empty - Runs every
watch_interval_minutes(default 60) _scan_watch_folder()globs*.xlsxin watch folder- For each file, checks KV store
linkbuilding:watched:{filename}— skip if already processed - Fuzzy-matches filename stem against ClickUp tasks with
LB Method = "Cora Backlinks"and status "to do":- Queries ClickUp for Link Building tasks
- Compares normalized filename stem against each task's
Keywordcustom field - If match found: extracts money_site_url from IMSURL field, cli_flags from CLIFlags field, etc.
- If no match: logs warning, marks as "unmatched" in KV store, sends notification asking user to create/link a ClickUp task
- On match: executes
run_link_buildingtool with args from the ClickUp task fields - On completion: moves .xlsx to
Z:/cora-inbox/processed/subfolder, updates KV state - On failure: updates KV state with error, notifies via NotificationBus
File handling after pipeline:
- On success: .xlsx moved from
Z:/cora-inbox/→Z:/cora-inbox/processed/ - On failure: .xlsx stays in
Z:/cora-inbox/(KV store marks it as failed so watcher doesn't retry automatically; user can reset KV entry to retry)
Also adds scan_cora_folder tool (can live in linkbuilding.py):
- Chat-invocable utility for the agent to check what's in the watch folder
- Returns list of unprocessed .xlsx files with ClickUp match status
- Internal agent tool, not a dashboard concern
7. cheddahbot/clickup.py — Add field creation method
Add create_custom_field(list_id, name, field_type, type_config=None) method that calls POST /list/{list_id}/field. Used by the setup tool to auto-create custom fields across lists.
8. cheddahbot/__main__.py — Add API endpoint
Add before Gradio mount:
@fastapi_app.get("/api/linkbuilding/status")
async def linkbuilding_status():
"""Return link building status for dashboard consumption."""
# Returns:
# {
# "pending_cora_runs": [
# {"keyword": "precision cnc machining", "url": "https://...", "client": "Chapter 2", "task_id": "abc123"},
# ...
# ],
# "in_progress": [...], # Currently executing pipelines
# "completed": [...], # Recently completed (last 7 days)
# "failed": [...] # Failed tasks needing attention
# }
The pending_cora_runs section is the key dashboard data: queries ClickUp for "to do" tasks with Work Category="Link Building" and LB Method="Cora Backlinks", returns each task's Keyword field and IMSURL (copiable URL) so the user can see exactly which Cora reports need to be run.
Also push link building events to NotificationBus (category="linkbuilding") at each pipeline step for future real-time dashboard support.
No other __main__.py changes needed — agent wiring is automatic from config.yaml.
ClickUp Custom Fields (Auto-Created)
New custom fields to be created programmatically:
| Field | Type | Purpose |
|---|---|---|
LB Method |
Dropdown | Link building subtype. Initial option: "Cora Backlinks" |
Keyword |
Short Text | Target keyword (used for file matching) |
CoraFile |
Short Text | Path to .xlsx file (optional, set by agent after file match) |
CustomAnchors |
Short Text | Comma-separated anchor text overrides |
BrandedPlusRatio |
Short Text | Override for -bp flag (e.g., "0.7") |
CLIFlags |
Short Text | Raw additional CLI flags (e.g., "-r 5 -t 0.3") |
Fields that already exist and will be reused: Client, IMSURL, Work Category (add "Link Building" option).
Auto-creation approach
- Add
create_custom_field(list_id, name, type, type_config=None)method tocheddahbot/clickup.py— callsPOST /list/{list_id}/field - Add a
setup_linkbuilding_fieldstool (category="linkbuilding") that:- Gets all list IDs in the space
- For each list, checks if fields already exist (via
get_custom_fields) - Creates missing fields via the new API method
- For
LB Methoddropdown, creates withtype_configcontaining "Cora Backlinks" option - For
Work Category, adds "Link Building" option if missing
- This tool runs once during initial setup, or can be re-run if new lists are added
- Also add "Link Building" as an option to the existing
Work Categorydropdown if not present
Data Flow & Status Lifecycle
Primary Trigger: Folder Watcher (Cora Backlinks)
The folder watcher is the main trigger for Cora Backlinks. The ClickUp scheduler does NOT auto-execute these — it can't, because the .xlsx doesn't exist until the user runs Cora.
1. ClickUp task created:
Work Category="Link Building", LB Method="Cora Backlinks", status="to do"
Fields filled: Client, IMSURL, Keyword, CLIFlags, BrandedPlusRatio, etc.
→ Appears on dashboard as "needs Cora run"
2. User runs Cora manually, drops .xlsx in Z:/cora-inbox
3. Folder watcher (_scan_watch_folder, runs every 60 min):
→ Finds precision-cnc-machining.xlsx
→ Fuzzy matches "precision cnc machining" against Keyword field on ClickUp "to do" Link Building tasks
→ Match found → extracts metadata from ClickUp task (IMSURL, CLIFlags, etc.)
→ Sets CoraFile field on the ClickUp task to the file path
→ Moves task to "in progress"
→ Posts comment: "Starting Cora Backlinks pipeline..."
4. Pipeline runs:
→ Step 1: ingest-cora → comment: "CORA report ingested. Job file: jobs/xxx.json"
→ Step 2: generate-batch → comment: "Content generation complete. X articles across Y tiers."
5. On success:
→ Move task to "complete"
→ Post summary comment with stats
→ Move .xlsx to Z:/cora-inbox/processed/
6. On failure:
→ Move task to "internal review"
→ Post error comment with details
→ .xlsx stays in Z:/cora-inbox (can retry)
Secondary Trigger: Chat
User: "Run link building for Z:/cora-inbox/precision-cnc-machining.xlsx"
→ Chat brain calls run_cora_backlinks (or run_link_building with explicit lb_method)
→ Tool auto-looks up matching ClickUp task via Keyword field (if exists)
→ Same pipeline + ClickUp sync as above
→ If no ClickUp match: runs pipeline without ClickUp tracking, returns results to chat only
Future Trigger: ClickUp Scheduler (other LB Methods)
Future link building methods (MCP, etc.) that don't need a .xlsx CAN be auto-executed by the ClickUp scheduler. The run_link_building orchestrator checks lb_method:
- "Cora Backlinks" → requires xlsx_path, skips if empty (folder watcher handles these)
- Future methods → can execute directly from ClickUp task data
ClickUp Skill Map Note
The skill_map entry for "Link Building" exists primarily for field mapping reference (so the folder watcher and chat know which ClickUp fields map to which tool params). The ClickUp scheduler will discover these tasks but run_link_building will skip Cora Backlinks that have no xlsx_path — they're waiting for the folder watcher.
Implementation Order
- Config — Add
LinkBuildingConfigto config.py, addlink_building:section to config.yaml, addlink_builderagent to config.yaml - Core tools — Create
cheddahbot/tools/linkbuilding.pywith_run_blm_command, parsers,run_link_buildingorchestrator, andrun_cora_backlinkspipeline - Standalone tools — Add
blm_ingest_coraandblm_generate_batch - Tests — Create
tests/test_linkbuilding.py, verify withuv run pytest tests/test_linkbuilding.py -v - ClickUp field creation — Add
create_custom_fieldto clickup.py, addsetup_linkbuilding_fieldstool - ClickUp integration — Add skill_map entry, add ClickUp state tracking to tools
- Folder watcher — Add
_folder_watch_loopto scheduler.py, addscan_cora_foldertool - API endpoint — Add
/api/linkbuilding/statusto__main__.py - Skill file — Create
skills/linkbuilding.md - ClickUp setup — Run
setup_linkbuilding_fieldsto auto-create custom fields across all lists - Full test run —
uv run pytest -v --no-cov
Verification
- Unit tests:
uv run pytest tests/test_linkbuilding.py -v— all pass with mocked subprocess - Full suite:
uv run pytest -v --no-cov— no regressions - Lint:
uv run ruff check .+uv run ruff format . - Manual e2e: Drop a real .xlsx in Z:/cora-inbox, verify ingest-cora runs, job JSON created, generate-batch runs
- ClickUp e2e: Create a Link Building task in ClickUp with proper fields, wait for scheduler poll, verify execution
- Chat e2e: Ask CheddahBot to "run link building for [keyword]" via chat UI
- API check: Hit
http://localhost:7860/api/linkbuilding/statusand verify data returned
Key Reference Files
cheddahbot/tools/press_release.py— Reference pattern for multi-step pipeline toolcheddahbot/scheduler.py:55-76— Where to add 4th daemon threadcheddahbot/config.py:108-200— load_config() pattern for new config sectionsE:/dev/Big-Link-Man/docs/CLI_COMMAND_REFERENCE.md— Full CLI referenceE:/dev/Big-Link-Man/src/cli/commands.py— Exact output formats to parse