Compare commits
No commits in common. "master" and "cora-start" have entirely different histories.
master
...
cora-start
|
|
@ -1,182 +0,0 @@
|
||||||
# CNC Swiss Screw Machining: Precision, Process, and When to Use It
|
|
||||||
|
|
||||||
CNC Swiss screw machining is a precision turning process for producing small, complex parts at tight tolerances and high volumes. This guide covers how Swiss screw machines work, what makes them different from conventional CNC turning, and how to evaluate a machining partner.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## What Is CNC Swiss Screw Machining and How Does It Work?
|
|
||||||
|
|
||||||
Swiss screw machining is a CNC turning process that uses a sliding headstock and guide bushing to support bar stock close to the cutting point. The result is reduced deflection, minimal vibration, and tolerances that conventional lathes struggle to achieve.
|
|
||||||
|
|
||||||
### Origins and Definition
|
|
||||||
|
|
||||||
The Swiss screw machine was developed in Switzerland in the 1800s to produce the tiny screws and pins required for watchmaking. This early form of precision metalworking used cam-driven automatic lathes — mechanical automation that could repeat the same cuts with consistent accuracy. The design became the foundation for precision small-part manufacturing and fabrication worldwide.
|
|
||||||
|
|
||||||
Today's CNC Swiss lathes add programmable multi-axis motion, live tooling, and sub-spindle capability. These Swiss lathes handle complex geometries, tight tolerances, and high production volumes that cam-driven machines could not. Modern CNC machining controls allow manufacturers to program intricate tool paths across multiple axes, producing parts that would have been impossible on earlier automatic lathe designs.
|
|
||||||
|
|
||||||
The key distinction from a conventional CNC lathe: on a Swiss lathe, the workpiece moves through a guide bushing while the tools remain in a fixed cutting zone. On a conventional lathe, the tools traverse along a stationary workpiece held by the tailstock and headstock. This difference determines how much deflection occurs during cutting.
|
|
||||||
|
|
||||||
### The Sliding Headstock and Guide Bushing
|
|
||||||
|
|
||||||
Bar stock feeds through a collet in the sliding headstock, which moves along the Z-axis to advance material into the cutting zone. A guide bushing supports the bar just 1–3mm from where the tool contacts the workpiece.
|
|
||||||
|
|
||||||
With the material held rigidly near the cutting point, there is almost no leverage for cutting forces to deflect the workpiece. Vibration is dampened and chatter is reduced, delivering tighter tolerances and better surface finish than conventional turning on the same part geometry.
|
|
||||||
|
|
||||||
Guide bushings come in two types. Rotary guide bushings rotate with the workpiece and deliver tolerances of ±0.0005" or better. Fixed guide bushings do not rotate and are used when even tighter tolerances are required.
|
|
||||||
|
|
||||||
### Multi-Tool Simultaneous Operation
|
|
||||||
|
|
||||||
CNC Swiss screw machines can mount up to 20 tools and operate several simultaneously. A main spindle handles turning while a sub-spindle machines the back end — all in one setup. This level of automation eliminates manual handling and keeps cycle times short.
|
|
||||||
|
|
||||||
Live tooling adds milling, cross-drilling, threading, and tapping directly on the Swiss lathe. Parts that would require three or four setups across different CNC machines come off a Swiss screw machine complete, with no secondary operations needed.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Benefits of CNC Swiss Screw Machining
|
|
||||||
|
|
||||||
### Precision and Production Advantages
|
|
||||||
|
|
||||||
- **Tolerances of ±0.0002"** are standard, with tighter tolerances achievable on specific features
|
|
||||||
- **Spindle speeds up to 10,000 RPM** enable efficient cutting of both metals and engineering plastics
|
|
||||||
- **Continuous bar-fed operation** — bar stock feeds automatically, parts drop off complete, minimal operator intervention
|
|
||||||
- **Reduced secondary operations** eliminate the cost of moving parts between machines
|
|
||||||
- **Automation** — bar feeders and CNC machining controls enable lights-out production, reducing labor costs on long runs
|
|
||||||
|
|
||||||
Setup is the largest cost driver. After that, per-part costs drop significantly, making Swiss screw machining cost-effective at medium to high production volumes.
|
|
||||||
|
|
||||||
### Materials for Swiss Screw Machining
|
|
||||||
|
|
||||||
Swiss screw machines work with a broad range of materials:
|
|
||||||
|
|
||||||
- **Stainless steel** — 303, 304, and 316 grades
|
|
||||||
- **Aluminum** — lightweight aerospace and electronics parts
|
|
||||||
- **Brass and copper** — electrical contacts, fittings, and connectors
|
|
||||||
- **Titanium** — medical implants and aerospace fasteners
|
|
||||||
- **Nickel alloys** — corrosion-resistant components for harsh environments
|
|
||||||
- **Bronze** — bushings, bearings, and wear components
|
|
||||||
- **Engineering plastics** — PEEK, Delrin, and nylon
|
|
||||||
|
|
||||||
Bar stock must be centerless-ground to ±0.0002" diametric tolerance to feed smoothly through the guide bushing. Exotic alloys like Inconel are workable but require specialized carbide tooling and experienced programming.
|
|
||||||
|
|
||||||
### Industries and Common Applications
|
|
||||||
|
|
||||||
- **Medical devices** — bone screws, dental implants, surgical instrument shafts, cannulas, and orthopedic pins. Medical applications often require biocompatible materials like titanium or surgical-grade stainless steel, plus full lot traceability.
|
|
||||||
- **Aerospace** — fasteners, sensor housings, hydraulic fittings, and electrical connectors. Aerospace machining demands tight tolerances, exotic materials, and documented quality processes.
|
|
||||||
- **Automotive** — fuel injector components, transmission pins, and valve parts produced in high volumes with consistent quality.
|
|
||||||
- **Electronics** — connector pins, contact sockets, terminal posts, and micro-components where dimensional precision directly affects electrical performance.
|
|
||||||
- **Defense** — ITAR-compliant precision components for weapons systems, communication equipment, and guidance systems.
|
|
||||||
|
|
||||||
Common machined parts include screws, pins, shafts, bushings, contacts, fittings, and cylindrical components with high length-to-diameter ratios.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## CNC Swiss Machining vs. Conventional CNC Turning
|
|
||||||
|
|
||||||
### Key Differences
|
|
||||||
|
|
||||||
| Factor | CNC Swiss Screw Machining | Conventional CNC Turning |
|
|
||||||
| ------ | ------------------------- | ------------------------ |
|
|
||||||
| Part diameter | Up to ~32mm (1.25") | Larger parts, no practical limit |
|
|
||||||
| Tolerances | ±0.0002" standard | ±0.001" typical |
|
|
||||||
| Complexity | Multi-axis, live tooling, sub-spindle | Typically 2-axis |
|
|
||||||
| Best volume | Medium to high | Flexible |
|
|
||||||
| L/D ratio | Excels at 10:1 or more | Limited by deflection |
|
|
||||||
| Setup cost | Higher | Lower |
|
|
||||||
| Per-part cost | Lower for small, complex parts | Lower for larger, simpler parts |
|
|
||||||
|
|
||||||
The guide bushing is the fundamental differentiator. It allows Swiss lathes to cut long, thin parts without the deflection that makes the same part impossible to hold tolerance on a conventional CNC lathe.
|
|
||||||
|
|
||||||
### When NOT to Use Swiss Screw Machining
|
|
||||||
|
|
||||||
Consider conventional CNC turning or milling when:
|
|
||||||
|
|
||||||
- **Parts exceed 32mm diameter** — larger parts need a conventional CNC lathe or mill
|
|
||||||
- **Production runs are very short** — for 10–50 pieces, a conventional CNC lathe is more economical
|
|
||||||
- **Tolerances are relaxed** — if the spec calls for ±0.005" or wider, Swiss machining is overkill
|
|
||||||
- **The geometry is not cylindrical** — prismatic parts are better suited to 3-axis or 5-axis CNC milling
|
|
||||||
- **No features benefit from simultaneous operations** — simple turned profiles cost less on a conventional lathe
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Quality, Certification, and Choosing a Partner
|
|
||||||
|
|
||||||
### Industry Certifications and Inspection
|
|
||||||
|
|
||||||
Quality in Swiss screw machining depends on the quality management systems behind the machines.
|
|
||||||
|
|
||||||
Key certifications:
|
|
||||||
|
|
||||||
- **ISO 9001:2015** — baseline quality management system standard
|
|
||||||
- **ISO 13485** — required for medical device component manufacturing
|
|
||||||
- **ITAR registration** — mandatory for defense-related machining
|
|
||||||
- **IATF 16949** — automotive quality standard with defect prevention requirements
|
|
||||||
|
|
||||||
Inspection methods to ask about:
|
|
||||||
|
|
||||||
- **Statistical process control (SPC)** — monitors dimensional trends during production
|
|
||||||
- **Coordinate measuring machines (CMM)** — 3D dimensional verification of finished parts
|
|
||||||
- **First article inspection (FAI)** — full dimensional report verifying the setup matches the print
|
|
||||||
|
|
||||||
Material traceability is standard in medical and aerospace work and increasingly expected across all industries.
|
|
||||||
|
|
||||||
### What to Look for in a Swiss Screw Machining Supplier
|
|
||||||
|
|
||||||
- **Machine fleet** — modern CNC Swiss lathes with multi-axis capability, live tooling, and sub-spindles
|
|
||||||
- **Relevant certifications** — ISO 9001 baseline, plus ISO 13485, ITAR, or IATF 16949 as your industry requires
|
|
||||||
- **Demonstrated tolerance capability** — sample parts or dimensional reports in your materials
|
|
||||||
- **In-house secondary operations** — deburring, heat treating, plating, and passivation under one roof
|
|
||||||
- **Engineering support** — a good partner reviews prints and suggests design optimizations for manufacturability
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Get Started with CNC Swiss Screw Machining
|
|
||||||
|
|
||||||
CNC Swiss screw machining delivers precision, speed, and repeatability for small-diameter parts that demand tight tolerances. Whether you are producing medical implants, aerospace fasteners, or high-volume electronic connectors, Swiss machining is a proven process for turning complex designs into finished components. Contact us to discuss your project and request a quote.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
<!-- FOQ SECTION START -->
|
|
||||||
|
|
||||||
## Frequently Asked Questions About CNC Swiss Screw Machining
|
|
||||||
|
|
||||||
### What Is the Difference Between Swiss Screw Machining and CNC Turning?
|
|
||||||
|
|
||||||
Swiss screw machining differs from conventional CNC turning in how the workpiece is supported during cutting. A Swiss screw machine uses a guide bushing to hold the bar stock within 1–3mm of the cutting tool, virtually eliminating deflection and enabling tolerances of ±0.0002". Conventional CNC turning clamps the workpiece without a guide bushing, which limits precision on long, slender parts and typically holds tolerances of ±0.001".
|
|
||||||
|
|
||||||
### How Tight Are Swiss Screw Machining Tolerances?
|
|
||||||
|
|
||||||
Swiss screw machining tolerances are typically ±0.0002" as a standard capability. This precision is possible because the guide bushing supports the workpiece close to the cutting tool, reducing deflection and vibration that would otherwise compromise dimensional accuracy.
|
|
||||||
|
|
||||||
### What Materials Can Be Swiss Screw Machined?
|
|
||||||
|
|
||||||
Swiss screw machines can process stainless steel, aluminum, brass, copper, titanium, nickel alloys, bronze, and engineering plastics like PEEK, Delrin, and nylon. Bar stock must be centerless-ground to ±0.0002" diametric tolerance to feed properly through the guide bushing.
|
|
||||||
|
|
||||||
### Is Swiss Screw Machining Cost-Effective for Small Production Runs?
|
|
||||||
|
|
||||||
Swiss screw machining is generally not cost-effective for very small runs due to significant setup time and tooling costs. The process becomes economical at medium to high volumes where setup cost is amortized across many parts. For runs under 50 pieces, conventional CNC turning is often more economical.
|
|
||||||
|
|
||||||
### What Industries Use CNC Swiss Screw Machining?
|
|
||||||
|
|
||||||
CNC Swiss screw machining is used extensively in medical device, aerospace, automotive, electronics, and defense manufacturing. These industries require small, complex, precision components produced at tight tolerances and in high volumes — exactly the part profile Swiss screw machines are designed to handle.
|
|
||||||
|
|
||||||
### How Does a Guide Bushing Work on a Swiss Screw Machine?
|
|
||||||
|
|
||||||
A guide bushing on a Swiss screw machine acts as a stationary support that holds the bar stock just 1–3mm from the cutting tool. As the sliding headstock feeds the workpiece through the bushing along the Z-axis, the bushing prevents the material from deflecting, enabling tighter tolerances and smoother surface finishes.
|
|
||||||
|
|
||||||
### What Part Sizes Can a Swiss Screw Machine Handle?
|
|
||||||
|
|
||||||
Swiss screw machines handle bar stock up to 32mm (1.25") in diameter. They excel at parts with high length-to-diameter ratios — 10:1 or greater — where conventional lathes would struggle with deflection. Larger parts are better suited to conventional CNC turning or milling.
|
|
||||||
|
|
||||||
### Does Swiss Screw Machining Require Secondary Operations?
|
|
||||||
|
|
||||||
Swiss screw machining often eliminates secondary operations entirely. With live tooling, sub-spindles, and multi-axis capability, a CNC Swiss machine can perform turning, milling, cross-drilling, threading, tapping, and knurling in a single setup. Parts frequently come off the machine complete.
|
|
||||||
|
|
||||||
### What Certifications Should a Swiss Screw Machining Supplier Have?
|
|
||||||
|
|
||||||
A Swiss screw machining supplier should hold ISO 9001:2015 as a baseline. Medical work requires ISO 13485, defense applications require ITAR registration, and automotive work calls for IATF 16949. Look for documented inspection processes including SPC, CMM measurement, and first article inspection.
|
|
||||||
|
|
||||||
### When Should You Choose Conventional CNC Over Swiss Machining?
|
|
||||||
|
|
||||||
Choose conventional CNC turning or milling over Swiss machining when parts exceed 32mm in diameter, production volumes are very low, tolerances are wider than ±0.005", or the geometry is primarily non-cylindrical. Conventional CNC is also better for simple turned profiles that don't benefit from simultaneous multi-tool operations.
|
|
||||||
|
|
||||||
<!-- FOQ SECTION END -->
|
|
||||||
|
|
@ -1,167 +0,0 @@
|
||||||
# Outline: CNC Swiss Screw Machining
|
|
||||||
|
|
||||||
**Format:** Comprehensive Guide
|
|
||||||
**Target word count:** ~1,400 words (cluster target from Cora: 1,342)
|
|
||||||
**Primary keyword:** cnc swiss screw machining
|
|
||||||
**Target audience:** Engineers, procurement professionals, and manufacturing decision-makers evaluating Swiss screw machining for their parts
|
|
||||||
**Heading targets (from Cora Structure):** 1 H1, 4+ H2s, ~10 H3s
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## H1: CNC Swiss Screw Machining: Precision, Process, and When to Use It
|
|
||||||
|
|
||||||
Brief intro: What Swiss screw machining is in one sentence, why it matters for precision small parts, and what the reader will learn.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## H2: What Is CNC Swiss Screw Machining and How Does It Work?
|
|
||||||
|
|
||||||
Definition + the mechanical process combined into one major section.
|
|
||||||
|
|
||||||
### H3: Origins and Definition
|
|
||||||
|
|
||||||
- Precision turning process using a sliding headstock and guide bushing
|
|
||||||
- Developed in Switzerland in the 1800s for watchmaking
|
|
||||||
- Key distinction from conventional CNC lathes: the workpiece moves, not just the tool
|
|
||||||
- Modern CNC Swiss machines: programmable, multi-axis, live tooling capable
|
|
||||||
|
|
||||||
### H3: The Sliding Headstock and Guide Bushing
|
|
||||||
|
|
||||||
- Bar stock feeds through collet in the sliding headstock
|
|
||||||
- Guide bushing supports material 1-3mm from the cutting tool
|
|
||||||
- Headstock moves along Z-axis, feeding stock into the tooling zone
|
|
||||||
- Result: minimal deflection, vibration dampened, tighter tolerances possible
|
|
||||||
- Guide bushing types: rotary (>±0.0005") vs. fixed (tighter tolerances)
|
|
||||||
|
|
||||||
### H3: Multi-Tool Simultaneous Operation
|
|
||||||
|
|
||||||
- Up to 20 tools can operate simultaneously
|
|
||||||
- Main spindle + sub-spindle: machine both ends of a part in one setup
|
|
||||||
- Live tooling: milling, cross-drilling, threading, tapping without removing the part
|
|
||||||
- Parts come off the machine complete — minimal secondary operations
|
|
||||||
|
|
||||||
~350 words
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## H2: Benefits of CNC Swiss Screw Machining
|
|
||||||
|
|
||||||
### H3: Precision and Production Advantages
|
|
||||||
|
|
||||||
- **Precision:** ±0.0002" tolerances, up to 10,000 RPM, micron-level accuracy
|
|
||||||
- **Reduced secondary operations:** complete parts in one chucking
|
|
||||||
- **Production speed:** continuous bar-fed operation, minimal downtime
|
|
||||||
- **Material efficiency:** less waste than conventional machining
|
|
||||||
- **Cost-effective at volume:** low per-part cost once setup is complete
|
|
||||||
|
|
||||||
### H3: Materials for Swiss Screw Machining
|
|
||||||
|
|
||||||
- Metals: stainless steel, aluminum, brass, copper, bronze, titanium, nickel alloys
|
|
||||||
- Plastics: PEEK, Delrin, nylon
|
|
||||||
- Bar stock requirements: must be centerless-ground to ±0.0002" for optimal results
|
|
||||||
- Exotic alloys are workable but require specific tooling and speeds
|
|
||||||
|
|
||||||
### H3: Industries and Common Applications
|
|
||||||
|
|
||||||
- **Medical:** surgical instruments, implants, bone screws, dental components
|
|
||||||
- **Aerospace:** fasteners, connectors, sensor housings
|
|
||||||
- **Automotive:** high-volume small precision parts, fuel system components
|
|
||||||
- **Electronics:** pins, connectors, contacts, micro-components
|
|
||||||
- **Defense:** ITAR-compliant precision components
|
|
||||||
- Common part types: screws, pins, shafts, bushings, contacts, fittings
|
|
||||||
|
|
||||||
~350 words
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## H2: CNC Swiss Machining vs. Conventional CNC Turning
|
|
||||||
|
|
||||||
### H3: Key Differences
|
|
||||||
|
|
||||||
| Factor | Swiss CNC | Conventional CNC |
|
|
||||||
| ------ | --------- | ---------------- |
|
|
||||||
| Part diameter | Up to ~32mm (1.25") | Larger parts |
|
|
||||||
| Tolerances | ±0.0002" standard | ±0.001" typical |
|
|
||||||
| Complexity | High (multi-axis, live tooling) | Moderate |
|
|
||||||
| Volume | Best at high volume | Better for short runs |
|
|
||||||
| Length-to-diameter ratio | Excels at high L/D ratios | Limited by deflection |
|
|
||||||
|
|
||||||
### H3: When NOT to Use Swiss Screw Machining
|
|
||||||
|
|
||||||
Parts larger than 32mm diameter, very short production runs where setup cost doesn't amortize, parts that don't require tight tolerances, non-cylindrical geometries better suited to 3- or 5-axis milling.
|
|
||||||
|
|
||||||
~250 words
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## H2: Quality, Certification, and Choosing a Partner
|
|
||||||
|
|
||||||
### H3: Industry Certifications and Inspection
|
|
||||||
|
|
||||||
- ISO 9001:2015 (general quality management)
|
|
||||||
- ISO 13485 (medical device manufacturing)
|
|
||||||
- ITAR registration (defense applications)
|
|
||||||
- IATF 16949 (automotive)
|
|
||||||
- Inspection methods: SPC, CMM, optical measurement, laser micrometers
|
|
||||||
- First article inspection, in-process monitoring, material traceability
|
|
||||||
|
|
||||||
### H3: What to Look for in a Swiss Screw Machining Supplier
|
|
||||||
|
|
||||||
- Machine fleet: modern CNC Swiss machines with multi-axis capability
|
|
||||||
- Certifications relevant to your industry
|
|
||||||
- Tolerance capabilities demonstrated with similar materials
|
|
||||||
- Secondary operations available in-house
|
|
||||||
- Production volume capacity and lead times
|
|
||||||
|
|
||||||
~200 words
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Conclusion
|
|
||||||
|
|
||||||
Recap + CTA. ~50 words
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Structure Summary
|
|
||||||
|
|
||||||
| Level | Count | Cora Target (min) |
|
|
||||||
| ----- | ----- | ----------------- |
|
|
||||||
| H1 | 1 | 1 |
|
|
||||||
| H2 | 5 | 4 |
|
|
||||||
| H3 | 11 | 10 |
|
|
||||||
|
|
||||||
## Unique Angles
|
|
||||||
|
|
||||||
1. **"When NOT to use Swiss"** — honest guidance that builds trust and captures comparison traffic
|
|
||||||
2. **Quality/inspection detail** — goes beyond just listing ISO numbers
|
|
||||||
3. **Supplier selection guidance** — practical buyer help that competitors skip
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Fan-Out Query Headings
|
|
||||||
|
|
||||||
Separate from main content. Do NOT count against word count or heading targets.
|
|
||||||
Style as accordions, FAQs, or hidden divs.
|
|
||||||
Answer format: restate the question in the answer ("How does X work? X works by...").
|
|
||||||
Each answer: 2-3 sentences max, self-contained.
|
|
||||||
|
|
||||||
### H3: What Is the Difference Between Swiss Screw Machining and CNC Turning?
|
|
||||||
|
|
||||||
### H3: How Tight Are Swiss Screw Machining Tolerances?
|
|
||||||
|
|
||||||
### H3: What Materials Can Be Swiss Screw Machined?
|
|
||||||
|
|
||||||
### H3: Is Swiss Screw Machining Cost-Effective for Small Production Runs?
|
|
||||||
|
|
||||||
### H3: What Industries Use CNC Swiss Screw Machining?
|
|
||||||
|
|
||||||
### H3: How Does a Guide Bushing Work on a Swiss Screw Machine?
|
|
||||||
|
|
||||||
### H3: What Part Sizes Can a Swiss Screw Machine Handle?
|
|
||||||
|
|
||||||
### H3: Does Swiss Screw Machining Require Secondary Operations?
|
|
||||||
|
|
||||||
### H3: What Certifications Should a Swiss Screw Machining Supplier Have?
|
|
||||||
|
|
||||||
### H3: When Should You Choose Conventional CNC Over Swiss Machining?
|
|
||||||
|
|
@ -1,122 +0,0 @@
|
||||||
# Research Summary: CNC Swiss Screw Machining
|
|
||||||
|
|
||||||
## Search Term
|
|
||||||
cnc swiss screw machining
|
|
||||||
|
|
||||||
## Sources Analyzed
|
|
||||||
|
|
||||||
| Source | URL | Word Count | Angle |
|
|
||||||
|--------|-----|------------|-------|
|
|
||||||
| Kerr Screw | kerrscrew.com/swiss-screw-machining-explained/ | ~1,300 | Historical context, automation evolution, applications |
|
|
||||||
| Avanti Engineering | avantiengineering.com/swiss-screw-machining-benefits-applications/ | ~900 | Benefits, applications, how it works |
|
|
||||||
| IQS Directory | iqsdirectory.com/.../swiss-screw-machining.html | ~6,500 | Deep technical guide: process, types, tools, materials, prep |
|
|
||||||
| Hogge Precision | hoggeprecision.com/benefits-of-cnc-swiss-screw-machining/ | ~800 | CNC vs automatic types, benefits, capabilities |
|
|
||||||
| Cox Manufacturing | coxmanufacturing.com/blog/what-is-swiss-screw-machining/ | ~250 | Brief intro, guide bushing emphasis |
|
|
||||||
| Nolte Precise | nolteprecise.com/cnc-swiss-screw-machining/ | ~1,100 | High-volume production focus |
|
|
||||||
| Hartford Technologies | resources.hartfordtechnologies.com/... | — | Swiss vs traditional machining comparison |
|
|
||||||
| Impro Precision | improprecision.com/introduction-swiss-screw-machining/ | — | Industry applications deep dive |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Common Themes (what everyone covers)
|
|
||||||
|
|
||||||
### 1. Definition & History
|
|
||||||
Every competitor explains that Swiss screw machining originated in Switzerland in the late 1800s for watchmaking. They define it as a precision turning process using a sliding headstock and guide bushing. This is table stakes — must be covered.
|
|
||||||
|
|
||||||
### 2. How It Works (Guide Bushing + Sliding Headstock)
|
|
||||||
Core technical differentiator from conventional CNC lathes:
|
|
||||||
- Bar stock feeds through a chucking collet in the sliding headstock
|
|
||||||
- Guide bushing supports the workpiece 1-3mm from the cutting tool
|
|
||||||
- Headstock moves along Z-axis (vs. conventional lathes where the tool moves)
|
|
||||||
- Reduces deflection and vibration, enabling tighter tolerances
|
|
||||||
- Guide bushing types: synchronous rotary (for >±0.0005") and fixed (for tighter tolerances)
|
|
||||||
|
|
||||||
### 3. Precision & Tolerances
|
|
||||||
Consistently cited numbers:
|
|
||||||
- ±0.0002" to ±0.0005" tolerances standard
|
|
||||||
- Up to 10,000 RPM spindle speeds
|
|
||||||
- Bar stock must be centerless-ground to ±0.0002" diametric tolerance
|
|
||||||
- Surface finish quality superior to conventional turning
|
|
||||||
|
|
||||||
### 4. Benefits Over Conventional CNC
|
|
||||||
Every competitor lists some version of:
|
|
||||||
- Tighter tolerances (guide bushing reduces deflection)
|
|
||||||
- Reduced secondary operations (multi-spindle, live tooling)
|
|
||||||
- Higher production speed for small parts
|
|
||||||
- Lower per-part cost at volume
|
|
||||||
- Less material waste
|
|
||||||
- Simultaneous multi-tool operation (up to 20 tools at once)
|
|
||||||
|
|
||||||
### 5. Materials
|
|
||||||
Standard list: stainless steel, aluminum, brass, copper, bronze, titanium, nickel alloys, and engineering plastics (PEEK, Delrin, nylon). Exotic alloys also mentioned.
|
|
||||||
|
|
||||||
### 6. Industries & Applications
|
|
||||||
Medical (implants, surgical instruments), aerospace (fasteners, connectors), automotive (high-volume small parts), electronics (connectors, pins), defense, hydraulics, telecommunications.
|
|
||||||
|
|
||||||
### 7. CNC vs. Automatic (Cam-Driven)
|
|
||||||
Most competitors distinguish between:
|
|
||||||
- Automatic/cam-driven machines: simpler geometry, extremely high volume, lower setup flexibility
|
|
||||||
- CNC Swiss machines: complex geometry, tighter tolerances, programmable, more flexible
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Content Structure Patterns
|
|
||||||
|
|
||||||
**Short-form competitors** (~250-800 words): Kerr Screw, Hogge, Cox
|
|
||||||
- Definition → Benefits list → Industries → CTA
|
|
||||||
- Minimal technical depth, service-page style
|
|
||||||
|
|
||||||
**Mid-form competitors** (~900-1,400 words): Avanti, Nolte, Hartford
|
|
||||||
- Definition → How it works → Benefits → Applications → Swiss vs. conventional comparison
|
|
||||||
- Moderate technical depth, educational blog style
|
|
||||||
|
|
||||||
**Long-form competitors** (~6,500 words): IQS Directory
|
|
||||||
- Comprehensive guide with chapters: definition → process → types → tools → materials → components → benefits → preparation
|
|
||||||
- Deep technical reference, encyclopedia style
|
|
||||||
|
|
||||||
**Observation:** Most competitors are in the 800-1,400 word range. IQS is an outlier at 6,500+. There's a gap in the 2,000-3,000 word range — content that's thorough enough to be a real resource but not a textbook chapter.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Gaps (what competitors miss or cover poorly)
|
|
||||||
|
|
||||||
### 1. Design for Swiss Machining
|
|
||||||
Only IQS Directory touches on preparation/design considerations. Nobody provides practical guidance for engineers on how to design parts specifically for Swiss screw machining (feature sizes, wall thickness, corner radii, tolerance callouts that are realistic).
|
|
||||||
|
|
||||||
### 2. When NOT to Use Swiss Machining
|
|
||||||
Competitors focus on benefits but rarely discuss limitations or when conventional CNC is actually better (larger parts, short runs, parts without rotational symmetry).
|
|
||||||
|
|
||||||
### 3. Cost Breakdown / Economics
|
|
||||||
Everyone says "cost-effective" but nobody provides actual cost drivers: setup costs, material costs (centerless-ground bar stock premium), tooling costs, volume thresholds where Swiss becomes economical vs. conventional CNC.
|
|
||||||
|
|
||||||
### 4. Quality & Inspection Process
|
|
||||||
Certifications get mentioned (ISO 9001, ISO 13485, ITAR) but the actual inspection process — SPC, CMM measurement, optical inspection, first article inspection — is barely explained.
|
|
||||||
|
|
||||||
### 5. Machine Selection (Brand/Model Landscape)
|
|
||||||
Brief mentions of Tsugami, Citizen, Star, Tornos — but no meaningful comparison of what machines are used or why. Buyers researching this topic often need to understand what machine capabilities their supplier should have.
|
|
||||||
|
|
||||||
### 6. Modern Capabilities Beyond Turning
|
|
||||||
Swiss machines today can do milling, drilling, cross-drilling, threading, knurling, and even gear cutting — but most competitors undersell these capabilities, making Swiss machining sound like it's only for round turned parts.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Potential Unique Angles
|
|
||||||
|
|
||||||
1. **"Design for Swiss" section** — Practical engineering guidance on how to design parts that are optimized for Swiss screw machining. This is genuinely useful and nobody covers it well.
|
|
||||||
|
|
||||||
2. **Economics / When to Choose Swiss** — Honest cost analysis: volume thresholds, setup costs, when conventional CNC or multi-spindle screw machines are actually better choices. This builds trust and captures comparison-search traffic.
|
|
||||||
|
|
||||||
3. **Modern Swiss capabilities** — Position Swiss machining as more than just turning. Cover live tooling, secondary operations, and complex multi-axis work that today's CNC Swiss machines can handle.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Entity Landscape (from competitor content)
|
|
||||||
|
|
||||||
Frequently mentioned entities across sources:
|
|
||||||
- **Machine components:** guide bushing, sliding headstock, spindle, collet, bar feeder, turret, live tooling
|
|
||||||
- **Materials:** stainless steel, aluminum, brass, titanium, PEEK, Delrin, copper, bronze, nickel
|
|
||||||
- **Industries:** medical devices, aerospace, automotive, electronics, defense, telecommunications
|
|
||||||
- **Processes:** turning, milling, drilling, threading, tapping, knurling, parting
|
|
||||||
- **Quality:** ISO 9001, ISO 13485, ITAR, SPC, CMM, first article inspection
|
|
||||||
- **Machine brands:** Tsugami, Citizen, Star, Tornos
|
|
||||||
- **Specifications:** tolerance (±0.0002"), RPM (10,000), bar stock diameter (up to 32mm or 1.25")
|
|
||||||
|
|
@ -1,180 +0,0 @@
|
||||||
# Brand Voice & Tone Guidelines
|
|
||||||
|
|
||||||
Reference for maintaining consistent voice across all written content. These are defaults — override with client-specific guidelines when available.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Voice Archetypes
|
|
||||||
|
|
||||||
Choose one primary archetype per brand. A secondary archetype can add nuance but should never dominate.
|
|
||||||
|
|
||||||
### Expert
|
|
||||||
- **Sounds like:** A senior practitioner sharing hard-won knowledge.
|
|
||||||
- **Characteristics:** Precise, evidence-backed, confident without arrogance. Cites data, references real-world experience, and isn't afraid to say "it depends."
|
|
||||||
- **Typical vocabulary:** "In practice," "the tradeoff is," "based on our benchmarks," "here's why this matters."
|
|
||||||
- **Risk to avoid:** Coming across as condescending or overly academic.
|
|
||||||
- **Best for:** Technical audiences, B2B SaaS, engineering blogs, whitepapers.
|
|
||||||
|
|
||||||
### Guide
|
|
||||||
- **Sounds like:** A patient teacher walking you through something step by step.
|
|
||||||
- **Characteristics:** Clear, encouraging, anticipates confusion. Breaks complex ideas into digestible pieces. Uses analogies.
|
|
||||||
- **Typical vocabulary:** "Let's start with," "think of it like," "the key thing to remember," "don't worry if this seems complex."
|
|
||||||
- **Risk to avoid:** Being patronizing or oversimplifying for an advanced audience.
|
|
||||||
- **Best for:** Tutorials, onboarding content, documentation, beginner-to-intermediate audiences.
|
|
||||||
|
|
||||||
### Innovator
|
|
||||||
- **Sounds like:** Someone who sees around corners and wants to bring you along.
|
|
||||||
- **Characteristics:** Forward-looking, curious, willing to challenge assumptions. Connects dots across domains. Thinks in systems.
|
|
||||||
- **Typical vocabulary:** "What if," "the shift we're seeing," "this changes the calculus," "the next wave."
|
|
||||||
- **Risk to avoid:** Sounding like hype or vaporware. Must ground vision in evidence.
|
|
||||||
- **Best for:** Thought leadership, industry analysis, product vision content, founder blogs.
|
|
||||||
|
|
||||||
### Friend
|
|
||||||
- **Sounds like:** A sharp colleague sharing advice over coffee.
|
|
||||||
- **Characteristics:** Warm, direct, conversational. Uses "you" and "we." Comfortable with humor when it's natural. Doesn't hide behind jargon.
|
|
||||||
- **Typical vocabulary:** "Here's the thing," "honestly," "we've all been there," "the trick is."
|
|
||||||
- **Risk to avoid:** Being too casual for high-stakes topics or enterprise audiences.
|
|
||||||
- **Best for:** Community content, newsletters, brand blogs aimed at practitioners.
|
|
||||||
|
|
||||||
### Motivator
|
|
||||||
- **Sounds like:** A coach who believes in your potential and pushes you to act.
|
|
||||||
- **Characteristics:** Energetic, action-oriented, focused on outcomes. Uses imperatives. Celebrates progress.
|
|
||||||
- **Typical vocabulary:** "Start today," "you can do this," "here's your edge," "stop waiting for perfect."
|
|
||||||
- **Risk to avoid:** Empty cheerleading. Must pair motivation with substance.
|
|
||||||
- **Best for:** Career content, productivity content, entrepreneurship, course marketing.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Core Writing Principles
|
|
||||||
|
|
||||||
These apply regardless of archetype.
|
|
||||||
|
|
||||||
### 1. Clarity First
|
|
||||||
- If a sentence can be misread, rewrite it.
|
|
||||||
- Use the simplest word that conveys the precise meaning. "Use" over "utilize." "Start" over "commence."
|
|
||||||
- One idea per paragraph. One purpose per section.
|
|
||||||
- Define jargon on first use, or skip it entirely.
|
|
||||||
|
|
||||||
### 2. Customer-Centric
|
|
||||||
- Frame everything from the reader's perspective, not the company's.
|
|
||||||
- **Instead of:** "We built a new feature that enables real-time collaboration."
|
|
||||||
- **Write:** "You can now edit documents with your team in real time."
|
|
||||||
- Lead with the reader's problem or goal, not the product or solution.
|
|
||||||
|
|
||||||
### 3. Active Voice
|
|
||||||
- Active voice is the default. Passive voice is acceptable only when the actor is unknown or irrelevant.
|
|
||||||
- **Active:** "The script generates a report every morning."
|
|
||||||
- **Passive (acceptable):** "The logs are rotated every 24 hours." (The actor doesn't matter.)
|
|
||||||
- **Passive (avoid):** "A decision was made to deprecate the endpoint." (Who decided?)
|
|
||||||
|
|
||||||
### 4. Show, Don't Claim
|
|
||||||
- Replace vague claims with specific evidence.
|
|
||||||
- **Claim:** "Our platform is incredibly fast."
|
|
||||||
- **Show:** "Queries return in under 50ms at the 99th percentile."
|
|
||||||
- If you can't provide evidence, soften the language or cut the sentence.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Tone Attributes
|
|
||||||
|
|
||||||
Tone shifts based on content type and audience. Use these spectrums to calibrate.
|
|
||||||
|
|
||||||
### Formality Spectrum
|
|
||||||
|
|
||||||
```
|
|
||||||
Casual -------|-------|-------|-------|------- Formal
|
|
||||||
1 2 3 4 5
|
|
||||||
```
|
|
||||||
|
|
||||||
| Level | Description | Use When |
|
|
||||||
|-------|-------------|----------|
|
|
||||||
| 1 | Slang OK, sentence fragments, first person | Internal team comms, very informal blogs |
|
|
||||||
| 2 | Conversational, contractions, direct address | Newsletters, community posts, most blog content |
|
|
||||||
| 3 | Professional but approachable, minimal contractions | Product announcements, mid-funnel content |
|
|
||||||
| 4 | Polished, structured, no contractions | Whitepapers, enterprise case studies, executive briefs |
|
|
||||||
| 5 | Formal, third person, precise terminology | Legal, compliance, academic partnerships |
|
|
||||||
|
|
||||||
**Default for most blog/article content: Level 2-3.**
|
|
||||||
|
|
||||||
### Technical Depth Spectrum
|
|
||||||
|
|
||||||
```
|
|
||||||
General -------|-------|-------|-------|------- Deep Technical
|
|
||||||
1 2 3 4 5
|
|
||||||
```
|
|
||||||
|
|
||||||
| Level | Description | Use When |
|
|
||||||
|-------|-------------|----------|
|
|
||||||
| 1 | No jargon, analogy-heavy, conceptual | Non-technical stakeholders, general audience |
|
|
||||||
| 2 | Light jargon (defined inline), practical focus | Business audience with some domain familiarity |
|
|
||||||
| 3 | Industry-standard terminology, code snippets OK | Practitioners who do the work daily |
|
|
||||||
| 4 | Assumes working knowledge, implementation details | Developers, engineers, technical decision-makers |
|
|
||||||
| 5 | Deep internals, performance analysis, tradeoff math | Senior engineers, architects, researchers |
|
|
||||||
|
|
||||||
**Default: Match the audience. When unsure, aim one level below what you think the audience can handle. Accessibility wins.**
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Language Preferences
|
|
||||||
|
|
||||||
### Use Action Verbs
|
|
||||||
Lead sentences — especially headings and CTAs — with strong verbs.
|
|
||||||
|
|
||||||
| Weak | Strong |
|
|
||||||
|------|--------|
|
|
||||||
| There is a way to improve | Improve |
|
|
||||||
| This section is a discussion of | This section covers |
|
|
||||||
| You should consider using | Use |
|
|
||||||
| It is important to note that | Note: |
|
|
||||||
| We are going to walk through | Let's walk through |
|
|
||||||
|
|
||||||
### Be Concrete and Specific
|
|
||||||
Vague language erodes trust. Replace generalities with specifics.
|
|
||||||
|
|
||||||
| Vague | Concrete |
|
|
||||||
|-------|----------|
|
|
||||||
| "significantly faster" | "3x faster" or "reduced from 12s to 2s" |
|
|
||||||
| "a large number of users" | "over 40,000 monthly active users" |
|
|
||||||
| "best-in-class" | describe the specific advantage |
|
|
||||||
| "seamless integration" | "connects via a single API call" |
|
|
||||||
| "in the near future" | "by Q2" or "in the next release" |
|
|
||||||
|
|
||||||
### Avoid These Patterns
|
|
||||||
- **Weasel words:** "very," "really," "extremely," "quite," "somewhat" — cut them or replace with data.
|
|
||||||
- **Nominalizations:** "implementation" when you mean "implement," "utilization" when you mean "use."
|
|
||||||
- **Hedge stacking:** "It might potentially be possible to perhaps consider..." — commit to a position or state the uncertainty once, clearly.
|
|
||||||
- **Buzzword chains:** "AI-powered next-gen synergistic platform" — describe what it actually does.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Pre-Publication Checklist
|
|
||||||
|
|
||||||
Run through this before publishing any piece of content.
|
|
||||||
|
|
||||||
### Voice Consistency
|
|
||||||
- [ ] Does the piece sound like one person wrote it, beginning to end?
|
|
||||||
- [ ] Does it match the target voice archetype?
|
|
||||||
- [ ] Are there jarring shifts in tone between sections?
|
|
||||||
- [ ] If multiple authors contributed, has it been edited for a unified voice?
|
|
||||||
|
|
||||||
### Clarity
|
|
||||||
- [ ] Can a reader in the target audience understand every sentence on the first read?
|
|
||||||
- [ ] Is jargon defined or avoided?
|
|
||||||
- [ ] Are all acronyms expanded on first use?
|
|
||||||
- [ ] Do headings accurately describe the content beneath them?
|
|
||||||
- [ ] Is the article scannable? (subheadings every 2-4 paragraphs, short paragraphs, lists where appropriate)
|
|
||||||
|
|
||||||
### Value
|
|
||||||
- [ ] Does the introduction make clear what the reader will gain?
|
|
||||||
- [ ] Does every section earn its place? (Cut anything that doesn't serve the reader's goal.)
|
|
||||||
- [ ] Are claims supported by evidence, examples, or data?
|
|
||||||
- [ ] Is the advice actionable — can the reader do something with it today?
|
|
||||||
- [ ] Does the conclusion provide a clear next step?
|
|
||||||
|
|
||||||
### Formatting
|
|
||||||
- [ ] Title is under 70 characters and includes the core keyword or topic.
|
|
||||||
- [ ] Meta description is 140-160 characters and summarizes the value proposition.
|
|
||||||
- [ ] Headings use parallel structure (all questions, all noun phrases, or all verb phrases — not mixed).
|
|
||||||
- [ ] Code blocks, tables, and images have context (a sentence before them explaining what the reader is looking at).
|
|
||||||
- [ ] Links use descriptive anchor text, not "click here."
|
|
||||||
- [ ] No walls of text — maximum 4 sentences per paragraph for web content.
|
|
||||||
|
|
@ -1,267 +0,0 @@
|
||||||
# Content Frameworks Reference
|
|
||||||
|
|
||||||
Quick-reference guide for structuring blog posts and articles. Use these templates as starting points, then adapt to the topic and audience.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Article Templates
|
|
||||||
|
|
||||||
### How-To Guide
|
|
||||||
|
|
||||||
```
|
|
||||||
Title: How to [Achieve Specific Outcome] (in [Timeframe/Steps])
|
|
||||||
|
|
||||||
Introduction
|
|
||||||
- State the outcome the reader will achieve
|
|
||||||
- Briefly explain why this matters or who this is for
|
|
||||||
- Set expectations: what they need, how long it takes
|
|
||||||
|
|
||||||
Prerequisites / What You'll Need (optional)
|
|
||||||
- Tools, knowledge, or setup required before starting
|
|
||||||
|
|
||||||
Step 1: [Action Verb] + [Object]
|
|
||||||
- What to do and why
|
|
||||||
- Concrete details, examples, or code snippets
|
|
||||||
- Common mistake to avoid at this step
|
|
||||||
|
|
||||||
Step 2: [Action Verb] + [Object]
|
|
||||||
- (same pattern)
|
|
||||||
|
|
||||||
... (repeat for each step)
|
|
||||||
|
|
||||||
Troubleshooting / Common Issues (optional)
|
|
||||||
- Problem → Cause → Fix, in a quick table or list
|
|
||||||
|
|
||||||
Conclusion
|
|
||||||
- Recap what the reader accomplished
|
|
||||||
- Suggest a logical next step or related guide
|
|
||||||
```
|
|
||||||
|
|
||||||
**Key principle:** Each step starts with an action verb. One action per step. If a step has sub-steps, break it out.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Listicle
|
|
||||||
|
|
||||||
```
|
|
||||||
Title: [Number] [Adjective] [Things] for [Audience/Goal]
|
|
||||||
Examples: "9 Underrated Tools for Frontend Performance"
|
|
||||||
"5 Strategies That Reduced Our Build Time by 60%"
|
|
||||||
|
|
||||||
Introduction (2-3 sentences)
|
|
||||||
- Who this list is for
|
|
||||||
- What criteria you used to select items
|
|
||||||
|
|
||||||
Item 1: [Name or Short Description]
|
|
||||||
- What it is (1 sentence)
|
|
||||||
- Why it matters or when to use it (1-2 sentences)
|
|
||||||
- Concrete example, stat, or tip
|
|
||||||
|
|
||||||
Item 2: ...
|
|
||||||
(repeat)
|
|
||||||
|
|
||||||
Wrap-Up
|
|
||||||
- Quick summary of top picks or situational recommendations
|
|
||||||
- CTA: ask readers to share their own picks, or link to a deeper dive
|
|
||||||
```
|
|
||||||
|
|
||||||
**Key principle:** Each item must stand alone. Readers skim listicles — front-load the value in each entry. Order by impact (strongest first or last) or by logical progression.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Comparison / Vs Article
|
|
||||||
|
|
||||||
```
|
|
||||||
Title: [Option A] vs [Option B]: [Decision Context]
|
|
||||||
Example: "Postgres vs MySQL: Which Database Fits Your SaaS in 2026?"
|
|
||||||
|
|
||||||
Introduction
|
|
||||||
- The decision the reader faces
|
|
||||||
- Who this comparison is for (skill level, use case)
|
|
||||||
- Summary verdict (give the answer up front, then prove it)
|
|
||||||
|
|
||||||
Quick Comparison Table
|
|
||||||
| Criteria | Option A | Option B |
|
|
||||||
|-----------------|----------------|----------------|
|
|
||||||
| [Criterion 1] | ... | ... |
|
|
||||||
| [Criterion 2] | ... | ... |
|
|
||||||
| Pricing | ... | ... |
|
|
||||||
| Best for | ... | ... |
|
|
||||||
|
|
||||||
Section: [Criterion 1] Deep Dive
|
|
||||||
- How A handles it
|
|
||||||
- How B handles it
|
|
||||||
- Verdict for this criterion
|
|
||||||
|
|
||||||
(repeat for each major criterion)
|
|
||||||
|
|
||||||
When to Choose A
|
|
||||||
- Bullet list of scenarios, use cases, or team profiles
|
|
||||||
|
|
||||||
When to Choose B
|
|
||||||
- Same structure
|
|
||||||
|
|
||||||
Final Recommendation
|
|
||||||
- Restate the summary verdict with nuance
|
|
||||||
- Suggest next steps (trial links, related guides)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Key principle:** Be opinionated. Readers come to comparison articles for a recommendation, not a feature dump. State your pick early, then support it.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Case Study
|
|
||||||
|
|
||||||
```
|
|
||||||
Title: How [Company/Person] [Achieved Result] with [Method/Tool]
|
|
||||||
|
|
||||||
Snapshot (sidebar or callout box)
|
|
||||||
- Company/person profile
|
|
||||||
- Challenge in one line
|
|
||||||
- Result in one line (with numbers)
|
|
||||||
- Timeline
|
|
||||||
|
|
||||||
The Challenge
|
|
||||||
- Situation before: pain points, constraints, failed attempts
|
|
||||||
- Why existing solutions weren't working
|
|
||||||
- Stakes: what would happen if unsolved
|
|
||||||
|
|
||||||
The Approach
|
|
||||||
- What they decided to do and why
|
|
||||||
- Implementation details (tools, process, decisions)
|
|
||||||
- Obstacles encountered during execution
|
|
||||||
|
|
||||||
The Results
|
|
||||||
- Quantified outcomes (before/after metrics)
|
|
||||||
- Qualitative outcomes (team sentiment, workflow changes)
|
|
||||||
- Timeline to results
|
|
||||||
|
|
||||||
Key Takeaways
|
|
||||||
- 2-4 lessons the reader can apply to their own situation
|
|
||||||
- What the subject would do differently next time (if anything)
|
|
||||||
```
|
|
||||||
|
|
||||||
**Key principle:** Specifics beat generalities. Use real numbers, timelines, and named tools. A case study without measurable results is just a testimonial.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### Thought Leadership
|
|
||||||
|
|
||||||
```
|
|
||||||
Title: [Contrarian Claim] or [Reframed Problem]
|
|
||||||
Examples: "Your Microservices Migration Will Fail — Here's Why"
|
|
||||||
"We've Been Thinking About Developer Productivity Wrong"
|
|
||||||
|
|
||||||
The Hook
|
|
||||||
- A bold claim, surprising stat, or industry assumption to challenge
|
|
||||||
- One paragraph max
|
|
||||||
|
|
||||||
The Conventional View
|
|
||||||
- What most people believe or do today
|
|
||||||
- Why it seems reasonable on the surface
|
|
||||||
|
|
||||||
The Shift
|
|
||||||
- What's changed (new data, your experience, a trend)
|
|
||||||
- Why the conventional view no longer holds
|
|
||||||
- Evidence: data, examples, analogies
|
|
||||||
|
|
||||||
The New Mental Model
|
|
||||||
- Your proposed way of thinking about this
|
|
||||||
- How it changes decisions or priorities
|
|
||||||
- 1-2 concrete examples of the new model applied
|
|
||||||
|
|
||||||
Implications
|
|
||||||
- What readers should do differently starting now
|
|
||||||
- What this means for the industry over the next 1-3 years
|
|
||||||
|
|
||||||
Close
|
|
||||||
- Restate the core insight in one sentence
|
|
||||||
- Invite discussion or point to your deeper work on this topic
|
|
||||||
```
|
|
||||||
|
|
||||||
**Key principle:** Thought leadership requires a genuine point of view. The article should change how the reader thinks, not just inform them.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Persuasion Frameworks
|
|
||||||
|
|
||||||
### AIDA (Attention, Interest, Desire, Action)
|
|
||||||
|
|
||||||
Use AIDA to structure the emotional arc of an article, especially product-adjacent or tutorial content.
|
|
||||||
|
|
||||||
| Stage | Purpose | Tactics |
|
|
||||||
|-------|---------|---------|
|
|
||||||
| **Attention** | Stop the scroll. Earn the click. | Surprising stat, bold claim, relatable pain point in the title and opening line. |
|
|
||||||
| **Interest** | Convince them to keep reading. | Show you understand their situation. Introduce the core concept or framework. Use subheadings that promise value. |
|
|
||||||
| **Desire** | Make them want the outcome. | Show results: examples, screenshots, before/after. Paint a picture of life after applying the advice. |
|
|
||||||
| **Action** | Tell them what to do next. | Specific, low-friction CTA. One action, not five. "Clone the repo," "Try this query," "Read part 2." |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### PAS (Problem, Agitate, Solution)
|
|
||||||
|
|
||||||
Use PAS for introductions, email content, and articles addressing a known pain point.
|
|
||||||
|
|
||||||
| Stage | Purpose | Tactics |
|
|
||||||
|-------|---------|---------|
|
|
||||||
| **Problem** | Name the pain clearly. | Describe the situation in the reader's own words. Be specific — "your CI pipeline takes 40 minutes" beats "slow builds." |
|
|
||||||
| **Agitate** | Make the pain feel urgent. | Show the consequences: wasted time, lost revenue, compounding tech debt. Use "what happens if you don't fix this" framing. |
|
|
||||||
| **Solution** | Present the path forward. | Introduce your approach, tool, or framework. Transition into the body of the article. |
|
|
||||||
|
|
||||||
PAS works best in the first 3-5 paragraphs, then hand off to a structural template (How-To, Listicle, etc.) for the body.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Introduction Patterns
|
|
||||||
|
|
||||||
Use one of these patterns for the opening 2-4 sentences. Match the pattern to the article type and audience.
|
|
||||||
|
|
||||||
**The Stat Drop**
|
|
||||||
Open with a surprising number, then connect it to the reader's world.
|
|
||||||
> "73% of API integrations fail in the first year — not because of bad code, but because of bad documentation."
|
|
||||||
|
|
||||||
**The Contrarian Hook**
|
|
||||||
Challenge a common belief head-on.
|
|
||||||
> "You don't need a content calendar. What you need is a content system."
|
|
||||||
|
|
||||||
**The Pain Mirror**
|
|
||||||
Describe the reader's frustration in their own words.
|
|
||||||
> "You've rewritten the onboarding flow three times this quarter. Each time, engagement drops again within a month."
|
|
||||||
|
|
||||||
**The Outcome Lead**
|
|
||||||
Start with the result, then explain how to get there.
|
|
||||||
> "Our deploy frequency went from weekly to 12x per day. Here's the infrastructure change that made it possible."
|
|
||||||
|
|
||||||
**The Story Open**
|
|
||||||
Begin with a brief, relevant anecdote (3 sentences max).
|
|
||||||
> "Last March, our team pushed a migration that broke checkout for 6 hours. The post-mortem revealed something we didn't expect."
|
|
||||||
|
|
||||||
**The Question**
|
|
||||||
Ask a question the reader is already asking themselves.
|
|
||||||
> "Why does every database migration guide assume you have zero traffic?"
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Conclusion Patterns
|
|
||||||
|
|
||||||
Every conclusion should do two things: (1) reinforce the core takeaway, and (2) give the reader a next step.
|
|
||||||
|
|
||||||
**The Recap + CTA**
|
|
||||||
Summarize the 2-3 key points, then give one clear action.
|
|
||||||
> "To recap: validate early, test with real data, and deploy incrementally. Ready to try it? Start with [specific first step]."
|
|
||||||
|
|
||||||
**The Implication Close**
|
|
||||||
Zoom out. Connect the article's advice to a bigger trend or outcome.
|
|
||||||
> "This isn't just about faster deploys — it's about building a team that ships with confidence."
|
|
||||||
|
|
||||||
**The Next Step Bridge**
|
|
||||||
Point to a logical follow-up resource or action.
|
|
||||||
> "Now that your monitoring is in place, the next step is setting up alerting thresholds. We cover that in [linked article]."
|
|
||||||
|
|
||||||
**The Challenge Close**
|
|
||||||
Issue a direct, friendly challenge to the reader.
|
|
||||||
> "Pick one of these patterns and apply it to your next pull request. See what changes."
|
|
||||||
|
|
||||||
**The Open Loop**
|
|
||||||
Tease upcoming content or unresolved questions to drive return visits.
|
|
||||||
> "We've covered the read path. In part 2, we'll tackle the write path — where the real complexity lives."
|
|
||||||
|
|
@ -1,160 +0,0 @@
|
||||||
# Brand Voice & Tone Guidelines
|
|
||||||
|
|
||||||
Reference for maintaining consistent voice across all written content. These are defaults — override with client-specific guidelines when available.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Voice Archetypes
|
|
||||||
|
|
||||||
Start with Expert but also work in Guide when appliciable.
|
|
||||||
|
|
||||||
### Expert
|
|
||||||
- **Sounds like:** A senior practitioner sharing hard-won knowledge.
|
|
||||||
- **Characteristics:** Precise, evidence-backed, confident without arrogance. Cites data, references real-world experience, and isn't afraid to say "it depends."
|
|
||||||
- **Typical vocabulary:** "In practice," "the tradeoff is," "based on our benchmarks," "here's why this matters."
|
|
||||||
- **Risk to avoid:** Coming across as condescending or overly academic.
|
|
||||||
- **Best for:** Technical audiences, B2B SaaS, engineering blogs, whitepapers.
|
|
||||||
|
|
||||||
### Guide
|
|
||||||
- **Sounds like:** A patient teacher walking you through something step by step.
|
|
||||||
- **Characteristics:** Clear, encouraging, anticipates confusion. Breaks complex ideas into digestible pieces. Uses analogies.
|
|
||||||
- **Typical vocabulary:** "Let's start with," "think of it like," "the key thing to remember," "don't worry if this seems complex."
|
|
||||||
- **Risk to avoid:** Being patronizing or oversimplifying for an advanced audience.
|
|
||||||
- **Best for:** Tutorials, onboarding content, documentation, beginner-to-intermediate audiences.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Core Writing Principles
|
|
||||||
|
|
||||||
These apply regardless of archetype.
|
|
||||||
|
|
||||||
### 1. Clarity First
|
|
||||||
- If a sentence can be misread, rewrite it.
|
|
||||||
- Use the simplest word that conveys the precise meaning. "Use" over "utilize." "Start" over "commence."
|
|
||||||
- One idea per paragraph. One purpose per section.
|
|
||||||
- Define jargon on first use, or skip it entirely.
|
|
||||||
|
|
||||||
### 2. Customer-Centric
|
|
||||||
- Frame everything from the reader's perspective, not the company's.
|
|
||||||
- **Instead of:** "We built a new feature that enables real-time collaboration."
|
|
||||||
- **Write:** "You can now edit documents with your team in real time."
|
|
||||||
- Lead with the reader's problem or goal, not the product or solution.
|
|
||||||
|
|
||||||
### 3. Active Voice
|
|
||||||
- Active voice is the default. Passive voice is acceptable only when the actor is unknown or irrelevant.
|
|
||||||
- **Active:** "The script generates a report every morning."
|
|
||||||
- **Passive (acceptable):** "The logs are rotated every 24 hours." (The actor doesn't matter.)
|
|
||||||
- **Passive (avoid):** "A decision was made to deprecate the endpoint." (Who decided?)
|
|
||||||
|
|
||||||
### 4. Show, Don't Claim
|
|
||||||
- Replace vague claims with specific evidence.
|
|
||||||
- **Claim:** "Our platform is incredibly fast."
|
|
||||||
- **Show:** "Queries return in under 50ms at the 99th percentile."
|
|
||||||
- If you can't provide evidence, soften the language or cut the sentence.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Tone Attributes
|
|
||||||
|
|
||||||
Tone shifts based on content type and audience. Use these spectrums to calibrate.
|
|
||||||
|
|
||||||
### Formality Spectrum
|
|
||||||
|
|
||||||
```
|
|
||||||
Casual -------|-------|-------|-------|------- Formal
|
|
||||||
1 2 3 4 5
|
|
||||||
```
|
|
||||||
|
|
||||||
| Level | Description | Use When |
|
|
||||||
|-------|-------------|----------|
|
|
||||||
| 1 | Slang OK, sentence fragments, first person | Internal team comms, very informal blogs |
|
|
||||||
| 2 | Conversational, contractions, direct address | Newsletters, community posts, most blog content |
|
|
||||||
| 3 | Professional but approachable, minimal contractions | Product announcements, mid-funnel content |
|
|
||||||
| 4 | Polished, structured, no contractions | Whitepapers, enterprise case studies, executive briefs |
|
|
||||||
| 5 | Formal, third person, precise terminology | Legal, compliance, academic partnerships |
|
|
||||||
|
|
||||||
**Default for most blog/article content: Level 2-3.**
|
|
||||||
|
|
||||||
### Technical Depth Spectrum
|
|
||||||
|
|
||||||
```
|
|
||||||
General -------|-------|-------|-------|------- Deep Technical
|
|
||||||
1 2 3 4 5
|
|
||||||
```
|
|
||||||
|
|
||||||
| Level | Description | Use When |
|
|
||||||
|-------|-------------|----------|
|
|
||||||
| 1 | No jargon, analogy-heavy, conceptual | Non-technical stakeholders, general audience |
|
|
||||||
| 2 | Light jargon (defined inline), practical focus | Business audience with some domain familiarity |
|
|
||||||
| 3 | Industry-standard terminology, code snippets OK | Practitioners who do the work daily |
|
|
||||||
| 4 | Assumes working knowledge, implementation details | Developers, engineers, technical decision-makers |
|
|
||||||
| 5 | Deep internals, performance analysis, tradeoff math | Senior engineers, architects, researchers |
|
|
||||||
|
|
||||||
**Default: Match the audience. When unsure, aim at what you think the audience can handle. We are mostly B2B.**
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Language Preferences
|
|
||||||
|
|
||||||
### Use Action Verbs
|
|
||||||
Lead sentences — especially headings and CTAs — with strong verbs.
|
|
||||||
|
|
||||||
| Weak | Strong |
|
|
||||||
|------|--------|
|
|
||||||
| There is a way to improve | Improve |
|
|
||||||
| This section is a discussion of | This section covers |
|
|
||||||
| You should consider using | Use |
|
|
||||||
| It is important to note that | Note: |
|
|
||||||
| We are going to walk through | Let's walk through |
|
|
||||||
|
|
||||||
### Be Concrete and Specific
|
|
||||||
Vague language erodes trust. Replace generalities with specifics.
|
|
||||||
|
|
||||||
| Vague | Concrete |
|
|
||||||
|-------|----------|
|
|
||||||
| "significantly faster" | "3x faster" or "reduced from 12s to 2s" |
|
|
||||||
| "a large number of users" | "over 40,000 monthly active users" |
|
|
||||||
| "best-in-class" | describe the specific advantage |
|
|
||||||
| "seamless integration" | "connects via a single API call" |
|
|
||||||
| "in the near future" | "by Q2" or "in the next release" |
|
|
||||||
|
|
||||||
### Avoid These Patterns
|
|
||||||
- **Weasel words:** "very," "really," "extremely," "quite," "somewhat" — cut them or replace with data.
|
|
||||||
- **Nominalizations:** "implementation" when you mean "implement," "utilization" when you mean "use."
|
|
||||||
- **Hedge stacking:** "It might potentially be possible to perhaps consider..." — commit to a position or state the uncertainty once, clearly.
|
|
||||||
- **Buzzword chains:** "AI-powered next-gen synergistic platform" — describe what it actually does.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Pre-Publication Checklist
|
|
||||||
|
|
||||||
Run through this before publishing any piece of content.
|
|
||||||
|
|
||||||
### Voice Consistency
|
|
||||||
- [ ] Does the piece sound like one person wrote it, beginning to end?
|
|
||||||
- [ ] Does it match the target voice archetype?
|
|
||||||
- [ ] Are there jarring shifts in tone between sections?
|
|
||||||
|
|
||||||
|
|
||||||
### Clarity
|
|
||||||
- [ ] Can a reader in the target audience understand every sentence on the first read?
|
|
||||||
- [ ] Is jargon defined or avoided?
|
|
||||||
- [ ] Are all acronyms expanded on first use?
|
|
||||||
- [ ] Do headings accurately describe the content beneath them?
|
|
||||||
- [ ] Is the article scannable? (subheadings every 2-4 paragraphs, short paragraphs, lists where appropriate)
|
|
||||||
|
|
||||||
### Value
|
|
||||||
- [ ] Does the introduction make clear what the reader will gain?
|
|
||||||
- [ ] Does every section earn its place? (Cut anything that doesn't serve the reader's goal.)
|
|
||||||
- [ ] Are claims supported by evidence, examples, or data?
|
|
||||||
- [ ] Is the advice actionable — can the reader do something with it today?
|
|
||||||
- [ ] Does the conclusion provide a clear next step?
|
|
||||||
|
|
||||||
### Formatting
|
|
||||||
- [ ] Title includes the core keyword or topic and at least 2 closely related keyword's/topics.
|
|
||||||
- [ ] Meta description summarizes the value proposition.
|
|
||||||
- [ ] Code blocks, tables, and images have context (a sentence before them explaining what the reader is looking at).
|
|
||||||
- [ ] Links use descriptive anchor text, not "click here."
|
|
||||||
- [ ] No walls of text — maximum 5 sentences per paragraph for web content. Use a minimum of 2 sentences.
|
|
||||||
|
|
@ -1,292 +0,0 @@
|
||||||
"""
|
|
||||||
Competitor Content Scraper
|
|
||||||
|
|
||||||
Fetches web pages and extracts clean text content for analysis.
|
|
||||||
Used as a utility when the user provides a list of URLs to examine.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with requests,beautifulsoup4 python competitor_scraper.py URL1 URL2 ...
|
|
||||||
[--output-dir ./working/competitor_content/]
|
|
||||||
[--format json|text]
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
from pathlib import Path
|
|
||||||
from urllib.parse import urlparse
|
|
||||||
|
|
||||||
try:
|
|
||||||
import requests
|
|
||||||
from bs4 import BeautifulSoup
|
|
||||||
except ImportError:
|
|
||||||
print(
|
|
||||||
"Error: requests and beautifulsoup4 are required.\n"
|
|
||||||
"Install with: uv add requests beautifulsoup4",
|
|
||||||
file=sys.stderr,
|
|
||||||
)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
|
|
||||||
UNWANTED_TAGS = [
|
|
||||||
"nav", "footer", "header", "aside", "script", "style", "noscript",
|
|
||||||
"iframe", "form", "button", "svg", "img", "video", "audio",
|
|
||||||
]
|
|
||||||
|
|
||||||
UNWANTED_CLASSES = [
|
|
||||||
"nav", "navbar", "navigation", "menu", "sidebar", "footer", "header",
|
|
||||||
"breadcrumb", "cookie", "popup", "modal", "advertisement", "ad-",
|
|
||||||
"social", "share", "comment", "related-posts",
|
|
||||||
]
|
|
||||||
|
|
||||||
DEFAULT_HEADERS = {
|
|
||||||
"User-Agent": (
|
|
||||||
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
|
|
||||||
"(KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
|
|
||||||
),
|
|
||||||
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
|
|
||||||
"Accept-Language": "en-US,en;q=0.5",
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class CompetitorScraper:
|
|
||||||
"""Fetches and cleans web page content for competitor analysis."""
|
|
||||||
|
|
||||||
def __init__(self, timeout: int = 15, delay: float = 1.0):
|
|
||||||
"""
|
|
||||||
Args:
|
|
||||||
timeout: Request timeout in seconds.
|
|
||||||
delay: Delay between requests in seconds (rate limiting).
|
|
||||||
"""
|
|
||||||
self.timeout = timeout
|
|
||||||
self.delay = delay
|
|
||||||
self.session = requests.Session()
|
|
||||||
self.session.headers.update(DEFAULT_HEADERS)
|
|
||||||
|
|
||||||
def scrape_url(self, url: str) -> dict:
|
|
||||||
"""Scrape a single URL and extract clean content.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Dict with: url, host, title, meta_description, headings, text, word_count, error
|
|
||||||
"""
|
|
||||||
result = {
|
|
||||||
"url": url,
|
|
||||||
"host": urlparse(url).netloc,
|
|
||||||
"title": "",
|
|
||||||
"meta_description": "",
|
|
||||||
"headings": [],
|
|
||||||
"text": "",
|
|
||||||
"word_count": 0,
|
|
||||||
"error": None,
|
|
||||||
}
|
|
||||||
|
|
||||||
try:
|
|
||||||
response = self.session.get(url, timeout=self.timeout)
|
|
||||||
response.raise_for_status()
|
|
||||||
response.encoding = response.apparent_encoding or "utf-8"
|
|
||||||
html = response.text
|
|
||||||
except requests.RequestException as e:
|
|
||||||
result["error"] = str(e)
|
|
||||||
return result
|
|
||||||
|
|
||||||
soup = BeautifulSoup(html, "html.parser")
|
|
||||||
|
|
||||||
# Extract title
|
|
||||||
title_tag = soup.find("title")
|
|
||||||
if title_tag:
|
|
||||||
result["title"] = title_tag.get_text(strip=True)
|
|
||||||
|
|
||||||
# Extract meta description
|
|
||||||
meta_desc = soup.find("meta", attrs={"name": "description"})
|
|
||||||
if meta_desc and meta_desc.get("content"):
|
|
||||||
result["meta_description"] = meta_desc["content"].strip()
|
|
||||||
|
|
||||||
# Extract headings before cleaning
|
|
||||||
result["headings"] = self._extract_headings(soup)
|
|
||||||
|
|
||||||
# Clean the HTML and extract main text
|
|
||||||
result["text"] = self._extract_text(soup)
|
|
||||||
result["word_count"] = len(result["text"].split())
|
|
||||||
|
|
||||||
return result
|
|
||||||
|
|
||||||
def scrape_urls(self, urls: list[str]) -> list[dict]:
|
|
||||||
"""Scrape multiple URLs with rate limiting.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
urls: List of URLs to scrape.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of result dicts from scrape_url.
|
|
||||||
"""
|
|
||||||
results = []
|
|
||||||
for i, url in enumerate(urls):
|
|
||||||
if i > 0:
|
|
||||||
time.sleep(self.delay)
|
|
||||||
|
|
||||||
print(f" Scraping [{i + 1}/{len(urls)}]: {url}", file=sys.stderr)
|
|
||||||
result = self.scrape_url(url)
|
|
||||||
|
|
||||||
if result["error"]:
|
|
||||||
print(f" Error: {result['error']}", file=sys.stderr)
|
|
||||||
else:
|
|
||||||
print(f" OK: {result['word_count']} words", file=sys.stderr)
|
|
||||||
|
|
||||||
results.append(result)
|
|
||||||
|
|
||||||
return results
|
|
||||||
|
|
||||||
def save_results(self, results: list[dict], output_dir: str) -> list[str]:
|
|
||||||
"""Save scraped results as individual text files.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
results: List of result dicts from scrape_urls.
|
|
||||||
output_dir: Directory to write files to.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of file paths written.
|
|
||||||
"""
|
|
||||||
out_path = Path(output_dir)
|
|
||||||
out_path.mkdir(parents=True, exist_ok=True)
|
|
||||||
|
|
||||||
saved = []
|
|
||||||
for result in results:
|
|
||||||
if result["error"] or not result["text"]:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Create filename from host
|
|
||||||
host = result["host"].replace("www.", "")
|
|
||||||
safe_name = re.sub(r'[^\w\-.]', '_', host)
|
|
||||||
filepath = out_path / f"{safe_name}.txt"
|
|
||||||
|
|
||||||
content = self._format_output(result)
|
|
||||||
filepath.write_text(content, encoding="utf-8")
|
|
||||||
saved.append(str(filepath))
|
|
||||||
|
|
||||||
return saved
|
|
||||||
|
|
||||||
def _extract_headings(self, soup: BeautifulSoup) -> list[dict]:
|
|
||||||
"""Extract all headings (h1-h6) with their level and text."""
|
|
||||||
headings = []
|
|
||||||
for tag in soup.find_all(re.compile(r'^h[1-6]$')):
|
|
||||||
level = int(tag.name[1])
|
|
||||||
text = tag.get_text(strip=True)
|
|
||||||
if text:
|
|
||||||
headings.append({"level": level, "text": text})
|
|
||||||
return headings
|
|
||||||
|
|
||||||
def _extract_text(self, soup: BeautifulSoup) -> str:
|
|
||||||
"""Extract clean body text from HTML, stripping navigation and boilerplate."""
|
|
||||||
# Remove unwanted tags
|
|
||||||
for tag_name in UNWANTED_TAGS:
|
|
||||||
for tag in soup.find_all(tag_name):
|
|
||||||
tag.decompose()
|
|
||||||
|
|
||||||
# Remove elements with unwanted class names
|
|
||||||
for element in list(soup.find_all(True)):
|
|
||||||
if element.attrs is None:
|
|
||||||
continue
|
|
||||||
classes = element.get("class", [])
|
|
||||||
if isinstance(classes, list):
|
|
||||||
class_str = " ".join(classes).lower()
|
|
||||||
else:
|
|
||||||
class_str = str(classes).lower()
|
|
||||||
|
|
||||||
el_id = str(element.get("id", "")).lower()
|
|
||||||
|
|
||||||
for pattern in UNWANTED_CLASSES:
|
|
||||||
if pattern in class_str or pattern in el_id:
|
|
||||||
element.decompose()
|
|
||||||
break
|
|
||||||
|
|
||||||
# Try to find main content area
|
|
||||||
main_content = (
|
|
||||||
soup.find("main")
|
|
||||||
or soup.find("article")
|
|
||||||
or soup.find("div", {"role": "main"})
|
|
||||||
or soup.find("div", class_=re.compile(r'content|article|post|entry', re.I))
|
|
||||||
or soup.body
|
|
||||||
or soup
|
|
||||||
)
|
|
||||||
|
|
||||||
# Extract text with some structure preserved
|
|
||||||
text = main_content.get_text(separator="\n", strip=True)
|
|
||||||
|
|
||||||
# Clean up excessive whitespace
|
|
||||||
lines = []
|
|
||||||
for line in text.splitlines():
|
|
||||||
line = line.strip()
|
|
||||||
if line:
|
|
||||||
lines.append(line)
|
|
||||||
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
def _format_output(self, result: dict) -> str:
|
|
||||||
"""Format a single result as a readable text file."""
|
|
||||||
lines = [
|
|
||||||
f"URL: {result['url']}",
|
|
||||||
f"Title: {result['title']}",
|
|
||||||
f"Meta Description: {result['meta_description']}",
|
|
||||||
f"Word Count: {result['word_count']}",
|
|
||||||
"",
|
|
||||||
"--- HEADINGS ---",
|
|
||||||
]
|
|
||||||
|
|
||||||
for h in result["headings"]:
|
|
||||||
indent = " " * (h["level"] - 1)
|
|
||||||
lines.append(f"{indent}H{h['level']}: {h['text']}")
|
|
||||||
|
|
||||||
lines.extend(["", "--- CONTENT ---", "", result["text"]])
|
|
||||||
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(description="Scrape competitor web pages for content analysis")
|
|
||||||
parser.add_argument("urls", nargs="+", help="URLs to scrape")
|
|
||||||
parser.add_argument(
|
|
||||||
"--output-dir",
|
|
||||||
default="./working/competitor_content",
|
|
||||||
help="Directory to save scraped content (default: ./working/competitor_content/)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--format",
|
|
||||||
choices=["json", "text"],
|
|
||||||
default="text",
|
|
||||||
help="Output format for stdout (default: text)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--timeout",
|
|
||||||
type=int,
|
|
||||||
default=15,
|
|
||||||
help="Request timeout in seconds (default: 15)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--delay",
|
|
||||||
type=float,
|
|
||||||
default=1.0,
|
|
||||||
help="Delay between requests in seconds (default: 1.0)",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
scraper = CompetitorScraper(timeout=args.timeout, delay=args.delay)
|
|
||||||
results = scraper.scrape_urls(args.urls)
|
|
||||||
|
|
||||||
# Save files
|
|
||||||
saved = scraper.save_results(results, args.output_dir)
|
|
||||||
print(f"\nSaved {len(saved)} files to {args.output_dir}", file=sys.stderr)
|
|
||||||
|
|
||||||
# Output to stdout
|
|
||||||
successful = [r for r in results if not r["error"]]
|
|
||||||
if args.format == "json":
|
|
||||||
print(json.dumps(successful, indent=2))
|
|
||||||
else:
|
|
||||||
for r in successful:
|
|
||||||
print(scraper._format_output(r))
|
|
||||||
print("\n" + "=" * 80 + "\n")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,984 +0,0 @@
|
||||||
"""
|
|
||||||
Cora SEO Report Parser
|
|
||||||
|
|
||||||
Reads a Cora XLSX file and extracts structured data from relevant sheets.
|
|
||||||
Used as a foundation module by entity_optimizer, lsi_optimizer, and seo_optimizer.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python cora_parser.py <xlsx_path> [--sheet SHEET] [--format FORMAT]
|
|
||||||
|
|
||||||
Options:
|
|
||||||
--sheet Which data to extract: entities, lsi, variations, results, tunings,
|
|
||||||
structure, densities, targets, summary, all (default: summary)
|
|
||||||
--format Output format: json, text (default: text)
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import math
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
try:
|
|
||||||
import openpyxl
|
|
||||||
except ImportError:
|
|
||||||
print("Error: openpyxl is required. Install with: uv add openpyxl", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
|
|
||||||
# =============================================================================
|
|
||||||
# Optimization Rules
|
|
||||||
#
|
|
||||||
# Hard-wired overrides that apply regardless of what Cora data says.
|
|
||||||
# These encode expert SEO knowledge and practical constraints.
|
|
||||||
# =============================================================================
|
|
||||||
|
|
||||||
OPTIMIZATION_RULES = {
|
|
||||||
# Heading rules
|
|
||||||
"h1_max": 1, # Never more than 1 H1
|
|
||||||
"h1_min": 1, # Always have exactly 1 H1
|
|
||||||
"optimize_headings": ["h1", "h2", "h3"], # Primary optimization targets
|
|
||||||
"low_priority_headings": ["h4"], # Only add if most competitors have them
|
|
||||||
"ignore_headings": ["h5", "h6"], # Skip entirely
|
|
||||||
|
|
||||||
# Keyword density
|
|
||||||
"exact_match_density_min": 0.02, # 2% minimum for exact match keyword
|
|
||||||
"no_keyword_stuffing_limit": True, # Do NOT flag for keyword stuffing
|
|
||||||
# Variations capture exact match, so hitting variation density covers it
|
|
||||||
|
|
||||||
# Word count strategy
|
|
||||||
"word_count_strategy": "cluster", # "cluster" = nearest competitive cluster, not raw average
|
|
||||||
"word_count_acceptable_max": 1500, # Up to 1500 is always acceptable even if target is lower
|
|
||||||
|
|
||||||
# Density awareness
|
|
||||||
"density_interdependent": True, # Adding content changes all density calculations
|
|
||||||
|
|
||||||
# Entity / LSI filtering
|
|
||||||
"exclude_competitor_entities": True, # Never use competitor company names as entities or LSI
|
|
||||||
"exclude_measurement_entities": True, # Ignore measurements (dimensions, tolerances) as entities
|
|
||||||
"allow_organization_entities": True, # Organizations like ISO, ANSI, etc. are OK
|
|
||||||
"never_mention_competitors": True, # Never mention competitors by name in content
|
|
||||||
|
|
||||||
# Entity correlation threshold
|
|
||||||
# Best of Both = lower of Spearman's or Pearson's correlation.
|
|
||||||
# Measures correlation to ranking position (1=top, 100=bottom), so negative = better ranking.
|
|
||||||
# Only include entities with Best of Both <= this value.
|
|
||||||
# Set to None to disable filtering.
|
|
||||||
"entity_correlation_threshold": -0.19,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
class CoraReport:
|
|
||||||
"""Parses a Cora SEO XLSX report and provides structured access to its data."""
|
|
||||||
|
|
||||||
def __init__(self, xlsx_path: str):
|
|
||||||
self.path = Path(xlsx_path)
|
|
||||||
if not self.path.exists():
|
|
||||||
raise FileNotFoundError(f"XLSX file not found: {xlsx_path}")
|
|
||||||
self.wb = openpyxl.load_workbook(str(self.path), data_only=True)
|
|
||||||
self._site_domain = None # Cached after first detection
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Core metadata
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_sheet_names(self) -> list[str]:
|
|
||||||
return self.wb.sheetnames
|
|
||||||
|
|
||||||
def get_search_term(self) -> str:
|
|
||||||
"""Extract the target keyword from the report."""
|
|
||||||
for sheet_name in ["Basic Tunings", "Strategic Overview", "Structure"]:
|
|
||||||
if sheet_name not in self.wb.sheetnames:
|
|
||||||
continue
|
|
||||||
ws = self.wb[sheet_name]
|
|
||||||
for row in ws.iter_rows(min_row=1, max_row=10, values_only=True):
|
|
||||||
if row and row[0] == "Search Terms" and len(row) > 1 and row[1]:
|
|
||||||
return str(row[1])
|
|
||||||
return ""
|
|
||||||
|
|
||||||
def get_variations_list(self) -> list[str]:
|
|
||||||
"""Extract the keyword variations list from Strategic Overview B10.
|
|
||||||
|
|
||||||
These are pipe-delimited inside curly braces:
|
|
||||||
{cnc screw|cnc screw machining|cnc swiss|...}
|
|
||||||
"""
|
|
||||||
if "Strategic Overview" not in self.wb.sheetnames:
|
|
||||||
return []
|
|
||||||
|
|
||||||
ws = self.wb["Strategic Overview"]
|
|
||||||
rows = list(ws.iter_rows(min_row=1, max_row=12, values_only=True))
|
|
||||||
|
|
||||||
for row in rows:
|
|
||||||
if row and row[0] == "Keywords" and len(row) > 1 and row[1]:
|
|
||||||
raw = str(row[1]).strip()
|
|
||||||
# Remove curly braces and split on pipe
|
|
||||||
raw = raw.strip("{}")
|
|
||||||
return [v.strip() for v in raw.split("|") if v.strip()]
|
|
||||||
return []
|
|
||||||
|
|
||||||
def get_site_domain(self) -> str:
|
|
||||||
"""Detect the user's site domain from the report.
|
|
||||||
|
|
||||||
Looks for the domain in the Entities sheet header (column with a .com/.net etc.
|
|
||||||
that isn't a standard Cora column) or the site column in other sheets.
|
|
||||||
"""
|
|
||||||
if self._site_domain:
|
|
||||||
return self._site_domain
|
|
||||||
|
|
||||||
# Try Entities sheet first
|
|
||||||
if "Entities" in self.wb.sheetnames:
|
|
||||||
ws = self.wb["Entities"]
|
|
||||||
rows = list(ws.iter_rows(min_row=1, max_row=5, values_only=True))
|
|
||||||
for row in rows:
|
|
||||||
if row and row[0] == "Entity":
|
|
||||||
for h in row:
|
|
||||||
if h and isinstance(h, str):
|
|
||||||
h = h.strip()
|
|
||||||
if re.match(r'^[a-zA-Z0-9-]+\.[a-zA-Z]{2,}$', h):
|
|
||||||
self._site_domain = h
|
|
||||||
return h
|
|
||||||
|
|
||||||
# Try LSI Keywords sheet — header like "#40.7 hoggeprecision.com"
|
|
||||||
if "LSI Keywords" in self.wb.sheetnames:
|
|
||||||
ws = self.wb["LSI Keywords"]
|
|
||||||
rows = list(ws.iter_rows(min_row=1, max_row=10, values_only=True))
|
|
||||||
for row in rows:
|
|
||||||
if row and row[0] == "LSI Keyword":
|
|
||||||
for h in row:
|
|
||||||
if h and isinstance(h, str):
|
|
||||||
match = re.search(r'([a-zA-Z0-9-]+\.[a-zA-Z]{2,})', h.strip())
|
|
||||||
if match:
|
|
||||||
self._site_domain = match.group(1)
|
|
||||||
return self._site_domain
|
|
||||||
|
|
||||||
return ""
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Entities
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_entities(self) -> list[dict]:
|
|
||||||
"""Extract entities from the Entities sheet.
|
|
||||||
|
|
||||||
Returns list of dicts with: name, freebase_id, wikidata_id, wiki_link,
|
|
||||||
relevance, confidence, type, correlation, current_count, max_count, deficit
|
|
||||||
"""
|
|
||||||
if "Entities" not in self.wb.sheetnames:
|
|
||||||
return []
|
|
||||||
|
|
||||||
ws = self.wb["Entities"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
# Find header row containing "Entity", "Freebase ID", etc.
|
|
||||||
header_idx = None
|
|
||||||
for i, row in enumerate(rows):
|
|
||||||
if row and row[0] == "Entity" and len(row) > 1 and row[1] == "Freebase ID":
|
|
||||||
header_idx = i
|
|
||||||
break
|
|
||||||
|
|
||||||
if header_idx is None:
|
|
||||||
return []
|
|
||||||
|
|
||||||
headers = rows[header_idx]
|
|
||||||
col_map = {str(h).strip(): j for j, h in enumerate(headers) if h}
|
|
||||||
|
|
||||||
# Find the site-specific column (domain name like "hoggeprecision.com")
|
|
||||||
site_col_idx = None
|
|
||||||
site_domain = self.get_site_domain()
|
|
||||||
if site_domain:
|
|
||||||
site_col_idx = col_map.get(site_domain)
|
|
||||||
|
|
||||||
entities = []
|
|
||||||
for row in rows[header_idx + 1:]:
|
|
||||||
if not row or not row[0]:
|
|
||||||
continue
|
|
||||||
|
|
||||||
name = str(row[0]).strip()
|
|
||||||
if not name:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Skip rows that look like metadata (e.g., "critical values: ...")
|
|
||||||
if name.startswith("critical") or name.startswith("http"):
|
|
||||||
continue
|
|
||||||
|
|
||||||
correlation = _safe_float(row, col_map.get("Best of Both"))
|
|
||||||
|
|
||||||
# Filter by Best of Both correlation threshold.
|
|
||||||
# Lower (more negative) = stronger ranking signal (correlates with
|
|
||||||
# position 1 vs 100). Only keep entities at or below the threshold.
|
|
||||||
threshold = OPTIMIZATION_RULES.get("entity_correlation_threshold")
|
|
||||||
if threshold is not None and (correlation is None or correlation > threshold):
|
|
||||||
continue
|
|
||||||
|
|
||||||
entity = {
|
|
||||||
"name": name,
|
|
||||||
"freebase_id": _safe_str(row, col_map.get("Freebase ID")),
|
|
||||||
"wikidata_id": _safe_str(row, col_map.get("Wikidata ID")),
|
|
||||||
"wiki_link": _safe_str(row, col_map.get("Wiki Link")),
|
|
||||||
"relevance": _safe_float(row, col_map.get("Relevance")),
|
|
||||||
"confidence": _safe_float(row, col_map.get("Confidence")),
|
|
||||||
"type": _safe_str(row, col_map.get("Type")),
|
|
||||||
"correlation": correlation,
|
|
||||||
"current_count": _safe_int(row, site_col_idx),
|
|
||||||
"max_count": _safe_int(row, col_map.get("Max")),
|
|
||||||
"deficit": _safe_int(row, col_map.get("Deficit")),
|
|
||||||
}
|
|
||||||
entities.append(entity)
|
|
||||||
|
|
||||||
return entities
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# LSI Keywords
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_lsi_keywords(self) -> list[dict]:
|
|
||||||
"""Extract LSI keywords from the LSI Keywords sheet.
|
|
||||||
|
|
||||||
Returns list of dicts with: keyword, spearmans, pearsons, best_of_both,
|
|
||||||
pages, max, avg, current_count, deficit
|
|
||||||
"""
|
|
||||||
if "LSI Keywords" not in self.wb.sheetnames:
|
|
||||||
return []
|
|
||||||
|
|
||||||
ws = self.wb["LSI Keywords"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
# Find header row containing "LSI Keyword", "Spearmans", etc.
|
|
||||||
header_idx = None
|
|
||||||
for i, row in enumerate(rows):
|
|
||||||
if row and row[0] == "LSI Keyword":
|
|
||||||
header_idx = i
|
|
||||||
break
|
|
||||||
|
|
||||||
if header_idx is None:
|
|
||||||
return []
|
|
||||||
|
|
||||||
headers = rows[header_idx]
|
|
||||||
col_map = {str(h).strip(): j for j, h in enumerate(headers) if h}
|
|
||||||
|
|
||||||
# Find site column — pattern like "#40.7 hoggeprecision.com"
|
|
||||||
site_col_idx = None
|
|
||||||
site_domain = self.get_site_domain()
|
|
||||||
if site_domain:
|
|
||||||
for j, h in enumerate(headers):
|
|
||||||
if h and isinstance(h, str) and site_domain in h:
|
|
||||||
site_col_idx = j
|
|
||||||
break
|
|
||||||
if site_col_idx is None:
|
|
||||||
site_col_idx = _find_site_col_idx(headers)
|
|
||||||
|
|
||||||
lsi_keywords = []
|
|
||||||
for row in rows[header_idx + 1:]:
|
|
||||||
if not row or not row[0]:
|
|
||||||
continue
|
|
||||||
|
|
||||||
keyword = str(row[0]).strip()
|
|
||||||
if not keyword:
|
|
||||||
continue
|
|
||||||
|
|
||||||
lsi = {
|
|
||||||
"keyword": keyword,
|
|
||||||
"spearmans": _safe_float(row, col_map.get("Spearmans")),
|
|
||||||
"pearsons": _safe_float(row, col_map.get("Pearsons")),
|
|
||||||
"best_of_both": _safe_float(row, col_map.get("Best of Both")),
|
|
||||||
"pages": _safe_int(row, col_map.get("Pages")),
|
|
||||||
"max": _safe_int(row, col_map.get("Max")),
|
|
||||||
"avg": _safe_float(row, col_map.get("Avg")),
|
|
||||||
"current_count": _safe_int(row, site_col_idx),
|
|
||||||
"deficit": _safe_float(row, col_map.get("Deficit")),
|
|
||||||
}
|
|
||||||
lsi_keywords.append(lsi)
|
|
||||||
|
|
||||||
return lsi_keywords
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Keyword Variations
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_keyword_variations(self) -> list[dict]:
|
|
||||||
"""Extract keyword variation counts from the Variations sheet.
|
|
||||||
|
|
||||||
Returns list of dicts with: variation, page1_max, page1_avg
|
|
||||||
"""
|
|
||||||
if "Variations" not in self.wb.sheetnames:
|
|
||||||
return []
|
|
||||||
|
|
||||||
ws = self.wb["Variations"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
if not rows or len(rows) < 3:
|
|
||||||
return []
|
|
||||||
|
|
||||||
header_row = rows[0]
|
|
||||||
|
|
||||||
# Find where variation columns start (after "# used" column)
|
|
||||||
var_start = 3 # default
|
|
||||||
for j, h in enumerate(header_row):
|
|
||||||
if h and str(h).strip() == "# used":
|
|
||||||
var_start = j + 1
|
|
||||||
break
|
|
||||||
|
|
||||||
max_row = rows[1] if len(rows) > 1 else None
|
|
||||||
avg_row = rows[2] if len(rows) > 2 else None
|
|
||||||
|
|
||||||
variations = []
|
|
||||||
for j in range(var_start, len(header_row)):
|
|
||||||
name = header_row[j]
|
|
||||||
if not name:
|
|
||||||
continue
|
|
||||||
|
|
||||||
variation = {
|
|
||||||
"variation": str(name).strip(),
|
|
||||||
"page1_max": _safe_int(max_row, j) if max_row else 0,
|
|
||||||
"page1_avg": _safe_int(avg_row, j) if avg_row else 0,
|
|
||||||
}
|
|
||||||
variations.append(variation)
|
|
||||||
|
|
||||||
return variations
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Structure Targets (per-element targets from Structure sheet)
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_structure_targets(self) -> dict:
|
|
||||||
"""Extract per-element optimization targets from the Structure sheet.
|
|
||||||
|
|
||||||
Returns a dict keyed by element type with sub-targets:
|
|
||||||
{
|
|
||||||
"title_tag": {"exact_match": 0.2, "variations": 1.3, "entities": 5.8, "lsi_words": 10.7},
|
|
||||||
"meta_description": {...},
|
|
||||||
"all_h_tags": {"count": 20.7, "exact_match": 0.4, "variations": 5.7, "entities": 45.8, "lsi_words": 77.4},
|
|
||||||
"h1": {"count": 1.1, "exact_match": 0.1, "variations": 1, "entities": 3.8, "lsi_words": 7.3},
|
|
||||||
"h2": {...},
|
|
||||||
"h3": {...},
|
|
||||||
"h4": {...},
|
|
||||||
}
|
|
||||||
Page 1 Average values are in column D (index 3).
|
|
||||||
"""
|
|
||||||
if "Structure" not in self.wb.sheetnames:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
ws = self.wb["Structure"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
# Find the header row with "Factor Name", "Page 1 Avg" etc.
|
|
||||||
header_idx = None
|
|
||||||
for i, row in enumerate(rows):
|
|
||||||
if row and len(row) > 3:
|
|
||||||
if row[2] == "Factor Name" or (row[1] == "Factor ID" and row[2] == "Factor Name"):
|
|
||||||
header_idx = i
|
|
||||||
break
|
|
||||||
# Also check for the combined "Best of Both Correlation" header
|
|
||||||
if row[0] and "Best of Both" in str(row[0]):
|
|
||||||
header_idx = i
|
|
||||||
break
|
|
||||||
|
|
||||||
if header_idx is None:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
# Parse factor rows into sections
|
|
||||||
# Section headers: "TITLE TAG", "META DESCRIPTION", "TOTAL FOR ALL H TAGS",
|
|
||||||
# "H1 Data", "H2 Data", "H3 Data", "H4 Data", "H5 Data", "H6 Data"
|
|
||||||
section_map = {
|
|
||||||
"TITLE TAG": "title_tag",
|
|
||||||
"META DESCRIPTION": "meta_description",
|
|
||||||
"TOTAL FOR ALL H TAGS": "all_h_tags",
|
|
||||||
"H1 Data": "h1",
|
|
||||||
"H2 Data": "h2",
|
|
||||||
"H3 Data": "h3",
|
|
||||||
"H4 Data": "h4",
|
|
||||||
}
|
|
||||||
|
|
||||||
# Factor name patterns to field names
|
|
||||||
factor_patterns = {
|
|
||||||
"Number of": "count",
|
|
||||||
"Exact Match": "exact_match",
|
|
||||||
"Variation": "variations",
|
|
||||||
"Entities": "entities",
|
|
||||||
"LSI": "lsi_words",
|
|
||||||
"Search Term": "search_terms",
|
|
||||||
"Keywords": "keywords",
|
|
||||||
}
|
|
||||||
|
|
||||||
targets = {}
|
|
||||||
current_section = None
|
|
||||||
|
|
||||||
for row in rows[header_idx + 1:]:
|
|
||||||
if not row or len(row) < 4:
|
|
||||||
continue
|
|
||||||
|
|
||||||
factor_name = _safe_str(row, 2)
|
|
||||||
|
|
||||||
# Check if this is a section header
|
|
||||||
if factor_name in section_map:
|
|
||||||
current_section = section_map[factor_name]
|
|
||||||
targets[current_section] = {}
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Skip sections we don't care about (H5, H6)
|
|
||||||
if factor_name in ("H5 Data", "H6 Data"):
|
|
||||||
current_section = None
|
|
||||||
continue
|
|
||||||
|
|
||||||
if current_section is None:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Get the Page 1 Average (column D, index 3)
|
|
||||||
avg_val = _safe_float(row, 3)
|
|
||||||
if avg_val is None:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Map factor name to field
|
|
||||||
field_name = None
|
|
||||||
for pattern, field in factor_patterns.items():
|
|
||||||
if pattern.lower() in factor_name.lower():
|
|
||||||
field_name = field
|
|
||||||
break
|
|
||||||
|
|
||||||
if field_name and current_section:
|
|
||||||
# Also grab correlation from column A
|
|
||||||
correlation = _safe_float(row, 0)
|
|
||||||
|
|
||||||
# Outlier detection: check if one of the top 10 results
|
|
||||||
# contributes >50% of the sum. If so, exclude it and
|
|
||||||
# recompute the average — that outlier is skewing the target.
|
|
||||||
top10 = [_safe_float(row, j) or 0 for j in range(4, 14)]
|
|
||||||
top10_sum = sum(top10)
|
|
||||||
adjusted_avg = avg_val
|
|
||||||
outlier_detected = False
|
|
||||||
|
|
||||||
if top10_sum > 0:
|
|
||||||
max_val = max(top10)
|
|
||||||
if max_val > top10_sum * 0.5 and avg_val > 1:
|
|
||||||
# One result is >50% of the total — outlier.
|
|
||||||
# Skip adjustment when avg <= 1: a single "1" among
|
|
||||||
# zeros triggers the rule but the target is already
|
|
||||||
# small enough that adjustment would zero it out.
|
|
||||||
remaining = [v for v in top10 if v != max_val]
|
|
||||||
# If max_val appears multiple times, only remove one
|
|
||||||
if len(remaining) == len(top10):
|
|
||||||
remaining = top10[:]
|
|
||||||
remaining.remove(max_val)
|
|
||||||
if remaining:
|
|
||||||
adjusted_avg = sum(remaining) / len(remaining)
|
|
||||||
outlier_detected = True
|
|
||||||
|
|
||||||
target_val = math.ceil(adjusted_avg)
|
|
||||||
|
|
||||||
entry = {
|
|
||||||
"avg": avg_val,
|
|
||||||
"target": target_val,
|
|
||||||
"correlation": correlation,
|
|
||||||
}
|
|
||||||
if outlier_detected:
|
|
||||||
entry["outlier_adjusted"] = True
|
|
||||||
entry["original_target"] = math.ceil(avg_val)
|
|
||||||
|
|
||||||
targets[current_section][field_name] = entry
|
|
||||||
|
|
||||||
return targets
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Density Targets (from Strategic Overview rows 46-48)
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_density_targets(self) -> dict:
|
|
||||||
"""Extract density targets from Strategic Overview rows 46-48.
|
|
||||||
|
|
||||||
Row 46: Variation density
|
|
||||||
Row 47: Entity density
|
|
||||||
Row 48: LSI density
|
|
||||||
|
|
||||||
Column D (index 3) = Page 1 Average.
|
|
||||||
Returns per-result values so we can show distribution.
|
|
||||||
"""
|
|
||||||
if "Strategic Overview" not in self.wb.sheetnames:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
ws = self.wb["Strategic Overview"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
# Find the density rows — they're the last 3 non-empty rows in the data section
|
|
||||||
# Look for them near row 46-48 area, identified by having floats in col D
|
|
||||||
# and being near the bottom of the data
|
|
||||||
# Approach: find the row with "Relevant Density" and the 3 rows after the gap
|
|
||||||
density_area_start = None
|
|
||||||
for i, row in enumerate(rows):
|
|
||||||
if row and len(row) > 2 and row[2] == "Relevant Density":
|
|
||||||
# Density target rows are a few rows below this
|
|
||||||
density_area_start = i
|
|
||||||
break
|
|
||||||
|
|
||||||
if density_area_start is None:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
# The 3 density rows come after a gap. They have NO values in cols A, B, C —
|
|
||||||
# only numeric values from col D onward. Row 44 (which has a correlation in
|
|
||||||
# col A) is a count row, not a density row, so we skip it.
|
|
||||||
density_rows = []
|
|
||||||
for i in range(density_area_start + 1, min(density_area_start + 10, len(rows))):
|
|
||||||
row = rows[i]
|
|
||||||
if not row:
|
|
||||||
continue
|
|
||||||
col_a = row[0] if len(row) > 0 else None
|
|
||||||
col_b = row[1] if len(row) > 1 else None
|
|
||||||
col_c = row[2] if len(row) > 2 else None
|
|
||||||
col_d = row[3] if len(row) > 3 else None
|
|
||||||
# Density rows have None in A, B, C and a float in D
|
|
||||||
if col_a is None and col_b is None and col_c is None and col_d is not None:
|
|
||||||
try:
|
|
||||||
float(col_d)
|
|
||||||
density_rows.append(row)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
pass
|
|
||||||
|
|
||||||
# Get result domains from row 22 area for the site column
|
|
||||||
result_start_col = 4 # Results start at col E (index 4)
|
|
||||||
|
|
||||||
result = {}
|
|
||||||
labels = ["variation_density", "entity_density", "lsi_density"]
|
|
||||||
|
|
||||||
for idx, label in enumerate(labels):
|
|
||||||
if idx >= len(density_rows):
|
|
||||||
break
|
|
||||||
row = density_rows[idx]
|
|
||||||
avg = _safe_float(row, 3)
|
|
||||||
# Collect per-competitor values
|
|
||||||
competitor_vals = []
|
|
||||||
for j in range(result_start_col, min(result_start_col + 10, len(row))):
|
|
||||||
v = _safe_float(row, j)
|
|
||||||
if v is not None:
|
|
||||||
competitor_vals.append(v)
|
|
||||||
|
|
||||||
result[label] = {
|
|
||||||
"avg": avg,
|
|
||||||
"avg_pct": f"{avg * 100:.2f}%" if avg else "N/A",
|
|
||||||
"competitor_values": competitor_vals,
|
|
||||||
}
|
|
||||||
|
|
||||||
return result
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Content Targets (word count, distinct entities, etc.)
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_content_targets(self) -> dict:
|
|
||||||
"""Extract key content-level targets from Strategic Overview.
|
|
||||||
|
|
||||||
Includes: word count distribution, distinct entities target, variations in HTML, etc.
|
|
||||||
"""
|
|
||||||
if "Strategic Overview" not in self.wb.sheetnames:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
ws = self.wb["Strategic Overview"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
targets = {}
|
|
||||||
result_start_col = 4
|
|
||||||
|
|
||||||
for i, row in enumerate(rows):
|
|
||||||
if not row or len(row) < 4:
|
|
||||||
continue
|
|
||||||
|
|
||||||
factor_name = _safe_str(row, 2)
|
|
||||||
factor_id = _safe_str(row, 1)
|
|
||||||
correlation = _safe_float(row, 0)
|
|
||||||
avg = _safe_float(row, 3)
|
|
||||||
|
|
||||||
if not factor_name or avg is None:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Key factors we care about
|
|
||||||
if factor_name == "Number of Distinct Entities Used":
|
|
||||||
competitor_vals = []
|
|
||||||
for j in range(result_start_col, min(result_start_col + 10, len(row))):
|
|
||||||
v = _safe_float(row, j)
|
|
||||||
if v is not None:
|
|
||||||
competitor_vals.append(int(v))
|
|
||||||
targets["distinct_entities"] = {
|
|
||||||
"factor_id": factor_id,
|
|
||||||
"avg": avg,
|
|
||||||
"target": math.ceil(avg),
|
|
||||||
"correlation": correlation,
|
|
||||||
"competitor_values": competitor_vals,
|
|
||||||
}
|
|
||||||
|
|
||||||
elif factor_name == "Variations in HTML Tags":
|
|
||||||
targets["variations_in_html"] = {
|
|
||||||
"factor_id": factor_id,
|
|
||||||
"avg": avg,
|
|
||||||
"target": math.ceil(avg),
|
|
||||||
"correlation": correlation,
|
|
||||||
}
|
|
||||||
|
|
||||||
elif factor_name == "Entities in the HTML Tag":
|
|
||||||
targets["entities_in_html"] = {
|
|
||||||
"factor_id": factor_id,
|
|
||||||
"avg": avg,
|
|
||||||
"target": math.ceil(avg),
|
|
||||||
"correlation": correlation,
|
|
||||||
}
|
|
||||||
|
|
||||||
return targets
|
|
||||||
|
|
||||||
def get_word_count_distribution(self) -> dict:
|
|
||||||
"""Get word count data for competitive cluster analysis.
|
|
||||||
|
|
||||||
Returns the clean word count for each competitor from the Keywords sheet,
|
|
||||||
sorted ascending, plus the Page 1 Average and suggested cluster target.
|
|
||||||
"""
|
|
||||||
if "Keywords" not in self.wb.sheetnames:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
ws = self.wb["Keywords"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
if not rows:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
headers = rows[0]
|
|
||||||
col_map = {str(h).strip(): j for j, h in enumerate(headers) if h}
|
|
||||||
|
|
||||||
host_idx = col_map.get("Host")
|
|
||||||
clean_wc_idx = col_map.get("Clean Word Count")
|
|
||||||
|
|
||||||
if host_idx is None or clean_wc_idx is None:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
# Collect word counts for page 1 results (top 10)
|
|
||||||
competitors = []
|
|
||||||
for row in rows[1:11]:
|
|
||||||
if not row or not row[host_idx]:
|
|
||||||
continue
|
|
||||||
wc = _safe_int(row, clean_wc_idx)
|
|
||||||
if wc and wc > 0:
|
|
||||||
competitors.append({
|
|
||||||
"host": str(row[host_idx]),
|
|
||||||
"clean_word_count": wc,
|
|
||||||
})
|
|
||||||
|
|
||||||
if not competitors:
|
|
||||||
return {}
|
|
||||||
|
|
||||||
# Sort by word count
|
|
||||||
competitors.sort(key=lambda x: x["clean_word_count"])
|
|
||||||
counts = [c["clean_word_count"] for c in competitors]
|
|
||||||
|
|
||||||
# Calculate cluster target
|
|
||||||
avg = sum(counts) / len(counts)
|
|
||||||
median = counts[len(counts) // 2]
|
|
||||||
cluster_target = _find_cluster_target(counts)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"competitors": competitors,
|
|
||||||
"counts_sorted": counts,
|
|
||||||
"average": round(avg),
|
|
||||||
"median": median,
|
|
||||||
"cluster_target": cluster_target,
|
|
||||||
"min": counts[0],
|
|
||||||
"max": counts[-1],
|
|
||||||
}
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Basic Tunings
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_basic_tunings(self) -> list[dict]:
|
|
||||||
"""Extract on-page tuning factors from the Basic Tunings sheet."""
|
|
||||||
if "Basic Tunings" not in self.wb.sheetnames:
|
|
||||||
return []
|
|
||||||
|
|
||||||
ws = self.wb["Basic Tunings"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
# Find sub-header row with "Factor ID", "Factor"
|
|
||||||
header_idx = None
|
|
||||||
for i, row in enumerate(rows):
|
|
||||||
if row and len(row) > 2 and row[1] == "Factor ID" and row[2] == "Factor":
|
|
||||||
header_idx = i
|
|
||||||
break
|
|
||||||
|
|
||||||
if header_idx is None:
|
|
||||||
return []
|
|
||||||
|
|
||||||
tunings = []
|
|
||||||
for row in rows[header_idx + 1:]:
|
|
||||||
if not row:
|
|
||||||
continue
|
|
||||||
|
|
||||||
factor_id = row[1] if len(row) > 1 else None
|
|
||||||
if not factor_id or not str(factor_id).strip():
|
|
||||||
continue
|
|
||||||
|
|
||||||
factor_id_str = str(factor_id).strip()
|
|
||||||
if not re.match(r'^[A-Z]{2,}\d+', factor_id_str):
|
|
||||||
continue
|
|
||||||
|
|
||||||
tuning = {
|
|
||||||
"factor_id": factor_id_str,
|
|
||||||
"factor": _safe_str(row, 2),
|
|
||||||
"current": _safe_str(row, 3),
|
|
||||||
"goal": _safe_str(row, 4),
|
|
||||||
"percent": _safe_float(row, 5),
|
|
||||||
"recommendation": _safe_str(row, 6),
|
|
||||||
}
|
|
||||||
tunings.append(tuning)
|
|
||||||
|
|
||||||
return tunings
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Competitor URLs (Results sheet)
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_competitor_urls(self) -> list[dict]:
|
|
||||||
"""Extract competitor URLs from the Results sheet."""
|
|
||||||
if "Results" not in self.wb.sheetnames:
|
|
||||||
return []
|
|
||||||
|
|
||||||
ws = self.wb["Results"]
|
|
||||||
rows = list(ws.iter_rows(values_only=True))
|
|
||||||
|
|
||||||
if not rows:
|
|
||||||
return []
|
|
||||||
|
|
||||||
headers = rows[0]
|
|
||||||
col_map = {str(h).strip(): j for j, h in enumerate(headers) if h}
|
|
||||||
|
|
||||||
results = []
|
|
||||||
for row in rows[1:]:
|
|
||||||
if not row or not row[0]:
|
|
||||||
continue
|
|
||||||
|
|
||||||
result = {
|
|
||||||
"rank": _safe_int(row, col_map.get("Rank")),
|
|
||||||
"host": _safe_str(row, col_map.get("Host")),
|
|
||||||
"url": _safe_str(row, col_map.get("URL")),
|
|
||||||
"title": _safe_str(row, col_map.get("Link Text")),
|
|
||||||
"summary": _safe_str(row, col_map.get("Summary")),
|
|
||||||
}
|
|
||||||
results.append(result)
|
|
||||||
|
|
||||||
return results
|
|
||||||
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
# Summary
|
|
||||||
# -------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def get_summary(self) -> dict:
|
|
||||||
"""Get a high-level summary of the Cora report with all key targets."""
|
|
||||||
entities = self.get_entities()
|
|
||||||
lsi = self.get_lsi_keywords()
|
|
||||||
variations = self.get_variations_list()
|
|
||||||
tunings = self.get_basic_tunings()
|
|
||||||
results = self.get_competitor_urls()
|
|
||||||
density = self.get_density_targets()
|
|
||||||
content = self.get_content_targets()
|
|
||||||
wc_dist = self.get_word_count_distribution()
|
|
||||||
|
|
||||||
# Find word count goal from tunings
|
|
||||||
word_count_goal = None
|
|
||||||
for t in tunings:
|
|
||||||
if t["factor"] == "Word Count":
|
|
||||||
word_count_goal = t["goal"]
|
|
||||||
break
|
|
||||||
|
|
||||||
entities_with_deficit = [e for e in entities if e["deficit"] and e["deficit"] > 0]
|
|
||||||
lsi_with_deficit = [l for l in lsi if l["deficit"] and l["deficit"] > 0]
|
|
||||||
|
|
||||||
return {
|
|
||||||
"search_term": self.get_search_term(),
|
|
||||||
"site_domain": self.get_site_domain(),
|
|
||||||
"keyword_variations": variations,
|
|
||||||
"total_entities": len(entities),
|
|
||||||
"entities_with_deficit": len(entities_with_deficit),
|
|
||||||
"total_lsi_keywords": len(lsi),
|
|
||||||
"lsi_with_deficit": len(lsi_with_deficit),
|
|
||||||
"word_count_goal": word_count_goal,
|
|
||||||
"word_count_cluster_target": wc_dist.get("cluster_target"),
|
|
||||||
"word_count_distribution": wc_dist.get("counts_sorted", []),
|
|
||||||
"variation_density_avg": density.get("variation_density", {}).get("avg_pct"),
|
|
||||||
"entity_density_avg": density.get("entity_density", {}).get("avg_pct"),
|
|
||||||
"lsi_density_avg": density.get("lsi_density", {}).get("avg_pct"),
|
|
||||||
"distinct_entities_target": content.get("distinct_entities", {}).get("target"),
|
|
||||||
"competitors_analyzed": len(results),
|
|
||||||
"tuning_factors": len(tunings),
|
|
||||||
"optimization_rules": OPTIMIZATION_RULES,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# =============================================================================
|
|
||||||
# Helper functions
|
|
||||||
# =============================================================================
|
|
||||||
|
|
||||||
def _safe_str(row, idx) -> str:
|
|
||||||
if idx is None or idx >= len(row) or row[idx] is None:
|
|
||||||
return ""
|
|
||||||
return str(row[idx]).strip()
|
|
||||||
|
|
||||||
|
|
||||||
def _safe_float(row, idx) -> float | None:
|
|
||||||
if idx is None or idx >= len(row) or row[idx] is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
return float(row[idx])
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _safe_int(row, idx) -> int | None:
|
|
||||||
if idx is None or idx >= len(row) or row[idx] is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
return int(float(row[idx]))
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _find_site_col_idx(headers) -> int | None:
|
|
||||||
"""Find site column by looking for domain pattern in header values."""
|
|
||||||
for j, h in enumerate(headers):
|
|
||||||
if h and isinstance(h, str):
|
|
||||||
h_str = h.strip()
|
|
||||||
if re.search(r'[a-zA-Z0-9-]+\.[a-zA-Z]{2,}', h_str):
|
|
||||||
# Skip known non-site headers
|
|
||||||
if h_str in ("Best of Both", "LSI Keyword"):
|
|
||||||
continue
|
|
||||||
return j
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _find_cluster_target(counts: list[int]) -> int:
|
|
||||||
"""Find the nearest competitive cluster target for word count.
|
|
||||||
|
|
||||||
Strategy: Don't use the raw average (skewed by outliers).
|
|
||||||
Instead, find clusters of 3+ competitors within 30% of each other
|
|
||||||
and target slightly above the nearest cluster's center.
|
|
||||||
"""
|
|
||||||
if not counts:
|
|
||||||
return 0
|
|
||||||
|
|
||||||
if len(counts) <= 3:
|
|
||||||
return math.ceil(max(counts) * 1.05)
|
|
||||||
|
|
||||||
# Simple clustering: find the densest grouping
|
|
||||||
best_cluster = []
|
|
||||||
for i in range(len(counts)):
|
|
||||||
cluster = [counts[i]]
|
|
||||||
for j in range(i + 1, len(counts)):
|
|
||||||
# Within 40% range of the cluster start
|
|
||||||
if counts[j] <= counts[i] * 1.4:
|
|
||||||
cluster.append(counts[j])
|
|
||||||
else:
|
|
||||||
break
|
|
||||||
if len(cluster) >= len(best_cluster):
|
|
||||||
best_cluster = cluster
|
|
||||||
|
|
||||||
if best_cluster:
|
|
||||||
cluster_avg = sum(best_cluster) / len(best_cluster)
|
|
||||||
# Target slightly above the cluster average
|
|
||||||
return math.ceil(cluster_avg * 1.05)
|
|
||||||
|
|
||||||
# Fallback: median + 5%
|
|
||||||
median = counts[len(counts) // 2]
|
|
||||||
return math.ceil(median * 1.05)
|
|
||||||
|
|
||||||
|
|
||||||
# =============================================================================
|
|
||||||
# Output formatting
|
|
||||||
# =============================================================================
|
|
||||||
|
|
||||||
def format_text(data, label: str = "") -> str:
|
|
||||||
"""Format data as human-readable text."""
|
|
||||||
lines = []
|
|
||||||
if label:
|
|
||||||
lines.append(f"=== {label} ===")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
if isinstance(data, dict):
|
|
||||||
for key, value in data.items():
|
|
||||||
if isinstance(value, list) and len(value) > 5:
|
|
||||||
lines.append(f" {key}: [{len(value)} items]")
|
|
||||||
elif isinstance(value, dict):
|
|
||||||
lines.append(f" {key}:")
|
|
||||||
for k2, v2 in value.items():
|
|
||||||
lines.append(f" {k2}: {v2}")
|
|
||||||
else:
|
|
||||||
lines.append(f" {key}: {value}")
|
|
||||||
elif isinstance(data, list):
|
|
||||||
for i, item in enumerate(data):
|
|
||||||
if isinstance(item, dict):
|
|
||||||
lines.append(f" [{i + 1}]")
|
|
||||||
for key, value in item.items():
|
|
||||||
lines.append(f" {key}: {value}")
|
|
||||||
else:
|
|
||||||
lines.append(f" [{i + 1}] {item}")
|
|
||||||
lines.append("")
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
# =============================================================================
|
|
||||||
# CLI
|
|
||||||
# =============================================================================
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(description="Parse a Cora SEO XLSX report")
|
|
||||||
parser.add_argument("xlsx_path", help="Path to the Cora XLSX file")
|
|
||||||
parser.add_argument(
|
|
||||||
"--sheet",
|
|
||||||
choices=[
|
|
||||||
"entities", "lsi", "variations", "results", "tunings",
|
|
||||||
"structure", "densities", "targets", "wordcount", "summary", "all",
|
|
||||||
],
|
|
||||||
default="summary",
|
|
||||||
help="Which data to extract (default: summary)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--format",
|
|
||||||
choices=["json", "text"],
|
|
||||||
default="text",
|
|
||||||
help="Output format (default: text)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--top-n",
|
|
||||||
type=int,
|
|
||||||
default=0,
|
|
||||||
help="Limit output to top N results (0 = all)",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
report = CoraReport(args.xlsx_path)
|
|
||||||
|
|
||||||
extractors = {
|
|
||||||
"entities": ("Entities", report.get_entities),
|
|
||||||
"lsi": ("LSI Keywords", report.get_lsi_keywords),
|
|
||||||
"variations": ("Keyword Variations", lambda: report.get_keyword_variations()),
|
|
||||||
"results": ("Competitor URLs", report.get_competitor_urls),
|
|
||||||
"tunings": ("Basic Tunings", report.get_basic_tunings),
|
|
||||||
"structure": ("Structure Targets", report.get_structure_targets),
|
|
||||||
"densities": ("Density Targets", report.get_density_targets),
|
|
||||||
"targets": ("Content Targets", report.get_content_targets),
|
|
||||||
"wordcount": ("Word Count Distribution", report.get_word_count_distribution),
|
|
||||||
"summary": ("Summary", report.get_summary),
|
|
||||||
}
|
|
||||||
|
|
||||||
if args.sheet == "all":
|
|
||||||
sheets_to_show = ["summary", "structure", "densities", "targets", "wordcount"]
|
|
||||||
else:
|
|
||||||
sheets_to_show = [args.sheet]
|
|
||||||
|
|
||||||
for sheet_key in sheets_to_show:
|
|
||||||
label, extractor = extractors[sheet_key]
|
|
||||||
data = extractor()
|
|
||||||
|
|
||||||
if args.top_n > 0 and isinstance(data, list):
|
|
||||||
data = data[:args.top_n]
|
|
||||||
|
|
||||||
if args.format == "json":
|
|
||||||
print(json.dumps(data, indent=2, default=str))
|
|
||||||
else:
|
|
||||||
print(format_text(data, label))
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,455 +0,0 @@
|
||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Entity Optimizer — Cora Entity Analysis for Content Drafts
|
|
||||||
|
|
||||||
Counts Cora-defined entities in a markdown content draft and recommends
|
|
||||||
additions based on relevance and deficit data from a Cora XLSX report.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python entity_optimizer.py <draft_path> <cora_xlsx_path> [--format json|text] [--top-n 30]
|
|
||||||
|
|
||||||
Options:
|
|
||||||
--format Output format: json or text (default: text)
|
|
||||||
--top-n Number of top recommendations to show (default: 30)
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from cora_parser import CoraReport
|
|
||||||
|
|
||||||
|
|
||||||
class EntityOptimizer:
|
|
||||||
"""Analyzes a content draft against Cora entity targets and recommends additions."""
|
|
||||||
|
|
||||||
def __init__(self, cora_xlsx_path: str):
|
|
||||||
"""Load entity targets from a Cora XLSX report.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
cora_xlsx_path: Path to the Cora SEO XLSX file.
|
|
||||||
"""
|
|
||||||
self.report = CoraReport(cora_xlsx_path)
|
|
||||||
self.entities = self.report.get_entities()
|
|
||||||
self.search_term = self.report.get_search_term()
|
|
||||||
|
|
||||||
# Populated after analyze_draft() is called
|
|
||||||
self.draft_text = ""
|
|
||||||
self.sections = [] # list of {"heading": str, "level": int, "text": str}
|
|
||||||
self.entity_counts = {} # entity name -> {"total": int, "per_section": {heading: count}}
|
|
||||||
|
|
||||||
def analyze_draft(self, draft_path: str) -> dict:
|
|
||||||
"""Run a full analysis of a content draft against Cora entity targets.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
draft_path: Path to a markdown content draft file.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
dict with keys: summary, entity_counts, deficits, recommendations, section_density
|
|
||||||
"""
|
|
||||||
path = Path(draft_path)
|
|
||||||
if not path.exists():
|
|
||||||
raise FileNotFoundError(f"Draft file not found: {draft_path}")
|
|
||||||
|
|
||||||
self.draft_text = path.read_text(encoding="utf-8")
|
|
||||||
self.sections = self._parse_sections(self.draft_text)
|
|
||||||
self.entity_counts = self.count_entities(self.draft_text)
|
|
||||||
deficits = self.calculate_deficits()
|
|
||||||
recommendations = self.recommend_additions()
|
|
||||||
section_density = self._section_density()
|
|
||||||
|
|
||||||
# Build summary stats
|
|
||||||
entities_found = sum(
|
|
||||||
1 for name, counts in self.entity_counts.items() if counts["total"] > 0
|
|
||||||
)
|
|
||||||
entities_with_deficit = sum(1 for d in deficits if d["remaining_deficit"] > 0)
|
|
||||||
|
|
||||||
summary = {
|
|
||||||
"search_term": self.search_term,
|
|
||||||
"total_entities_tracked": len(self.entities),
|
|
||||||
"entities_found_in_draft": entities_found,
|
|
||||||
"entities_with_deficit": entities_with_deficit,
|
|
||||||
"total_sections": len(self.sections),
|
|
||||||
}
|
|
||||||
|
|
||||||
return {
|
|
||||||
"summary": summary,
|
|
||||||
"entity_counts": self.entity_counts,
|
|
||||||
"deficits": deficits,
|
|
||||||
"recommendations": recommendations,
|
|
||||||
"section_density": section_density,
|
|
||||||
}
|
|
||||||
|
|
||||||
def count_entities(self, text: str) -> dict:
|
|
||||||
"""Count occurrences of each Cora entity in the text, total and per section.
|
|
||||||
|
|
||||||
Uses case-insensitive matching with word boundaries so partial matches
|
|
||||||
inside larger words are excluded.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
text: The full draft text.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
dict mapping entity name to {"total": int, "per_section": {heading: int}}
|
|
||||||
"""
|
|
||||||
counts = {}
|
|
||||||
sections = self.sections if self.sections else self._parse_sections(text)
|
|
||||||
|
|
||||||
for entity in self.entities:
|
|
||||||
name = entity["name"]
|
|
||||||
pattern = re.compile(r"\b" + re.escape(name) + r"\b", re.IGNORECASE)
|
|
||||||
|
|
||||||
total = len(pattern.findall(text))
|
|
||||||
|
|
||||||
per_section = {}
|
|
||||||
for section in sections:
|
|
||||||
section_count = len(pattern.findall(section["text"]))
|
|
||||||
if section_count > 0:
|
|
||||||
per_section[section["heading"]] = section_count
|
|
||||||
|
|
||||||
counts[name] = {
|
|
||||||
"total": total,
|
|
||||||
"per_section": per_section,
|
|
||||||
}
|
|
||||||
|
|
||||||
return counts
|
|
||||||
|
|
||||||
def calculate_deficits(self) -> list[dict]:
|
|
||||||
"""Calculate which entities are still below their Cora deficit target.
|
|
||||||
|
|
||||||
Compares the count found in the draft against the deficit value from
|
|
||||||
the Cora report. An entity with a Cora deficit of 20 and a draft count
|
|
||||||
of 5 has a remaining deficit of 15.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of dicts with: name, relevance, correlation, cora_deficit,
|
|
||||||
draft_count, remaining_deficit — sorted by remaining_deficit descending.
|
|
||||||
"""
|
|
||||||
deficits = []
|
|
||||||
for entity in self.entities:
|
|
||||||
name = entity["name"]
|
|
||||||
cora_deficit = entity.get("deficit") or 0
|
|
||||||
draft_count = self.entity_counts.get(name, {}).get("total", 0)
|
|
||||||
remaining = max(0, cora_deficit - draft_count)
|
|
||||||
|
|
||||||
deficits.append({
|
|
||||||
"name": name,
|
|
||||||
"relevance": entity.get("relevance") or 0,
|
|
||||||
"correlation": entity.get("correlation") or 0,
|
|
||||||
"cora_deficit": cora_deficit,
|
|
||||||
"draft_count": draft_count,
|
|
||||||
"remaining_deficit": remaining,
|
|
||||||
})
|
|
||||||
|
|
||||||
deficits.sort(key=lambda d: d["remaining_deficit"], reverse=True)
|
|
||||||
return deficits
|
|
||||||
|
|
||||||
def recommend_additions(self) -> list[dict]:
|
|
||||||
"""Generate prioritized recommendations for entity additions.
|
|
||||||
|
|
||||||
Priority is calculated as relevance * remaining_deficit, so entities
|
|
||||||
that are both highly relevant and far below target rank highest.
|
|
||||||
Each recommendation includes suggested sections where the entity
|
|
||||||
could naturally be added, based on where related entities already appear.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of recommendation dicts sorted by priority descending. Each dict
|
|
||||||
has: name, relevance, correlation, cora_deficit, draft_count,
|
|
||||||
remaining_deficit, priority, suggested_sections.
|
|
||||||
"""
|
|
||||||
deficits = self.calculate_deficits()
|
|
||||||
recommendations = []
|
|
||||||
|
|
||||||
for deficit_entry in deficits:
|
|
||||||
if deficit_entry["remaining_deficit"] <= 0:
|
|
||||||
continue
|
|
||||||
|
|
||||||
relevance = deficit_entry["relevance"]
|
|
||||||
remaining = deficit_entry["remaining_deficit"]
|
|
||||||
priority = relevance * remaining
|
|
||||||
|
|
||||||
suggested = self._suggest_sections(deficit_entry["name"])
|
|
||||||
|
|
||||||
recommendations.append({
|
|
||||||
"name": deficit_entry["name"],
|
|
||||||
"relevance": relevance,
|
|
||||||
"correlation": deficit_entry["correlation"],
|
|
||||||
"cora_deficit": deficit_entry["cora_deficit"],
|
|
||||||
"draft_count": deficit_entry["draft_count"],
|
|
||||||
"remaining_deficit": remaining,
|
|
||||||
"priority": round(priority, 4),
|
|
||||||
"suggested_sections": suggested,
|
|
||||||
})
|
|
||||||
|
|
||||||
recommendations.sort(key=lambda r: r["priority"], reverse=True)
|
|
||||||
return recommendations
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# Internal helpers
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
def _parse_sections(self, text: str) -> list[dict]:
|
|
||||||
"""Split markdown text into sections by headings.
|
|
||||||
|
|
||||||
Each section captures the heading text, heading level, and the body
|
|
||||||
text under that heading (up to the next heading of equal or higher level).
|
|
||||||
|
|
||||||
A virtual "Introduction" section is created for content before the first heading.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
list of {"heading": str, "level": int, "text": str}
|
|
||||||
"""
|
|
||||||
heading_pattern = re.compile(r"^(#{1,6})\s+(.+)$", re.MULTILINE)
|
|
||||||
matches = list(heading_pattern.finditer(text))
|
|
||||||
|
|
||||||
sections = []
|
|
||||||
|
|
||||||
# Content before the first heading becomes the Introduction section
|
|
||||||
if matches:
|
|
||||||
intro_text = text[:matches[0].start()].strip()
|
|
||||||
if intro_text:
|
|
||||||
sections.append({
|
|
||||||
"heading": "Introduction",
|
|
||||||
"level": 0,
|
|
||||||
"text": intro_text,
|
|
||||||
})
|
|
||||||
else:
|
|
||||||
# No headings at all — treat the entire text as one section
|
|
||||||
return [{
|
|
||||||
"heading": "Full Document",
|
|
||||||
"level": 0,
|
|
||||||
"text": text,
|
|
||||||
}]
|
|
||||||
|
|
||||||
for i, match in enumerate(matches):
|
|
||||||
level = len(match.group(1))
|
|
||||||
heading = match.group(2).strip()
|
|
||||||
start = match.end()
|
|
||||||
end = matches[i + 1].start() if i + 1 < len(matches) else len(text)
|
|
||||||
body = text[start:end].strip()
|
|
||||||
|
|
||||||
sections.append({
|
|
||||||
"heading": heading,
|
|
||||||
"level": level,
|
|
||||||
"text": body,
|
|
||||||
})
|
|
||||||
|
|
||||||
return sections
|
|
||||||
|
|
||||||
def _suggest_sections(self, entity_name: str) -> list[str]:
|
|
||||||
"""Suggest sections where an entity could naturally be added.
|
|
||||||
|
|
||||||
Strategy: find sections that already contain other entities from the
|
|
||||||
same Cora report. Sections with higher concentrations of related
|
|
||||||
entities are better candidates because the topic is contextually aligned.
|
|
||||||
|
|
||||||
If no sections have related entities, return all non-empty sections
|
|
||||||
as general candidates.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
entity_name: The entity to find placement for.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of section heading strings, ordered by relevance.
|
|
||||||
"""
|
|
||||||
if not self.sections:
|
|
||||||
return []
|
|
||||||
|
|
||||||
# Build a score for each section: count how many other entities appear there
|
|
||||||
section_scores = []
|
|
||||||
for section in self.sections:
|
|
||||||
heading = section["heading"]
|
|
||||||
other_entity_count = 0
|
|
||||||
for name, counts in self.entity_counts.items():
|
|
||||||
if name.lower() == entity_name.lower():
|
|
||||||
continue
|
|
||||||
if heading in counts.get("per_section", {}):
|
|
||||||
other_entity_count += counts["per_section"][heading]
|
|
||||||
|
|
||||||
if other_entity_count > 0:
|
|
||||||
section_scores.append((heading, other_entity_count))
|
|
||||||
|
|
||||||
# Sort by entity richness descending
|
|
||||||
section_scores.sort(key=lambda x: x[1], reverse=True)
|
|
||||||
|
|
||||||
if section_scores:
|
|
||||||
return [heading for heading, _score in section_scores]
|
|
||||||
|
|
||||||
# Fallback: return all sections with non-trivial content
|
|
||||||
return [
|
|
||||||
s["heading"]
|
|
||||||
for s in self.sections
|
|
||||||
if len(s["text"].split()) > 20
|
|
||||||
]
|
|
||||||
|
|
||||||
def _section_density(self) -> list[dict]:
|
|
||||||
"""Calculate per-section entity density.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of dicts with: heading, level, word_count, entities_found,
|
|
||||||
entity_mentions, density (mentions per 100 words).
|
|
||||||
"""
|
|
||||||
densities = []
|
|
||||||
for section in self.sections:
|
|
||||||
heading = section["heading"]
|
|
||||||
word_count = len(section["text"].split())
|
|
||||||
entities_found = 0
|
|
||||||
total_mentions = 0
|
|
||||||
|
|
||||||
for name, counts in self.entity_counts.items():
|
|
||||||
section_count = counts.get("per_section", {}).get(heading, 0)
|
|
||||||
if section_count > 0:
|
|
||||||
entities_found += 1
|
|
||||||
total_mentions += section_count
|
|
||||||
|
|
||||||
density = round((total_mentions / word_count) * 100, 2) if word_count > 0 else 0.0
|
|
||||||
|
|
||||||
densities.append({
|
|
||||||
"heading": heading,
|
|
||||||
"level": section["level"],
|
|
||||||
"word_count": word_count,
|
|
||||||
"entities_found": entities_found,
|
|
||||||
"entity_mentions": total_mentions,
|
|
||||||
"density_per_100_words": density,
|
|
||||||
})
|
|
||||||
|
|
||||||
return densities
|
|
||||||
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# Output formatting
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
def format_text_report(analysis: dict, top_n: int = 30) -> str:
|
|
||||||
"""Format the analysis result as a human-readable text report."""
|
|
||||||
lines = []
|
|
||||||
summary = analysis["summary"]
|
|
||||||
|
|
||||||
# --- Header ---
|
|
||||||
lines.append("=" * 70)
|
|
||||||
lines.append(" ENTITY OPTIMIZATION REPORT")
|
|
||||||
if summary.get("search_term"):
|
|
||||||
lines.append(f" Target keyword: {summary['search_term']}")
|
|
||||||
lines.append("=" * 70)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# --- Summary ---
|
|
||||||
lines.append("SUMMARY")
|
|
||||||
lines.append("-" * 40)
|
|
||||||
lines.append(f" Total entities tracked: {summary['total_entities_tracked']}")
|
|
||||||
lines.append(f" Entities found in draft: {summary['entities_found_in_draft']}")
|
|
||||||
lines.append(f" Entities with deficit: {summary['entities_with_deficit']}")
|
|
||||||
lines.append(f" Total sections in draft: {summary['total_sections']}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# --- Top Recommendations ---
|
|
||||||
recommendations = analysis["recommendations"]
|
|
||||||
shown = recommendations[:top_n]
|
|
||||||
|
|
||||||
lines.append(f"TOP {min(top_n, len(recommendations))} RECOMMENDATIONS (sorted by priority)")
|
|
||||||
lines.append("-" * 70)
|
|
||||||
|
|
||||||
if not shown:
|
|
||||||
lines.append(" No entity deficits found — the draft covers all targets.")
|
|
||||||
else:
|
|
||||||
for i, rec in enumerate(shown, 1):
|
|
||||||
sections_str = ", ".join(rec["suggested_sections"][:3]) if rec["suggested_sections"] else "any section"
|
|
||||||
lines.append(
|
|
||||||
f" {i:>3}. Entity '{rec['name']}' found {rec['draft_count']} times, "
|
|
||||||
f"target deficit is {rec['cora_deficit']}. "
|
|
||||||
f"Remaining: {rec['remaining_deficit']}. "
|
|
||||||
f"Priority: {rec['priority']}"
|
|
||||||
)
|
|
||||||
lines.append(
|
|
||||||
f" Relevance: {rec['relevance']} | Correlation: {rec['correlation']}"
|
|
||||||
)
|
|
||||||
lines.append(
|
|
||||||
f" Suggested sections: [{sections_str}]"
|
|
||||||
)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# --- Per-Section Entity Density ---
|
|
||||||
lines.append("PER-SECTION ENTITY DENSITY")
|
|
||||||
lines.append("-" * 70)
|
|
||||||
lines.append(f" {'Section':<40} {'Words':>6} {'Entities':>9} {'Mentions':>9} {'Density':>8}")
|
|
||||||
lines.append(f" {'-' * 40} {'-' * 6} {'-' * 9} {'-' * 9} {'-' * 8}")
|
|
||||||
|
|
||||||
for sd in analysis["section_density"]:
|
|
||||||
indent = " " * sd["level"] if sd["level"] > 0 else ""
|
|
||||||
heading_display = indent + sd["heading"]
|
|
||||||
if len(heading_display) > 38:
|
|
||||||
heading_display = heading_display[:35] + "..."
|
|
||||||
lines.append(
|
|
||||||
f" {heading_display:<40} {sd['word_count']:>6} {sd['entities_found']:>9} "
|
|
||||||
f"{sd['entity_mentions']:>9} {sd['density_per_100_words']:>7.2f}%"
|
|
||||||
)
|
|
||||||
|
|
||||||
lines.append("")
|
|
||||||
lines.append("=" * 70)
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
def format_json_report(analysis: dict, top_n: int = 30) -> str:
|
|
||||||
"""Format the analysis result as machine-readable JSON."""
|
|
||||||
output = {
|
|
||||||
"summary": analysis["summary"],
|
|
||||||
"recommendations": analysis["recommendations"][:top_n],
|
|
||||||
"section_density": analysis["section_density"],
|
|
||||||
"entity_counts": analysis["entity_counts"],
|
|
||||||
"deficits": analysis["deficits"],
|
|
||||||
}
|
|
||||||
return json.dumps(output, indent=2, default=str)
|
|
||||||
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# CLI entry point
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
description="Analyze a content draft against Cora entity targets and recommend additions.",
|
|
||||||
usage="uv run --with openpyxl python entity_optimizer.py <draft_path> <cora_xlsx_path> [options]",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"draft_path",
|
|
||||||
help="Path to the markdown content draft",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"cora_xlsx_path",
|
|
||||||
help="Path to the Cora SEO XLSX report",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--format",
|
|
||||||
choices=["json", "text"],
|
|
||||||
default="text",
|
|
||||||
help="Output format (default: text)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--top-n",
|
|
||||||
type=int,
|
|
||||||
default=30,
|
|
||||||
help="Number of top recommendations to display (default: 30)",
|
|
||||||
)
|
|
||||||
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
try:
|
|
||||||
optimizer = EntityOptimizer(args.cora_xlsx_path)
|
|
||||||
analysis = optimizer.analyze_draft(args.draft_path)
|
|
||||||
except FileNotFoundError as e:
|
|
||||||
print(f"Error: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
except Exception as e:
|
|
||||||
print(f"Error analyzing draft: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
if args.format == "json":
|
|
||||||
print(format_json_report(analysis, top_n=args.top_n))
|
|
||||||
else:
|
|
||||||
print(format_text_report(analysis, top_n=args.top_n))
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,414 +0,0 @@
|
||||||
"""
|
|
||||||
LSI Keyword Optimizer
|
|
||||||
|
|
||||||
Counts Cora-defined LSI keywords in a content draft and recommends additions.
|
|
||||||
Reads LSI targets from a Cora XLSX report via cora_parser.CoraReport, then
|
|
||||||
scans a markdown draft to measure per-keyword usage and calculate deficits.
|
|
||||||
|
|
||||||
Recommendations are prioritized by |correlation| x deficit so the most
|
|
||||||
ranking-impactful gaps surface first.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python lsi_optimizer.py <draft_path> <cora_xlsx_path> \
|
|
||||||
[--format json|text] [--min-correlation 0.2] [--top-n 50]
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from cora_parser import CoraReport
|
|
||||||
|
|
||||||
|
|
||||||
class LSIOptimizer:
|
|
||||||
"""Analyzes a content draft against Cora LSI keyword targets."""
|
|
||||||
|
|
||||||
def __init__(self, cora_xlsx_path: str):
|
|
||||||
"""Load LSI keyword targets from a Cora XLSX report.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
cora_xlsx_path: Path to the Cora SEO report XLSX file.
|
|
||||||
"""
|
|
||||||
self.report = CoraReport(cora_xlsx_path)
|
|
||||||
self.lsi_keywords = self.report.get_lsi_keywords()
|
|
||||||
self.draft_text = ""
|
|
||||||
self.sections: list[dict] = []
|
|
||||||
self._keyword_counts: dict[str, int] = {}
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# Public API
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
def analyze_draft(self, draft_path: str) -> dict:
|
|
||||||
"""Run full LSI analysis on a markdown draft.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
draft_path: Path to a markdown content draft.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Analysis dict with keys: summary, keyword_counts, deficits,
|
|
||||||
recommendations, section_coverage.
|
|
||||||
"""
|
|
||||||
path = Path(draft_path)
|
|
||||||
if not path.exists():
|
|
||||||
raise FileNotFoundError(f"Draft file not found: {draft_path}")
|
|
||||||
|
|
||||||
self.draft_text = path.read_text(encoding="utf-8")
|
|
||||||
self.sections = self._parse_sections(self.draft_text)
|
|
||||||
self._keyword_counts = self.count_lsi_keywords(self.draft_text)
|
|
||||||
|
|
||||||
deficits = self.calculate_deficits()
|
|
||||||
recommendations = self.recommend_additions()
|
|
||||||
section_coverage = self._section_coverage()
|
|
||||||
|
|
||||||
total_tracked = len(self.lsi_keywords)
|
|
||||||
found_in_draft = sum(1 for c in self._keyword_counts.values() if c > 0)
|
|
||||||
with_deficit = len(deficits)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"summary": {
|
|
||||||
"total_lsi_tracked": total_tracked,
|
|
||||||
"found_in_draft": found_in_draft,
|
|
||||||
"with_deficit": with_deficit,
|
|
||||||
"fully_satisfied": total_tracked - with_deficit,
|
|
||||||
},
|
|
||||||
"keyword_counts": self._keyword_counts,
|
|
||||||
"deficits": deficits,
|
|
||||||
"recommendations": recommendations,
|
|
||||||
"section_coverage": section_coverage,
|
|
||||||
}
|
|
||||||
|
|
||||||
def count_lsi_keywords(self, text: str) -> dict[str, int]:
|
|
||||||
"""Count occurrences of each LSI keyword in the given text.
|
|
||||||
|
|
||||||
Uses word-boundary-aware regex matching so multi-word phrases like
|
|
||||||
"part that" are matched correctly and case-insensitively.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
text: The content string to scan.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Dict mapping keyword string to its occurrence count.
|
|
||||||
"""
|
|
||||||
counts: dict[str, int] = {}
|
|
||||||
for kw_data in self.lsi_keywords:
|
|
||||||
keyword = kw_data["keyword"]
|
|
||||||
pattern = self._keyword_pattern(keyword)
|
|
||||||
matches = pattern.findall(text)
|
|
||||||
counts[keyword] = len(matches)
|
|
||||||
return counts
|
|
||||||
|
|
||||||
def calculate_deficits(self) -> list[dict]:
|
|
||||||
"""Identify LSI keywords whose draft count is below the Cora target.
|
|
||||||
|
|
||||||
A keyword has a deficit when the Cora report indicates a positive
|
|
||||||
deficit value (target minus current usage in the report) AND the
|
|
||||||
draft count has not yet closed that gap.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
List of dicts with: keyword, draft_count, target, deficit,
|
|
||||||
spearmans, pearsons, best_of_both. Only keywords with
|
|
||||||
remaining deficit > 0 are included.
|
|
||||||
"""
|
|
||||||
deficits = []
|
|
||||||
for kw_data in self.lsi_keywords:
|
|
||||||
keyword = kw_data["keyword"]
|
|
||||||
cora_deficit = kw_data.get("deficit") or 0
|
|
||||||
if cora_deficit <= 0:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# The Cora deficit is based on the original page. The draft may
|
|
||||||
# have added some occurrences, so we re-compute: how many more
|
|
||||||
# are still needed?
|
|
||||||
cora_current = kw_data.get("current_count") or 0
|
|
||||||
target = cora_current + cora_deficit
|
|
||||||
draft_count = self._keyword_counts.get(keyword, 0)
|
|
||||||
|
|
||||||
remaining_deficit = target - draft_count
|
|
||||||
if remaining_deficit <= 0:
|
|
||||||
continue
|
|
||||||
|
|
||||||
deficits.append({
|
|
||||||
"keyword": keyword,
|
|
||||||
"draft_count": draft_count,
|
|
||||||
"target": target,
|
|
||||||
"deficit": remaining_deficit,
|
|
||||||
"spearmans": kw_data.get("spearmans"),
|
|
||||||
"pearsons": kw_data.get("pearsons"),
|
|
||||||
"best_of_both": kw_data.get("best_of_both"),
|
|
||||||
})
|
|
||||||
|
|
||||||
return deficits
|
|
||||||
|
|
||||||
def recommend_additions(
|
|
||||||
self,
|
|
||||||
min_correlation: float = 0.0,
|
|
||||||
top_n: int = 0,
|
|
||||||
) -> list[dict]:
|
|
||||||
"""Produce a prioritized list of LSI keyword additions.
|
|
||||||
|
|
||||||
Priority score = abs(best_of_both) x deficit. Keywords with higher
|
|
||||||
correlation to ranking AND larger deficits sort to the top.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
min_correlation: Only include keywords whose
|
|
||||||
abs(best_of_both) >= this threshold.
|
|
||||||
top_n: Limit to top N results (0 = no limit).
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Sorted list of dicts with: keyword, priority, deficit,
|
|
||||||
draft_count, target, best_of_both, spearmans, pearsons.
|
|
||||||
"""
|
|
||||||
deficits = self.calculate_deficits()
|
|
||||||
|
|
||||||
recommendations = []
|
|
||||||
for d in deficits:
|
|
||||||
correlation = abs(d["best_of_both"]) if d["best_of_both"] else 0.0
|
|
||||||
if correlation < min_correlation:
|
|
||||||
continue
|
|
||||||
|
|
||||||
priority = correlation * d["deficit"]
|
|
||||||
recommendations.append({
|
|
||||||
"keyword": d["keyword"],
|
|
||||||
"priority": round(priority, 4),
|
|
||||||
"deficit": d["deficit"],
|
|
||||||
"draft_count": d["draft_count"],
|
|
||||||
"target": d["target"],
|
|
||||||
"best_of_both": d["best_of_both"],
|
|
||||||
"spearmans": d["spearmans"],
|
|
||||||
"pearsons": d["pearsons"],
|
|
||||||
})
|
|
||||||
|
|
||||||
recommendations.sort(key=lambda r: r["priority"], reverse=True)
|
|
||||||
|
|
||||||
if top_n > 0:
|
|
||||||
recommendations = recommendations[:top_n]
|
|
||||||
|
|
||||||
return recommendations
|
|
||||||
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
# Internal helpers
|
|
||||||
# ------------------------------------------------------------------
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _keyword_pattern(keyword: str) -> re.Pattern:
|
|
||||||
"""Build a word-boundary-aware regex for an LSI keyword.
|
|
||||||
|
|
||||||
Handles multi-word phrases by joining escaped tokens with flexible
|
|
||||||
whitespace. Case-insensitive.
|
|
||||||
"""
|
|
||||||
tokens = keyword.strip().split()
|
|
||||||
escaped = [re.escape(t) for t in tokens]
|
|
||||||
# Allow flexible whitespace between tokens in multi-word phrases
|
|
||||||
pattern_str = r"\b" + r"\s+".join(escaped) + r"\b"
|
|
||||||
return re.compile(pattern_str, re.IGNORECASE)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _parse_sections(text: str) -> list[dict]:
|
|
||||||
"""Split markdown text into sections by headings.
|
|
||||||
|
|
||||||
Returns list of dicts with: heading, level, content.
|
|
||||||
The content before the first heading gets heading="(intro)".
|
|
||||||
"""
|
|
||||||
heading_re = re.compile(r"^(#{1,6})\s+(.+)$", re.MULTILINE)
|
|
||||||
matches = list(heading_re.finditer(text))
|
|
||||||
|
|
||||||
sections: list[dict] = []
|
|
||||||
|
|
||||||
if not matches:
|
|
||||||
# No headings — treat entire text as one section
|
|
||||||
sections.append({
|
|
||||||
"heading": "(intro)",
|
|
||||||
"level": 0,
|
|
||||||
"content": text,
|
|
||||||
})
|
|
||||||
return sections
|
|
||||||
|
|
||||||
# Content before first heading
|
|
||||||
if matches[0].start() > 0:
|
|
||||||
intro = text[: matches[0].start()]
|
|
||||||
if intro.strip():
|
|
||||||
sections.append({
|
|
||||||
"heading": "(intro)",
|
|
||||||
"level": 0,
|
|
||||||
"content": intro,
|
|
||||||
})
|
|
||||||
|
|
||||||
for i, match in enumerate(matches):
|
|
||||||
level = len(match.group(1))
|
|
||||||
heading = match.group(2).strip()
|
|
||||||
start = match.end()
|
|
||||||
end = matches[i + 1].start() if i + 1 < len(matches) else len(text)
|
|
||||||
content = text[start:end]
|
|
||||||
sections.append({
|
|
||||||
"heading": heading,
|
|
||||||
"level": level,
|
|
||||||
"content": content,
|
|
||||||
})
|
|
||||||
|
|
||||||
return sections
|
|
||||||
|
|
||||||
def _section_coverage(self) -> list[dict]:
|
|
||||||
"""Calculate LSI keyword coverage per section.
|
|
||||||
|
|
||||||
Returns list of dicts with: heading, level, total_keywords_found,
|
|
||||||
keyword_details (list of keyword/count pairs present in that section).
|
|
||||||
"""
|
|
||||||
coverage = []
|
|
||||||
for section in self.sections:
|
|
||||||
section_counts = self.count_lsi_keywords(section["content"])
|
|
||||||
found = {kw: cnt for kw, cnt in section_counts.items() if cnt > 0}
|
|
||||||
|
|
||||||
coverage.append({
|
|
||||||
"heading": section["heading"],
|
|
||||||
"level": section["level"],
|
|
||||||
"total_keywords_found": len(found),
|
|
||||||
"keyword_details": [
|
|
||||||
{"keyword": kw, "count": cnt}
|
|
||||||
for kw, cnt in sorted(found.items(), key=lambda x: x[1], reverse=True)
|
|
||||||
],
|
|
||||||
})
|
|
||||||
|
|
||||||
return coverage
|
|
||||||
|
|
||||||
|
|
||||||
# ----------------------------------------------------------------------
|
|
||||||
# Output formatting
|
|
||||||
# ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
def format_text_report(analysis: dict) -> str:
|
|
||||||
"""Format the analysis dict as a human-readable text report."""
|
|
||||||
lines: list[str] = []
|
|
||||||
summary = analysis["summary"]
|
|
||||||
|
|
||||||
# --- Summary ---
|
|
||||||
lines.append("=" * 60)
|
|
||||||
lines.append(" LSI KEYWORD OPTIMIZATION REPORT")
|
|
||||||
lines.append("=" * 60)
|
|
||||||
lines.append("")
|
|
||||||
lines.append(f" Total LSI keywords tracked : {summary['total_lsi_tracked']}")
|
|
||||||
lines.append(f" Found in draft : {summary['found_in_draft']}")
|
|
||||||
lines.append(f" With deficit (need more) : {summary['with_deficit']}")
|
|
||||||
lines.append(f" Fully satisfied : {summary['fully_satisfied']}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# --- Top Recommendations ---
|
|
||||||
recs = analysis["recommendations"]
|
|
||||||
if recs:
|
|
||||||
lines.append("-" * 60)
|
|
||||||
lines.append(" TOP RECOMMENDATIONS (sorted by priority)")
|
|
||||||
lines.append("-" * 60)
|
|
||||||
lines.append("")
|
|
||||||
lines.append(
|
|
||||||
f" {'#':<4} {'Keyword':<30} {'Priority':>9} "
|
|
||||||
f"{'Deficit':>8} {'Draft':>6} {'Target':>7} {'Corr':>7}"
|
|
||||||
)
|
|
||||||
lines.append(f" {'—'*4} {'—'*30} {'—'*9} {'—'*8} {'—'*6} {'—'*7} {'—'*7}")
|
|
||||||
|
|
||||||
for i, rec in enumerate(recs, 1):
|
|
||||||
corr = rec["best_of_both"]
|
|
||||||
corr_str = f"{corr:.3f}" if corr is not None else "N/A"
|
|
||||||
keyword_display = rec["keyword"]
|
|
||||||
if len(keyword_display) > 28:
|
|
||||||
keyword_display = keyword_display[:25] + "..."
|
|
||||||
|
|
||||||
lines.append(
|
|
||||||
f" {i:<4} {keyword_display:<30} {rec['priority']:>9.4f} "
|
|
||||||
f"{rec['deficit']:>8} {rec['draft_count']:>6} "
|
|
||||||
f"{rec['target']:>7} {corr_str:>7}"
|
|
||||||
)
|
|
||||||
lines.append("")
|
|
||||||
else:
|
|
||||||
lines.append(" No recommendations — all LSI targets met or no deficits found.")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# --- Section Coverage ---
|
|
||||||
sections = analysis["section_coverage"]
|
|
||||||
if sections:
|
|
||||||
lines.append("-" * 60)
|
|
||||||
lines.append(" PER-SECTION LSI COVERAGE")
|
|
||||||
lines.append("-" * 60)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
for sec in sections:
|
|
||||||
indent = " " * (sec["level"] + 1)
|
|
||||||
heading = sec["heading"]
|
|
||||||
kw_count = sec["total_keywords_found"]
|
|
||||||
lines.append(f"{indent}{heading} ({kw_count} LSI keyword{'s' if kw_count != 1 else ''})")
|
|
||||||
|
|
||||||
if sec["keyword_details"]:
|
|
||||||
for detail in sec["keyword_details"][:10]:
|
|
||||||
lines.append(f"{indent} - \"{detail['keyword']}\" x{detail['count']}")
|
|
||||||
remaining = len(sec["keyword_details"]) - 10
|
|
||||||
if remaining > 0:
|
|
||||||
lines.append(f"{indent} ... and {remaining} more")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
lines.append("=" * 60)
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
# ----------------------------------------------------------------------
|
|
||||||
# CLI entry point
|
|
||||||
# ----------------------------------------------------------------------
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
description="Analyze a content draft against Cora LSI keyword targets.",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"draft_path",
|
|
||||||
help="Path to the markdown content draft",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"cora_xlsx_path",
|
|
||||||
help="Path to the Cora SEO XLSX report",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--format",
|
|
||||||
choices=["json", "text"],
|
|
||||||
default="text",
|
|
||||||
help="Output format (default: text)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--min-correlation",
|
|
||||||
type=float,
|
|
||||||
default=0.2,
|
|
||||||
help="Minimum |correlation| to include in recommendations (default: 0.2)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--top-n",
|
|
||||||
type=int,
|
|
||||||
default=50,
|
|
||||||
help="Limit recommendations to top N (default: 50, 0 = unlimited)",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
try:
|
|
||||||
optimizer = LSIOptimizer(args.cora_xlsx_path)
|
|
||||||
except FileNotFoundError as e:
|
|
||||||
print(f"Error: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
try:
|
|
||||||
analysis = optimizer.analyze_draft(args.draft_path)
|
|
||||||
except FileNotFoundError as e:
|
|
||||||
print(f"Error: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
# Apply CLI filters to recommendations
|
|
||||||
analysis["recommendations"] = optimizer.recommend_additions(
|
|
||||||
min_correlation=args.min_correlation,
|
|
||||||
top_n=args.top_n,
|
|
||||||
)
|
|
||||||
|
|
||||||
if args.format == "json":
|
|
||||||
print(json.dumps(analysis, indent=2, default=str))
|
|
||||||
else:
|
|
||||||
print(format_text_report(analysis))
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,402 +0,0 @@
|
||||||
"""
|
|
||||||
SEO Content Optimizer
|
|
||||||
|
|
||||||
Checks keyword density and content structure of a draft against Cora targets.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python seo_optimizer.py <draft_path>
|
|
||||||
[--keyword <kw>] [--cora-xlsx <path>] [--format json|text]
|
|
||||||
|
|
||||||
Works standalone for basic checks, or with a Cora XLSX report for
|
|
||||||
keyword-specific targets via cora_parser.CoraReport.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Optional Cora integration — script works without it
|
|
||||||
try:
|
|
||||||
from cora_parser import CoraReport
|
|
||||||
except ImportError:
|
|
||||||
CoraReport = None
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Helpers
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def _split_words(text: str) -> list[str]:
|
|
||||||
"""Extract words from text (alphabetic sequences)."""
|
|
||||||
return re.findall(r"[a-zA-Z']+", text)
|
|
||||||
|
|
||||||
|
|
||||||
def _strip_markdown_headings(text: str) -> str:
|
|
||||||
"""Remove markdown heading markers from text for word counting."""
|
|
||||||
return re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
|
|
||||||
|
|
||||||
|
|
||||||
def _extract_headings(text: str) -> list[dict]:
|
|
||||||
"""Extract markdown-style headings with their levels."""
|
|
||||||
headings = []
|
|
||||||
for match in re.finditer(r"^(#{1,6})\s+(.+)$", text, re.MULTILINE):
|
|
||||||
level = len(match.group(1))
|
|
||||||
headings.append({"level": level, "text": match.group(2).strip()})
|
|
||||||
return headings
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# SEOOptimizer
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
class SEOOptimizer:
|
|
||||||
"""Analyze a content draft for keyword density and structure."""
|
|
||||||
|
|
||||||
def __init__(self):
|
|
||||||
self._results = {}
|
|
||||||
|
|
||||||
# -- public entry point -------------------------------------------------
|
|
||||||
|
|
||||||
def analyze(
|
|
||||||
self,
|
|
||||||
draft_path: str,
|
|
||||||
primary_keyword: str | None = None,
|
|
||||||
cora_xlsx_path: str | None = None,
|
|
||||||
) -> dict:
|
|
||||||
"""Run checks on *draft_path* and return an analysis dict."""
|
|
||||||
path = Path(draft_path)
|
|
||||||
if not path.exists():
|
|
||||||
raise FileNotFoundError(f"Draft not found: {draft_path}")
|
|
||||||
|
|
||||||
text = path.read_text(encoding="utf-8")
|
|
||||||
|
|
||||||
# Optionally load Cora data
|
|
||||||
cora = None
|
|
||||||
if cora_xlsx_path:
|
|
||||||
if CoraReport is None:
|
|
||||||
print(
|
|
||||||
"Warning: cora_parser not available. "
|
|
||||||
"Install openpyxl and ensure cora_parser.py is importable.",
|
|
||||||
file=sys.stderr,
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
cora = CoraReport(cora_xlsx_path)
|
|
||||||
|
|
||||||
# Determine keyword list
|
|
||||||
keywords = []
|
|
||||||
if primary_keyword:
|
|
||||||
keywords.append(primary_keyword)
|
|
||||||
if cora:
|
|
||||||
search_term = cora.get_search_term()
|
|
||||||
if search_term and search_term.lower() not in [k.lower() for k in keywords]:
|
|
||||||
keywords.insert(0, search_term)
|
|
||||||
for var in cora.get_keyword_variations():
|
|
||||||
v = var["variation"]
|
|
||||||
if v.lower() not in [k.lower() for k in keywords]:
|
|
||||||
keywords.append(v)
|
|
||||||
|
|
||||||
# If still no keywords but Cora gave a search term, use it
|
|
||||||
if not keywords and cora:
|
|
||||||
st = cora.get_search_term()
|
|
||||||
if st:
|
|
||||||
keywords.append(st)
|
|
||||||
|
|
||||||
# Word-count target from Cora
|
|
||||||
word_count_target = None
|
|
||||||
if cora:
|
|
||||||
for t in cora.get_basic_tunings():
|
|
||||||
if t["factor"] == "Word Count":
|
|
||||||
try:
|
|
||||||
word_count_target = int(float(t["goal"]))
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
pass
|
|
||||||
break
|
|
||||||
|
|
||||||
# Build Cora keyword targets (page1_avg) for comparison
|
|
||||||
cora_keyword_targets = {}
|
|
||||||
if cora:
|
|
||||||
for var in cora.get_keyword_variations():
|
|
||||||
cora_keyword_targets[var["variation"].lower()] = {
|
|
||||||
"page1_avg": var.get("page1_avg", 0),
|
|
||||||
"page1_max": var.get("page1_max", 0),
|
|
||||||
}
|
|
||||||
|
|
||||||
# Run checks
|
|
||||||
self._results["content_length"] = self.check_content_length(text, target=word_count_target)
|
|
||||||
self._results["structure"] = self.check_structure(text)
|
|
||||||
self._results["keyword_density"] = self.check_keyword_density(
|
|
||||||
text, keywords=keywords or None, cora_targets=cora_keyword_targets,
|
|
||||||
)
|
|
||||||
|
|
||||||
return self._results
|
|
||||||
|
|
||||||
# -- individual checks --------------------------------------------------
|
|
||||||
|
|
||||||
def check_keyword_density(
|
|
||||||
self,
|
|
||||||
text: str,
|
|
||||||
keywords: list[str] | None = None,
|
|
||||||
cora_targets: dict | None = None,
|
|
||||||
) -> dict:
|
|
||||||
"""Return per-keyword density information.
|
|
||||||
|
|
||||||
Only reports variations that have page1_avg > 0 (competitors actually
|
|
||||||
use them) when Cora targets are available.
|
|
||||||
"""
|
|
||||||
clean_text = _strip_markdown_headings(text).lower()
|
|
||||||
words = _split_words(clean_text)
|
|
||||||
total_words = len(words)
|
|
||||||
|
|
||||||
if total_words == 0:
|
|
||||||
return {"total_words": 0, "keywords": []}
|
|
||||||
|
|
||||||
results: list[dict] = []
|
|
||||||
|
|
||||||
if keywords:
|
|
||||||
for kw in keywords:
|
|
||||||
kw_lower = kw.lower()
|
|
||||||
|
|
||||||
# Skip zero-avg variations — competitors don't use them
|
|
||||||
if cora_targets and kw_lower in cora_targets:
|
|
||||||
if cora_targets[kw_lower].get("page1_avg", 0) == 0:
|
|
||||||
continue
|
|
||||||
|
|
||||||
kw_words = kw_lower.split()
|
|
||||||
if len(kw_words) > 1:
|
|
||||||
pattern = re.compile(r"\b" + re.escape(kw_lower) + r"\b")
|
|
||||||
count = len(pattern.findall(clean_text))
|
|
||||||
else:
|
|
||||||
count = sum(1 for w in words if w == kw_lower)
|
|
||||||
|
|
||||||
density = (count / total_words) * 100 if total_words else 0
|
|
||||||
|
|
||||||
entry = {
|
|
||||||
"keyword": kw,
|
|
||||||
"count": count,
|
|
||||||
"density_pct": round(density, 2),
|
|
||||||
}
|
|
||||||
|
|
||||||
# Add Cora target if available
|
|
||||||
if cora_targets and kw_lower in cora_targets:
|
|
||||||
entry["target_avg"] = cora_targets[kw_lower]["page1_avg"]
|
|
||||||
entry["target_max"] = cora_targets[kw_lower]["page1_max"]
|
|
||||||
|
|
||||||
results.append(entry)
|
|
||||||
else:
|
|
||||||
# Fallback: top frequent words (>= 4 chars)
|
|
||||||
freq: dict[str, int] = {}
|
|
||||||
for w in words:
|
|
||||||
if len(w) >= 4:
|
|
||||||
freq[w] = freq.get(w, 0) + 1
|
|
||||||
top = sorted(freq.items(), key=lambda x: x[1], reverse=True)[:10]
|
|
||||||
for w, count in top:
|
|
||||||
density = (count / total_words) * 100
|
|
||||||
results.append({
|
|
||||||
"keyword": w,
|
|
||||||
"count": count,
|
|
||||||
"density_pct": round(density, 2),
|
|
||||||
})
|
|
||||||
|
|
||||||
return {"total_words": total_words, "keywords": results}
|
|
||||||
|
|
||||||
def check_structure(self, text: str) -> dict:
|
|
||||||
"""Analyze heading hierarchy, paragraph count, and list usage."""
|
|
||||||
headings = _extract_headings(text)
|
|
||||||
|
|
||||||
# Count headings per level
|
|
||||||
heading_counts = {f"h{i}": 0 for i in range(1, 7)}
|
|
||||||
for h in headings:
|
|
||||||
heading_counts[f"h{h['level']}"] += 1
|
|
||||||
|
|
||||||
# Check nesting issues
|
|
||||||
nesting_issues: list[str] = []
|
|
||||||
if heading_counts["h1"] > 1:
|
|
||||||
nesting_issues.append(f"Multiple H1 tags found ({heading_counts['h1']}); use exactly one.")
|
|
||||||
|
|
||||||
prev_level = 0
|
|
||||||
for h in headings:
|
|
||||||
if prev_level > 0 and h["level"] > prev_level + 1:
|
|
||||||
nesting_issues.append(
|
|
||||||
f"Heading skip: H{prev_level} -> H{h['level']} "
|
|
||||||
f"(at \"{h['text'][:40]}...\")"
|
|
||||||
if len(h["text"]) > 40 else
|
|
||||||
f"Heading skip: H{prev_level} -> H{h['level']} "
|
|
||||||
f"(at \"{h['text']}\")"
|
|
||||||
)
|
|
||||||
prev_level = h["level"]
|
|
||||||
|
|
||||||
# Paragraphs
|
|
||||||
paragraphs = []
|
|
||||||
for block in re.split(r"\n\s*\n", text):
|
|
||||||
block = block.strip()
|
|
||||||
if not block:
|
|
||||||
continue
|
|
||||||
if re.match(r"^#{1,6}\s+", block) and "\n" not in block:
|
|
||||||
continue
|
|
||||||
if all(re.match(r"^\s*[-*+]\s|^\s*\d+\.\s", line) for line in block.splitlines() if line.strip()):
|
|
||||||
continue
|
|
||||||
paragraphs.append(block)
|
|
||||||
|
|
||||||
paragraph_count = len(paragraphs)
|
|
||||||
|
|
||||||
# List usage
|
|
||||||
unordered_items = len(re.findall(r"^\s*[-*+]\s", text, re.MULTILINE))
|
|
||||||
ordered_items = len(re.findall(r"^\s*\d+\.\s", text, re.MULTILINE))
|
|
||||||
|
|
||||||
return {
|
|
||||||
"heading_counts": heading_counts,
|
|
||||||
"headings": [{"level": h["level"], "text": h["text"]} for h in headings],
|
|
||||||
"nesting_issues": nesting_issues,
|
|
||||||
"paragraph_count": paragraph_count,
|
|
||||||
"unordered_list_items": unordered_items,
|
|
||||||
"ordered_list_items": ordered_items,
|
|
||||||
}
|
|
||||||
|
|
||||||
def check_content_length(self, text: str, target: int | None = None) -> dict:
|
|
||||||
"""Compare word count against an optional target."""
|
|
||||||
clean = _strip_markdown_headings(text)
|
|
||||||
words = _split_words(clean)
|
|
||||||
word_count = len(words)
|
|
||||||
|
|
||||||
result: dict = {"word_count": word_count}
|
|
||||||
|
|
||||||
if target is not None:
|
|
||||||
result["target"] = target
|
|
||||||
result["difference"] = word_count - target
|
|
||||||
if word_count >= target:
|
|
||||||
result["status"] = "meets_target"
|
|
||||||
elif word_count >= target * 0.8:
|
|
||||||
result["status"] = "close"
|
|
||||||
else:
|
|
||||||
result["status"] = "below_target"
|
|
||||||
|
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Text-mode formatting
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def _format_text_report(results: dict) -> str:
|
|
||||||
"""Format analysis results as a human-readable text report."""
|
|
||||||
lines: list[str] = []
|
|
||||||
sep = "-" * 60
|
|
||||||
|
|
||||||
# 1. Content Stats
|
|
||||||
cl = results.get("content_length", {})
|
|
||||||
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append(" CONTENT STATS")
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append(f" Word count: {cl.get('word_count', 0)}")
|
|
||||||
if cl.get("target"):
|
|
||||||
lines.append(f" Target: {cl['target']} ({cl.get('status', '')})")
|
|
||||||
diff = cl.get("difference", 0)
|
|
||||||
sign = "+" if diff >= 0 else ""
|
|
||||||
lines.append(f" Difference: {sign}{diff}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# 2. Structure
|
|
||||||
st = results.get("structure", {})
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append(" STRUCTURE")
|
|
||||||
lines.append(sep)
|
|
||||||
hc = st.get("heading_counts", {})
|
|
||||||
for lvl in range(1, 7):
|
|
||||||
count = hc.get(f"h{lvl}", 0)
|
|
||||||
if count > 0:
|
|
||||||
lines.append(f" H{lvl}: {count}")
|
|
||||||
issues = st.get("nesting_issues", [])
|
|
||||||
if issues:
|
|
||||||
lines.append(" Nesting issues:")
|
|
||||||
for issue in issues:
|
|
||||||
lines.append(f" - {issue}")
|
|
||||||
else:
|
|
||||||
lines.append(" Nesting: OK")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# 3. Keyword Density (only variations with targets)
|
|
||||||
kd = results.get("keyword_density", {})
|
|
||||||
kw_list = kd.get("keywords", [])
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append(" KEYWORD DENSITY")
|
|
||||||
lines.append(sep)
|
|
||||||
if kw_list:
|
|
||||||
lines.append(f" {'Variation':<30s} {'Count':>5s} {'Density':>7s} {'Avg':>5s} {'Max':>5s}")
|
|
||||||
lines.append(f" {'-'*30} {'-'*5} {'-'*7} {'-'*5} {'-'*5}")
|
|
||||||
for kw in kw_list:
|
|
||||||
avg_str = str(kw.get("target_avg", "")) if "target_avg" in kw else ""
|
|
||||||
max_str = str(kw.get("target_max", "")) if "target_max" in kw else ""
|
|
||||||
lines.append(
|
|
||||||
f" {kw['keyword']:<30s} "
|
|
||||||
f"{kw['count']:>5d} "
|
|
||||||
f"{kw['density_pct']:>6.2f}% "
|
|
||||||
f"{avg_str:>5s} "
|
|
||||||
f"{max_str:>5s}"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
lines.append(" No keywords specified.")
|
|
||||||
lines.append("")
|
|
||||||
lines.append(sep)
|
|
||||||
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# CLI
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
description="Check keyword density and structure of a content draft.",
|
|
||||||
epilog="Example: uv run --with openpyxl python seo_optimizer.py draft.md --cora-xlsx report.xlsx",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"draft_path",
|
|
||||||
help="Path to the content draft (plain text or markdown)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--keyword",
|
|
||||||
dest="keyword",
|
|
||||||
default=None,
|
|
||||||
help="Primary keyword to evaluate",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--cora-xlsx",
|
|
||||||
dest="cora_xlsx",
|
|
||||||
default=None,
|
|
||||||
help="Path to a Cora XLSX report for keyword-specific targets",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--format",
|
|
||||||
choices=["json", "text"],
|
|
||||||
default="text",
|
|
||||||
help="Output format (default: text)",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
optimizer = SEOOptimizer()
|
|
||||||
|
|
||||||
try:
|
|
||||||
results = optimizer.analyze(
|
|
||||||
draft_path=args.draft_path,
|
|
||||||
primary_keyword=args.keyword,
|
|
||||||
cora_xlsx_path=args.cora_xlsx,
|
|
||||||
)
|
|
||||||
except FileNotFoundError as e:
|
|
||||||
print(f"Error: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
except Exception as e:
|
|
||||||
print(f"Error during analysis: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
if args.format == "json":
|
|
||||||
print(json.dumps(results, indent=2, default=str))
|
|
||||||
else:
|
|
||||||
print(_format_text_report(results))
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,469 +0,0 @@
|
||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Test Block Generator — Programmatically Assemble Test Blocks from Templates
|
|
||||||
|
|
||||||
Takes LLM-generated sentence templates (with {N} slots for body text) and
|
|
||||||
pre-written headings, plus an LLM-curated entity list, and assembles a test
|
|
||||||
block. Tracks aggregate densities in real-time and stops when targets are met.
|
|
||||||
|
|
||||||
The LLM handles all intelligence: filtering entities for topical relevance,
|
|
||||||
writing headings, creating body templates. This script handles all math:
|
|
||||||
slot filling, density tracking, stop conditions.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python test_block_generator.py <templates_path> <prep_json_path> <cora_xlsx_path>
|
|
||||||
--entities-file <path> [--output-dir ./working/] [--min-sentences 5]
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from cora_parser import CoraReport
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Term selection
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def load_entity_names(entities_file: str) -> list[str]:
|
|
||||||
"""Load LLM-curated entity names from file (one per line)."""
|
|
||||||
path = Path(entities_file)
|
|
||||||
if not path.exists():
|
|
||||||
print(f"Error: entities file not found: {path}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
names = []
|
|
||||||
for line in path.read_text(encoding="utf-8").splitlines():
|
|
||||||
name = line.strip()
|
|
||||||
if name:
|
|
||||||
names.append(name)
|
|
||||||
return names
|
|
||||||
|
|
||||||
|
|
||||||
def build_term_queue(
|
|
||||||
filtered_entity_names: list[str],
|
|
||||||
variations: list[str],
|
|
||||||
) -> list[str]:
|
|
||||||
"""Build a flat priority-ordered term list.
|
|
||||||
|
|
||||||
Order: filtered entities (LLM-curated, in provided order) -> keyword variations.
|
|
||||||
"""
|
|
||||||
terms = []
|
|
||||||
seen = set()
|
|
||||||
|
|
||||||
# 1. Filtered entities from LLM (already curated for topical relevance)
|
|
||||||
for name in filtered_entity_names:
|
|
||||||
if name.lower() not in seen:
|
|
||||||
terms.append(name)
|
|
||||||
seen.add(name.lower())
|
|
||||||
|
|
||||||
# 2. Keyword variations
|
|
||||||
for v in variations:
|
|
||||||
if v.lower() not in seen:
|
|
||||||
terms.append(v)
|
|
||||||
seen.add(v.lower())
|
|
||||||
|
|
||||||
return terms
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Generator
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
class TestBlockGenerator:
|
|
||||||
"""Fills body templates with entity/variation terms, inserts pre-written
|
|
||||||
headings, and tracks aggregate densities."""
|
|
||||||
|
|
||||||
def __init__(self, cora_xlsx_path: str, prep_data: dict, filtered_entity_names: list[str]):
|
|
||||||
self.report = CoraReport(cora_xlsx_path)
|
|
||||||
self.prep = prep_data
|
|
||||||
self.entities = self.report.get_entities()
|
|
||||||
self.variations = self.report.get_variations_list()
|
|
||||||
|
|
||||||
# Compile regex patterns for counting (built once, used per sentence)
|
|
||||||
self.entity_patterns = {}
|
|
||||||
for e in self.entities:
|
|
||||||
name = e["name"]
|
|
||||||
self.entity_patterns[name] = re.compile(
|
|
||||||
r"\b" + re.escape(name) + r"\b", re.IGNORECASE
|
|
||||||
)
|
|
||||||
|
|
||||||
self.variation_patterns = {}
|
|
||||||
for v in self.variations:
|
|
||||||
self.variation_patterns[v] = re.compile(
|
|
||||||
r"\b" + re.escape(v) + r"\b", re.IGNORECASE
|
|
||||||
)
|
|
||||||
|
|
||||||
# Build term queue from LLM-curated entity list
|
|
||||||
self.term_queue = build_term_queue(filtered_entity_names, self.variations)
|
|
||||||
self.term_idx = 0
|
|
||||||
|
|
||||||
# Track which 0->1 entities have been introduced
|
|
||||||
# Use the full missing list from prep to track introductions accurately
|
|
||||||
missing = prep_data.get("distinct_entities", {}).get("missing_entities", [])
|
|
||||||
self.missing_names = {e["name"] for e in missing}
|
|
||||||
self.introduced = set()
|
|
||||||
|
|
||||||
# Running totals for new content
|
|
||||||
self.new_words = 0
|
|
||||||
self.new_entity_mentions = 0
|
|
||||||
self.new_variation_mentions = 0
|
|
||||||
self.new_h2_count = 0
|
|
||||||
self.new_h3_count = 0
|
|
||||||
|
|
||||||
# Baseline from prep
|
|
||||||
self.base_words = prep_data["word_count"]["current"]
|
|
||||||
self.base_entity_mentions = prep_data["entity_density"]["current_mentions"]
|
|
||||||
self.base_variation_mentions = prep_data["variation_density"]["current_mentions"]
|
|
||||||
self.target_entity_d = prep_data["entity_density"]["target_decimal"]
|
|
||||||
self.target_variation_d = prep_data["variation_density"]["target_decimal"]
|
|
||||||
|
|
||||||
def pick_term(self, used_in_sentence: set) -> str:
|
|
||||||
"""Pick next term from the queue, skipping duplicates within a sentence."""
|
|
||||||
if not self.term_queue:
|
|
||||||
return "equipment"
|
|
||||||
|
|
||||||
used_lower = {u.lower() for u in used_in_sentence}
|
|
||||||
for _ in range(len(self.term_queue)):
|
|
||||||
term = self.term_queue[self.term_idx % len(self.term_queue)]
|
|
||||||
self.term_idx = (self.term_idx + 1) % len(self.term_queue)
|
|
||||||
if term.lower() not in used_lower:
|
|
||||||
return term
|
|
||||||
|
|
||||||
# All exhausted for this sentence, return next anyway
|
|
||||||
term = self.term_queue[self.term_idx % len(self.term_queue)]
|
|
||||||
self.term_idx = (self.term_idx + 1) % len(self.term_queue)
|
|
||||||
return term
|
|
||||||
|
|
||||||
def fill_template(self, template: str) -> str:
|
|
||||||
"""Fill a template's {N} slots with terms."""
|
|
||||||
slots = re.findall(r"\{(\d+)\}", template)
|
|
||||||
used = set()
|
|
||||||
filled = template
|
|
||||||
|
|
||||||
for slot_num in slots:
|
|
||||||
term = self.pick_term(used)
|
|
||||||
used.add(term)
|
|
||||||
filled = filled.replace(f"{{{slot_num}}}", term, 1)
|
|
||||||
|
|
||||||
return filled
|
|
||||||
|
|
||||||
def count_sentence(self, text: str) -> tuple[int, int, int]:
|
|
||||||
"""Count words, entity mentions, and variation mentions in text.
|
|
||||||
|
|
||||||
Also tracks which 0->1 entities have been introduced.
|
|
||||||
Returns: (word_count, entity_mentions, variation_mentions)
|
|
||||||
"""
|
|
||||||
entity_mentions = 0
|
|
||||||
for name, pattern in self.entity_patterns.items():
|
|
||||||
count = len(pattern.findall(text))
|
|
||||||
entity_mentions += count
|
|
||||||
if count > 0 and name in self.missing_names:
|
|
||||||
self.introduced.add(name)
|
|
||||||
|
|
||||||
variation_mentions = 0
|
|
||||||
for v, pattern in self.variation_patterns.items():
|
|
||||||
variation_mentions += len(pattern.findall(text))
|
|
||||||
|
|
||||||
words = len(re.findall(r"[a-zA-Z']+", text))
|
|
||||||
return words, entity_mentions, variation_mentions
|
|
||||||
|
|
||||||
def projected_density(self, metric: str) -> float:
|
|
||||||
"""Calculate projected density after current additions."""
|
|
||||||
total_words = self.base_words + self.new_words
|
|
||||||
if total_words == 0:
|
|
||||||
return 0.0
|
|
||||||
|
|
||||||
if metric == "entity":
|
|
||||||
return (self.base_entity_mentions + self.new_entity_mentions) / total_words
|
|
||||||
elif metric == "variation":
|
|
||||||
return (self.base_variation_mentions + self.new_variation_mentions) / total_words
|
|
||||||
return 0.0
|
|
||||||
|
|
||||||
def targets_met(self, min_reached: bool) -> bool:
|
|
||||||
"""Check if all density targets are met and minimums reached."""
|
|
||||||
if not min_reached:
|
|
||||||
return False
|
|
||||||
|
|
||||||
entity_ok = self.projected_density("entity") >= self.target_entity_d
|
|
||||||
variation_ok = self.projected_density("variation") >= self.target_variation_d
|
|
||||||
|
|
||||||
distinct_deficit = self.prep["distinct_entities"]["deficit"]
|
|
||||||
distinct_ok = len(self.introduced) >= distinct_deficit
|
|
||||||
|
|
||||||
wc_deficit = self.prep["word_count"]["deficit"]
|
|
||||||
wc_ok = self.new_words >= wc_deficit
|
|
||||||
|
|
||||||
return entity_ok and variation_ok and distinct_ok and wc_ok
|
|
||||||
|
|
||||||
def generate(
|
|
||||||
self,
|
|
||||||
templates: list[str],
|
|
||||||
min_sentences: int = 5,
|
|
||||||
) -> dict:
|
|
||||||
"""Generate the test block by filling body templates and inserting
|
|
||||||
pre-written headings.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
templates: List of template strings. Lines starting with "H2:" or
|
|
||||||
"H3:" are pre-written headings (inserted as-is, no slot filling).
|
|
||||||
Everything else is a body template with {N} slots.
|
|
||||||
min_sentences: Minimum sentences before checking stop condition.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
Dict with "sentences" list and "stats" summary.
|
|
||||||
"""
|
|
||||||
h2_headings = []
|
|
||||||
h3_headings = []
|
|
||||||
body_templates = []
|
|
||||||
|
|
||||||
for t in templates:
|
|
||||||
t = t.strip()
|
|
||||||
if not t:
|
|
||||||
continue
|
|
||||||
if t.upper().startswith("H2:"):
|
|
||||||
h2_headings.append(t[3:].strip())
|
|
||||||
elif t.upper().startswith("H3:"):
|
|
||||||
h3_headings.append(t[3:].strip())
|
|
||||||
else:
|
|
||||||
body_templates.append(t)
|
|
||||||
|
|
||||||
if not body_templates:
|
|
||||||
return {"error": "No body templates found", "sentences": [], "stats": {}}
|
|
||||||
|
|
||||||
h2_needed = self.prep["headings"]["h2"]["deficit"]
|
|
||||||
h3_needed = self.prep["headings"]["h3"]["deficit"]
|
|
||||||
|
|
||||||
sentences = []
|
|
||||||
count = 0
|
|
||||||
body_idx = 0
|
|
||||||
h2_idx = 0
|
|
||||||
h3_idx = 0
|
|
||||||
max_iter = max(len(body_templates) * 3, 60)
|
|
||||||
|
|
||||||
for _ in range(max_iter):
|
|
||||||
# Insert pre-written heading if deficit exists and we're at a paragraph break
|
|
||||||
if h2_needed > 0 and h2_headings and count % 5 == 0:
|
|
||||||
text = h2_headings[h2_idx % len(h2_headings)]
|
|
||||||
w, e, v = self.count_sentence(text)
|
|
||||||
self.new_words += w
|
|
||||||
self.new_entity_mentions += e
|
|
||||||
self.new_variation_mentions += v
|
|
||||||
self.new_h2_count += 1
|
|
||||||
h2_needed -= 1
|
|
||||||
h2_idx += 1
|
|
||||||
sentences.append({"text": text, "type": "h2"})
|
|
||||||
count += 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
if h3_needed > 0 and h3_headings and count > 0 and count % 3 == 0:
|
|
||||||
text = h3_headings[h3_idx % len(h3_headings)]
|
|
||||||
w, e, v = self.count_sentence(text)
|
|
||||||
self.new_words += w
|
|
||||||
self.new_entity_mentions += e
|
|
||||||
self.new_variation_mentions += v
|
|
||||||
self.new_h3_count += 1
|
|
||||||
h3_needed -= 1
|
|
||||||
h3_idx += 1
|
|
||||||
sentences.append({"text": text, "type": "h3"})
|
|
||||||
count += 1
|
|
||||||
continue
|
|
||||||
|
|
||||||
# Body sentence — fill template slots
|
|
||||||
tmpl = body_templates[body_idx % len(body_templates)]
|
|
||||||
filled = self.fill_template(tmpl)
|
|
||||||
w, e, v = self.count_sentence(filled)
|
|
||||||
self.new_words += w
|
|
||||||
self.new_entity_mentions += e
|
|
||||||
self.new_variation_mentions += v
|
|
||||||
body_idx += 1
|
|
||||||
sentences.append({"text": filled, "type": "body"})
|
|
||||||
count += 1
|
|
||||||
|
|
||||||
if self.targets_met(count >= min_sentences):
|
|
||||||
break
|
|
||||||
|
|
||||||
return {
|
|
||||||
"sentences": sentences,
|
|
||||||
"stats": {
|
|
||||||
"total_sentences": count,
|
|
||||||
"new_words": self.new_words,
|
|
||||||
"new_entity_mentions": self.new_entity_mentions,
|
|
||||||
"new_variation_mentions": self.new_variation_mentions,
|
|
||||||
"new_distinct_entities_introduced": len(self.introduced),
|
|
||||||
"introduced_entities": sorted(self.introduced),
|
|
||||||
"new_h2_count": self.new_h2_count,
|
|
||||||
"new_h3_count": self.new_h3_count,
|
|
||||||
"projected_entity_density_pct": round(
|
|
||||||
self.projected_density("entity") * 100, 2
|
|
||||||
),
|
|
||||||
"projected_variation_density_pct": round(
|
|
||||||
self.projected_density("variation") * 100, 2
|
|
||||||
),
|
|
||||||
"target_entity_density_pct": round(self.target_entity_d * 100, 2),
|
|
||||||
"target_variation_density_pct": round(self.target_variation_d * 100, 2),
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Output formatting
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def format_markdown(sentences: list[dict]) -> str:
|
|
||||||
"""Convert sentence list to markdown with test block markers."""
|
|
||||||
lines = ["<!-- HIDDEN TEST BLOCK START -->", ""]
|
|
||||||
paragraph = []
|
|
||||||
|
|
||||||
for s in sentences:
|
|
||||||
if s["type"] in ("h2", "h3"):
|
|
||||||
# Flush paragraph before heading
|
|
||||||
if paragraph:
|
|
||||||
lines.append(" ".join(paragraph))
|
|
||||||
lines.append("")
|
|
||||||
paragraph = []
|
|
||||||
prefix = "##" if s["type"] == "h2" else "###"
|
|
||||||
lines.append(f"{prefix} {s['text']}")
|
|
||||||
lines.append("")
|
|
||||||
else:
|
|
||||||
paragraph.append(s["text"])
|
|
||||||
if len(paragraph) >= 4:
|
|
||||||
lines.append(" ".join(paragraph))
|
|
||||||
lines.append("")
|
|
||||||
paragraph = []
|
|
||||||
|
|
||||||
if paragraph:
|
|
||||||
lines.append(" ".join(paragraph))
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
lines.append("<!-- HIDDEN TEST BLOCK END -->")
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
def format_html(sentences: list[dict]) -> str:
|
|
||||||
"""Convert sentence list to HTML with test block markers."""
|
|
||||||
lines = ["<!-- HIDDEN TEST BLOCK START -->", ""]
|
|
||||||
paragraph = []
|
|
||||||
|
|
||||||
for s in sentences:
|
|
||||||
if s["type"] in ("h2", "h3"):
|
|
||||||
if paragraph:
|
|
||||||
lines.append("<p>" + " ".join(paragraph) + "</p>")
|
|
||||||
lines.append("")
|
|
||||||
paragraph = []
|
|
||||||
tag = "h2" if s["type"] == "h2" else "h3"
|
|
||||||
lines.append(f"<{tag}>{s['text']}</{tag}>")
|
|
||||||
lines.append("")
|
|
||||||
else:
|
|
||||||
paragraph.append(s["text"])
|
|
||||||
if len(paragraph) >= 4:
|
|
||||||
lines.append("<p>" + " ".join(paragraph) + "</p>")
|
|
||||||
lines.append("")
|
|
||||||
paragraph = []
|
|
||||||
|
|
||||||
if paragraph:
|
|
||||||
lines.append("<p>" + " ".join(paragraph) + "</p>")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
lines.append("<!-- HIDDEN TEST BLOCK END -->")
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# CLI
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
description="Generate a test block from templates and deficit data.",
|
|
||||||
)
|
|
||||||
parser.add_argument("templates_path", help="Path to templates file (one per line)")
|
|
||||||
parser.add_argument("prep_json_path", help="Path to prep JSON from test_block_prep.py")
|
|
||||||
parser.add_argument("cora_xlsx_path", help="Path to Cora XLSX report")
|
|
||||||
parser.add_argument(
|
|
||||||
"--entities-file", required=True,
|
|
||||||
help="Path to LLM-curated entity list (one name per line)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--output-dir", default="./working",
|
|
||||||
help="Directory for output files (default: ./working)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--min-sentences", type=int, default=5,
|
|
||||||
help="Minimum sentences before checking stop condition (default: 5)",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
# Load inputs
|
|
||||||
templates_path = Path(args.templates_path)
|
|
||||||
if not templates_path.exists():
|
|
||||||
print(f"Error: templates file not found: {templates_path}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
templates = [
|
|
||||||
line.strip()
|
|
||||||
for line in templates_path.read_text(encoding="utf-8").splitlines()
|
|
||||||
if line.strip()
|
|
||||||
]
|
|
||||||
|
|
||||||
prep_path = Path(args.prep_json_path)
|
|
||||||
if not prep_path.exists():
|
|
||||||
print(f"Error: prep JSON not found: {prep_path}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
prep_data = json.loads(prep_path.read_text(encoding="utf-8"))
|
|
||||||
|
|
||||||
# Load LLM-curated entity list
|
|
||||||
filtered_entity_names = load_entity_names(args.entities_file)
|
|
||||||
|
|
||||||
# Generate
|
|
||||||
gen = TestBlockGenerator(args.cora_xlsx_path, prep_data, filtered_entity_names)
|
|
||||||
result = gen.generate(templates, min_sentences=args.min_sentences)
|
|
||||||
|
|
||||||
if "error" in result and result["error"]:
|
|
||||||
print(f"Error: {result['error']}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
# Write outputs
|
|
||||||
out_dir = Path(args.output_dir)
|
|
||||||
out_dir.mkdir(parents=True, exist_ok=True)
|
|
||||||
|
|
||||||
md_path = out_dir / "test_block.md"
|
|
||||||
html_path = out_dir / "test_block.html"
|
|
||||||
txt_path = out_dir / "test_block.txt"
|
|
||||||
stats_path = out_dir / "test_block_stats.json"
|
|
||||||
|
|
||||||
md_content = format_markdown(result["sentences"])
|
|
||||||
html_content = format_html(result["sentences"])
|
|
||||||
|
|
||||||
md_path.write_text(md_content, encoding="utf-8")
|
|
||||||
html_path.write_text(html_content, encoding="utf-8")
|
|
||||||
txt_path.write_text(html_content, encoding="utf-8")
|
|
||||||
stats_path.write_text(
|
|
||||||
json.dumps(result["stats"], indent=2, default=str), encoding="utf-8"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Print summary
|
|
||||||
stats = result["stats"]
|
|
||||||
print(f"Test block generated:")
|
|
||||||
print(f" Sentences: {stats['total_sentences']}")
|
|
||||||
print(f" Words: {stats['new_words']}")
|
|
||||||
print(f" Entity mentions: {stats['new_entity_mentions']}")
|
|
||||||
print(f" Variation mentions: {stats['new_variation_mentions']}")
|
|
||||||
print(f" New 0->1 entities: {stats['new_distinct_entities_introduced']}")
|
|
||||||
print(f" Projected entity density: {stats['projected_entity_density_pct']}%"
|
|
||||||
f" (target: {stats['target_entity_density_pct']}%)")
|
|
||||||
print(f" Projected variation density: {stats['projected_variation_density_pct']}%"
|
|
||||||
f" (target: {stats['target_variation_density_pct']}%)")
|
|
||||||
print(f"\nFiles written:")
|
|
||||||
print(f" {md_path}")
|
|
||||||
print(f" {html_path}")
|
|
||||||
print(f" {txt_path}")
|
|
||||||
print(f" {stats_path}")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,578 +0,0 @@
|
||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Test Block Prep — Extract Deficit Data for Test Block Generation
|
|
||||||
|
|
||||||
Reads existing content (from competitor_scraper.py output or plain text) and a
|
|
||||||
Cora XLSX report, then calculates all deficit metrics needed to programmatically
|
|
||||||
generate a test block.
|
|
||||||
|
|
||||||
Outputs structured JSON with:
|
|
||||||
- Word count vs target + deficit
|
|
||||||
- Distinct entity count vs target + deficit + list of missing entities
|
|
||||||
- Variation density vs target + deficit (Cora row 46)
|
|
||||||
- Entity density vs target + deficit (Cora row 47)
|
|
||||||
- LSI density vs target + deficit (Cora row 48)
|
|
||||||
- Heading structure deficits
|
|
||||||
- Template generation instructions (slots per sentence, sentence count, etc.)
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python test_block_prep.py <content_path> <cora_xlsx_path>
|
|
||||||
[--format json|text]
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import math
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from cora_parser import CoraReport
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Content parsing
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def parse_scraper_content(file_path: str) -> dict:
|
|
||||||
"""Parse a competitor_scraper.py output file or plain text/markdown.
|
|
||||||
|
|
||||||
Returns dict with: headings, content, word_count, title, meta_description.
|
|
||||||
"""
|
|
||||||
text = Path(file_path).read_text(encoding="utf-8")
|
|
||||||
|
|
||||||
result = {
|
|
||||||
"headings": [],
|
|
||||||
"content": "",
|
|
||||||
"word_count": 0,
|
|
||||||
"title": "",
|
|
||||||
"meta_description": "",
|
|
||||||
}
|
|
||||||
|
|
||||||
if "--- HEADINGS ---" in text and "--- CONTENT ---" in text:
|
|
||||||
headings_start = text.index("--- HEADINGS ---")
|
|
||||||
content_start = text.index("--- CONTENT ---")
|
|
||||||
|
|
||||||
# Parse metadata
|
|
||||||
metadata = text[:headings_start]
|
|
||||||
for line in metadata.splitlines():
|
|
||||||
if line.startswith("Title: "):
|
|
||||||
result["title"] = line[7:].strip()
|
|
||||||
elif line.startswith("Meta Description: "):
|
|
||||||
result["meta_description"] = line[18:].strip()
|
|
||||||
|
|
||||||
# Parse headings
|
|
||||||
headings_text = text[headings_start + len("--- HEADINGS ---"):content_start].strip()
|
|
||||||
for line in headings_text.splitlines():
|
|
||||||
line = line.strip()
|
|
||||||
match = re.match(r"H(\d):\s+(.+)", line)
|
|
||||||
if match:
|
|
||||||
result["headings"].append({
|
|
||||||
"level": int(match.group(1)),
|
|
||||||
"text": match.group(2).strip(),
|
|
||||||
})
|
|
||||||
|
|
||||||
# Parse content
|
|
||||||
result["content"] = text[content_start + len("--- CONTENT ---"):].strip()
|
|
||||||
else:
|
|
||||||
# Plain text/markdown
|
|
||||||
result["content"] = text.strip()
|
|
||||||
for match in re.finditer(r"^(#{1,6})\s+(.+)$", text, re.MULTILINE):
|
|
||||||
result["headings"].append({
|
|
||||||
"level": len(match.group(1)),
|
|
||||||
"text": match.group(2).strip(),
|
|
||||||
})
|
|
||||||
|
|
||||||
words = re.findall(r"[a-zA-Z']+", result["content"])
|
|
||||||
result["word_count"] = len(words)
|
|
||||||
return result
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Counting functions
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def count_entity_mentions(text: str, entities: list[dict]) -> dict:
|
|
||||||
"""Count mentions of each Cora entity in text.
|
|
||||||
|
|
||||||
Returns: per_entity dict, total_mentions, distinct_count.
|
|
||||||
"""
|
|
||||||
per_entity = {}
|
|
||||||
total_mentions = 0
|
|
||||||
distinct_count = 0
|
|
||||||
|
|
||||||
for entity in entities:
|
|
||||||
name = entity["name"]
|
|
||||||
pattern = re.compile(r"\b" + re.escape(name) + r"\b", re.IGNORECASE)
|
|
||||||
count = len(pattern.findall(text))
|
|
||||||
per_entity[name] = count
|
|
||||||
total_mentions += count
|
|
||||||
if count > 0:
|
|
||||||
distinct_count += 1
|
|
||||||
|
|
||||||
return {
|
|
||||||
"per_entity": per_entity,
|
|
||||||
"total_mentions": total_mentions,
|
|
||||||
"distinct_count": distinct_count,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def count_variation_mentions(text: str, variations: list[str]) -> dict:
|
|
||||||
"""Count mentions of each keyword variation in text.
|
|
||||||
|
|
||||||
Returns: per_variation dict, total_mentions.
|
|
||||||
"""
|
|
||||||
per_variation = {}
|
|
||||||
total_mentions = 0
|
|
||||||
|
|
||||||
for var in variations:
|
|
||||||
pattern = re.compile(r"\b" + re.escape(var) + r"\b", re.IGNORECASE)
|
|
||||||
count = len(pattern.findall(text))
|
|
||||||
per_variation[var] = count
|
|
||||||
total_mentions += count
|
|
||||||
|
|
||||||
return {
|
|
||||||
"per_variation": per_variation,
|
|
||||||
"total_mentions": total_mentions,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def count_lsi_mentions(text: str, lsi_keywords: list[dict]) -> dict:
|
|
||||||
"""Count mentions of each LSI keyword in text.
|
|
||||||
|
|
||||||
Returns: per_keyword dict, total_mentions, distinct_count.
|
|
||||||
"""
|
|
||||||
per_keyword = {}
|
|
||||||
total_mentions = 0
|
|
||||||
distinct_count = 0
|
|
||||||
|
|
||||||
for kw_data in lsi_keywords:
|
|
||||||
keyword = kw_data["keyword"]
|
|
||||||
tokens = keyword.strip().split()
|
|
||||||
escaped = [re.escape(t) for t in tokens]
|
|
||||||
pattern_str = r"\b" + r"\s+".join(escaped) + r"\b"
|
|
||||||
pattern = re.compile(pattern_str, re.IGNORECASE)
|
|
||||||
count = len(pattern.findall(text))
|
|
||||||
per_keyword[keyword] = count
|
|
||||||
total_mentions += count
|
|
||||||
if count > 0:
|
|
||||||
distinct_count += 1
|
|
||||||
|
|
||||||
return {
|
|
||||||
"per_keyword": per_keyword,
|
|
||||||
"total_mentions": total_mentions,
|
|
||||||
"distinct_count": distinct_count,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
def count_terms_in_headings(
|
|
||||||
headings: list[dict],
|
|
||||||
entities: list[dict],
|
|
||||||
variations: list[str],
|
|
||||||
) -> dict:
|
|
||||||
"""Count entity and variation mentions in heading text.
|
|
||||||
|
|
||||||
Returns total counts and per-level breakdown.
|
|
||||||
"""
|
|
||||||
all_heading_text = " ".join(h["text"] for h in headings)
|
|
||||||
|
|
||||||
entity_mentions = 0
|
|
||||||
for entity in entities:
|
|
||||||
pattern = re.compile(r"\b" + re.escape(entity["name"]) + r"\b", re.IGNORECASE)
|
|
||||||
entity_mentions += len(pattern.findall(all_heading_text))
|
|
||||||
|
|
||||||
variation_mentions = 0
|
|
||||||
for var in variations:
|
|
||||||
pattern = re.compile(r"\b" + re.escape(var) + r"\b", re.IGNORECASE)
|
|
||||||
variation_mentions += len(pattern.findall(all_heading_text))
|
|
||||||
|
|
||||||
per_level = {}
|
|
||||||
for level in [2, 3]:
|
|
||||||
level_headings = [h for h in headings if h["level"] == level]
|
|
||||||
level_text = " ".join(h["text"] for h in level_headings)
|
|
||||||
|
|
||||||
lev_entity = 0
|
|
||||||
for entity in entities:
|
|
||||||
pattern = re.compile(r"\b" + re.escape(entity["name"]) + r"\b", re.IGNORECASE)
|
|
||||||
lev_entity += len(pattern.findall(level_text))
|
|
||||||
|
|
||||||
lev_var = 0
|
|
||||||
for var in variations:
|
|
||||||
pattern = re.compile(r"\b" + re.escape(var) + r"\b", re.IGNORECASE)
|
|
||||||
lev_var += len(pattern.findall(level_text))
|
|
||||||
|
|
||||||
per_level[f"h{level}"] = {
|
|
||||||
"count": len(level_headings),
|
|
||||||
"entity_mentions": lev_entity,
|
|
||||||
"variation_mentions": lev_var,
|
|
||||||
}
|
|
||||||
|
|
||||||
return {
|
|
||||||
"entity_mentions_total": entity_mentions,
|
|
||||||
"variation_mentions_total": variation_mentions,
|
|
||||||
"per_level": per_level,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Template instruction calculation
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def calculate_template_instructions(
|
|
||||||
current_words: int,
|
|
||||||
current_entity_mentions: int,
|
|
||||||
current_variation_mentions: int,
|
|
||||||
target_entity_density: float,
|
|
||||||
target_variation_density: float,
|
|
||||||
distinct_entity_deficit: int,
|
|
||||||
word_count_deficit: int,
|
|
||||||
) -> dict:
|
|
||||||
"""Calculate template parameters for the generator script.
|
|
||||||
|
|
||||||
Figures out how many words the test block needs, how many slots per
|
|
||||||
sentence, and how many sentences — so the LLM knows what to generate.
|
|
||||||
"""
|
|
||||||
AVG_WORDS_PER_SENTENCE = 15
|
|
||||||
MAX_SLOTS = 5
|
|
||||||
MIN_SLOTS = 2
|
|
||||||
|
|
||||||
current_entity_density = current_entity_mentions / current_words if current_words > 0 else 0
|
|
||||||
current_variation_density = current_variation_mentions / current_words if current_words > 0 else 0
|
|
||||||
|
|
||||||
# Minimum test block size from word count deficit
|
|
||||||
min_words = max(word_count_deficit, 150)
|
|
||||||
|
|
||||||
# Calculate minimum words needed to close entity density gap
|
|
||||||
entity_deficit_pct = target_entity_density - current_entity_density
|
|
||||||
if entity_deficit_pct > 0:
|
|
||||||
# At max internal density (MAX_SLOTS / AVG_WORDS), how many words?
|
|
||||||
max_internal = MAX_SLOTS / AVG_WORDS_PER_SENTENCE
|
|
||||||
if max_internal > target_entity_density:
|
|
||||||
needed = (target_entity_density * current_words - current_entity_mentions)
|
|
||||||
words_for_entity = math.ceil(needed / (max_internal - target_entity_density))
|
|
||||||
min_words = max(min_words, words_for_entity)
|
|
||||||
|
|
||||||
# Same for variation density gap
|
|
||||||
var_deficit_pct = target_variation_density - current_variation_density
|
|
||||||
if var_deficit_pct > 0:
|
|
||||||
max_internal = MAX_SLOTS / AVG_WORDS_PER_SENTENCE
|
|
||||||
if max_internal > target_variation_density:
|
|
||||||
needed = (target_variation_density * current_words - current_variation_mentions)
|
|
||||||
words_for_var = math.ceil(needed / (max_internal - target_variation_density))
|
|
||||||
min_words = max(min_words, words_for_var)
|
|
||||||
|
|
||||||
# If only distinct entities are deficit (densities met), smaller block
|
|
||||||
if entity_deficit_pct <= 0 and var_deficit_pct <= 0 and distinct_entity_deficit > 0:
|
|
||||||
min_words = max(150, distinct_entity_deficit * AVG_WORDS_PER_SENTENCE)
|
|
||||||
|
|
||||||
# Round up to nearest 50
|
|
||||||
target_words = math.ceil(max(min_words, 150) / 50) * 50
|
|
||||||
|
|
||||||
# Required entity mentions in test block
|
|
||||||
if target_entity_density > 0:
|
|
||||||
total_needed = math.ceil(target_entity_density * (current_words + target_words))
|
|
||||||
entity_mentions_needed = max(0, total_needed - current_entity_mentions)
|
|
||||||
else:
|
|
||||||
entity_mentions_needed = max(distinct_entity_deficit, 0)
|
|
||||||
|
|
||||||
# Required variation mentions in test block
|
|
||||||
if target_variation_density > 0:
|
|
||||||
total_needed = math.ceil(target_variation_density * (current_words + target_words))
|
|
||||||
variation_mentions_needed = max(0, total_needed - current_variation_mentions)
|
|
||||||
else:
|
|
||||||
variation_mentions_needed = 0
|
|
||||||
|
|
||||||
# Derive slots per sentence
|
|
||||||
target_sentences = max(1, math.ceil(target_words / AVG_WORDS_PER_SENTENCE))
|
|
||||||
total_slots = entity_mentions_needed + variation_mentions_needed
|
|
||||||
# Overlapping terms count toward both, so reduce estimate
|
|
||||||
total_slots = max(total_slots, entity_mentions_needed)
|
|
||||||
slots_per_sentence = math.ceil(total_slots / target_sentences) if target_sentences > 0 else MIN_SLOTS
|
|
||||||
slots_per_sentence = max(MIN_SLOTS, min(MAX_SLOTS, slots_per_sentence))
|
|
||||||
|
|
||||||
# Number of templates: derived from two factors
|
|
||||||
# 1. Word deficit: how many sentences to fill the word gap
|
|
||||||
word_driven = math.ceil(target_words / AVG_WORDS_PER_SENTENCE)
|
|
||||||
# 2. Entity deficit: how many sentences to introduce all missing entities
|
|
||||||
entity_driven = math.ceil(distinct_entity_deficit / slots_per_sentence) if slots_per_sentence > 0 else 0
|
|
||||||
num_templates = max(word_driven, entity_driven, 5)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"target_word_count": target_words,
|
|
||||||
"num_templates": num_templates,
|
|
||||||
"num_templates_reason": "word_deficit" if word_driven >= entity_driven else "entity_deficit",
|
|
||||||
"slots_per_sentence": slots_per_sentence,
|
|
||||||
"avg_words_per_template": AVG_WORDS_PER_SENTENCE,
|
|
||||||
"entity_mentions_needed": entity_mentions_needed,
|
|
||||||
"variation_mentions_needed": variation_mentions_needed,
|
|
||||||
"rationale": (
|
|
||||||
f"Need ~{entity_mentions_needed} entity mentions and "
|
|
||||||
f"~{variation_mentions_needed} variation mentions "
|
|
||||||
f"across ~{target_words} words. "
|
|
||||||
f"Templates: {num_templates} (driven by {'word deficit' if word_driven >= entity_driven else 'entity deficit'}), "
|
|
||||||
f"{slots_per_sentence} slots each."
|
|
||||||
),
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Main prep function
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def run_prep(content_path: str, cora_xlsx_path: str) -> dict:
|
|
||||||
"""Run the full test block prep analysis."""
|
|
||||||
report = CoraReport(cora_xlsx_path)
|
|
||||||
entities = report.get_entities()
|
|
||||||
lsi_keywords = report.get_lsi_keywords()
|
|
||||||
variations_list = report.get_variations_list()
|
|
||||||
density_targets = report.get_density_targets()
|
|
||||||
content_targets = report.get_content_targets()
|
|
||||||
structure_targets = report.get_structure_targets()
|
|
||||||
word_count_dist = report.get_word_count_distribution()
|
|
||||||
|
|
||||||
# Parse existing content
|
|
||||||
parsed = parse_scraper_content(content_path)
|
|
||||||
content_text = parsed["content"]
|
|
||||||
current_words = parsed["word_count"]
|
|
||||||
headings = parsed["headings"]
|
|
||||||
|
|
||||||
# --- Word count ---
|
|
||||||
cluster_target = word_count_dist.get("cluster_target", 0)
|
|
||||||
wc_target = cluster_target if cluster_target else word_count_dist.get("average", 0)
|
|
||||||
wc_deficit = max(0, wc_target - current_words)
|
|
||||||
|
|
||||||
# --- Entity counts ---
|
|
||||||
entity_data = count_entity_mentions(content_text, entities)
|
|
||||||
distinct_target = content_targets.get("distinct_entities", {}).get("target", 0)
|
|
||||||
distinct_deficit = max(0, distinct_target - entity_data["distinct_count"])
|
|
||||||
|
|
||||||
# Missing entities (0 count, sorted by relevance)
|
|
||||||
missing_entities = []
|
|
||||||
for entity in entities:
|
|
||||||
if entity_data["per_entity"].get(entity["name"], 0) == 0:
|
|
||||||
missing_entities.append({
|
|
||||||
"name": entity["name"],
|
|
||||||
"relevance": entity.get("relevance") or 0,
|
|
||||||
"type": entity.get("type", ""),
|
|
||||||
})
|
|
||||||
missing_entities.sort(key=lambda e: e["relevance"], reverse=True)
|
|
||||||
|
|
||||||
# --- Variation counts ---
|
|
||||||
variation_data = count_variation_mentions(content_text, variations_list)
|
|
||||||
|
|
||||||
# --- LSI counts ---
|
|
||||||
lsi_data = count_lsi_mentions(content_text, lsi_keywords)
|
|
||||||
|
|
||||||
# --- Density calculations ---
|
|
||||||
cur_entity_d = entity_data["total_mentions"] / current_words if current_words else 0
|
|
||||||
cur_var_d = variation_data["total_mentions"] / current_words if current_words else 0
|
|
||||||
cur_lsi_d = lsi_data["total_mentions"] / current_words if current_words else 0
|
|
||||||
|
|
||||||
tgt_entity_d = density_targets.get("entity_density", {}).get("avg") or 0
|
|
||||||
tgt_var_d = density_targets.get("variation_density", {}).get("avg") or 0
|
|
||||||
tgt_lsi_d = density_targets.get("lsi_density", {}).get("avg") or 0
|
|
||||||
|
|
||||||
# --- Heading analysis ---
|
|
||||||
heading_data = count_terms_in_headings(headings, entities, variations_list)
|
|
||||||
h2_target = structure_targets.get("h2", {}).get("count", {}).get("target", 0)
|
|
||||||
h3_target = structure_targets.get("h3", {}).get("count", {}).get("target", 0)
|
|
||||||
h2_current = heading_data["per_level"].get("h2", {}).get("count", 0)
|
|
||||||
h3_current = heading_data["per_level"].get("h3", {}).get("count", 0)
|
|
||||||
|
|
||||||
all_h_var_target = structure_targets.get("all_h_tags", {}).get("variations", {}).get("target", 0)
|
|
||||||
all_h_ent_target = structure_targets.get("all_h_tags", {}).get("entities", {}).get("target", 0)
|
|
||||||
|
|
||||||
# --- Template instructions ---
|
|
||||||
template_inst = calculate_template_instructions(
|
|
||||||
current_words=current_words,
|
|
||||||
current_entity_mentions=entity_data["total_mentions"],
|
|
||||||
current_variation_mentions=variation_data["total_mentions"],
|
|
||||||
target_entity_density=tgt_entity_d,
|
|
||||||
target_variation_density=tgt_var_d,
|
|
||||||
distinct_entity_deficit=distinct_deficit,
|
|
||||||
word_count_deficit=wc_deficit,
|
|
||||||
)
|
|
||||||
|
|
||||||
return {
|
|
||||||
"search_term": report.get_search_term(),
|
|
||||||
"content_file": content_path,
|
|
||||||
"word_count": {
|
|
||||||
"current": current_words,
|
|
||||||
"target": wc_target,
|
|
||||||
"deficit": wc_deficit,
|
|
||||||
"status": "meets_target" if wc_deficit == 0 else "below_target",
|
|
||||||
},
|
|
||||||
"distinct_entities": {
|
|
||||||
"current": entity_data["distinct_count"],
|
|
||||||
"target": distinct_target,
|
|
||||||
"deficit": distinct_deficit,
|
|
||||||
"total_tracked": len(entities),
|
|
||||||
"missing_entities": missing_entities,
|
|
||||||
},
|
|
||||||
"entity_density": {
|
|
||||||
"current_pct": round(cur_entity_d * 100, 2),
|
|
||||||
"target_pct": round(tgt_entity_d * 100, 2),
|
|
||||||
"deficit_pct": round(max(0, tgt_entity_d - cur_entity_d) * 100, 2),
|
|
||||||
"current_mentions": entity_data["total_mentions"],
|
|
||||||
"target_decimal": tgt_entity_d,
|
|
||||||
"current_decimal": cur_entity_d,
|
|
||||||
"status": "meets_target" if cur_entity_d >= tgt_entity_d else "below_target",
|
|
||||||
},
|
|
||||||
"variation_density": {
|
|
||||||
"current_pct": round(cur_var_d * 100, 2),
|
|
||||||
"target_pct": round(tgt_var_d * 100, 2),
|
|
||||||
"deficit_pct": round(max(0, tgt_var_d - cur_var_d) * 100, 2),
|
|
||||||
"current_mentions": variation_data["total_mentions"],
|
|
||||||
"target_decimal": tgt_var_d,
|
|
||||||
"current_decimal": cur_var_d,
|
|
||||||
"status": "meets_target" if cur_var_d >= tgt_var_d else "below_target",
|
|
||||||
},
|
|
||||||
"lsi_density": {
|
|
||||||
"current_pct": round(cur_lsi_d * 100, 2),
|
|
||||||
"target_pct": round(tgt_lsi_d * 100, 2),
|
|
||||||
"deficit_pct": round(max(0, tgt_lsi_d - cur_lsi_d) * 100, 2),
|
|
||||||
"current_mentions": lsi_data["total_mentions"],
|
|
||||||
"target_decimal": tgt_lsi_d,
|
|
||||||
"current_decimal": cur_lsi_d,
|
|
||||||
"status": "meets_target" if cur_lsi_d >= tgt_lsi_d else "below_target",
|
|
||||||
},
|
|
||||||
"headings": {
|
|
||||||
"h2": {
|
|
||||||
"current": h2_current,
|
|
||||||
"target": h2_target,
|
|
||||||
"deficit": max(0, h2_target - h2_current),
|
|
||||||
},
|
|
||||||
"h3": {
|
|
||||||
"current": h3_current,
|
|
||||||
"target": h3_target,
|
|
||||||
"deficit": max(0, h3_target - h3_current),
|
|
||||||
},
|
|
||||||
"variations_in_headings": {
|
|
||||||
"current": heading_data["variation_mentions_total"],
|
|
||||||
"target": all_h_var_target,
|
|
||||||
"deficit": max(0, all_h_var_target - heading_data["variation_mentions_total"]),
|
|
||||||
},
|
|
||||||
"entities_in_headings": {
|
|
||||||
"current": heading_data["entity_mentions_total"],
|
|
||||||
"target": all_h_ent_target,
|
|
||||||
"deficit": max(0, all_h_ent_target - heading_data["entity_mentions_total"]),
|
|
||||||
},
|
|
||||||
},
|
|
||||||
"template_instructions": template_inst,
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Output formatting
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def format_text_report(data: dict) -> str:
|
|
||||||
"""Format prep data as a human-readable text report."""
|
|
||||||
lines = []
|
|
||||||
sep = "=" * 65
|
|
||||||
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append(f" TEST BLOCK PREP — {data['search_term']}")
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Word count
|
|
||||||
wc = data["word_count"]
|
|
||||||
lines.append("WORD COUNT")
|
|
||||||
lines.append(f" Current: {wc['current']} | Target: {wc['target']} | Deficit: {wc['deficit']} [{wc['status']}]")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Distinct entities
|
|
||||||
de = data["distinct_entities"]
|
|
||||||
lines.append("DISTINCT ENTITIES")
|
|
||||||
lines.append(f" Current: {de['current']} | Target: {de['target']} | Deficit: {de['deficit']} (of {de['total_tracked']} tracked)")
|
|
||||||
if de["missing_entities"]:
|
|
||||||
lines.append(f" Top missing (0->1):")
|
|
||||||
for ent in de["missing_entities"][:15]:
|
|
||||||
lines.append(f" - {ent['name']} (relevance: {ent['relevance']}, type: {ent['type']})")
|
|
||||||
remaining = len(de["missing_entities"]) - 15
|
|
||||||
if remaining > 0:
|
|
||||||
lines.append(f" ... and {remaining} more")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Entity density
|
|
||||||
ed = data["entity_density"]
|
|
||||||
lines.append("ENTITY DENSITY (Cora row 47)")
|
|
||||||
lines.append(f" Current: {ed['current_pct']}% | Target: {ed['target_pct']}% | Deficit: {ed['deficit_pct']}% [{ed['status']}]")
|
|
||||||
lines.append(f" Current mentions: {ed['current_mentions']}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Variation density
|
|
||||||
vd = data["variation_density"]
|
|
||||||
lines.append("VARIATION DENSITY (Cora row 46)")
|
|
||||||
lines.append(f" Current: {vd['current_pct']}% | Target: {vd['target_pct']}% | Deficit: {vd['deficit_pct']}% [{vd['status']}]")
|
|
||||||
lines.append(f" Current mentions: {vd['current_mentions']}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# LSI density
|
|
||||||
ld = data["lsi_density"]
|
|
||||||
lines.append("LSI DENSITY (Cora row 48)")
|
|
||||||
lines.append(f" Current: {ld['current_pct']}% | Target: {ld['target_pct']}% | Deficit: {ld['deficit_pct']}% [{ld['status']}]")
|
|
||||||
lines.append(f" Current mentions: {ld['current_mentions']}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Headings
|
|
||||||
hd = data["headings"]
|
|
||||||
lines.append("HEADING DEFICITS")
|
|
||||||
lines.append(f" H2: {hd['h2']['current']} current / {hd['h2']['target']} target -- deficit {hd['h2']['deficit']}")
|
|
||||||
lines.append(f" H3: {hd['h3']['current']} current / {hd['h3']['target']} target -- deficit {hd['h3']['deficit']}")
|
|
||||||
lines.append(f" Variations in headings: {hd['variations_in_headings']['current']} / {hd['variations_in_headings']['target']} -- deficit {hd['variations_in_headings']['deficit']}")
|
|
||||||
lines.append(f" Entities in headings: {hd['entities_in_headings']['current']} / {hd['entities_in_headings']['target']} -- deficit {hd['entities_in_headings']['deficit']}")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Template instructions
|
|
||||||
ti = data["template_instructions"]
|
|
||||||
lines.append("TEMPLATE INSTRUCTIONS")
|
|
||||||
lines.append(f" {ti['rationale']}")
|
|
||||||
lines.append(f" >> Generate {ti['num_templates']} templates, ~{ti['avg_words_per_template']} words each, {ti['slots_per_sentence']} slots per template")
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
lines.append(sep)
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# CLI
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
description="Extract deficit data for test block generation.",
|
|
||||||
)
|
|
||||||
parser.add_argument("content_path", help="Path to scraper output or content file")
|
|
||||||
parser.add_argument("cora_xlsx_path", help="Path to Cora XLSX report")
|
|
||||||
parser.add_argument(
|
|
||||||
"--format", choices=["json", "text"], default="text",
|
|
||||||
help="Output format (default: text)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--output", "-o", default=None,
|
|
||||||
help="Write output to file instead of stdout",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
try:
|
|
||||||
data = run_prep(args.content_path, args.cora_xlsx_path)
|
|
||||||
except FileNotFoundError as e:
|
|
||||||
print(f"Error: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
if args.format == "json":
|
|
||||||
output = json.dumps(data, indent=2, default=str)
|
|
||||||
else:
|
|
||||||
output = format_text_report(data)
|
|
||||||
|
|
||||||
if args.output:
|
|
||||||
Path(args.output).write_text(output, encoding="utf-8")
|
|
||||||
print(f"Written to {args.output}", file=sys.stderr)
|
|
||||||
else:
|
|
||||||
print(output)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,378 +0,0 @@
|
||||||
#!/usr/bin/env python3
|
|
||||||
"""
|
|
||||||
Test Block Validator — Before/After Comparison
|
|
||||||
|
|
||||||
Runs the same deficit analysis from test_block_prep.py on:
|
|
||||||
1. Existing content alone (before)
|
|
||||||
2. Existing content + test block (after)
|
|
||||||
|
|
||||||
Produces a deterministic comparison showing exactly how each metric changed.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run --with openpyxl python test_block_validate.py <content_path> <test_block_path> <cora_xlsx_path>
|
|
||||||
[--format json|text] [--output PATH]
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import re
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from cora_parser import CoraReport
|
|
||||||
from test_block_prep import (
|
|
||||||
parse_scraper_content,
|
|
||||||
count_entity_mentions,
|
|
||||||
count_variation_mentions,
|
|
||||||
count_lsi_mentions,
|
|
||||||
count_terms_in_headings,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def extract_test_block_text(file_path: str) -> str:
|
|
||||||
"""Read test block file and return the text content.
|
|
||||||
|
|
||||||
Strips HTML tags and test block markers. Returns plain text for counting.
|
|
||||||
"""
|
|
||||||
text = Path(file_path).read_text(encoding="utf-8")
|
|
||||||
|
|
||||||
# Remove test block markers
|
|
||||||
text = text.replace("<!-- HIDDEN TEST BLOCK START -->", "")
|
|
||||||
text = text.replace("<!-- HIDDEN TEST BLOCK END -->", "")
|
|
||||||
|
|
||||||
# Remove HTML tags
|
|
||||||
text = re.sub(r"<[^>]+>", " ", text)
|
|
||||||
|
|
||||||
# Remove markdown heading markers
|
|
||||||
text = re.sub(r"^#{1,6}\s+", "", text, flags=re.MULTILINE)
|
|
||||||
|
|
||||||
return text.strip()
|
|
||||||
|
|
||||||
|
|
||||||
def extract_test_block_headings(file_path: str) -> list[dict]:
|
|
||||||
"""Extract heading structure from test block (HTML or markdown)."""
|
|
||||||
text = Path(file_path).read_text(encoding="utf-8")
|
|
||||||
headings = []
|
|
||||||
|
|
||||||
# Try HTML headings first
|
|
||||||
for match in re.finditer(r"<h(\d)>(.+?)</h\d>", text, re.IGNORECASE):
|
|
||||||
headings.append({
|
|
||||||
"level": int(match.group(1)),
|
|
||||||
"text": match.group(2).strip(),
|
|
||||||
})
|
|
||||||
|
|
||||||
# If no HTML headings, try markdown
|
|
||||||
if not headings:
|
|
||||||
for match in re.finditer(r"^(#{1,6})\s+(.+)$", text, re.MULTILINE):
|
|
||||||
headings.append({
|
|
||||||
"level": len(match.group(1)),
|
|
||||||
"text": match.group(2).strip(),
|
|
||||||
})
|
|
||||||
|
|
||||||
return headings
|
|
||||||
|
|
||||||
|
|
||||||
def run_validation(
|
|
||||||
content_path: str,
|
|
||||||
test_block_path: str,
|
|
||||||
cora_xlsx_path: str,
|
|
||||||
) -> dict:
|
|
||||||
"""Run before/after validation.
|
|
||||||
|
|
||||||
Returns dict with: before, after, delta, targets, status.
|
|
||||||
"""
|
|
||||||
report = CoraReport(cora_xlsx_path)
|
|
||||||
entities = report.get_entities()
|
|
||||||
lsi_keywords = report.get_lsi_keywords()
|
|
||||||
variations_list = report.get_variations_list()
|
|
||||||
density_targets = report.get_density_targets()
|
|
||||||
content_targets = report.get_content_targets()
|
|
||||||
structure_targets = report.get_structure_targets()
|
|
||||||
word_count_dist = report.get_word_count_distribution()
|
|
||||||
|
|
||||||
# --- Parse existing content ---
|
|
||||||
parsed = parse_scraper_content(content_path)
|
|
||||||
existing_text = parsed["content"]
|
|
||||||
existing_headings = parsed["headings"]
|
|
||||||
|
|
||||||
# --- Parse test block ---
|
|
||||||
block_text = extract_test_block_text(test_block_path)
|
|
||||||
block_headings = extract_test_block_headings(test_block_path)
|
|
||||||
|
|
||||||
# --- Combined ---
|
|
||||||
combined_text = existing_text + "\n\n" + block_text
|
|
||||||
combined_headings = existing_headings + block_headings
|
|
||||||
|
|
||||||
# --- Count words ---
|
|
||||||
count_words = lambda t: len(re.findall(r"[a-zA-Z']+", t))
|
|
||||||
before_words = count_words(existing_text)
|
|
||||||
block_words = count_words(block_text)
|
|
||||||
after_words = count_words(combined_text)
|
|
||||||
|
|
||||||
# --- Count entities ---
|
|
||||||
before_ent = count_entity_mentions(existing_text, entities)
|
|
||||||
after_ent = count_entity_mentions(combined_text, entities)
|
|
||||||
|
|
||||||
# --- Count variations ---
|
|
||||||
before_var = count_variation_mentions(existing_text, variations_list)
|
|
||||||
after_var = count_variation_mentions(combined_text, variations_list)
|
|
||||||
|
|
||||||
# --- Count LSI ---
|
|
||||||
before_lsi = count_lsi_mentions(existing_text, lsi_keywords)
|
|
||||||
after_lsi = count_lsi_mentions(combined_text, lsi_keywords)
|
|
||||||
|
|
||||||
# --- Heading analysis ---
|
|
||||||
before_hdg = count_terms_in_headings(existing_headings, entities, variations_list)
|
|
||||||
after_hdg = count_terms_in_headings(combined_headings, entities, variations_list)
|
|
||||||
|
|
||||||
# --- Targets ---
|
|
||||||
tgt_entity_d = density_targets.get("entity_density", {}).get("avg") or 0
|
|
||||||
tgt_var_d = density_targets.get("variation_density", {}).get("avg") or 0
|
|
||||||
tgt_lsi_d = density_targets.get("lsi_density", {}).get("avg") or 0
|
|
||||||
distinct_target = content_targets.get("distinct_entities", {}).get("target", 0)
|
|
||||||
cluster_target = word_count_dist.get("cluster_target", 0)
|
|
||||||
wc_target = cluster_target if cluster_target else word_count_dist.get("average", 0)
|
|
||||||
|
|
||||||
h2_target = structure_targets.get("h2", {}).get("count", {}).get("target", 0)
|
|
||||||
h3_target = structure_targets.get("h3", {}).get("count", {}).get("target", 0)
|
|
||||||
|
|
||||||
# --- Build comparison ---
|
|
||||||
def density(mentions, words):
|
|
||||||
return mentions / words if words > 0 else 0
|
|
||||||
|
|
||||||
def pct(d):
|
|
||||||
return round(d * 100, 2)
|
|
||||||
|
|
||||||
# Find new 0->1 entities
|
|
||||||
new_entities = []
|
|
||||||
for name, after_count in after_ent["per_entity"].items():
|
|
||||||
before_count = before_ent["per_entity"].get(name, 0)
|
|
||||||
if before_count == 0 and after_count > 0:
|
|
||||||
new_entities.append(name)
|
|
||||||
|
|
||||||
before_h2 = len([h for h in existing_headings if h["level"] == 2])
|
|
||||||
after_h2 = len([h for h in combined_headings if h["level"] == 2])
|
|
||||||
before_h3 = len([h for h in existing_headings if h["level"] == 3])
|
|
||||||
after_h3 = len([h for h in combined_headings if h["level"] == 3])
|
|
||||||
|
|
||||||
return {
|
|
||||||
"search_term": report.get_search_term(),
|
|
||||||
"test_block_words": block_words,
|
|
||||||
"word_count": {
|
|
||||||
"before": before_words,
|
|
||||||
"after": after_words,
|
|
||||||
"target": wc_target,
|
|
||||||
"before_status": "meets" if before_words >= wc_target else "below",
|
|
||||||
"after_status": "meets" if after_words >= wc_target else "below",
|
|
||||||
},
|
|
||||||
"distinct_entities": {
|
|
||||||
"before": before_ent["distinct_count"],
|
|
||||||
"after": after_ent["distinct_count"],
|
|
||||||
"target": distinct_target,
|
|
||||||
"new_0_to_1": len(new_entities),
|
|
||||||
"new_entity_names": sorted(new_entities),
|
|
||||||
"before_status": "meets" if before_ent["distinct_count"] >= distinct_target else "below",
|
|
||||||
"after_status": "meets" if after_ent["distinct_count"] >= distinct_target else "below",
|
|
||||||
},
|
|
||||||
"entity_density": {
|
|
||||||
"before_pct": pct(density(before_ent["total_mentions"], before_words)),
|
|
||||||
"after_pct": pct(density(after_ent["total_mentions"], after_words)),
|
|
||||||
"target_pct": pct(tgt_entity_d),
|
|
||||||
"before_mentions": before_ent["total_mentions"],
|
|
||||||
"after_mentions": after_ent["total_mentions"],
|
|
||||||
"delta_mentions": after_ent["total_mentions"] - before_ent["total_mentions"],
|
|
||||||
"before_status": "meets" if density(before_ent["total_mentions"], before_words) >= tgt_entity_d else "below",
|
|
||||||
"after_status": "meets" if density(after_ent["total_mentions"], after_words) >= tgt_entity_d else "below",
|
|
||||||
},
|
|
||||||
"variation_density": {
|
|
||||||
"before_pct": pct(density(before_var["total_mentions"], before_words)),
|
|
||||||
"after_pct": pct(density(after_var["total_mentions"], after_words)),
|
|
||||||
"target_pct": pct(tgt_var_d),
|
|
||||||
"before_mentions": before_var["total_mentions"],
|
|
||||||
"after_mentions": after_var["total_mentions"],
|
|
||||||
"delta_mentions": after_var["total_mentions"] - before_var["total_mentions"],
|
|
||||||
"before_status": "meets" if density(before_var["total_mentions"], before_words) >= tgt_var_d else "below",
|
|
||||||
"after_status": "meets" if density(after_var["total_mentions"], after_words) >= tgt_var_d else "below",
|
|
||||||
},
|
|
||||||
"lsi_density": {
|
|
||||||
"before_pct": pct(density(before_lsi["total_mentions"], before_words)),
|
|
||||||
"after_pct": pct(density(after_lsi["total_mentions"], after_words)),
|
|
||||||
"target_pct": pct(tgt_lsi_d),
|
|
||||||
"before_mentions": before_lsi["total_mentions"],
|
|
||||||
"after_mentions": after_lsi["total_mentions"],
|
|
||||||
"delta_mentions": after_lsi["total_mentions"] - before_lsi["total_mentions"],
|
|
||||||
"before_status": "meets" if density(before_lsi["total_mentions"], before_words) >= tgt_lsi_d else "below",
|
|
||||||
"after_status": "meets" if density(after_lsi["total_mentions"], after_words) >= tgt_lsi_d else "below",
|
|
||||||
},
|
|
||||||
"headings": {
|
|
||||||
"h2": {
|
|
||||||
"before": before_h2,
|
|
||||||
"after": after_h2,
|
|
||||||
"target": h2_target,
|
|
||||||
},
|
|
||||||
"h3": {
|
|
||||||
"before": before_h3,
|
|
||||||
"after": after_h3,
|
|
||||||
"target": h3_target,
|
|
||||||
},
|
|
||||||
"entities_in_headings": {
|
|
||||||
"before": before_hdg["entity_mentions_total"],
|
|
||||||
"after": after_hdg["entity_mentions_total"],
|
|
||||||
},
|
|
||||||
"variations_in_headings": {
|
|
||||||
"before": before_hdg["variation_mentions_total"],
|
|
||||||
"after": after_hdg["variation_mentions_total"],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Output formatting
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def format_text_report(data: dict) -> str:
|
|
||||||
"""Format validation as a human-readable before/after comparison."""
|
|
||||||
lines = []
|
|
||||||
sep = "=" * 70
|
|
||||||
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append(f" TEST BLOCK VALIDATION -- {data['search_term']}")
|
|
||||||
lines.append(f" Test block added {data['test_block_words']} words")
|
|
||||||
lines.append(sep)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Helper for status indicator
|
|
||||||
def status(s):
|
|
||||||
return "[OK]" if s == "meets" else "[!!]"
|
|
||||||
|
|
||||||
# Word count
|
|
||||||
wc = data["word_count"]
|
|
||||||
lines.append(f" {'METRIC':<30} {'BEFORE':>10} {'AFTER':>10} {'TARGET':>10} {'STATUS':>8}")
|
|
||||||
lines.append(f" {'-'*30} {'-'*10} {'-'*10} {'-'*10} {'-'*8}")
|
|
||||||
|
|
||||||
lines.append(
|
|
||||||
f" {'Word count':<30} {wc['before']:>10} {wc['after']:>10} "
|
|
||||||
f"{wc['target']:>10} {status(wc['after_status']):>8}"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Distinct entities
|
|
||||||
de = data["distinct_entities"]
|
|
||||||
lines.append(
|
|
||||||
f" {'Distinct entities':<30} {de['before']:>10} {de['after']:>10} "
|
|
||||||
f"{de['target']:>10} {status(de['after_status']):>8}"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Entity density
|
|
||||||
ed = data["entity_density"]
|
|
||||||
lines.append(
|
|
||||||
f" {'Entity density %':<30} {ed['before_pct']:>9}% {ed['after_pct']:>9}% "
|
|
||||||
f"{ed['target_pct']:>9}% {status(ed['after_status']):>8}"
|
|
||||||
)
|
|
||||||
|
|
||||||
# Variation density
|
|
||||||
vd = data["variation_density"]
|
|
||||||
lines.append(
|
|
||||||
f" {'Variation density %':<30} {vd['before_pct']:>9}% {vd['after_pct']:>9}% "
|
|
||||||
f"{vd['target_pct']:>9}% {status(vd['after_status']):>8}"
|
|
||||||
)
|
|
||||||
|
|
||||||
# LSI density
|
|
||||||
ld = data["lsi_density"]
|
|
||||||
lines.append(
|
|
||||||
f" {'LSI density %':<30} {ld['before_pct']:>9}% {ld['after_pct']:>9}% "
|
|
||||||
f"{ld['target_pct']:>9}% {status(ld['after_status']):>8}"
|
|
||||||
)
|
|
||||||
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Mention counts
|
|
||||||
lines.append(f" {'MENTION COUNTS':<30} {'BEFORE':>10} {'AFTER':>10} {'DELTA':>10}")
|
|
||||||
lines.append(f" {'-'*30} {'-'*10} {'-'*10} {'-'*10}")
|
|
||||||
lines.append(
|
|
||||||
f" {'Entity mentions':<30} {ed['before_mentions']:>10} "
|
|
||||||
f"{ed['after_mentions']:>10} {'+' + str(ed['delta_mentions']):>10}"
|
|
||||||
)
|
|
||||||
lines.append(
|
|
||||||
f" {'Variation mentions':<30} {vd['before_mentions']:>10} "
|
|
||||||
f"{vd['after_mentions']:>10} {'+' + str(vd['delta_mentions']):>10}"
|
|
||||||
)
|
|
||||||
lines.append(
|
|
||||||
f" {'LSI mentions':<30} {ld['before_mentions']:>10} "
|
|
||||||
f"{ld['after_mentions']:>10} {'+' + str(ld['delta_mentions']):>10}"
|
|
||||||
)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# Headings
|
|
||||||
hd = data["headings"]
|
|
||||||
lines.append(f" {'HEADINGS':<30} {'BEFORE':>10} {'AFTER':>10} {'TARGET':>10}")
|
|
||||||
lines.append(f" {'-'*30} {'-'*10} {'-'*10} {'-'*10}")
|
|
||||||
lines.append(f" {'H2 count':<30} {hd['h2']['before']:>10} {hd['h2']['after']:>10} {hd['h2']['target']:>10}")
|
|
||||||
lines.append(f" {'H3 count':<30} {hd['h3']['before']:>10} {hd['h3']['after']:>10} {hd['h3']['target']:>10}")
|
|
||||||
lines.append(
|
|
||||||
f" {'Entities in headings':<30} {hd['entities_in_headings']['before']:>10} "
|
|
||||||
f"{hd['entities_in_headings']['after']:>10}"
|
|
||||||
)
|
|
||||||
lines.append(
|
|
||||||
f" {'Variations in headings':<30} {hd['variations_in_headings']['before']:>10} "
|
|
||||||
f"{hd['variations_in_headings']['after']:>10}"
|
|
||||||
)
|
|
||||||
lines.append("")
|
|
||||||
|
|
||||||
# New entities
|
|
||||||
de = data["distinct_entities"]
|
|
||||||
if de["new_entity_names"]:
|
|
||||||
lines.append(f" NEW ENTITIES INTRODUCED (0->1): {de['new_0_to_1']}")
|
|
||||||
for name in de["new_entity_names"]:
|
|
||||||
lines.append(f" + {name}")
|
|
||||||
lines.append("")
|
|
||||||
lines.append(sep)
|
|
||||||
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# CLI
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(
|
|
||||||
description="Validate a test block with before/after comparison.",
|
|
||||||
)
|
|
||||||
parser.add_argument("content_path", help="Path to existing content (scraper output)")
|
|
||||||
parser.add_argument("test_block_path", help="Path to test block (.md or .html)")
|
|
||||||
parser.add_argument("cora_xlsx_path", help="Path to Cora XLSX report")
|
|
||||||
parser.add_argument(
|
|
||||||
"--format", choices=["json", "text"], default="text",
|
|
||||||
help="Output format (default: text)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--output", "-o", default=None,
|
|
||||||
help="Write output to file instead of stdout",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
try:
|
|
||||||
data = run_validation(args.content_path, args.test_block_path, args.cora_xlsx_path)
|
|
||||||
except FileNotFoundError as e:
|
|
||||||
print(f"Error: {e}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
if args.format == "json":
|
|
||||||
output = json.dumps(data, indent=2, default=str)
|
|
||||||
else:
|
|
||||||
output = format_text_report(data)
|
|
||||||
|
|
||||||
if args.output:
|
|
||||||
Path(args.output).write_text(output, encoding="utf-8")
|
|
||||||
print(f"Written to {args.output}", file=sys.stderr)
|
|
||||||
else:
|
|
||||||
# Handle Windows encoding
|
|
||||||
try:
|
|
||||||
print(output)
|
|
||||||
except UnicodeEncodeError:
|
|
||||||
sys.stdout.buffer.write(output.encode("utf-8"))
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,583 +0,0 @@
|
||||||
---
|
|
||||||
name: content-researcher
|
|
||||||
description: Research, outline, draft, and optimize SEO web content (service pages, blog posts, product pages) against Cora SEO reports. Create new content. Entity, LSI, and keyword density optimization. Generate entity test blocks (hidden divs).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
# Content Research & Creation Skill
|
|
||||||
|
|
||||||
Write and optimize SEO web content — service pages, blog posts, product pages, landing pages. Covers the full pipeline: competitor research, outline, drafting, and quantitative optimization against a Cora SEO report (XLSX).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Invocation
|
|
||||||
|
|
||||||
Use this skill when the user asks to write, research, outline, draft, or optimize web content. Common triggers:
|
|
||||||
|
|
||||||
- "Write a service page about [topic]"
|
|
||||||
- "Let's work on the [topic] page"
|
|
||||||
- "Create content about [topic] for [company]"
|
|
||||||
- "I have a Cora report for [keyword]"
|
|
||||||
- "Optimize this page against the Cora report"
|
|
||||||
- "Help me build an outline for [topic]"
|
|
||||||
- "Research [topic] and write an article"
|
|
||||||
- Any mention of writing web pages, blog posts, or SEO content for a website
|
|
||||||
|
|
||||||
**Routing logic — ask two questions up front:**
|
|
||||||
|
|
||||||
1. "Do you have a Cora report (XLSX) for this keyword?"
|
|
||||||
2. "Do you have existing content to optimize?" (could be a URL to a live page, pasted text, or a file path)
|
|
||||||
|
|
||||||
| Cora report? | Existing content? | Start at |
|
|
||||||
|--------------|-------------------|----------|
|
|
||||||
| No | No | Phase 1, Step 1 (full research → draft workflow) |
|
|
||||||
| Yes | No | Phase 1, Step 1 (research → outline using Cora targets → draft → optimize) |
|
|
||||||
| Yes | Yes | Phase 2, Step 6 (load Cora, optimize existing content) |
|
|
||||||
| No | Yes | Ask user to generate the Cora report first — optimization without Cora targets is guesswork |
|
|
||||||
|
|
||||||
**Existing content from a URL:** If the user provides a URL to a live page (e.g. their WordPress site), **always use the BS4 competitor scraper** to pull the content — never `web_fetch`. The `web_fetch` tool runs content through an AI summarization layer that loses heading structure, drops sections, and can hallucinate product details. The scraper returns the actual HTML heading hierarchy and verbatim text.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with requests,beautifulsoup4 python competitor_scraper.py "URL" --output-dir ./working/competitor_content/
|
|
||||||
```
|
|
||||||
|
|
||||||
Read the output file, then use the scraped heading structure and body text to build `./working/draft.md`. Preserve the original text verbatim — do not paraphrase or summarize product descriptions, specifications, or technical details. Only restructure headings and add entity/LSI terms where needed for optimization. The user does NOT need to paste or save the content manually.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 1: Research & First Draft
|
|
||||||
|
|
||||||
### Step 1 — Topic Input
|
|
||||||
|
|
||||||
Collect from the user:
|
|
||||||
- **Required:** Topic or keyword
|
|
||||||
- **Optional:** Competitor URLs to examine, industry context, pasted research they've already done, target audience
|
|
||||||
- **For service pages:** Company name, what services/capabilities they actually offer, what they do NOT offer. This prevents writing claims about capabilities the company doesn't have. Ask explicitly: "Is this a service page? If so, what does the company offer and what should I avoid mentioning?"
|
|
||||||
|
|
||||||
For informational/educational articles, company details are less critical — the content is about the topic, not the company. For service pages, company context is mandatory before drafting.
|
|
||||||
|
|
||||||
If the user provides their own research (pasted text, notes, URLs), use that as the primary input. Do not redo research the user has already done.
|
|
||||||
|
|
||||||
### Step 2 — Competitor Research
|
|
||||||
|
|
||||||
Research what competitors are publishing on this topic. Three modes depending on user input:
|
|
||||||
|
|
||||||
**Mode A — Claude researches (default):**
|
|
||||||
Use `web_search` to find the top competitor content for the topic. Use the BS4 competitor scraper (not `web_fetch`) to read the most relevant 5-10 results — this preserves accurate heading structure and verbatim text. Focus on:
|
|
||||||
- What subtopics they cover
|
|
||||||
- How they structure their content (H2/H3 breakdown)
|
|
||||||
- What angles or claims they make
|
|
||||||
- What they leave out (gaps)
|
|
||||||
|
|
||||||
**Mode B — User provides URLs:**
|
|
||||||
If the user gives specific URLs, use the competitor scraper to bulk-fetch them:
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with requests,beautifulsoup4 python competitor_scraper.py URL1 URL2 URL3 --output-dir ./working/competitor_content/
|
|
||||||
```
|
|
||||||
Then read the output files and analyze them.
|
|
||||||
|
|
||||||
**Mode C — User provides research:**
|
|
||||||
If the user pastes in research, notes, or analysis, skip scraping and work from what they gave you.
|
|
||||||
|
|
||||||
**Output:** Write a research summary covering:
|
|
||||||
1. Common themes across competitors (what everyone covers)
|
|
||||||
2. Content structure patterns (how they organize it)
|
|
||||||
3. Key entities, terms, and concepts mentioned repeatedly
|
|
||||||
4. Gaps — what competitors miss or cover poorly
|
|
||||||
5. Potential unique angles
|
|
||||||
|
|
||||||
Save the research summary to `./working/research_summary.md`.
|
|
||||||
|
|
||||||
### Step 3 — Build Outline
|
|
||||||
|
|
||||||
Using the research summary, build a structured outline:
|
|
||||||
|
|
||||||
1. **Generate fan-out queries** — Before structuring the outline, generate 10-15 search queries you would use to thoroughly research this topic. These are the natural "next searches" someone would run after the primary keyword — questions, comparisons, material/process specifics, use-case queries. Examples for "cnc swiss screw machining":
|
|
||||||
- "what is swiss screw machining"
|
|
||||||
- "swiss screw machining vs cnc turning"
|
|
||||||
- "swiss machining tolerances"
|
|
||||||
- "what materials can be swiss machined"
|
|
||||||
- "swiss screw machining for medical devices"
|
|
||||||
- "when to use swiss machining vs conventional lathe"
|
|
||||||
These queries represent the search cluster around the topic. The more of them the content answers, the more authoritative it becomes across related searches.
|
|
||||||
|
|
||||||
2. **Cover the common ground** — Include the themes that all/most competitors address. Missing these makes content look incomplete.
|
|
||||||
3. **Identify 1-2 unique angles** — Find something competitors are NOT covering well. This is the content's differentiator.
|
|
||||||
4. **Shape H3 headings from fan-out queries** — Map the strongest fan-out queries to H3 headings. Headings that match real search patterns give the content more surface area across the query cluster. A heading like "What Materials Can Be Swiss Machined?" is better than "Materials" because it mirrors how people actually search.
|
|
||||||
5. **Structure for scanning** — Use clear H2 sections with H3 subsections. Each H2 should address one major subtopic.
|
|
||||||
6. **Include notes on each section** — Brief description of what goes in each section and why.
|
|
||||||
|
|
||||||
Consult `references/content_frameworks.md` for structural templates (how-to, listicle, comparison, etc.) and select the best fit for the topic.
|
|
||||||
|
|
||||||
**IMPORTANT: YOU NEED A CORA REPORT BEFORE building the outline.** The Cora report provides:
|
|
||||||
- Heading count targets (H2, H3 counts) that shape the outline structure
|
|
||||||
- Entity lists that inform heading names (pack entity terms into H2/H3 headings)
|
|
||||||
- Word count targets that determine section depth
|
|
||||||
- Structure targets (entities per heading level, variations per heading level) that guide how keyword-rich headings should be
|
|
||||||
|
|
||||||
If the user has not yet provided the Cora XLSX, **ask for it before proceeding with the outline.** Research can happen without Cora, but the outline should not be built without it.
|
|
||||||
|
|
||||||
Save the outline to `./working/outline.md`.
|
|
||||||
|
|
||||||
### Step 4 — HUMAN REVIEW (STOP AND WAIT)
|
|
||||||
|
|
||||||
**Present the outline to the user and ask:**
|
|
||||||
|
|
||||||
> "Here's the outline based on the research. Review it and let me know:
|
|
||||||
> 1. Any sections to add, remove, or reorder?
|
|
||||||
> 2. Are the unique angles worth pursuing?
|
|
||||||
> 3. Any specific points or data you want included?
|
|
||||||
> 4. Anything else before I draft?"
|
|
||||||
|
|
||||||
**Do NOT proceed until the user responds.** This is a critical gate. Incorporate all feedback before moving on.
|
|
||||||
|
|
||||||
### Step 5 — Write First Draft
|
|
||||||
|
|
||||||
Write the full content based on the approved outline:
|
|
||||||
|
|
||||||
- Follow the structure exactly as approved
|
|
||||||
- Consult `references/brand_guidelines.md` for voice and tone guidance
|
|
||||||
- Write in clear, scannable paragraphs (max 4 sentences per paragraph)
|
|
||||||
- Use subheadings every 2-4 paragraphs
|
|
||||||
- Include lists, examples, and concrete details where appropriate
|
|
||||||
- Aim for the word count the user specified.
|
|
||||||
|
|
||||||
**Fan-Out Query (FOQ) Section:**
|
|
||||||
After the main content, write a separate FOQ section using the fan-out queries from the outline. This section is **excluded** from word count and heading count targets — it lives outside the core article.
|
|
||||||
|
|
||||||
- Each FOQ is an H3 heading phrased as a question
|
|
||||||
- Answer in 2-3 sentences max, self-contained
|
|
||||||
- **Restate the question in the answer** — this is the format LLMs and featured snippets prefer for citation: "How does X work? X works by..."
|
|
||||||
- The user may style these as accordions, FAQ schema, or hidden divs
|
|
||||||
- Mark the section clearly (e.g. `<!-- FOQ SECTION START -->`) so it's easy to separate from the main content
|
|
||||||
|
|
||||||
Save the draft to `./working/draft.md`.
|
|
||||||
|
|
||||||
Tell the user: "First draft is ready. If you have a Cora report for this keyword, provide the XLSX path and I'll optimize against it. Otherwise, let me know what changes you'd like."
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 2: Cora Optimization
|
|
||||||
|
|
||||||
This phase begins when the user provides a Cora XLSX report. The draft may come from Phase 1, or the user may provide an existing draft to optimize.
|
|
||||||
|
|
||||||
### Step 6 — Load Cora Report
|
|
||||||
|
|
||||||
Parse the Cora XLSX and display a summary of targets:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet summary
|
|
||||||
```
|
|
||||||
|
|
||||||
Show the user:
|
|
||||||
- Search term and keyword variations
|
|
||||||
- Entity count and deficit count
|
|
||||||
- LSI keyword count and deficit count
|
|
||||||
- Word count target (cluster target, not raw average)
|
|
||||||
- Density targets (variation, entity, LSI)
|
|
||||||
- Key optimization rules that will be applied
|
|
||||||
|
|
||||||
### Step 7 — Entity Optimization
|
|
||||||
|
|
||||||
Run the entity optimizer against the draft:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python entity_optimizer.py "{draft_path}" "{cora_xlsx_path}" --top-n 30
|
|
||||||
```
|
|
||||||
|
|
||||||
Review the output and apply the top recommendations:
|
|
||||||
- Focus on entities with high relevance AND high remaining deficit
|
|
||||||
- Add entities naturally — they must fit the context of the section
|
|
||||||
- Prioritize adding entities to H2 and H3 headings first (these are primary optimization targets)
|
|
||||||
- Do NOT force entities where they don't make sense — readability always wins
|
|
||||||
- H1: exactly 1, always. Do not add a second H1.
|
|
||||||
- H5, H6: ignore completely
|
|
||||||
- H4: only add if most competitors have them
|
|
||||||
|
|
||||||
After applying entity changes, save the updated draft.
|
|
||||||
|
|
||||||
### Step 8 — LSI Keyword Optimization
|
|
||||||
|
|
||||||
Run the LSI optimizer:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python lsi_optimizer.py "{draft_path}" "{cora_xlsx_path}" --min-correlation 0.2 --top-n 50
|
|
||||||
```
|
|
||||||
|
|
||||||
Apply LSI keyword recommendations:
|
|
||||||
- Focus on keywords with strongest correlation (highest absolute value = most ranking impact)
|
|
||||||
- Many LSI keywords are common phrases that may already appear naturally
|
|
||||||
- Add missing keywords in body text, not just headings
|
|
||||||
- Some LSI keywords overlap with entities — count these once, benefit twice
|
|
||||||
|
|
||||||
After applying LSI changes, save the updated draft.
|
|
||||||
|
|
||||||
### Step 9 — Structure & Density Check
|
|
||||||
|
|
||||||
Check the overall structure against Cora targets:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet structure --format json
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python cora_parser.py "{cora_xlsx_path}" --sheet densities --format json
|
|
||||||
```
|
|
||||||
|
|
||||||
Verify and adjust:
|
|
||||||
- **Heading counts:** Compare H1, H2, H3, H4 counts against Page 1 Average targets. Add or consolidate headings as needed.
|
|
||||||
- **Entities per heading level:** Check that each heading level has enough entity mentions vs. the Structure sheet targets.
|
|
||||||
- **Variations in headings:** Ensure keyword variations appear in H2/H3 headings at target levels.
|
|
||||||
- **Density targets:** Check variation density, entity density, and LSI density against the Strategic Overview percentages.
|
|
||||||
- **Word count:** Compare against the cluster target (NOT the raw average). If below target, identify which sections could be expanded.
|
|
||||||
|
|
||||||
**Important density note:** Adding content to meet one target changes the denominator for ALL density calculations. After significant word count changes, re-check densities. Usually 1-2 optimization passes are sufficient.
|
|
||||||
|
|
||||||
### Step 10 — Keyword Density Check (Optional)
|
|
||||||
|
|
||||||
If a quick keyword density check is useful:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python seo_optimizer.py "{draft_path}" --cora-xlsx "{cora_xlsx_path}"
|
|
||||||
```
|
|
||||||
|
|
||||||
Key rules:
|
|
||||||
- Exact match keyword density: 2% minimum, no upper limit
|
|
||||||
- Variations capture exact match — hitting variation density targets covers exact match
|
|
||||||
- Do NOT flag keyword stuffing. There is no practical upper limit that hurts rankings.
|
|
||||||
|
|
||||||
### Step 11 — Meta Title, Meta Description, and URL Slug
|
|
||||||
|
|
||||||
Generate meta tags and add them as an HTML comment block at the top of the draft file.
|
|
||||||
|
|
||||||
**Meta title format:** Pack keyword variations into a pipe-separated title tag. Google reads far more than the ~60 characters it displays — a long title tag with variations gives the page more surface area across related searches. You can go up to 500 characters but do not have to.
|
|
||||||
|
|
||||||
Format: `Exact Search Term | Variation 1 | Variation 2 | ... | Company Name`
|
|
||||||
|
|
||||||
Use the keyword variations from the Cora report. Only include variations that have a page1_avg > 0 (competitors actually use them). Put the highest-value variations first.
|
|
||||||
|
|
||||||
**Meta description:** Write a keyword-rich summary (~350-500 characters) that hits the primary keyword, key variations, materials, sizes, and company name. This is not just a copy of the intro paragraph — it should be independently optimized.
|
|
||||||
|
|
||||||
**URL slug:** Short, keyword-focused. Example: `/custom-spun-hemispheres`
|
|
||||||
|
|
||||||
Add to the top of the draft file:
|
|
||||||
```html
|
|
||||||
<!--
|
|
||||||
META TITLE: Exact Search Term | Variation 1 | Variation 2 | Company Name
|
|
||||||
META DESCRIPTION: Keyword-rich summary here.
|
|
||||||
URL SLUG: /url-slug-here
|
|
||||||
-->
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 12 — Image & Diagram Placement
|
|
||||||
|
|
||||||
Read through the draft md file and identify where visuals would enhance the content:
|
|
||||||
|
|
||||||
For each recommendation, specify:
|
|
||||||
- **Location:** After which heading or paragraph
|
|
||||||
- **Type:** Photo, diagram, chart, infographic, screenshot, illustration
|
|
||||||
- **Description:** What the visual should show
|
|
||||||
- **Rationale:** Why it adds value at that point (breaks up text, illustrates a process, makes data tangible, etc.)
|
|
||||||
|
|
||||||
Common placement triggers:
|
|
||||||
- Sections describing a process or workflow (diagram)
|
|
||||||
- Sections with comparative data (chart or table)
|
|
||||||
- Long text-only stretches (break up with a relevant image)
|
|
||||||
- Technical concepts that benefit from visual explanation (diagram)
|
|
||||||
- Before/after scenarios (side-by-side images)
|
|
||||||
|
|
||||||
### Step 13 — HUMAN REVIEW (STOP AND WAIT)
|
|
||||||
|
|
||||||
**Present the final draft, optimization summary, and image suggestions to the user:**
|
|
||||||
|
|
||||||
> "Here's the optimized draft. Summary of changes:
|
|
||||||
> - [X] entities added across [Y] sections
|
|
||||||
> - [X] LSI keywords incorporated
|
|
||||||
> - Word count: [current] (target: [target])
|
|
||||||
> - Variation density: [current]% (target: [target]%)
|
|
||||||
> - Entity density: [current]% (target: [target]%)
|
|
||||||
> - [X] image/diagram placements suggested
|
|
||||||
>
|
|
||||||
> Review the draft. What needs adjusting?"
|
|
||||||
|
|
||||||
**Do NOT finalize until the user approves.**
|
|
||||||
|
|
||||||
### Step 14 — HTML Export
|
|
||||||
|
|
||||||
After the user approves the draft, convert the markdown to plain HTML for WordPress. Save as `./working/draft.html` (or `draft_normal.html`, `draft_storybrand.html` if multiple versions exist).
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
- **Plain HTML only** — no classes, no divs, no wrappers. Just `<h2>`, `<h3>`, `<p>`, `<ul>/<li>`, and `<strong>` tags.
|
|
||||||
- **Omit the H1** — WordPress sets the page title separately. Do not include an `<h1>` tag in the HTML.
|
|
||||||
- **Keep the meta comment block** at the top (META TITLE, META DESCRIPTION, URL SLUG).
|
|
||||||
- **Keep the FOQ comment markers** (`<!-- FOQ SECTION START -->` / `<!-- FOQ SECTION END -->`) so the user can identify that section for special styling.
|
|
||||||
- The user pastes this HTML into WordPress Gutenberg's Code Editor view, where it maps directly to blocks.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Phase 3: Quick Test Block
|
|
||||||
|
|
||||||
A standalone workflow for testing whether adding entities, keywords, and headings moves rankings before investing in full content optimization. The output is a minimal text block placed in a hidden div on the page for A/B testing.
|
|
||||||
|
|
||||||
**Key principle:** The LLM handles all intelligence — filtering entities for topical relevance, writing headings, creating body templates. Python scripts handle all math — slot filling, density tracking, stop conditions, validation. There are NO per-entity mention targets — only aggregate density percentages and distinct entity counts.
|
|
||||||
|
|
||||||
### When to Use
|
|
||||||
|
|
||||||
User says "test block," "hidden div," "quick test," "test the entities," or similar. This is NOT part of Phase 2 — it is an independent workflow. Requirements: a Cora report and existing content (URL or file).
|
|
||||||
|
|
||||||
### Step T1 — Load Inputs
|
|
||||||
|
|
||||||
- Pull existing content via BS4 scraper if a URL is provided, or read from file if a path is given.
|
|
||||||
- Save existing content to `{cwd}/working/existing_content.md` if fetched from URL.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with requests,beautifulsoup4 python competitor_scraper.py "{url}" --output-dir {cwd}/working/
|
|
||||||
```
|
|
||||||
|
|
||||||
Then rename the output file to `{cwd}/working/existing_content.md`.
|
|
||||||
|
|
||||||
### Step T2 — Run Prep Script (Programmatic)
|
|
||||||
|
|
||||||
Run `test_block_prep.py` to extract all deficit data:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python test_block_prep.py "{content_path}" "{cora_xlsx_path}" --format json -o {cwd}/working/prep_data.json
|
|
||||||
```
|
|
||||||
|
|
||||||
This outputs structured JSON with:
|
|
||||||
- Word count vs target + deficit
|
|
||||||
- Distinct entity count vs target + deficit + list of missing 0-count entities
|
|
||||||
- Variation density % vs target (Cora row 46)
|
|
||||||
- Entity density % vs target (Cora row 47)
|
|
||||||
- LSI density % vs target (Cora row 48)
|
|
||||||
- Heading structure deficits (H2, H3 counts; entities/variations in headings)
|
|
||||||
- **Template instructions**: how many templates to generate, how many slots per template, target word count
|
|
||||||
|
|
||||||
Review the prep output. All numbers come from deterministic script analysis — no estimation.
|
|
||||||
|
|
||||||
### Step T3 — Filter Entities for Topical Relevance (LLM Step)
|
|
||||||
|
|
||||||
Read the `missing_entities` list from `{cwd}/working/prep_data.json`. This list contains ALL entities with 0 mentions on the existing page, sorted by Cora relevance score. **Many of these will be noise** — navigation terms, competitor names, unrelated concepts that happen to appear on ranking pages.
|
|
||||||
|
|
||||||
Review every entity and keep ONLY those that are topically relevant to the page's subject matter. Ask: "Would a subject matter expert writing about [page topic] naturally mention this term?"
|
|
||||||
|
|
||||||
**Remove:**
|
|
||||||
- Competitor company names and brands
|
|
||||||
- People (athletes, historical figures, etc.)
|
|
||||||
- Web furniture (blog, menu, privacy, FAQ, social media platforms)
|
|
||||||
- Geographic entities unrelated to the topic
|
|
||||||
- Software, media, organisms, and other off-topic typed entities
|
|
||||||
- Generic terms that only appear due to page chrome (calculator, glossary, children, etc.)
|
|
||||||
|
|
||||||
**Keep:**
|
|
||||||
- Terms directly related to the product/service/topic
|
|
||||||
- Materials, processes, components, and industry terms
|
|
||||||
- Related applications and industries where the product is used
|
|
||||||
- Technical specifications and engineering concepts
|
|
||||||
|
|
||||||
Save the filtered entity names to `{cwd}/working/filtered_entities.txt`, one entity per line, ordered from most to least relevant.
|
|
||||||
|
|
||||||
### Step T4 — Generate Headings and Body Templates (LLM Creative Step)
|
|
||||||
|
|
||||||
This step has two parts. Read the prep JSON for the numbers you need:
|
|
||||||
- `headings.h2.deficit`: how many H2 headings to generate
|
|
||||||
- `headings.h3.deficit`: how many H3 headings to generate
|
|
||||||
- `headings.entities_in_headings.deficit`: how many entity mentions needed across all headings
|
|
||||||
- `template_instructions.num_templates`: how many body templates to create
|
|
||||||
- `template_instructions.slots_per_sentence`: how many `{N}` slots per body template
|
|
||||||
- `template_instructions.avg_words_per_template`: target words per template (~15)
|
|
||||||
|
|
||||||
**Part 1 — Write headings:**
|
|
||||||
|
|
||||||
Using the filtered entity list from T3 and your understanding of the page topic, write topically relevant H2 and H3 headings. These are final text — NOT templates, no `{N}` slots. The headings should:
|
|
||||||
- Read like real section headings a subject matter expert would write
|
|
||||||
- Naturally incorporate entities from the filtered list (aim to hit the entities_in_headings deficit)
|
|
||||||
- Be relevant to the page's topic and the types of content that would appear under them
|
|
||||||
|
|
||||||
**Part 2 — Write body templates:**
|
|
||||||
|
|
||||||
Generate body sentence templates with numbered placeholder slots. Follow the numbers from `template_instructions`:
|
|
||||||
- Create `num_templates` templates
|
|
||||||
- Each template gets `slots_per_sentence` numbered slots: `{1}`, `{2}`, `{3}`, etc. Slots MUST be numbered — the generator regex matches `{1}`, `{2}`, NOT `{N}`.
|
|
||||||
- Templates must be topically relevant to the page's subject matter
|
|
||||||
- Templates should be grammatically coherent but brevity wins over polish
|
|
||||||
- Do NOT try to specify which entities go in which slot — the generator script handles that
|
|
||||||
|
|
||||||
Save everything to `{cwd}/working/templates.txt`, one per line. Headings are prefixed with `H2:` or `H3:`, body templates are plain text with `{N}` slots.
|
|
||||||
|
|
||||||
Example for an expansion joints page:
|
|
||||||
```
|
|
||||||
H2: Bellows Expansion Joints for Industrial Piping Systems
|
|
||||||
H2: Metal and Rubber Expansion Joint Applications in Water Treatment
|
|
||||||
H3: Gasket and Flange Connections for Expansion Joints
|
|
||||||
{1} and {2} are critical components used to absorb thermal movement and reduce stress in piping systems.
|
|
||||||
{1} provide reliable performance in demanding {2} environments where thermal cycling is constant.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step T5 — Run Generator Script (Programmatic)
|
|
||||||
|
|
||||||
Run `test_block_generator.py` to fill body template slots and assemble the test block. The script requires the LLM-curated entity list from T3:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python test_block_generator.py {cwd}/working/templates.txt {cwd}/working/prep_data.json "{cora_xlsx_path}" --entities-file {cwd}/working/filtered_entities.txt --output-dir {cwd}/working/ --min-sentences 5
|
|
||||||
```
|
|
||||||
|
|
||||||
The script:
|
|
||||||
1. Loads the LLM-curated entity list — uses ONLY these entities for slot filling (no script-level filtering)
|
|
||||||
2. Builds a term queue: filtered entities first, then keyword variations
|
|
||||||
3. Inserts pre-written headings as-is (no slot filling on heading lines)
|
|
||||||
4. Fills body template slots, rotating through the term queue (no duplicates within a sentence)
|
|
||||||
5. Tracks projected densities: (baseline_mentions + new_mentions) / (baseline_words + new_words)
|
|
||||||
6. Stops when: all density targets met, distinct entity deficit closed, word count deficit closed, AND minimum sentence count reached
|
|
||||||
|
|
||||||
Output files:
|
|
||||||
- `{cwd}/working/test_block.md` — Markdown version
|
|
||||||
- `{cwd}/working/test_block.html` — Plain HTML version
|
|
||||||
- `{cwd}/working/test_block_stats.json` — Generation stats (mentions added, entities introduced, projected densities)
|
|
||||||
|
|
||||||
### Step T6 — Rewrite Body Sentences for Readability (LLM Step — use Haiku)
|
|
||||||
|
|
||||||
The generator produces grammatically rough sentences because entities get slotted into positions where they don't naturally fit. This step rewrites each body sentence to read naturally while preserving entity strings exactly.
|
|
||||||
|
|
||||||
**Use Haiku for this step** — it's fast and cheap enough to handle sentence-by-sentence rewrites.
|
|
||||||
|
|
||||||
Read `{cwd}/working/test_block.md`. For each body sentence (NOT headings — leave all H2/H3 lines exactly as they are):
|
|
||||||
|
|
||||||
1. Identify which entity terms from `{cwd}/working/filtered_entities.txt` appear in the sentence
|
|
||||||
2. Rewrite the sentence so it is grammatically correct and reads naturally
|
|
||||||
3. **Preserve every entity string exactly** — same spelling, same case. Do not paraphrase, hyphenate, abbreviate, or pluralize entity terms. "stainless steel" must remain "stainless steel", not "stainless-steel" or "SS".
|
|
||||||
4. Keep the sentence under 20 words
|
|
||||||
5. The rewrite should be topically relevant to the page subject
|
|
||||||
|
|
||||||
Reassemble the test block with:
|
|
||||||
- Same `<!-- HIDDEN TEST BLOCK START -->` / `<!-- HIDDEN TEST BLOCK END -->` markers
|
|
||||||
- Same headings in the same positions
|
|
||||||
- Rewritten body sentences grouped into paragraphs (4 sentences per paragraph)
|
|
||||||
|
|
||||||
Overwrite both files:
|
|
||||||
- `{cwd}/working/test_block.md` (markdown format)
|
|
||||||
- `{cwd}/working/test_block.html` (HTML format with `<h2>`, `<p>` tags)
|
|
||||||
|
|
||||||
### Step T7 — Run Validation Script (Programmatic)
|
|
||||||
|
|
||||||
Run `test_block_validate.py` for a deterministic before/after comparison:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd {skill_dir}/scripts && uv run --with openpyxl python test_block_validate.py "{content_path}" {cwd}/working/test_block.md "{cora_xlsx_path}" --format json -o {cwd}/working/validation_report.json
|
|
||||||
```
|
|
||||||
|
|
||||||
This produces a report showing every metric before and after, with targets and status:
|
|
||||||
- Word count, distinct entities, entity density %, variation density %, LSI density %
|
|
||||||
- Heading counts (H2, H3), entities/variations in headings
|
|
||||||
- List of all new 0->1 entities introduced
|
|
||||||
- All numbers are from the same counting code — no mixing of data sources
|
|
||||||
|
|
||||||
Present the validation report to the user. Flag any metric that dropped below target after the test block was added.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Optimization Rules
|
|
||||||
|
|
||||||
These override any data from the Cora report:
|
|
||||||
|
|
||||||
| Rule | Detail |
|
|
||||||
|------|--------|
|
|
||||||
| H1 count | Exactly 1, always |
|
|
||||||
| H2, H3 | Primary optimization targets — focus entity/variation additions here |
|
|
||||||
| H4 | Low priority — only add if most competitors have them |
|
|
||||||
| H5, H6 | Ignore completely |
|
|
||||||
| Word count | Target the nearest competitive cluster, not the raw average. Up to ~1,500 words is always acceptable even if the target is lower. |
|
|
||||||
| Exact match density | 2% minimum, no upper limit |
|
|
||||||
| Keyword stuffing | Do NOT flag or warn about keyword stuffing |
|
|
||||||
| Variations include exact match | Optimizing variation density inherently covers exact match |
|
|
||||||
| Density is interdependent | Adding content changes ALL density calculations — re-check after big changes |
|
|
||||||
| Optimization passes | 1-2 passes is typically sufficient |
|
|
||||||
| Competitor names | NEVER use competitor company names as entities or LSI keywords. Do not mention competitors by name in content. |
|
|
||||||
| Measurement entities | Ignore measurements (dimensions, tolerances, etc.) as entities — skip these in entity optimization |
|
|
||||||
| Organization entities | Organizations like ISO, ANSI, ASTM are fine — keep these as entities |
|
|
||||||
| Entity correlation filter | Only entities with Best of Both <= -0.19 are included. Best of Both is the lower of Spearman's or Pearson's correlation to ranking position (1=top, 100=bottom), so more negative = stronger ranking signal. This filter is applied in `cora_parser.py` and affects all downstream consumers. To disable, set `entity_correlation_threshold` to `None` in `OPTIMIZATION_RULES`. Added 2026-03-20 — revert if entity coverage feels too thin. |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Scripts Reference
|
|
||||||
|
|
||||||
All scripts are in `{skill_dir}/scripts/`. Run them with `uv run --with openpyxl python` (or `--with requests,beautifulsoup4` for the scraper).
|
|
||||||
|
|
||||||
### cora_parser.py
|
|
||||||
Foundation module. Reads a Cora XLSX and extracts structured data.
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python cora_parser.py <xlsx_path> [--sheet SHEET] [--format json|text]
|
|
||||||
```
|
|
||||||
Sheets: `summary`, `entities`, `lsi`, `variations`, `structure`, `densities`, `targets`, `wordcount`, `results`, `tunings`, `all`
|
|
||||||
|
|
||||||
### entity_optimizer.py
|
|
||||||
Counts entities in a draft against Cora targets, recommends additions sorted by (relevance x deficit).
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python entity_optimizer.py <draft_path> <cora_xlsx_path> [--format json|text] [--top-n 30]
|
|
||||||
```
|
|
||||||
|
|
||||||
### lsi_optimizer.py
|
|
||||||
Counts LSI keywords in a draft against Cora targets, recommends additions sorted by (|correlation| x deficit).
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python lsi_optimizer.py <draft_path> <cora_xlsx_path> [--format json|text] [--min-correlation 0.2] [--top-n 50]
|
|
||||||
```
|
|
||||||
|
|
||||||
### seo_optimizer.py
|
|
||||||
Keyword density, structure, and readability checks. Optional Cora integration.
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python seo_optimizer.py <draft_path> [--keyword <kw>] [--cora-xlsx <path>] [--format json|text]
|
|
||||||
```
|
|
||||||
|
|
||||||
### competitor_scraper.py
|
|
||||||
Utility for bulk-fetching URLs when the user provides a list.
|
|
||||||
```
|
|
||||||
uv run --with requests,beautifulsoup4 python competitor_scraper.py <url1> <url2> ... [--output-dir ./working/competitor_content/]
|
|
||||||
```
|
|
||||||
|
|
||||||
### test_block_prep.py
|
|
||||||
|
|
||||||
Extracts all deficit data from existing content + Cora XLSX. Outputs structured JSON with word count, entity/variation/LSI density deficits, heading deficits, missing entities list, and calculated template instructions (num_templates, slots_per_sentence).
|
|
||||||
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python test_block_prep.py <content_path> <cora_xlsx_path> [--format json|text] [-o PATH]
|
|
||||||
```
|
|
||||||
|
|
||||||
### test_block_generator.py
|
|
||||||
|
|
||||||
Fills body template slots with entities from an LLM-curated entity list. Inserts pre-written headings as-is (no slot filling). Tracks aggregate densities in real-time, stops when all targets are met. Outputs test_block.md, test_block.html, and test_block_stats.json.
|
|
||||||
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python test_block_generator.py <templates_path> <prep_json_path> <cora_xlsx_path> --entities-file <path> [--output-dir DIR] [--min-sentences N]
|
|
||||||
```
|
|
||||||
|
|
||||||
### test_block_validate.py
|
|
||||||
|
|
||||||
Deterministic before/after comparison. Runs the same counting logic on existing content alone vs existing content + test block. Shows every metric with before, after, target, and status.
|
|
||||||
|
|
||||||
```
|
|
||||||
uv run --with openpyxl python test_block_validate.py <content_path> <test_block_path> <cora_xlsx_path> [--format json|text] [-o PATH]
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Reference Files
|
|
||||||
|
|
||||||
- `references/content_frameworks.md` — Article templates (how-to, listicle, comparison, case study, thought leadership), persuasion frameworks (AIDA, PAS), introduction and conclusion patterns.
|
|
||||||
- `references/brand_guidelines.md` — Voice archetypes, writing principles, tone spectrums, language preferences, pre-publication checklist.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Working Directory
|
|
||||||
|
|
||||||
**CRITICAL: All output files MUST be written to `{cwd}/working/` — the `working/` subfolder inside the user's current project directory (where Claude Code was launched). NEVER write files to the skill directory, scripts directory, or any location outside the project folder. When running scripts, always use absolute paths for output flags (`-o`, `--output-dir`) pointing to `{cwd}/working/`.**
|
|
||||||
|
|
||||||
All intermediate files go in `{cwd}/working/` (the user's project directory):
|
|
||||||
- `working/research_summary.md` — Research output from Step 2
|
|
||||||
- `working/outline.md` — Outline from Step 3
|
|
||||||
- `working/draft.md` — Content draft (updated in place during optimization)
|
|
||||||
- `working/competitor_content/` — Scraped competitor text files (if URLs were fetched)
|
|
||||||
- `working/existing_content.md` — BS4-scraped existing page content (Phase 3)
|
|
||||||
- `working/prep_data.json` — Deficit analysis output from test_block_prep.py (Phase 3)
|
|
||||||
- `working/filtered_entities.txt` — LLM-curated entity list, one per line (Phase 3, Step T3)
|
|
||||||
- `working/templates.txt` — Pre-written headings + body templates with numbered slots (Phase 3, Step T4)
|
|
||||||
- `working/test_block.md` — Quick test block in markdown (Phase 3)
|
|
||||||
- `working/test_block.html` — Quick test block in plain HTML (Phase 3)
|
|
||||||
- `working/test_block_stats.json` — Generation stats: mentions added, entities introduced, projected densities (Phase 3)
|
|
||||||
- `working/validation_report.json` — Before/after comparison from test_block_validate.py (Phase 3)
|
|
||||||
|
|
@ -93,8 +93,6 @@ uv add --group test <package>
|
||||||
| `identity/SOUL.md` | Agent personality |
|
| `identity/SOUL.md` | Agent personality |
|
||||||
| `identity/USER.md` | User profile |
|
| `identity/USER.md` | User profile |
|
||||||
| `skills/` | Markdown skill files with YAML frontmatter |
|
| `skills/` | Markdown skill files with YAML frontmatter |
|
||||||
| `scripts/create_clickup_task.py` | CLI script to create ClickUp tasks |
|
|
||||||
| `docs/clickup-task-creation.md` | Task creation conventions, per-type fields, and defaults |
|
|
||||||
|
|
||||||
## Conventions
|
## Conventions
|
||||||
|
|
||||||
|
|
@ -176,11 +174,9 @@ skill_map:
|
||||||
"Press Release":
|
"Press Release":
|
||||||
tool: "write_press_releases"
|
tool: "write_press_releases"
|
||||||
auto_execute: true
|
auto_execute: true
|
||||||
required_fields: [topic, company_name, target_url]
|
|
||||||
field_mapping:
|
field_mapping:
|
||||||
topic: "PR Topic" # ClickUp custom field for PR topic/keyword
|
topic: "task_name" # uses ClickUp task name
|
||||||
company_name: "Customer" # looks up "Customer" custom field
|
company_name: "Customer" # looks up "Customer" custom field
|
||||||
target_url: "IMSURL" # target money-site URL (required)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Task lifecycle: `to do` → discovered → approved/awaiting_approval → executing → completed/failed (+ attachments uploaded)
|
Task lifecycle: `to do` → discovered → approved/awaiting_approval → executing → completed/failed (+ attachments uploaded)
|
||||||
|
|
|
||||||
|
|
@ -1,7 +1,6 @@
|
||||||
"""Entry point: python -m cheddahbot"""
|
"""Entry point: python -m cheddahbot"""
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
from logging.handlers import RotatingFileHandler
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from .agent import Agent
|
from .agent import Agent
|
||||||
|
|
@ -16,22 +15,6 @@ logging.basicConfig(
|
||||||
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
|
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
|
||||||
datefmt="%H:%M:%S",
|
datefmt="%H:%M:%S",
|
||||||
)
|
)
|
||||||
|
|
||||||
# All levels to rotating log file (DEBUG+)
|
|
||||||
_log_dir = Path(__file__).resolve().parent.parent / "logs"
|
|
||||||
_log_dir.mkdir(exist_ok=True)
|
|
||||||
_file_handler = RotatingFileHandler(
|
|
||||||
_log_dir / "cheddahbot.log", maxBytes=5 * 1024 * 1024, backupCount=5
|
|
||||||
)
|
|
||||||
_file_handler.setLevel(logging.DEBUG)
|
|
||||||
_file_handler.setFormatter(
|
|
||||||
logging.Formatter("%(asctime)s [%(name)s] %(levelname)s: %(message)s")
|
|
||||||
)
|
|
||||||
logging.getLogger().addHandler(_file_handler)
|
|
||||||
|
|
||||||
logging.getLogger("httpx").setLevel(logging.WARNING)
|
|
||||||
logging.getLogger("httpcore").setLevel(logging.WARNING)
|
|
||||||
|
|
||||||
log = logging.getLogger("cheddahbot")
|
log = logging.getLogger("cheddahbot")
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -138,41 +121,6 @@ def main():
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log.warning("Notification bus not available: %s", e)
|
log.warning("Notification bus not available: %s", e)
|
||||||
|
|
||||||
# ntfy.sh push notifications
|
|
||||||
if notification_bus and config.ntfy.enabled:
|
|
||||||
try:
|
|
||||||
import os
|
|
||||||
|
|
||||||
from .ntfy import NtfyChannel, NtfyNotifier
|
|
||||||
|
|
||||||
ntfy_channels = []
|
|
||||||
for ch_cfg in config.ntfy.channels:
|
|
||||||
topic = os.getenv(ch_cfg.topic_env_var, "")
|
|
||||||
if topic:
|
|
||||||
ntfy_channels.append(
|
|
||||||
NtfyChannel(
|
|
||||||
name=ch_cfg.name,
|
|
||||||
server=ch_cfg.server,
|
|
||||||
topic=topic,
|
|
||||||
categories=ch_cfg.categories,
|
|
||||||
include_patterns=ch_cfg.include_patterns,
|
|
||||||
exclude_patterns=ch_cfg.exclude_patterns,
|
|
||||||
priority=ch_cfg.priority,
|
|
||||||
tags=ch_cfg.tags,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
log.warning(
|
|
||||||
"ntfy channel '%s' skipped — env var %s not set",
|
|
||||||
ch_cfg.name, ch_cfg.topic_env_var,
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier(ntfy_channels)
|
|
||||||
if notifier.enabled:
|
|
||||||
notification_bus.subscribe("ntfy", notifier.notify)
|
|
||||||
log.info("ntfy notifier subscribed to notification bus")
|
|
||||||
except Exception as e:
|
|
||||||
log.warning("ntfy notifier not available: %s", e)
|
|
||||||
|
|
||||||
# Scheduler (uses default agent)
|
# Scheduler (uses default agent)
|
||||||
scheduler = None
|
scheduler = None
|
||||||
try:
|
try:
|
||||||
|
|
@ -181,14 +129,22 @@ def main():
|
||||||
log.info("Starting scheduler...")
|
log.info("Starting scheduler...")
|
||||||
scheduler = Scheduler(config, db, default_agent, notification_bus=notification_bus)
|
scheduler = Scheduler(config, db, default_agent, notification_bus=notification_bus)
|
||||||
scheduler.start()
|
scheduler.start()
|
||||||
# Inject scheduler into tool context so get_active_tasks can read it
|
|
||||||
if tools:
|
|
||||||
tools.scheduler = scheduler
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log.warning("Scheduler not available: %s", e)
|
log.warning("Scheduler not available: %s", e)
|
||||||
|
|
||||||
|
log.info("Launching Gradio UI on %s:%s...", config.host, config.port)
|
||||||
|
blocks = create_ui(
|
||||||
|
registry, config, default_llm, notification_bus=notification_bus, scheduler=scheduler
|
||||||
|
)
|
||||||
|
|
||||||
|
# Build a parent FastAPI app so we can mount the dashboard alongside Gradio.
|
||||||
|
# Inserting routes into blocks.app before launch() doesn't work because
|
||||||
|
# launch()/mount_gradio_app() replaces the internal App instance.
|
||||||
|
import gradio as gr
|
||||||
import uvicorn
|
import uvicorn
|
||||||
from fastapi import FastAPI
|
from fastapi import FastAPI
|
||||||
|
from fastapi.responses import RedirectResponse
|
||||||
|
from starlette.staticfiles import StaticFiles
|
||||||
|
|
||||||
fastapi_app = FastAPI()
|
fastapi_app = FastAPI()
|
||||||
|
|
||||||
|
|
@ -199,33 +155,24 @@ def main():
|
||||||
fastapi_app.include_router(api_router)
|
fastapi_app.include_router(api_router)
|
||||||
log.info("API router mounted at /api/")
|
log.info("API router mounted at /api/")
|
||||||
|
|
||||||
# Mount new HTMX web UI (chat at /, dashboard at /dashboard)
|
# Mount the dashboard as static files (must come before Gradio's catch-all)
|
||||||
from .web import mount_web_app
|
dashboard_dir = Path(__file__).resolve().parent.parent / "dashboard"
|
||||||
|
if dashboard_dir.is_dir():
|
||||||
|
# Redirect /dashboard (no trailing slash) → /dashboard/
|
||||||
|
@fastapi_app.get("/dashboard")
|
||||||
|
async def _dashboard_redirect():
|
||||||
|
return RedirectResponse(url="/dashboard/")
|
||||||
|
|
||||||
mount_web_app(
|
fastapi_app.mount(
|
||||||
fastapi_app,
|
"/dashboard",
|
||||||
registry,
|
StaticFiles(directory=str(dashboard_dir), html=True),
|
||||||
config,
|
name="dashboard",
|
||||||
default_llm,
|
|
||||||
notification_bus=notification_bus,
|
|
||||||
scheduler=scheduler,
|
|
||||||
db=db,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Mount Gradio at /old for transition period
|
|
||||||
try:
|
|
||||||
import gradio as gr
|
|
||||||
|
|
||||||
log.info("Mounting Gradio UI at /old...")
|
|
||||||
blocks = create_ui(
|
|
||||||
registry, config, default_llm, notification_bus=notification_bus, scheduler=scheduler
|
|
||||||
)
|
)
|
||||||
gr.mount_gradio_app(fastapi_app, blocks, path="/old", pwa=False, show_error=True)
|
log.info("Dashboard mounted at /dashboard/ (serving %s)", dashboard_dir)
|
||||||
log.info("Gradio UI available at /old")
|
|
||||||
except Exception as e:
|
# Mount Gradio at the root
|
||||||
log.warning("Gradio UI not available: %s", e)
|
gr.mount_gradio_app(fastapi_app, blocks, path="/", pwa=True, show_error=True)
|
||||||
|
|
||||||
log.info("Launching web UI on %s:%s...", config.host, config.port)
|
|
||||||
uvicorn.run(fastapi_app, host=config.host, port=config.port)
|
uvicorn.run(fastapi_app, host=config.host, port=config.port)
|
||||||
|
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -369,7 +369,6 @@ class Agent:
|
||||||
system_context: str = "",
|
system_context: str = "",
|
||||||
tools: str = "",
|
tools: str = "",
|
||||||
model: str = "",
|
model: str = "",
|
||||||
skip_permissions: bool = False,
|
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Execute a task using the execution brain (Claude Code CLI).
|
"""Execute a task using the execution brain (Claude Code CLI).
|
||||||
|
|
||||||
|
|
@ -379,25 +378,19 @@ class Agent:
|
||||||
Args:
|
Args:
|
||||||
tools: Override Claude Code tool list (e.g. "Bash,Read,WebSearch").
|
tools: Override Claude Code tool list (e.g. "Bash,Read,WebSearch").
|
||||||
model: Override the CLI model (e.g. "claude-sonnet-4.5").
|
model: Override the CLI model (e.g. "claude-sonnet-4.5").
|
||||||
skip_permissions: If True, run CLI with --dangerously-skip-permissions.
|
|
||||||
"""
|
"""
|
||||||
log.info("Execution brain task: %s", prompt[:100])
|
log.info("Execution brain task: %s", prompt[:100])
|
||||||
kwargs: dict = {
|
kwargs: dict = {"system_prompt": system_context}
|
||||||
"system_prompt": system_context,
|
|
||||||
"timeout": self.config.timeouts.execution_brain,
|
|
||||||
}
|
|
||||||
if tools:
|
if tools:
|
||||||
kwargs["tools"] = tools
|
kwargs["tools"] = tools
|
||||||
if model:
|
if model:
|
||||||
kwargs["model"] = model
|
kwargs["model"] = model
|
||||||
if skip_permissions:
|
|
||||||
kwargs["skip_permissions"] = True
|
|
||||||
result = self.llm.execute(prompt, **kwargs)
|
result = self.llm.execute(prompt, **kwargs)
|
||||||
|
|
||||||
# Log to daily memory
|
# Log to daily memory
|
||||||
if self._memory:
|
if self._memory:
|
||||||
try:
|
try:
|
||||||
self._memory.log_daily(f"[Execution] {prompt[:200]}\n-> {result[:500]}")
|
self._memory.log_daily(f"[Execution] {prompt[:200]}\n→ {result[:500]}")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
log.warning("Failed to log execution to memory: %s", e)
|
log.warning("Failed to log execution to memory: %s", e)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -125,7 +125,7 @@ async def get_tasks_by_company():
|
||||||
data = await get_tasks()
|
data = await get_tasks()
|
||||||
by_company: dict[str, list] = {}
|
by_company: dict[str, list] = {}
|
||||||
for task in data.get("tasks", []):
|
for task in data.get("tasks", []):
|
||||||
company = task["custom_fields"].get("Client") or "Unassigned"
|
company = task["custom_fields"].get("Customer") or "Unassigned"
|
||||||
by_company.setdefault(company, []).append(task)
|
by_company.setdefault(company, []).append(task)
|
||||||
|
|
||||||
# Sort companies by task count descending
|
# Sort companies by task count descending
|
||||||
|
|
@ -238,7 +238,7 @@ async def get_link_building_tasks():
|
||||||
in_progress_not_started.append(t)
|
in_progress_not_started.append(t)
|
||||||
by_company: dict[str, list] = {}
|
by_company: dict[str, list] = {}
|
||||||
for task in active_lb:
|
for task in active_lb:
|
||||||
company = task["custom_fields"].get("Client") or "Unassigned"
|
company = task["custom_fields"].get("Customer") or "Unassigned"
|
||||||
by_company.setdefault(company, []).append(task)
|
by_company.setdefault(company, []).append(task)
|
||||||
|
|
||||||
result = {
|
result = {
|
||||||
|
|
@ -320,7 +320,7 @@ async def get_need_cora_tasks():
|
||||||
if kw_lower not in by_keyword:
|
if kw_lower not in by_keyword:
|
||||||
by_keyword[kw_lower] = {
|
by_keyword[kw_lower] = {
|
||||||
"keyword": kw,
|
"keyword": kw,
|
||||||
"company": t["custom_fields"].get("Client") or "Unassigned",
|
"company": t["custom_fields"].get("Customer") or "Unassigned",
|
||||||
"due_date": t.get("due_date"),
|
"due_date": t.get("due_date"),
|
||||||
"tasks": [],
|
"tasks": [],
|
||||||
}
|
}
|
||||||
|
|
@ -367,7 +367,7 @@ async def get_press_release_tasks():
|
||||||
|
|
||||||
by_company: dict[str, list] = {}
|
by_company: dict[str, list] = {}
|
||||||
for task in pr_tasks:
|
for task in pr_tasks:
|
||||||
company = task["custom_fields"].get("Client") or "Unassigned"
|
company = task["custom_fields"].get("Customer") or "Unassigned"
|
||||||
by_company.setdefault(company, []).append(task)
|
by_company.setdefault(company, []).append(task)
|
||||||
|
|
||||||
return {
|
return {
|
||||||
|
|
@ -562,15 +562,6 @@ async def force_loop_run():
|
||||||
return {"status": "ok", "message": "Force pulse sent to heartbeat and poll loops"}
|
return {"status": "ok", "message": "Force pulse sent to heartbeat and poll loops"}
|
||||||
|
|
||||||
|
|
||||||
@router.post("/system/briefing/force")
|
|
||||||
async def force_briefing():
|
|
||||||
"""Force the morning briefing to send now (won't block tomorrow's)."""
|
|
||||||
if not _scheduler:
|
|
||||||
return {"status": "error", "message": "Scheduler not available"}
|
|
||||||
_scheduler.force_briefing()
|
|
||||||
return {"status": "ok", "message": "Briefing force-triggered"}
|
|
||||||
|
|
||||||
|
|
||||||
@router.post("/cache/clear")
|
@router.post("/cache/clear")
|
||||||
async def clear_cache():
|
async def clear_cache():
|
||||||
"""Clear the ClickUp data cache."""
|
"""Clear the ClickUp data cache."""
|
||||||
|
|
|
||||||
|
|
@ -31,7 +31,6 @@ class ClickUpTask:
|
||||||
list_name: str = ""
|
list_name: str = ""
|
||||||
tags: list[str] = field(default_factory=list)
|
tags: list[str] = field(default_factory=list)
|
||||||
date_done: str = ""
|
date_done: str = ""
|
||||||
date_updated: str = ""
|
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def from_api(cls, data: dict, task_type_field_name: str = "Task Type") -> ClickUpTask:
|
def from_api(cls, data: dict, task_type_field_name: str = "Task Type") -> ClickUpTask:
|
||||||
|
|
@ -68,9 +67,6 @@ class ClickUpTask:
|
||||||
raw_done = data.get("date_done") or data.get("date_closed")
|
raw_done = data.get("date_done") or data.get("date_closed")
|
||||||
date_done = str(raw_done) if raw_done else ""
|
date_done = str(raw_done) if raw_done else ""
|
||||||
|
|
||||||
raw_updated = data.get("date_updated")
|
|
||||||
date_updated = str(raw_updated) if raw_updated else ""
|
|
||||||
|
|
||||||
return cls(
|
return cls(
|
||||||
id=data["id"],
|
id=data["id"],
|
||||||
name=data.get("name", ""),
|
name=data.get("name", ""),
|
||||||
|
|
@ -84,7 +80,6 @@ class ClickUpTask:
|
||||||
list_name=data.get("list", {}).get("name", ""),
|
list_name=data.get("list", {}).get("name", ""),
|
||||||
tags=tags,
|
tags=tags,
|
||||||
date_done=date_done,
|
date_done=date_done,
|
||||||
date_updated=date_updated,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -419,176 +414,6 @@ class ClickUpClient:
|
||||||
log.info("Created custom field '%s' (%s) on list %s", name, field_type, list_id)
|
log.info("Created custom field '%s' (%s) on list %s", name, field_type, list_id)
|
||||||
return result
|
return result
|
||||||
|
|
||||||
def get_task(self, task_id: str) -> ClickUpTask:
|
|
||||||
"""Fetch a single task by ID."""
|
|
||||||
resp = self._client.get(f"/task/{task_id}")
|
|
||||||
resp.raise_for_status()
|
|
||||||
return ClickUpTask.from_api(resp.json(), self._task_type_field_name)
|
|
||||||
|
|
||||||
def set_custom_field_by_name(
|
|
||||||
self, task_id: str, field_name: str, value: Any
|
|
||||||
) -> bool:
|
|
||||||
"""Set a custom field by its human-readable name.
|
|
||||||
|
|
||||||
Looks up the field ID from the task's list, then sets the value.
|
|
||||||
Falls back gracefully if the field doesn't exist.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
task_data = self._client.get(f"/task/{task_id}").json()
|
|
||||||
list_id = task_data.get("list", {}).get("id", "")
|
|
||||||
if not list_id:
|
|
||||||
log.warning("Could not determine list_id for task %s", task_id)
|
|
||||||
return False
|
|
||||||
|
|
||||||
fields = self.get_custom_fields(list_id)
|
|
||||||
field_id = None
|
|
||||||
for f in fields:
|
|
||||||
if f.get("name") == field_name:
|
|
||||||
field_id = f["id"]
|
|
||||||
break
|
|
||||||
|
|
||||||
if not field_id:
|
|
||||||
log.warning("Field '%s' not found in list %s", field_name, list_id)
|
|
||||||
return False
|
|
||||||
|
|
||||||
return self.set_custom_field_value(task_id, field_id, value)
|
|
||||||
except Exception as e:
|
|
||||||
log.error("Failed to set field '%s' on task %s: %s", field_name, task_id, e)
|
|
||||||
return False
|
|
||||||
|
|
||||||
def set_custom_field_smart(
|
|
||||||
self, task_id: str, list_id: str, field_name: str, value: str
|
|
||||||
) -> bool:
|
|
||||||
"""Set a custom field by name, auto-resolving dropdown option UUIDs.
|
|
||||||
|
|
||||||
For dropdown fields, *value* is matched against option names
|
|
||||||
(case-insensitive). For all other field types, *value* is passed through.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
fields = self.get_custom_fields(list_id)
|
|
||||||
target = None
|
|
||||||
for f in fields:
|
|
||||||
if f.get("name") == field_name:
|
|
||||||
target = f
|
|
||||||
break
|
|
||||||
|
|
||||||
if not target:
|
|
||||||
log.warning("Field '%s' not found in list %s", field_name, list_id)
|
|
||||||
return False
|
|
||||||
|
|
||||||
field_id = target["id"]
|
|
||||||
resolved = value
|
|
||||||
|
|
||||||
if target.get("type") == "drop_down":
|
|
||||||
options = target.get("type_config", {}).get("options", [])
|
|
||||||
for opt in options:
|
|
||||||
if opt.get("name", "").lower() == value.lower():
|
|
||||||
resolved = opt["id"]
|
|
||||||
break
|
|
||||||
else:
|
|
||||||
log.warning(
|
|
||||||
"Dropdown option '%s' not found for field '%s'",
|
|
||||||
value,
|
|
||||||
field_name,
|
|
||||||
)
|
|
||||||
return False
|
|
||||||
|
|
||||||
return self.set_custom_field_value(task_id, field_id, resolved)
|
|
||||||
except Exception as e:
|
|
||||||
log.error(
|
|
||||||
"Failed to set field '%s' on task %s: %s", field_name, task_id, e
|
|
||||||
)
|
|
||||||
return False
|
|
||||||
|
|
||||||
def get_custom_field_by_name(self, task_id: str, field_name: str) -> Any:
|
|
||||||
"""Read a custom field value from a task by field name.
|
|
||||||
|
|
||||||
Fetches the task and looks up the field value from custom_fields.
|
|
||||||
Returns None if not found.
|
|
||||||
"""
|
|
||||||
try:
|
|
||||||
task = self.get_task(task_id)
|
|
||||||
return task.custom_fields.get(field_name)
|
|
||||||
except Exception as e:
|
|
||||||
log.warning("Failed to read field '%s' from task %s: %s", field_name, task_id, e)
|
|
||||||
return None
|
|
||||||
|
|
||||||
def create_task(
|
|
||||||
self,
|
|
||||||
list_id: str,
|
|
||||||
name: str,
|
|
||||||
description: str = "",
|
|
||||||
status: str = "to do",
|
|
||||||
due_date: int | None = None,
|
|
||||||
tags: list[str] | None = None,
|
|
||||||
custom_fields: list[dict] | None = None,
|
|
||||||
priority: int | None = None,
|
|
||||||
assignees: list[int] | None = None,
|
|
||||||
time_estimate: int | None = None,
|
|
||||||
) -> dict:
|
|
||||||
"""Create a new task in a ClickUp list.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
list_id: The list to create the task in.
|
|
||||||
name: Task name.
|
|
||||||
description: Task description (markdown supported).
|
|
||||||
status: Initial status (default "to do").
|
|
||||||
due_date: Due date as Unix timestamp in milliseconds.
|
|
||||||
tags: List of tag names to apply.
|
|
||||||
custom_fields: List of custom field dicts ({"id": ..., "value": ...}).
|
|
||||||
priority: 1=Urgent, 2=High, 3=Normal, 4=Low.
|
|
||||||
assignees: List of ClickUp user IDs.
|
|
||||||
time_estimate: Time estimate in milliseconds.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
API response dict containing task id, url, etc.
|
|
||||||
"""
|
|
||||||
payload: dict[str, Any] = {"name": name, "status": status}
|
|
||||||
if description:
|
|
||||||
payload["description"] = description
|
|
||||||
if due_date is not None:
|
|
||||||
payload["due_date"] = due_date
|
|
||||||
if tags:
|
|
||||||
payload["tags"] = tags
|
|
||||||
if custom_fields:
|
|
||||||
payload["custom_fields"] = custom_fields
|
|
||||||
if priority is not None:
|
|
||||||
payload["priority"] = priority
|
|
||||||
if assignees:
|
|
||||||
payload["assignees"] = assignees
|
|
||||||
if time_estimate is not None:
|
|
||||||
payload["time_estimate"] = time_estimate
|
|
||||||
|
|
||||||
def _call():
|
|
||||||
resp = self._client.post(f"/list/{list_id}/task", json=payload)
|
|
||||||
resp.raise_for_status()
|
|
||||||
return resp.json()
|
|
||||||
|
|
||||||
result = self._retry(_call)
|
|
||||||
log.info("Created task '%s' in list %s (id: %s)", name, list_id, result.get("id"))
|
|
||||||
return result
|
|
||||||
|
|
||||||
def find_list_in_folder(
|
|
||||||
self, space_id: str, folder_name: str, list_name: str = "Overall"
|
|
||||||
) -> str | None:
|
|
||||||
"""Find a list within a named folder in a space.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
space_id: ClickUp space ID.
|
|
||||||
folder_name: Folder name to match (case-insensitive).
|
|
||||||
list_name: List name within the folder (default "Overall").
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
The list_id if found, or None.
|
|
||||||
"""
|
|
||||||
folders = self.get_folders(space_id)
|
|
||||||
for folder in folders:
|
|
||||||
if folder["name"].lower() == folder_name.lower():
|
|
||||||
for lst in folder["lists"]:
|
|
||||||
if lst["name"].lower() == list_name.lower():
|
|
||||||
return lst["id"]
|
|
||||||
return None
|
|
||||||
|
|
||||||
def discover_field_filter(self, list_id: str, field_name: str) -> dict[str, Any] | None:
|
def discover_field_filter(self, list_id: str, field_name: str) -> dict[str, Any] | None:
|
||||||
"""Discover a custom field's UUID and dropdown option map.
|
"""Discover a custom field's UUID and dropdown option map.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -43,13 +43,11 @@ class ClickUpConfig:
|
||||||
poll_interval_minutes: int = 20
|
poll_interval_minutes: int = 20
|
||||||
poll_statuses: list[str] = field(default_factory=lambda: ["to do"])
|
poll_statuses: list[str] = field(default_factory=lambda: ["to do"])
|
||||||
review_status: str = "internal review"
|
review_status: str = "internal review"
|
||||||
pr_review_status: str = "pr needs review"
|
|
||||||
in_progress_status: str = "in progress"
|
in_progress_status: str = "in progress"
|
||||||
automation_status: str = "automation underway"
|
automation_status: str = "automation underway"
|
||||||
error_status: str = "error"
|
error_status: str = "error"
|
||||||
task_type_field_name: str = "Work Category"
|
task_type_field_name: str = "Work Category"
|
||||||
default_auto_execute: bool = False
|
default_auto_execute: bool = False
|
||||||
poll_task_types: list[str] = field(default_factory=list)
|
|
||||||
skill_map: dict = field(default_factory=dict)
|
skill_map: dict = field(default_factory=dict)
|
||||||
enabled: bool = False
|
enabled: bool = False
|
||||||
|
|
||||||
|
|
@ -89,7 +87,6 @@ class AutoCoraConfig:
|
||||||
cora_categories: list[str] = field(
|
cora_categories: list[str] = field(
|
||||||
default_factory=lambda: ["Content Creation", "On Page Optimization", "Link Building"]
|
default_factory=lambda: ["Content Creation", "On Page Optimization", "Link Building"]
|
||||||
)
|
)
|
||||||
cora_human_inbox: str = "" # e.g. "Z:/Cora-For-Human"
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
|
|
@ -98,39 +95,6 @@ class ApiBudgetConfig:
|
||||||
alert_threshold: float = 0.8 # alert at 80% of limit
|
alert_threshold: float = 0.8 # alert at 80% of limit
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class TimeoutConfig:
|
|
||||||
execution_brain: int = 2700 # 45 minutes
|
|
||||||
blm: int = 1800 # 30 minutes
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class ContentConfig:
|
|
||||||
cora_inbox: str = "" # e.g. "Z:/content-cora-inbox"
|
|
||||||
outline_dir: str = "" # e.g. "Z:/content-outlines"
|
|
||||||
company_capabilities_default: str = (
|
|
||||||
"All certifications and licenses need to be verified on the company's website."
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class NtfyChannelConfig:
|
|
||||||
name: str = ""
|
|
||||||
topic_env_var: str = "" # env var name holding the topic string
|
|
||||||
server: str = "https://ntfy.sh"
|
|
||||||
categories: list[str] = field(default_factory=list)
|
|
||||||
include_patterns: list[str] = field(default_factory=list)
|
|
||||||
exclude_patterns: list[str] = field(default_factory=list)
|
|
||||||
priority: str = "high" # min / low / default / high / urgent
|
|
||||||
tags: str = "" # comma-separated emoji shortcodes
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class NtfyConfig:
|
|
||||||
enabled: bool = False
|
|
||||||
channels: list[NtfyChannelConfig] = field(default_factory=list)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class AgentConfig:
|
class AgentConfig:
|
||||||
"""Per-agent configuration for multi-agent support."""
|
"""Per-agent configuration for multi-agent support."""
|
||||||
|
|
@ -162,9 +126,6 @@ class Config:
|
||||||
link_building: LinkBuildingConfig = field(default_factory=LinkBuildingConfig)
|
link_building: LinkBuildingConfig = field(default_factory=LinkBuildingConfig)
|
||||||
autocora: AutoCoraConfig = field(default_factory=AutoCoraConfig)
|
autocora: AutoCoraConfig = field(default_factory=AutoCoraConfig)
|
||||||
api_budget: ApiBudgetConfig = field(default_factory=ApiBudgetConfig)
|
api_budget: ApiBudgetConfig = field(default_factory=ApiBudgetConfig)
|
||||||
content: ContentConfig = field(default_factory=ContentConfig)
|
|
||||||
timeouts: TimeoutConfig = field(default_factory=TimeoutConfig)
|
|
||||||
ntfy: NtfyConfig = field(default_factory=NtfyConfig)
|
|
||||||
agents: list[AgentConfig] = field(default_factory=lambda: [AgentConfig()])
|
agents: list[AgentConfig] = field(default_factory=lambda: [AgentConfig()])
|
||||||
|
|
||||||
# Derived paths
|
# Derived paths
|
||||||
|
|
@ -224,28 +185,6 @@ def load_config() -> Config:
|
||||||
for k, v in data["api_budget"].items():
|
for k, v in data["api_budget"].items():
|
||||||
if hasattr(cfg.api_budget, k):
|
if hasattr(cfg.api_budget, k):
|
||||||
setattr(cfg.api_budget, k, v)
|
setattr(cfg.api_budget, k, v)
|
||||||
if "content" in data and isinstance(data["content"], dict):
|
|
||||||
for k, v in data["content"].items():
|
|
||||||
if hasattr(cfg.content, k):
|
|
||||||
setattr(cfg.content, k, v)
|
|
||||||
if "timeouts" in data and isinstance(data["timeouts"], dict):
|
|
||||||
for k, v in data["timeouts"].items():
|
|
||||||
if hasattr(cfg.timeouts, k):
|
|
||||||
setattr(cfg.timeouts, k, int(v))
|
|
||||||
|
|
||||||
# ntfy push notifications
|
|
||||||
if "ntfy" in data and isinstance(data["ntfy"], dict):
|
|
||||||
ntfy_data = data["ntfy"]
|
|
||||||
cfg.ntfy.enabled = ntfy_data.get("enabled", False)
|
|
||||||
if "channels" in ntfy_data and isinstance(ntfy_data["channels"], list):
|
|
||||||
cfg.ntfy.channels = []
|
|
||||||
for ch_data in ntfy_data["channels"]:
|
|
||||||
if isinstance(ch_data, dict):
|
|
||||||
ch = NtfyChannelConfig()
|
|
||||||
for k, v in ch_data.items():
|
|
||||||
if hasattr(ch, k):
|
|
||||||
setattr(ch, k, v)
|
|
||||||
cfg.ntfy.channels.append(ch)
|
|
||||||
|
|
||||||
# Multi-agent configs
|
# Multi-agent configs
|
||||||
if "agents" in data and isinstance(data["agents"], list):
|
if "agents" in data and isinstance(data["agents"], list):
|
||||||
|
|
@ -299,12 +238,6 @@ def load_config() -> Config:
|
||||||
if blm_dir := os.getenv("BLM_DIR"):
|
if blm_dir := os.getenv("BLM_DIR"):
|
||||||
cfg.link_building.blm_dir = blm_dir
|
cfg.link_building.blm_dir = blm_dir
|
||||||
|
|
||||||
# Timeout env var overrides (seconds)
|
|
||||||
if t := os.getenv("CHEDDAH_TIMEOUT_EXECUTION_BRAIN"):
|
|
||||||
cfg.timeouts.execution_brain = int(t)
|
|
||||||
if t := os.getenv("CHEDDAH_TIMEOUT_BLM"):
|
|
||||||
cfg.timeouts.blm = int(t)
|
|
||||||
|
|
||||||
# Ensure data directories exist
|
# Ensure data directories exist
|
||||||
cfg.data_dir.mkdir(parents=True, exist_ok=True)
|
cfg.data_dir.mkdir(parents=True, exist_ok=True)
|
||||||
(cfg.data_dir / "uploads").mkdir(exist_ok=True)
|
(cfg.data_dir / "uploads").mkdir(exist_ok=True)
|
||||||
|
|
|
||||||
|
|
@ -156,8 +156,6 @@ class LLMAdapter:
|
||||||
working_dir: str | None = None,
|
working_dir: str | None = None,
|
||||||
tools: str = "Bash,Read,Edit,Write,Glob,Grep",
|
tools: str = "Bash,Read,Edit,Write,Glob,Grep",
|
||||||
model: str | None = None,
|
model: str | None = None,
|
||||||
skip_permissions: bool = False,
|
|
||||||
timeout: int = 2700,
|
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Execution brain: calls Claude Code CLI with full tool access.
|
"""Execution brain: calls Claude Code CLI with full tool access.
|
||||||
|
|
||||||
|
|
@ -167,9 +165,6 @@ class LLMAdapter:
|
||||||
Args:
|
Args:
|
||||||
tools: Comma-separated Claude Code tool names (default: standard set).
|
tools: Comma-separated Claude Code tool names (default: standard set).
|
||||||
model: Override the CLI model (e.g. "claude-sonnet-4.5").
|
model: Override the CLI model (e.g. "claude-sonnet-4.5").
|
||||||
skip_permissions: If True, append --dangerously-skip-permissions to
|
|
||||||
timeout: Max seconds to wait for CLI completion (default: 2700 / 45 min).
|
|
||||||
the CLI invocation (used for automated pipelines).
|
|
||||||
"""
|
"""
|
||||||
claude_bin = shutil.which("claude")
|
claude_bin = shutil.which("claude")
|
||||||
if not claude_bin:
|
if not claude_bin:
|
||||||
|
|
@ -193,8 +188,6 @@ class LLMAdapter:
|
||||||
cmd.extend(["--model", model])
|
cmd.extend(["--model", model])
|
||||||
if system_prompt:
|
if system_prompt:
|
||||||
cmd.extend(["--system-prompt", system_prompt])
|
cmd.extend(["--system-prompt", system_prompt])
|
||||||
if skip_permissions:
|
|
||||||
cmd.append("--dangerously-skip-permissions")
|
|
||||||
|
|
||||||
log.debug("Execution brain cmd: %s", " ".join(cmd[:6]) + "...")
|
log.debug("Execution brain cmd: %s", " ".join(cmd[:6]) + "...")
|
||||||
|
|
||||||
|
|
@ -220,11 +213,10 @@ class LLMAdapter:
|
||||||
)
|
)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
stdout, stderr = proc.communicate(input=prompt, timeout=timeout)
|
stdout, stderr = proc.communicate(input=prompt, timeout=300)
|
||||||
except subprocess.TimeoutExpired:
|
except subprocess.TimeoutExpired:
|
||||||
proc.kill()
|
proc.kill()
|
||||||
minutes = timeout // 60
|
return "Error: Claude Code execution timed out after 5 minutes."
|
||||||
return f"Error: Claude Code execution timed out after {minutes} minutes."
|
|
||||||
|
|
||||||
if proc.returncode != 0:
|
if proc.returncode != 0:
|
||||||
return f"Execution error: {stderr or 'unknown error'}"
|
return f"Execution error: {stderr or 'unknown error'}"
|
||||||
|
|
@ -363,14 +355,9 @@ class LLMAdapter:
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
if not has_yielded and attempt < max_retries and _is_retryable_error(e):
|
if not has_yielded and attempt < max_retries and _is_retryable_error(e):
|
||||||
wait = 2**attempt
|
wait = 2 ** attempt
|
||||||
log.warning(
|
log.warning("Retryable LLM error (attempt %d/%d), retrying in %ds: %s",
|
||||||
"Retryable LLM error (attempt %d/%d), retrying in %ds: %s",
|
attempt + 1, max_retries + 1, wait, e)
|
||||||
attempt + 1,
|
|
||||||
max_retries + 1,
|
|
||||||
wait,
|
|
||||||
e,
|
|
||||||
)
|
|
||||||
time.sleep(wait)
|
time.sleep(wait)
|
||||||
continue
|
continue
|
||||||
yield {"type": "text", "content": _friendly_error(e, self.provider)}
|
yield {"type": "text", "content": _friendly_error(e, self.provider)}
|
||||||
|
|
|
||||||
|
|
@ -1,175 +0,0 @@
|
||||||
"""ntfy.sh push notification sender.
|
|
||||||
|
|
||||||
Subscribes to the NotificationBus and routes notifications to ntfy.sh
|
|
||||||
topics based on category and message-pattern matching.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import hashlib
|
|
||||||
import logging
|
|
||||||
import re
|
|
||||||
import threading
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from datetime import date
|
|
||||||
|
|
||||||
import httpx
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class NtfyChannel:
|
|
||||||
"""One ntfy topic with routing rules."""
|
|
||||||
|
|
||||||
name: str
|
|
||||||
server: str
|
|
||||||
topic: str
|
|
||||||
categories: list[str]
|
|
||||||
include_patterns: list[str] = field(default_factory=list)
|
|
||||||
exclude_patterns: list[str] = field(default_factory=list)
|
|
||||||
priority: str = "high"
|
|
||||||
tags: str = ""
|
|
||||||
|
|
||||||
def accepts(self, message: str, category: str) -> bool:
|
|
||||||
"""Return True if this channel should receive the notification."""
|
|
||||||
if category not in self.categories:
|
|
||||||
return False
|
|
||||||
if self.exclude_patterns:
|
|
||||||
for pat in self.exclude_patterns:
|
|
||||||
if re.search(pat, message, re.IGNORECASE):
|
|
||||||
return False
|
|
||||||
if self.include_patterns:
|
|
||||||
return any(
|
|
||||||
re.search(pat, message, re.IGNORECASE)
|
|
||||||
for pat in self.include_patterns
|
|
||||||
)
|
|
||||||
return True # no include_patterns = accept all matching categories
|
|
||||||
|
|
||||||
|
|
||||||
class NtfyNotifier:
|
|
||||||
"""Posts notifications to ntfy.sh topics."""
|
|
||||||
|
|
||||||
def __init__(
|
|
||||||
self,
|
|
||||||
channels: list[NtfyChannel],
|
|
||||||
*,
|
|
||||||
daily_cap: int = 200,
|
|
||||||
):
|
|
||||||
self._channels = [ch for ch in channels if ch.topic]
|
|
||||||
self._daily_cap = daily_cap
|
|
||||||
self._lock = threading.Lock()
|
|
||||||
# dedup: set of hash(channel.name + message) — persists for process lifetime
|
|
||||||
self._sent: set[str] = set()
|
|
||||||
# daily cap tracking
|
|
||||||
self._daily_count = 0
|
|
||||||
self._daily_date = ""
|
|
||||||
# 429 backoff: date string when rate-limited
|
|
||||||
self._rate_limited_until = ""
|
|
||||||
if self._channels:
|
|
||||||
log.info(
|
|
||||||
"ntfy notifier initialized with %d channel(s): %s",
|
|
||||||
len(self._channels),
|
|
||||||
", ".join(ch.name for ch in self._channels),
|
|
||||||
)
|
|
||||||
|
|
||||||
@property
|
|
||||||
def enabled(self) -> bool:
|
|
||||||
return bool(self._channels)
|
|
||||||
|
|
||||||
def _today(self) -> str:
|
|
||||||
return date.today().isoformat()
|
|
||||||
|
|
||||||
def _check_and_track(self, channel_name: str, message: str) -> bool:
|
|
||||||
"""Return True if this message should be sent. Updates internal state."""
|
|
||||||
today = self._today()
|
|
||||||
|
|
||||||
with self._lock:
|
|
||||||
# 429 backoff: skip all sends for rest of day
|
|
||||||
if self._rate_limited_until == today:
|
|
||||||
return False
|
|
||||||
|
|
||||||
# Reset daily counter on date rollover (but keep dedup memory)
|
|
||||||
if self._daily_date != today:
|
|
||||||
self._daily_date = today
|
|
||||||
self._daily_count = 0
|
|
||||||
self._rate_limited_until = ""
|
|
||||||
|
|
||||||
# Daily cap check
|
|
||||||
if self._daily_count >= self._daily_cap:
|
|
||||||
return False
|
|
||||||
|
|
||||||
# Dedup check — once sent, never send the same message again
|
|
||||||
# (until process restart)
|
|
||||||
key = hashlib.md5(
|
|
||||||
(channel_name + "\0" + message).encode()
|
|
||||||
).hexdigest()
|
|
||||||
if key in self._sent:
|
|
||||||
log.info(
|
|
||||||
"ntfy dedup: suppressed duplicate to '%s'", channel_name,
|
|
||||||
)
|
|
||||||
return False
|
|
||||||
|
|
||||||
# All checks passed — record send
|
|
||||||
self._sent.add(key)
|
|
||||||
self._daily_count += 1
|
|
||||||
|
|
||||||
if self._daily_count == self._daily_cap:
|
|
||||||
log.warning(
|
|
||||||
"ntfy daily cap reached (%d). No more sends today.",
|
|
||||||
self._daily_cap,
|
|
||||||
)
|
|
||||||
|
|
||||||
return True
|
|
||||||
|
|
||||||
def _mark_rate_limited(self) -> None:
|
|
||||||
"""Flag that we got a 429 — suppress all sends for rest of day."""
|
|
||||||
with self._lock:
|
|
||||||
self._rate_limited_until = self._today()
|
|
||||||
log.warning("ntfy 429 received. Suppressing all sends for rest of day.")
|
|
||||||
|
|
||||||
def notify(self, message: str, category: str) -> None:
|
|
||||||
"""Route a notification to matching ntfy channels.
|
|
||||||
|
|
||||||
This is the callback signature expected by NotificationBus.subscribe().
|
|
||||||
Each matching channel posts in a daemon thread so the notification
|
|
||||||
pipeline is never blocked.
|
|
||||||
"""
|
|
||||||
for channel in self._channels:
|
|
||||||
if channel.accepts(message, category):
|
|
||||||
if not self._check_and_track(channel.name, message):
|
|
||||||
continue
|
|
||||||
t = threading.Thread(
|
|
||||||
target=self._post,
|
|
||||||
args=(channel, message, category),
|
|
||||||
daemon=True,
|
|
||||||
)
|
|
||||||
t.start()
|
|
||||||
|
|
||||||
def _post(self, channel: NtfyChannel, message: str, category: str) -> None:
|
|
||||||
"""Send a notification to an ntfy topic. Fire-and-forget."""
|
|
||||||
url = f"{channel.server.rstrip('/')}/{channel.topic}"
|
|
||||||
headers: dict[str, str] = {
|
|
||||||
"Title": f"CheddahBot [{category}]",
|
|
||||||
"Priority": channel.priority,
|
|
||||||
}
|
|
||||||
if channel.tags:
|
|
||||||
headers["Tags"] = channel.tags
|
|
||||||
try:
|
|
||||||
resp = httpx.post(
|
|
||||||
url,
|
|
||||||
content=message.encode("utf-8"),
|
|
||||||
headers=headers,
|
|
||||||
timeout=10.0,
|
|
||||||
)
|
|
||||||
if resp.status_code == 429:
|
|
||||||
self._mark_rate_limited()
|
|
||||||
elif resp.status_code >= 400:
|
|
||||||
log.warning(
|
|
||||||
"ntfy '%s' returned %d: %s",
|
|
||||||
channel.name, resp.status_code, resp.text[:200],
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
log.debug("ntfy notification sent to '%s'", channel.name)
|
|
||||||
except httpx.HTTPError as e:
|
|
||||||
log.warning("ntfy '%s' failed: %s", channel.name, e)
|
|
||||||
File diff suppressed because it is too large
Load Diff
|
|
@ -1,619 +0,0 @@
|
||||||
/* CheddahBot Dark Theme */
|
|
||||||
|
|
||||||
:root {
|
|
||||||
--bg-primary: #0d1117;
|
|
||||||
--bg-surface: #161b22;
|
|
||||||
--bg-surface-hover: #1c2129;
|
|
||||||
--bg-input: #0d1117;
|
|
||||||
--text-primary: #e6edf3;
|
|
||||||
--text-secondary: #8b949e;
|
|
||||||
--text-muted: #484f58;
|
|
||||||
--accent: #2dd4bf;
|
|
||||||
--accent-dim: #134e4a;
|
|
||||||
--border: #30363d;
|
|
||||||
--success: #3fb950;
|
|
||||||
--error: #f85149;
|
|
||||||
--warning: #d29922;
|
|
||||||
--font-sans: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif;
|
|
||||||
--font-mono: 'JetBrains Mono', 'Fira Code', 'Cascadia Code', monospace;
|
|
||||||
--radius: 8px;
|
|
||||||
--sidebar-width: 280px;
|
|
||||||
}
|
|
||||||
|
|
||||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
|
||||||
|
|
||||||
html, body {
|
|
||||||
height: 100%;
|
|
||||||
font-family: var(--font-sans);
|
|
||||||
font-size: 15px;
|
|
||||||
line-height: 1.5;
|
|
||||||
color: var(--text-primary);
|
|
||||||
background: var(--bg-primary);
|
|
||||||
overflow: hidden;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Top Navigation */
|
|
||||||
.top-nav {
|
|
||||||
display: flex;
|
|
||||||
align-items: center;
|
|
||||||
gap: 24px;
|
|
||||||
padding: 0 20px;
|
|
||||||
height: 48px;
|
|
||||||
background: var(--bg-surface);
|
|
||||||
border-bottom: 1px solid var(--border);
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
.nav-brand {
|
|
||||||
font-weight: 700;
|
|
||||||
font-size: 1.1em;
|
|
||||||
color: var(--accent);
|
|
||||||
}
|
|
||||||
|
|
||||||
.nav-links { display: flex; gap: 4px; }
|
|
||||||
|
|
||||||
.nav-link {
|
|
||||||
color: var(--text-secondary);
|
|
||||||
text-decoration: none;
|
|
||||||
padding: 6px 14px;
|
|
||||||
border-radius: var(--radius);
|
|
||||||
font-size: 0.9em;
|
|
||||||
transition: background 0.15s, color 0.15s;
|
|
||||||
}
|
|
||||||
.nav-link:hover { background: var(--bg-surface-hover); color: var(--text-primary); }
|
|
||||||
.nav-link.active { color: var(--accent); background: var(--accent-dim); }
|
|
||||||
|
|
||||||
/* Main content area */
|
|
||||||
.main-content {
|
|
||||||
height: calc(100vh - 48px);
|
|
||||||
overflow: hidden;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* ─── Chat Layout ─── */
|
|
||||||
.chat-layout {
|
|
||||||
display: flex;
|
|
||||||
height: 100%;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Sidebar */
|
|
||||||
.chat-sidebar {
|
|
||||||
width: var(--sidebar-width);
|
|
||||||
min-width: var(--sidebar-width);
|
|
||||||
background: var(--bg-surface);
|
|
||||||
border-right: 1px solid var(--border);
|
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
padding: 12px;
|
|
||||||
gap: 8px;
|
|
||||||
overflow-y: auto;
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
.sidebar-header {
|
|
||||||
display: flex;
|
|
||||||
justify-content: space-between;
|
|
||||||
align-items: center;
|
|
||||||
}
|
|
||||||
|
|
||||||
.sidebar-header h3 { font-size: 0.85em; color: var(--text-secondary); text-transform: uppercase; letter-spacing: 0.05em; }
|
|
||||||
|
|
||||||
.sidebar-toggle {
|
|
||||||
display: none;
|
|
||||||
background: none;
|
|
||||||
border: none;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
font-size: 1.2em;
|
|
||||||
cursor: pointer;
|
|
||||||
}
|
|
||||||
|
|
||||||
.sidebar-open-btn {
|
|
||||||
display: none;
|
|
||||||
position: fixed;
|
|
||||||
top: 56px;
|
|
||||||
left: 8px;
|
|
||||||
z-index: 20;
|
|
||||||
background: var(--bg-surface);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
color: var(--text-primary);
|
|
||||||
padding: 6px 10px;
|
|
||||||
border-radius: var(--radius);
|
|
||||||
cursor: pointer;
|
|
||||||
font-size: 1.2em;
|
|
||||||
}
|
|
||||||
|
|
||||||
.sidebar-divider {
|
|
||||||
height: 1px;
|
|
||||||
background: var(--border);
|
|
||||||
margin: 4px 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
.agent-selector { display: flex; flex-direction: column; gap: 4px; }
|
|
||||||
|
|
||||||
.agent-btn {
|
|
||||||
padding: 8px 12px;
|
|
||||||
background: transparent;
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
color: var(--text-primary);
|
|
||||||
cursor: pointer;
|
|
||||||
text-align: left;
|
|
||||||
font-size: 0.9em;
|
|
||||||
transition: border-color 0.15s, background 0.15s;
|
|
||||||
}
|
|
||||||
.agent-btn:hover { background: var(--bg-surface-hover); }
|
|
||||||
.agent-btn.active { border-color: var(--accent); background: var(--accent-dim); }
|
|
||||||
|
|
||||||
.btn-new-chat {
|
|
||||||
width: 100%;
|
|
||||||
padding: 8px;
|
|
||||||
background: var(--accent-dim);
|
|
||||||
border: 1px solid var(--accent);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
color: var(--accent);
|
|
||||||
cursor: pointer;
|
|
||||||
font-size: 0.9em;
|
|
||||||
transition: background 0.15s;
|
|
||||||
}
|
|
||||||
.btn-new-chat:hover { background: var(--accent); color: var(--bg-primary); }
|
|
||||||
|
|
||||||
.chat-sidebar h3 {
|
|
||||||
font-size: 0.8em;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
text-transform: uppercase;
|
|
||||||
letter-spacing: 0.05em;
|
|
||||||
margin-top: 8px;
|
|
||||||
}
|
|
||||||
|
|
||||||
.conv-btn {
|
|
||||||
display: block;
|
|
||||||
width: 100%;
|
|
||||||
padding: 8px 10px;
|
|
||||||
background: transparent;
|
|
||||||
border: 1px solid transparent;
|
|
||||||
border-radius: var(--radius);
|
|
||||||
color: var(--text-primary);
|
|
||||||
cursor: pointer;
|
|
||||||
text-align: left;
|
|
||||||
font-size: 0.85em;
|
|
||||||
white-space: nowrap;
|
|
||||||
overflow: hidden;
|
|
||||||
text-overflow: ellipsis;
|
|
||||||
transition: background 0.15s;
|
|
||||||
}
|
|
||||||
.conv-btn:hover { background: var(--bg-surface-hover); }
|
|
||||||
.conv-btn.active { border-color: var(--accent); background: var(--accent-dim); }
|
|
||||||
|
|
||||||
/* Chat main area */
|
|
||||||
.chat-main {
|
|
||||||
flex: 1;
|
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
min-width: 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Status bar */
|
|
||||||
.status-bar {
|
|
||||||
display: flex;
|
|
||||||
gap: 16px;
|
|
||||||
padding: 8px 20px;
|
|
||||||
font-size: 0.8em;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
border-bottom: 1px solid var(--border);
|
|
||||||
background: var(--bg-surface);
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
.status-item strong { color: var(--text-primary); }
|
|
||||||
.text-ok { color: var(--success) !important; }
|
|
||||||
.text-err { color: var(--error) !important; }
|
|
||||||
|
|
||||||
/* Notification banner */
|
|
||||||
.notification-banner {
|
|
||||||
margin: 8px 20px 0;
|
|
||||||
padding: 10px 16px;
|
|
||||||
background: var(--bg-surface);
|
|
||||||
border: 1px solid var(--accent-dim);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
font-size: 0.9em;
|
|
||||||
color: var(--accent);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Messages area */
|
|
||||||
.chat-messages {
|
|
||||||
flex: 1;
|
|
||||||
overflow-y: auto;
|
|
||||||
padding: 16px 20px;
|
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
gap: 12px;
|
|
||||||
}
|
|
||||||
|
|
||||||
.message {
|
|
||||||
display: flex;
|
|
||||||
gap: 10px;
|
|
||||||
max-width: 85%;
|
|
||||||
animation: fadeIn 0.2s ease-out;
|
|
||||||
}
|
|
||||||
|
|
||||||
@keyframes fadeIn {
|
|
||||||
from { opacity: 0; transform: translateY(4px); }
|
|
||||||
to { opacity: 1; transform: translateY(0); }
|
|
||||||
}
|
|
||||||
|
|
||||||
.message.user { align-self: flex-end; flex-direction: row-reverse; }
|
|
||||||
.message.assistant { align-self: flex-start; }
|
|
||||||
|
|
||||||
.message-avatar {
|
|
||||||
width: 32px;
|
|
||||||
height: 32px;
|
|
||||||
border-radius: 50%;
|
|
||||||
display: flex;
|
|
||||||
align-items: center;
|
|
||||||
justify-content: center;
|
|
||||||
font-size: 0.7em;
|
|
||||||
font-weight: 700;
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
.message.user .message-avatar { background: var(--accent-dim); color: var(--accent); }
|
|
||||||
.message.assistant .message-avatar { background: #1c2129; color: var(--text-secondary); }
|
|
||||||
|
|
||||||
.message-body {
|
|
||||||
background: var(--bg-surface);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
padding: 10px 14px;
|
|
||||||
min-width: 0;
|
|
||||||
}
|
|
||||||
.message.user .message-body { background: var(--accent-dim); border-color: var(--accent); }
|
|
||||||
|
|
||||||
.message-content {
|
|
||||||
word-wrap: break-word;
|
|
||||||
overflow-wrap: break-word;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Markdown rendering in messages */
|
|
||||||
.message-content p { margin: 0.4em 0; }
|
|
||||||
.message-content p:first-child { margin-top: 0; }
|
|
||||||
.message-content p:last-child { margin-bottom: 0; }
|
|
||||||
.message-content pre {
|
|
||||||
background: var(--bg-primary);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: 4px;
|
|
||||||
padding: 10px;
|
|
||||||
overflow-x: auto;
|
|
||||||
font-family: var(--font-mono);
|
|
||||||
font-size: 0.9em;
|
|
||||||
margin: 0.5em 0;
|
|
||||||
}
|
|
||||||
.message-content code {
|
|
||||||
font-family: var(--font-mono);
|
|
||||||
font-size: 0.9em;
|
|
||||||
background: var(--bg-primary);
|
|
||||||
padding: 2px 5px;
|
|
||||||
border-radius: 3px;
|
|
||||||
}
|
|
||||||
.message-content pre code { background: none; padding: 0; }
|
|
||||||
.message-content ul, .message-content ol { margin: 0.4em 0; padding-left: 1.5em; }
|
|
||||||
.message-content a { color: var(--accent); }
|
|
||||||
.message-content blockquote {
|
|
||||||
border-left: 3px solid var(--accent);
|
|
||||||
padding-left: 12px;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
margin: 0.5em 0;
|
|
||||||
}
|
|
||||||
.message-content table { border-collapse: collapse; margin: 0.5em 0; }
|
|
||||||
.message-content th, .message-content td {
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
padding: 6px 10px;
|
|
||||||
text-align: left;
|
|
||||||
}
|
|
||||||
.message-content th { background: var(--bg-surface-hover); }
|
|
||||||
|
|
||||||
/* Chat input area */
|
|
||||||
.chat-input-area {
|
|
||||||
padding: 12px 20px;
|
|
||||||
border-top: 1px solid var(--border);
|
|
||||||
background: var(--bg-surface);
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
.input-row {
|
|
||||||
display: flex;
|
|
||||||
align-items: flex-end;
|
|
||||||
gap: 8px;
|
|
||||||
}
|
|
||||||
|
|
||||||
#chat-input {
|
|
||||||
flex: 1;
|
|
||||||
background: var(--bg-input);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
padding: 10px 14px;
|
|
||||||
color: var(--text-primary);
|
|
||||||
font-family: var(--font-sans);
|
|
||||||
font-size: 15px;
|
|
||||||
resize: none;
|
|
||||||
max-height: 200px;
|
|
||||||
line-height: 1.4;
|
|
||||||
}
|
|
||||||
#chat-input:focus { outline: none; border-color: var(--accent); }
|
|
||||||
#chat-input::placeholder { color: var(--text-muted); }
|
|
||||||
|
|
||||||
.file-upload-btn {
|
|
||||||
padding: 8px 10px;
|
|
||||||
cursor: pointer;
|
|
||||||
font-size: 1.2em;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
transition: color 0.15s;
|
|
||||||
flex-shrink: 0;
|
|
||||||
}
|
|
||||||
.file-upload-btn:hover { color: var(--accent); }
|
|
||||||
|
|
||||||
.send-btn {
|
|
||||||
padding: 8px 14px;
|
|
||||||
background: var(--accent);
|
|
||||||
border: none;
|
|
||||||
border-radius: var(--radius);
|
|
||||||
color: var(--bg-primary);
|
|
||||||
font-size: 1.1em;
|
|
||||||
cursor: pointer;
|
|
||||||
flex-shrink: 0;
|
|
||||||
transition: opacity 0.15s;
|
|
||||||
}
|
|
||||||
.send-btn:hover { opacity: 0.85; }
|
|
||||||
|
|
||||||
.file-preview {
|
|
||||||
margin-top: 6px;
|
|
||||||
font-size: 0.85em;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
}
|
|
||||||
.file-preview .file-tag {
|
|
||||||
display: inline-block;
|
|
||||||
background: var(--bg-primary);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: 4px;
|
|
||||||
padding: 2px 8px;
|
|
||||||
margin-right: 6px;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* ─── Dashboard Layout ─── */
|
|
||||||
.dashboard-layout {
|
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
gap: 16px;
|
|
||||||
padding: 16px 20px;
|
|
||||||
height: 100%;
|
|
||||||
overflow-y: auto;
|
|
||||||
}
|
|
||||||
|
|
||||||
.panel {
|
|
||||||
background: var(--bg-surface);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
padding: 16px;
|
|
||||||
}
|
|
||||||
|
|
||||||
.panel-title {
|
|
||||||
font-size: 1.1em;
|
|
||||||
font-weight: 600;
|
|
||||||
margin-bottom: 12px;
|
|
||||||
color: var(--accent);
|
|
||||||
}
|
|
||||||
|
|
||||||
.panel-section {
|
|
||||||
margin-bottom: 16px;
|
|
||||||
}
|
|
||||||
|
|
||||||
.panel-section h3 {
|
|
||||||
font-size: 0.85em;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
text-transform: uppercase;
|
|
||||||
letter-spacing: 0.05em;
|
|
||||||
margin-bottom: 8px;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Loop health grid */
|
|
||||||
.loop-grid {
|
|
||||||
display: flex;
|
|
||||||
flex-wrap: wrap;
|
|
||||||
gap: 8px;
|
|
||||||
}
|
|
||||||
|
|
||||||
.loop-badge {
|
|
||||||
display: flex;
|
|
||||||
flex-direction: column;
|
|
||||||
align-items: center;
|
|
||||||
padding: 8px 12px;
|
|
||||||
border-radius: var(--radius);
|
|
||||||
font-size: 0.8em;
|
|
||||||
min-width: 90px;
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
}
|
|
||||||
.loop-name { font-weight: 600; }
|
|
||||||
.loop-ago { color: var(--text-secondary); font-size: 0.85em; }
|
|
||||||
|
|
||||||
.badge-ok { border-color: var(--success); background: rgba(63, 185, 80, 0.1); }
|
|
||||||
.badge-ok .loop-name { color: var(--success); }
|
|
||||||
.badge-warn { border-color: var(--warning); background: rgba(210, 153, 34, 0.1); }
|
|
||||||
.badge-warn .loop-name { color: var(--warning); }
|
|
||||||
.badge-err { border-color: var(--error); background: rgba(248, 81, 73, 0.1); }
|
|
||||||
.badge-err .loop-name { color: var(--error); }
|
|
||||||
.badge-muted { border-color: var(--text-muted); }
|
|
||||||
.badge-muted .loop-name { color: var(--text-muted); }
|
|
||||||
|
|
||||||
/* Active executions */
|
|
||||||
.exec-list { display: flex; flex-direction: column; gap: 6px; }
|
|
||||||
.exec-item {
|
|
||||||
display: flex;
|
|
||||||
gap: 12px;
|
|
||||||
padding: 6px 10px;
|
|
||||||
background: var(--bg-primary);
|
|
||||||
border-radius: 4px;
|
|
||||||
font-size: 0.85em;
|
|
||||||
}
|
|
||||||
.exec-name { flex: 1; font-weight: 500; }
|
|
||||||
.exec-tool { color: var(--text-secondary); }
|
|
||||||
.exec-dur { color: var(--accent); font-family: var(--font-mono); }
|
|
||||||
|
|
||||||
/* Action buttons */
|
|
||||||
.action-buttons { display: flex; gap: 8px; flex-wrap: wrap; }
|
|
||||||
|
|
||||||
.btn {
|
|
||||||
padding: 8px 16px;
|
|
||||||
background: var(--bg-surface-hover);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
color: var(--text-primary);
|
|
||||||
cursor: pointer;
|
|
||||||
font-size: 0.9em;
|
|
||||||
transition: border-color 0.15s, background 0.15s;
|
|
||||||
}
|
|
||||||
.btn:hover { border-color: var(--accent); }
|
|
||||||
|
|
||||||
.btn-sm { padding: 6px 12px; font-size: 0.8em; }
|
|
||||||
|
|
||||||
/* Notification feed */
|
|
||||||
.notif-feed { display: flex; flex-direction: column; gap: 4px; max-height: 300px; overflow-y: auto; }
|
|
||||||
.notif-item {
|
|
||||||
padding: 6px 10px;
|
|
||||||
font-size: 0.85em;
|
|
||||||
border-left: 3px solid var(--border);
|
|
||||||
background: var(--bg-primary);
|
|
||||||
border-radius: 0 4px 4px 0;
|
|
||||||
}
|
|
||||||
.notif-clickup { border-left-color: var(--accent); }
|
|
||||||
.notif-info { border-left-color: var(--text-secondary); }
|
|
||||||
.notif-error { border-left-color: var(--error); }
|
|
||||||
.notif-cat {
|
|
||||||
font-weight: 600;
|
|
||||||
font-size: 0.8em;
|
|
||||||
text-transform: uppercase;
|
|
||||||
color: var(--text-secondary);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Task table */
|
|
||||||
.task-table { width: 100%; border-collapse: collapse; font-size: 0.85em; }
|
|
||||||
.task-table th, .task-table td { padding: 8px 12px; border-bottom: 1px solid var(--border); text-align: left; }
|
|
||||||
.task-table th { color: var(--text-secondary); font-weight: 600; text-transform: uppercase; font-size: 0.85em; }
|
|
||||||
.task-table a { color: var(--accent); text-decoration: none; }
|
|
||||||
.task-table a:hover { text-decoration: underline; }
|
|
||||||
|
|
||||||
.status-badge {
|
|
||||||
display: inline-block;
|
|
||||||
padding: 2px 8px;
|
|
||||||
border-radius: 4px;
|
|
||||||
font-size: 0.85em;
|
|
||||||
font-weight: 500;
|
|
||||||
}
|
|
||||||
.status-to-do { background: rgba(139, 148, 158, 0.2); color: var(--text-secondary); }
|
|
||||||
.status-in-progress, .status-automation-underway { background: rgba(45, 212, 191, 0.15); color: var(--accent); }
|
|
||||||
.status-error { background: rgba(248, 81, 73, 0.15); color: var(--error); }
|
|
||||||
.status-complete, .status-closed { background: rgba(63, 185, 80, 0.15); color: var(--success); }
|
|
||||||
.status-internal-review, .status-outline-review { background: rgba(210, 153, 34, 0.15); color: var(--warning); }
|
|
||||||
|
|
||||||
/* Pipeline groups */
|
|
||||||
.pipeline-group { margin-bottom: 16px; }
|
|
||||||
.pipeline-group h4 {
|
|
||||||
font-size: 0.9em;
|
|
||||||
margin-bottom: 8px;
|
|
||||||
padding-bottom: 4px;
|
|
||||||
border-bottom: 1px solid var(--border);
|
|
||||||
}
|
|
||||||
.pipeline-stats {
|
|
||||||
display: flex;
|
|
||||||
gap: 12px;
|
|
||||||
margin-bottom: 12px;
|
|
||||||
flex-wrap: wrap;
|
|
||||||
}
|
|
||||||
.pipeline-stat {
|
|
||||||
padding: 8px 14px;
|
|
||||||
background: var(--bg-primary);
|
|
||||||
border: 1px solid var(--border);
|
|
||||||
border-radius: var(--radius);
|
|
||||||
text-align: center;
|
|
||||||
}
|
|
||||||
.pipeline-stat .stat-count { font-size: 1.5em; font-weight: 700; color: var(--accent); }
|
|
||||||
.pipeline-stat .stat-label { font-size: 0.75em; color: var(--text-secondary); }
|
|
||||||
|
|
||||||
/* Flash messages */
|
|
||||||
.flash-msg {
|
|
||||||
position: fixed;
|
|
||||||
bottom: 20px;
|
|
||||||
right: 20px;
|
|
||||||
background: var(--accent);
|
|
||||||
color: var(--bg-primary);
|
|
||||||
padding: 10px 20px;
|
|
||||||
border-radius: var(--radius);
|
|
||||||
font-weight: 600;
|
|
||||||
font-size: 0.9em;
|
|
||||||
z-index: 100;
|
|
||||||
animation: fadeIn 0.2s ease-out, fadeOut 0.5s 2.5s ease-out forwards;
|
|
||||||
}
|
|
||||||
@keyframes fadeOut { to { opacity: 0; transform: translateY(10px); } }
|
|
||||||
|
|
||||||
/* Utility */
|
|
||||||
.text-muted { color: var(--text-muted); }
|
|
||||||
|
|
||||||
/* Typing indicator */
|
|
||||||
.typing-indicator span {
|
|
||||||
display: inline-block;
|
|
||||||
width: 6px;
|
|
||||||
height: 6px;
|
|
||||||
background: var(--text-secondary);
|
|
||||||
border-radius: 50%;
|
|
||||||
margin: 0 2px;
|
|
||||||
animation: bounce 1.2s infinite;
|
|
||||||
}
|
|
||||||
.typing-indicator span:nth-child(2) { animation-delay: 0.2s; }
|
|
||||||
.typing-indicator span:nth-child(3) { animation-delay: 0.4s; }
|
|
||||||
@keyframes bounce {
|
|
||||||
0%, 60%, 100% { transform: translateY(0); }
|
|
||||||
30% { transform: translateY(-6px); }
|
|
||||||
}
|
|
||||||
|
|
||||||
/* ─── Mobile ─── */
|
|
||||||
@media (max-width: 768px) {
|
|
||||||
.chat-sidebar {
|
|
||||||
position: fixed;
|
|
||||||
top: 48px;
|
|
||||||
left: 0;
|
|
||||||
bottom: 0;
|
|
||||||
z-index: 30;
|
|
||||||
transform: translateX(-100%);
|
|
||||||
transition: transform 0.2s ease;
|
|
||||||
width: 280px;
|
|
||||||
}
|
|
||||||
.chat-sidebar.open { transform: translateX(0); }
|
|
||||||
.sidebar-toggle { display: block; }
|
|
||||||
.sidebar-open-btn { display: block; }
|
|
||||||
|
|
||||||
.status-bar { flex-wrap: wrap; gap: 8px; padding: 6px 12px; font-size: 0.75em; }
|
|
||||||
|
|
||||||
.chat-messages { padding: 12px; }
|
|
||||||
.message { max-width: 95%; }
|
|
||||||
.chat-input-area { padding: 8px 12px; }
|
|
||||||
|
|
||||||
#chat-input { font-size: 16px; } /* Prevent iOS zoom */
|
|
||||||
|
|
||||||
.dashboard-layout { padding: 12px; }
|
|
||||||
.loop-grid { gap: 6px; }
|
|
||||||
.loop-badge { min-width: 70px; padding: 6px 8px; font-size: 0.75em; }
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Overlay for mobile sidebar */
|
|
||||||
.sidebar-overlay {
|
|
||||||
display: none;
|
|
||||||
position: fixed;
|
|
||||||
top: 48px;
|
|
||||||
left: 0;
|
|
||||||
right: 0;
|
|
||||||
bottom: 0;
|
|
||||||
background: rgba(0, 0, 0, 0.5);
|
|
||||||
z-index: 25;
|
|
||||||
}
|
|
||||||
.sidebar-overlay.visible { display: block; }
|
|
||||||
|
|
||||||
/* Scrollbar styling */
|
|
||||||
::-webkit-scrollbar { width: 6px; }
|
|
||||||
::-webkit-scrollbar-track { background: transparent; }
|
|
||||||
::-webkit-scrollbar-thumb { background: var(--border); border-radius: 3px; }
|
|
||||||
::-webkit-scrollbar-thumb:hover { background: var(--text-muted); }
|
|
||||||
|
|
@ -1,284 +0,0 @@
|
||||||
/* CheddahBot Frontend JS */
|
|
||||||
|
|
||||||
// ── Session Management ──
|
|
||||||
const SESSION_KEY = 'cheddahbot_session';
|
|
||||||
|
|
||||||
function getSession() {
|
|
||||||
try { return JSON.parse(localStorage.getItem(SESSION_KEY) || '{}'); }
|
|
||||||
catch { return {}; }
|
|
||||||
}
|
|
||||||
|
|
||||||
function saveSession(data) {
|
|
||||||
const s = getSession();
|
|
||||||
Object.assign(s, data);
|
|
||||||
localStorage.setItem(SESSION_KEY, JSON.stringify(s));
|
|
||||||
}
|
|
||||||
|
|
||||||
function getActiveAgent() {
|
|
||||||
return getSession().agent_name || document.getElementById('input-agent-name')?.value || 'default';
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Agent Switching ──
|
|
||||||
function switchAgent(name) {
|
|
||||||
// Update UI
|
|
||||||
document.querySelectorAll('.agent-btn').forEach(b => {
|
|
||||||
b.classList.toggle('active', b.dataset.agent === name);
|
|
||||||
});
|
|
||||||
document.getElementById('input-agent-name').value = name;
|
|
||||||
document.getElementById('input-conv-id').value = '';
|
|
||||||
saveSession({ agent_name: name, conv_id: null });
|
|
||||||
|
|
||||||
// Clear chat and load new sidebar
|
|
||||||
document.getElementById('chat-messages').innerHTML = '';
|
|
||||||
refreshSidebar();
|
|
||||||
}
|
|
||||||
|
|
||||||
function setActiveAgent(name) {
|
|
||||||
document.querySelectorAll('.agent-btn').forEach(b => {
|
|
||||||
b.classList.toggle('active', b.dataset.agent === name);
|
|
||||||
});
|
|
||||||
const agentInput = document.getElementById('input-agent-name');
|
|
||||||
if (agentInput) agentInput.value = name;
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Sidebar ──
|
|
||||||
function refreshSidebar() {
|
|
||||||
const agent = getActiveAgent();
|
|
||||||
htmx.ajax('GET', '/chat/conversations?agent_name=' + agent, {
|
|
||||||
target: '#sidebar-conversations',
|
|
||||||
swap: 'innerHTML'
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Conversation Loading ──
|
|
||||||
function loadConversation(convId) {
|
|
||||||
const agent = getActiveAgent();
|
|
||||||
document.getElementById('input-conv-id').value = convId;
|
|
||||||
saveSession({ conv_id: convId });
|
|
||||||
|
|
||||||
htmx.ajax('GET', '/chat/load/' + convId + '?agent_name=' + agent, {
|
|
||||||
target: '#chat-messages',
|
|
||||||
swap: 'innerHTML'
|
|
||||||
}).then(() => {
|
|
||||||
scrollChat();
|
|
||||||
renderAllMarkdown();
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Chat Input ──
|
|
||||||
function handleKeydown(e) {
|
|
||||||
if (e.key === 'Enter' && !e.shiftKey) {
|
|
||||||
e.preventDefault();
|
|
||||||
document.getElementById('chat-form').requestSubmit();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function autoResize(el) {
|
|
||||||
el.style.height = 'auto';
|
|
||||||
el.style.height = Math.min(el.scrollHeight, 200) + 'px';
|
|
||||||
}
|
|
||||||
|
|
||||||
function afterSend(event) {
|
|
||||||
const input = document.getElementById('chat-input');
|
|
||||||
input.value = '';
|
|
||||||
input.style.height = 'auto';
|
|
||||||
|
|
||||||
// Clear file input and preview
|
|
||||||
const fileInput = document.querySelector('input[type="file"]');
|
|
||||||
if (fileInput) fileInput.value = '';
|
|
||||||
const preview = document.getElementById('file-preview');
|
|
||||||
if (preview) { preview.style.display = 'none'; preview.innerHTML = ''; }
|
|
||||||
|
|
||||||
scrollChat();
|
|
||||||
}
|
|
||||||
|
|
||||||
function scrollChat() {
|
|
||||||
const el = document.getElementById('chat-messages');
|
|
||||||
if (el) {
|
|
||||||
requestAnimationFrame(() => {
|
|
||||||
el.scrollTop = el.scrollHeight;
|
|
||||||
});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── File Upload Preview ──
|
|
||||||
function showFileNames(input) {
|
|
||||||
const preview = document.getElementById('file-preview');
|
|
||||||
if (!input.files.length) {
|
|
||||||
preview.style.display = 'none';
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
let html = '';
|
|
||||||
for (const f of input.files) {
|
|
||||||
html += '<span class="file-tag">' + f.name + '</span>';
|
|
||||||
}
|
|
||||||
preview.innerHTML = html;
|
|
||||||
preview.style.display = 'block';
|
|
||||||
}
|
|
||||||
|
|
||||||
// Drag and drop
|
|
||||||
document.addEventListener('DOMContentLoaded', () => {
|
|
||||||
const chatMain = document.querySelector('.chat-main');
|
|
||||||
if (!chatMain) return;
|
|
||||||
|
|
||||||
chatMain.addEventListener('dragover', e => {
|
|
||||||
e.preventDefault();
|
|
||||||
chatMain.style.outline = '2px dashed var(--accent)';
|
|
||||||
});
|
|
||||||
chatMain.addEventListener('dragleave', () => {
|
|
||||||
chatMain.style.outline = '';
|
|
||||||
});
|
|
||||||
chatMain.addEventListener('drop', e => {
|
|
||||||
e.preventDefault();
|
|
||||||
chatMain.style.outline = '';
|
|
||||||
const fileInput = document.querySelector('input[type="file"]');
|
|
||||||
if (fileInput && e.dataTransfer.files.length) {
|
|
||||||
fileInput.files = e.dataTransfer.files;
|
|
||||||
showFileNames(fileInput);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
// ── SSE Streaming ──
|
|
||||||
// Handle SSE chunks for chat streaming
|
|
||||||
let streamBuffer = '';
|
|
||||||
let activeSSE = null;
|
|
||||||
|
|
||||||
document.addEventListener('htmx:sseBeforeMessage', function(e) {
|
|
||||||
// This fires for each SSE event received by htmx
|
|
||||||
});
|
|
||||||
|
|
||||||
// Watch for SSE trigger divs being added to the DOM
|
|
||||||
const observer = new MutationObserver(mutations => {
|
|
||||||
for (const m of mutations) {
|
|
||||||
for (const node of m.addedNodes) {
|
|
||||||
if (node.id === 'sse-trigger') {
|
|
||||||
setupStream(node);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
document.addEventListener('DOMContentLoaded', () => {
|
|
||||||
const chatMessages = document.getElementById('chat-messages');
|
|
||||||
if (chatMessages) {
|
|
||||||
observer.observe(chatMessages, { childList: true, subtree: true });
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
||||||
function setupStream(triggerDiv) {
|
|
||||||
const sseUrl = triggerDiv.getAttribute('sse-connect');
|
|
||||||
if (!sseUrl) return;
|
|
||||||
|
|
||||||
// Remove the htmx SSE to manage manually
|
|
||||||
triggerDiv.remove();
|
|
||||||
|
|
||||||
const responseDiv = document.getElementById('assistant-response');
|
|
||||||
if (!responseDiv) return;
|
|
||||||
|
|
||||||
streamBuffer = '';
|
|
||||||
|
|
||||||
// Show typing indicator
|
|
||||||
responseDiv.innerHTML = '<div class="typing-indicator"><span></span><span></span><span></span></div>';
|
|
||||||
|
|
||||||
const source = new EventSource(sseUrl);
|
|
||||||
activeSSE = source;
|
|
||||||
|
|
||||||
source.addEventListener('chunk', function(e) {
|
|
||||||
if (streamBuffer === '') {
|
|
||||||
// Remove typing indicator on first chunk
|
|
||||||
responseDiv.innerHTML = '';
|
|
||||||
}
|
|
||||||
streamBuffer += e.data;
|
|
||||||
// Render markdown
|
|
||||||
try {
|
|
||||||
responseDiv.innerHTML = marked.parse(streamBuffer);
|
|
||||||
} catch {
|
|
||||||
responseDiv.textContent = streamBuffer;
|
|
||||||
}
|
|
||||||
scrollChat();
|
|
||||||
});
|
|
||||||
|
|
||||||
source.addEventListener('done', function(e) {
|
|
||||||
source.close();
|
|
||||||
activeSSE = null;
|
|
||||||
// Final markdown render
|
|
||||||
if (streamBuffer) {
|
|
||||||
try {
|
|
||||||
responseDiv.innerHTML = marked.parse(streamBuffer);
|
|
||||||
} catch {
|
|
||||||
responseDiv.textContent = streamBuffer;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
streamBuffer = '';
|
|
||||||
|
|
||||||
// Update conv_id from done event data
|
|
||||||
const convId = e.data;
|
|
||||||
if (convId) {
|
|
||||||
document.getElementById('input-conv-id').value = convId;
|
|
||||||
saveSession({ conv_id: convId });
|
|
||||||
}
|
|
||||||
|
|
||||||
// Refresh sidebar
|
|
||||||
refreshSidebar();
|
|
||||||
scrollChat();
|
|
||||||
});
|
|
||||||
|
|
||||||
source.onerror = function() {
|
|
||||||
source.close();
|
|
||||||
activeSSE = null;
|
|
||||||
if (!streamBuffer) {
|
|
||||||
responseDiv.innerHTML = '<span class="text-err">Connection lost</span>';
|
|
||||||
}
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Markdown Rendering ──
|
|
||||||
function renderAllMarkdown() {
|
|
||||||
document.querySelectorAll('.message-content').forEach(el => {
|
|
||||||
const raw = el.textContent;
|
|
||||||
if (raw && typeof marked !== 'undefined') {
|
|
||||||
try {
|
|
||||||
el.innerHTML = marked.parse(raw);
|
|
||||||
} catch { /* keep raw text */ }
|
|
||||||
}
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Mobile Sidebar ──
|
|
||||||
function toggleSidebar() {
|
|
||||||
const sidebar = document.getElementById('chat-sidebar');
|
|
||||||
const overlay = document.getElementById('sidebar-overlay');
|
|
||||||
if (sidebar) {
|
|
||||||
sidebar.classList.toggle('open');
|
|
||||||
}
|
|
||||||
if (overlay) {
|
|
||||||
overlay.classList.toggle('visible');
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Notification Banner (chat page) ──
|
|
||||||
function setupChatNotifications() {
|
|
||||||
const banner = document.getElementById('notification-banner');
|
|
||||||
if (!banner) return;
|
|
||||||
|
|
||||||
const source = new EventSource('/sse/notifications');
|
|
||||||
source.addEventListener('notification', function(e) {
|
|
||||||
const notif = JSON.parse(e.data);
|
|
||||||
banner.textContent = notif.message;
|
|
||||||
banner.style.display = 'block';
|
|
||||||
// Auto-hide after 15s
|
|
||||||
setTimeout(() => { banner.style.display = 'none'; }, 15000);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
document.addEventListener('DOMContentLoaded', setupChatNotifications);
|
|
||||||
|
|
||||||
// ── HTMX Events ──
|
|
||||||
document.addEventListener('scrollChat', scrollChat);
|
|
||||||
document.addEventListener('htmx:afterSwap', function(e) {
|
|
||||||
if (e.target.id === 'chat-messages') {
|
|
||||||
renderAllMarkdown();
|
|
||||||
scrollChat();
|
|
||||||
}
|
|
||||||
});
|
|
||||||
|
|
@ -1,27 +0,0 @@
|
||||||
<!DOCTYPE html>
|
|
||||||
<html lang="en">
|
|
||||||
<head>
|
|
||||||
<meta charset="UTF-8">
|
|
||||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
|
||||||
<title>{% block title %}CheddahBot{% endblock %}</title>
|
|
||||||
<link rel="stylesheet" href="/static/app.css">
|
|
||||||
<script src="https://unpkg.com/htmx.org@2.0.4"></script>
|
|
||||||
<script src="https://unpkg.com/htmx-ext-sse@2.3.0/sse.js"></script>
|
|
||||||
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
|
|
||||||
{% block head %}{% endblock %}
|
|
||||||
</head>
|
|
||||||
<body>
|
|
||||||
<nav class="top-nav">
|
|
||||||
<div class="nav-brand">CheddahBot</div>
|
|
||||||
<div class="nav-links">
|
|
||||||
<a href="/" class="nav-link {% block nav_chat_active %}{% endblock %}">Chat</a>
|
|
||||||
<a href="/dashboard" class="nav-link {% block nav_dash_active %}{% endblock %}">Dashboard</a>
|
|
||||||
</div>
|
|
||||||
</nav>
|
|
||||||
<main class="main-content">
|
|
||||||
{% block content %}{% endblock %}
|
|
||||||
</main>
|
|
||||||
<script src="/static/app.js"></script>
|
|
||||||
{% block scripts %}{% endblock %}
|
|
||||||
</body>
|
|
||||||
</html>
|
|
||||||
|
|
@ -1,111 +0,0 @@
|
||||||
{% extends "base.html" %}
|
|
||||||
|
|
||||||
{% block title %}Chat - CheddahBot{% endblock %}
|
|
||||||
{% block nav_chat_active %}active{% endblock %}
|
|
||||||
|
|
||||||
{% block content %}
|
|
||||||
<div class="chat-layout">
|
|
||||||
<!-- Sidebar -->
|
|
||||||
<aside class="chat-sidebar" id="chat-sidebar">
|
|
||||||
<div class="sidebar-header">
|
|
||||||
<h3>Agents</h3>
|
|
||||||
<button class="sidebar-toggle" onclick="toggleSidebar()" aria-label="Close sidebar">✕</button>
|
|
||||||
</div>
|
|
||||||
<div class="agent-selector" id="agent-selector">
|
|
||||||
{% for agent in agents %}
|
|
||||||
<button
|
|
||||||
class="agent-btn {% if agent.name == default_agent %}active{% endif %}"
|
|
||||||
data-agent="{{ agent.name }}"
|
|
||||||
onclick="switchAgent('{{ agent.name }}')"
|
|
||||||
>{{ agent.display_name }}</button>
|
|
||||||
{% endfor %}
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div class="sidebar-divider"></div>
|
|
||||||
|
|
||||||
<button
|
|
||||||
class="btn btn-new-chat"
|
|
||||||
hx-post="/chat/new"
|
|
||||||
hx-vals='{"agent_name": "{{ default_agent }}"}'
|
|
||||||
hx-target="#chat-messages"
|
|
||||||
hx-swap="innerHTML"
|
|
||||||
onclick="this.setAttribute('hx-vals', JSON.stringify({agent_name: getActiveAgent()}))"
|
|
||||||
>+ New Chat</button>
|
|
||||||
|
|
||||||
<h3>History</h3>
|
|
||||||
<div id="sidebar-conversations"
|
|
||||||
hx-get="/chat/conversations?agent_name={{ default_agent }}"
|
|
||||||
hx-trigger="load"
|
|
||||||
hx-swap="innerHTML">
|
|
||||||
</div>
|
|
||||||
</aside>
|
|
||||||
|
|
||||||
<!-- Mobile sidebar toggle + overlay -->
|
|
||||||
<button class="sidebar-open-btn" onclick="toggleSidebar()" aria-label="Open sidebar">☰</button>
|
|
||||||
<div id="sidebar-overlay" class="sidebar-overlay" onclick="toggleSidebar()"></div>
|
|
||||||
|
|
||||||
<!-- Chat area -->
|
|
||||||
<div class="chat-main">
|
|
||||||
<!-- Status bar -->
|
|
||||||
<div class="status-bar">
|
|
||||||
<span class="status-item">Model: <strong>{{ chat_model }}</strong></span>
|
|
||||||
<span class="status-item">Exec: <strong class="{% if exec_available %}text-ok{% else %}text-err{% endif %}">{{ "OK" if exec_available else "N/A" }}</strong></span>
|
|
||||||
<span class="status-item">ClickUp: <strong class="{% if clickup_enabled %}text-ok{% else %}text-err{% endif %}">{{ "ON" if clickup_enabled else "OFF" }}</strong></span>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Notification banner (populated by SSE) -->
|
|
||||||
<div id="notification-banner" class="notification-banner" style="display:none;"></div>
|
|
||||||
|
|
||||||
<!-- Messages -->
|
|
||||||
<div class="chat-messages" id="chat-messages">
|
|
||||||
<!-- Messages loaded here -->
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Input area -->
|
|
||||||
<form id="chat-form" class="chat-input-area"
|
|
||||||
hx-post="/chat/send"
|
|
||||||
hx-target="#chat-messages"
|
|
||||||
hx-swap="beforeend"
|
|
||||||
hx-encoding="multipart/form-data"
|
|
||||||
hx-on::after-request="afterSend(event)">
|
|
||||||
<input type="hidden" name="agent_name" id="input-agent-name" value="{{ default_agent }}">
|
|
||||||
<input type="hidden" name="conv_id" id="input-conv-id" value="">
|
|
||||||
<div class="input-row">
|
|
||||||
<label class="file-upload-btn" title="Attach files">
|
|
||||||
📎
|
|
||||||
<input type="file" name="files" multiple style="display:none;" onchange="showFileNames(this)">
|
|
||||||
</label>
|
|
||||||
<textarea name="text" id="chat-input" rows="1" placeholder="Type a message..."
|
|
||||||
onkeydown="handleKeydown(event)" oninput="autoResize(this)"></textarea>
|
|
||||||
<button type="submit" class="send-btn" title="Send">➤</button>
|
|
||||||
</div>
|
|
||||||
<div id="file-preview" class="file-preview" style="display:none;"></div>
|
|
||||||
</form>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
{% endblock %}
|
|
||||||
|
|
||||||
{% block scripts %}
|
|
||||||
<script>
|
|
||||||
// Initialize session state
|
|
||||||
const SESSION_KEY = 'cheddahbot_session';
|
|
||||||
let session = JSON.parse(localStorage.getItem(SESSION_KEY) || '{}');
|
|
||||||
if (!session.agent_name) session.agent_name = '{{ default_agent }}';
|
|
||||||
|
|
||||||
// Restore session on load
|
|
||||||
document.addEventListener('DOMContentLoaded', function() {
|
|
||||||
if (session.agent_name) {
|
|
||||||
setActiveAgent(session.agent_name);
|
|
||||||
}
|
|
||||||
if (session.conv_id) {
|
|
||||||
loadConversation(session.conv_id);
|
|
||||||
}
|
|
||||||
// Load conversations for sidebar
|
|
||||||
refreshSidebar();
|
|
||||||
});
|
|
||||||
|
|
||||||
function saveSession() {
|
|
||||||
localStorage.setItem(SESSION_KEY, JSON.stringify(session));
|
|
||||||
}
|
|
||||||
</script>
|
|
||||||
{% endblock %}
|
|
||||||
|
|
@ -1,174 +0,0 @@
|
||||||
{% extends "base.html" %}
|
|
||||||
|
|
||||||
{% block title %}Dashboard - CheddahBot{% endblock %}
|
|
||||||
{% block nav_dash_active %}active{% endblock %}
|
|
||||||
|
|
||||||
{% block content %}
|
|
||||||
<div class="dashboard-layout">
|
|
||||||
|
|
||||||
<!-- Ops Panel -->
|
|
||||||
<section class="panel" id="ops-panel">
|
|
||||||
<h2 class="panel-title">Operations</h2>
|
|
||||||
|
|
||||||
<!-- Active Executions -->
|
|
||||||
<div class="panel-section">
|
|
||||||
<h3>Active Executions</h3>
|
|
||||||
<div id="active-executions" class="exec-list">
|
|
||||||
<span class="text-muted">Loading...</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Loop Health -->
|
|
||||||
<div class="panel-section">
|
|
||||||
<h3>Loop Health</h3>
|
|
||||||
<div id="loop-health" class="loop-grid">
|
|
||||||
<span class="text-muted">Loading...</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Actions -->
|
|
||||||
<div class="panel-section">
|
|
||||||
<h3>Actions</h3>
|
|
||||||
<div class="action-buttons">
|
|
||||||
<button class="btn btn-sm"
|
|
||||||
hx-post="/api/system/loops/force"
|
|
||||||
hx-swap="none"
|
|
||||||
hx-on::after-request="showFlash('Force pulse sent')">
|
|
||||||
Force Pulse
|
|
||||||
</button>
|
|
||||||
<button class="btn btn-sm"
|
|
||||||
hx-post="/api/system/briefing/force"
|
|
||||||
hx-swap="none"
|
|
||||||
hx-on::after-request="showFlash('Briefing triggered')">
|
|
||||||
Force Briefing
|
|
||||||
</button>
|
|
||||||
<button class="btn btn-sm"
|
|
||||||
hx-post="/api/cache/clear"
|
|
||||||
hx-swap="none"
|
|
||||||
hx-on::after-request="showFlash('Cache cleared')">
|
|
||||||
Clear Cache
|
|
||||||
</button>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<!-- Notification Feed -->
|
|
||||||
<div class="panel-section">
|
|
||||||
<h3>Notifications</h3>
|
|
||||||
<div id="notification-feed" class="notif-feed">
|
|
||||||
<span class="text-muted">Waiting for notifications...</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
</section>
|
|
||||||
|
|
||||||
<!-- Pipeline Panel -->
|
|
||||||
<section class="panel" id="pipeline-panel">
|
|
||||||
<h2 class="panel-title">Pipeline</h2>
|
|
||||||
<div id="pipeline-content"
|
|
||||||
hx-get="/dashboard/pipeline"
|
|
||||||
hx-trigger="load, every 120s"
|
|
||||||
hx-swap="innerHTML">
|
|
||||||
<span class="text-muted">Loading pipeline data...</span>
|
|
||||||
</div>
|
|
||||||
</section>
|
|
||||||
|
|
||||||
</div>
|
|
||||||
{% endblock %}
|
|
||||||
|
|
||||||
{% block scripts %}
|
|
||||||
<script>
|
|
||||||
// Connect to SSE for live loop updates
|
|
||||||
const loopSource = new EventSource('/sse/loops');
|
|
||||||
loopSource.addEventListener('loops', function(e) {
|
|
||||||
const data = JSON.parse(e.data);
|
|
||||||
renderLoopHealth(data.loops);
|
|
||||||
renderActiveExecutions(data.executions);
|
|
||||||
});
|
|
||||||
|
|
||||||
// Connect to SSE for notifications
|
|
||||||
const notifSource = new EventSource('/sse/notifications');
|
|
||||||
notifSource.addEventListener('notification', function(e) {
|
|
||||||
const notif = JSON.parse(e.data);
|
|
||||||
addNotification(notif.message, notif.category);
|
|
||||||
});
|
|
||||||
|
|
||||||
function renderLoopHealth(loops) {
|
|
||||||
const container = document.getElementById('loop-health');
|
|
||||||
if (!loops || Object.keys(loops).length === 0) {
|
|
||||||
container.innerHTML = '<span class="text-muted">No loop data</span>';
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
let html = '';
|
|
||||||
const now = new Date();
|
|
||||||
for (const [name, ts] of Object.entries(loops)) {
|
|
||||||
let statusClass = 'badge-muted';
|
|
||||||
let agoText = 'never';
|
|
||||||
if (ts) {
|
|
||||||
const dt = new Date(ts);
|
|
||||||
const secs = Math.floor((now - dt) / 1000);
|
|
||||||
if (secs < 120) {
|
|
||||||
statusClass = 'badge-ok';
|
|
||||||
agoText = secs + 's ago';
|
|
||||||
} else if (secs < 600) {
|
|
||||||
statusClass = 'badge-warn';
|
|
||||||
agoText = Math.floor(secs / 60) + 'm ago';
|
|
||||||
} else {
|
|
||||||
statusClass = 'badge-err';
|
|
||||||
agoText = Math.floor(secs / 60) + 'm ago';
|
|
||||||
}
|
|
||||||
}
|
|
||||||
html += '<div class="loop-badge ' + statusClass + '">' +
|
|
||||||
'<span class="loop-name">' + name + '</span>' +
|
|
||||||
'<span class="loop-ago">' + agoText + '</span>' +
|
|
||||||
'</div>';
|
|
||||||
}
|
|
||||||
container.innerHTML = html;
|
|
||||||
}
|
|
||||||
|
|
||||||
function renderActiveExecutions(execs) {
|
|
||||||
const container = document.getElementById('active-executions');
|
|
||||||
if (!execs || Object.keys(execs).length === 0) {
|
|
||||||
container.innerHTML = '<span class="text-muted">No active executions</span>';
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
let html = '';
|
|
||||||
const now = new Date();
|
|
||||||
for (const [id, info] of Object.entries(execs)) {
|
|
||||||
const started = new Date(info.started_at);
|
|
||||||
const durSecs = Math.floor((now - started) / 1000);
|
|
||||||
let dur = durSecs + 's';
|
|
||||||
if (durSecs >= 60) dur = Math.floor(durSecs / 60) + 'm ' + (durSecs % 60) + 's';
|
|
||||||
html += '<div class="exec-item">' +
|
|
||||||
'<span class="exec-name">' + info.name + '</span>' +
|
|
||||||
'<span class="exec-tool">' + info.tool + '</span>' +
|
|
||||||
'<span class="exec-dur">' + dur + '</span>' +
|
|
||||||
'</div>';
|
|
||||||
}
|
|
||||||
container.innerHTML = html;
|
|
||||||
}
|
|
||||||
|
|
||||||
let notifCount = 0;
|
|
||||||
function addNotification(message, category) {
|
|
||||||
const container = document.getElementById('notification-feed');
|
|
||||||
if (notifCount === 0) container.innerHTML = '';
|
|
||||||
notifCount++;
|
|
||||||
|
|
||||||
const div = document.createElement('div');
|
|
||||||
div.className = 'notif-item notif-' + (category || 'info');
|
|
||||||
div.innerHTML = '<span class="notif-cat">' + (category || 'info') + '</span> ' + message;
|
|
||||||
container.insertBefore(div, container.firstChild);
|
|
||||||
|
|
||||||
// Keep max 30
|
|
||||||
while (container.children.length > 30) {
|
|
||||||
container.removeChild(container.lastChild);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function showFlash(msg) {
|
|
||||||
const el = document.createElement('div');
|
|
||||||
el.className = 'flash-msg';
|
|
||||||
el.textContent = msg;
|
|
||||||
document.body.appendChild(el);
|
|
||||||
setTimeout(() => el.remove(), 3000);
|
|
||||||
}
|
|
||||||
</script>
|
|
||||||
{% endblock %}
|
|
||||||
|
|
@ -1,6 +0,0 @@
|
||||||
<div class="message {{ role }}">
|
|
||||||
<div class="message-avatar">{% if role == 'user' %}You{% else %}CB{% endif %}</div>
|
|
||||||
<div class="message-body">
|
|
||||||
<div class="message-content">{{ content }}</div>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
@ -1,11 +0,0 @@
|
||||||
{% if conversations %}
|
|
||||||
{% for conv in conversations %}
|
|
||||||
<button class="conv-btn"
|
|
||||||
onclick="loadConversation('{{ conv.id }}')"
|
|
||||||
title="{{ conv.title or 'New Chat' }}">
|
|
||||||
{{ conv.title or 'New Chat' }}
|
|
||||||
</button>
|
|
||||||
{% endfor %}
|
|
||||||
{% else %}
|
|
||||||
<p class="text-muted">No conversations yet</p>
|
|
||||||
{% endif %}
|
|
||||||
|
|
@ -1,6 +0,0 @@
|
||||||
{% for name, info in loops.items() %}
|
|
||||||
<div class="loop-badge {{ info.class }}">
|
|
||||||
<span class="loop-name">{{ name }}</span>
|
|
||||||
<span class="loop-ago">{{ info.ago }}</span>
|
|
||||||
</div>
|
|
||||||
{% endfor %}
|
|
||||||
|
|
@ -1,6 +0,0 @@
|
||||||
{% for notif in notifications %}
|
|
||||||
<div class="notif-item notif-{{ notif.category or 'info' }}">
|
|
||||||
<span class="notif-cat">{{ notif.category }}</span>
|
|
||||||
{{ notif.message }}
|
|
||||||
</div>
|
|
||||||
{% endfor %}
|
|
||||||
|
|
@ -1,27 +0,0 @@
|
||||||
{% if tasks %}
|
|
||||||
<table class="task-table">
|
|
||||||
<thead>
|
|
||||||
<tr>
|
|
||||||
<th>Task</th>
|
|
||||||
<th>Customer</th>
|
|
||||||
<th>Status</th>
|
|
||||||
<th>Due</th>
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{% for task in tasks %}
|
|
||||||
<tr>
|
|
||||||
<td>
|
|
||||||
{% if task.url %}<a href="{{ task.url }}" target="_blank" rel="noopener">{{ task.name }}</a>
|
|
||||||
{% else %}{{ task.name }}{% endif %}
|
|
||||||
</td>
|
|
||||||
<td>{{ task.custom_fields.get('Client', 'N/A') if task.custom_fields else 'N/A' }}</td>
|
|
||||||
<td><span class="status-badge status-{{ task.status|replace(' ', '-') }}">{{ task.status }}</span></td>
|
|
||||||
<td>{{ task.due_display or '-' }}</td>
|
|
||||||
</tr>
|
|
||||||
{% endfor %}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
{% else %}
|
|
||||||
<p class="text-muted">No tasks</p>
|
|
||||||
{% endif %}
|
|
||||||
|
|
@ -105,7 +105,6 @@ class ToolRegistry:
|
||||||
self.db = db
|
self.db = db
|
||||||
self.agent = agent
|
self.agent = agent
|
||||||
self.agent_registry = None # set after multi-agent setup
|
self.agent_registry = None # set after multi-agent setup
|
||||||
self.scheduler = None # set after scheduler creation
|
|
||||||
self._discover_tools()
|
self._discover_tools()
|
||||||
|
|
||||||
def _discover_tools(self):
|
def _discover_tools(self):
|
||||||
|
|
@ -159,13 +158,10 @@ class ToolRegistry:
|
||||||
"agent": self.agent,
|
"agent": self.agent,
|
||||||
"memory": self.agent._memory,
|
"memory": self.agent._memory,
|
||||||
"agent_registry": self.agent_registry,
|
"agent_registry": self.agent_registry,
|
||||||
"scheduler": self.scheduler,
|
|
||||||
}
|
}
|
||||||
# Pass scheduler-injected metadata through ctx (not LLM-visible)
|
# Pass scheduler-injected metadata through ctx (not LLM-visible)
|
||||||
if "clickup_task_id" in args:
|
if "clickup_task_id" in args:
|
||||||
ctx["clickup_task_id"] = args.pop("clickup_task_id")
|
ctx["clickup_task_id"] = args.pop("clickup_task_id")
|
||||||
if "clickup_task_status" in args:
|
|
||||||
ctx["clickup_task_status"] = args.pop("clickup_task_status")
|
|
||||||
args["ctx"] = ctx
|
args["ctx"] = ctx
|
||||||
|
|
||||||
# Filter args to only params the function accepts (plus **kwargs)
|
# Filter args to only params the function accepts (plus **kwargs)
|
||||||
|
|
|
||||||
|
|
@ -52,15 +52,15 @@ def _get_clickup_client(ctx: dict):
|
||||||
|
|
||||||
|
|
||||||
def _find_qualifying_tasks(client, config, target_date: str, categories: list[str]):
|
def _find_qualifying_tasks(client, config, target_date: str, categories: list[str]):
|
||||||
"""Find 'to do' tasks in cora_categories due on target_date (single day).
|
"""Find 'to do' tasks in cora_categories due on target_date.
|
||||||
|
|
||||||
Used when target_date is explicitly provided.
|
|
||||||
Returns list of ClickUpTask objects.
|
Returns list of ClickUpTask objects.
|
||||||
"""
|
"""
|
||||||
space_id = config.clickup.space_id
|
space_id = config.clickup.space_id
|
||||||
if not space_id:
|
if not space_id:
|
||||||
return []
|
return []
|
||||||
|
|
||||||
|
# Parse target date to filter by due_date range (full day)
|
||||||
try:
|
try:
|
||||||
dt = datetime.strptime(target_date, "%Y-%m-%d").replace(tzinfo=UTC)
|
dt = datetime.strptime(target_date, "%Y-%m-%d").replace(tzinfo=UTC)
|
||||||
except ValueError:
|
except ValueError:
|
||||||
|
|
@ -78,8 +78,10 @@ def _find_qualifying_tasks(client, config, target_date: str, categories: list[st
|
||||||
|
|
||||||
qualifying = []
|
qualifying = []
|
||||||
for task in tasks:
|
for task in tasks:
|
||||||
|
# Must be in one of the cora categories
|
||||||
if task.task_type not in categories:
|
if task.task_type not in categories:
|
||||||
continue
|
continue
|
||||||
|
# Must have a due_date within the target day
|
||||||
if not task.due_date:
|
if not task.due_date:
|
||||||
continue
|
continue
|
||||||
try:
|
try:
|
||||||
|
|
@ -93,129 +95,17 @@ def _find_qualifying_tasks(client, config, target_date: str, categories: list[st
|
||||||
return qualifying
|
return qualifying
|
||||||
|
|
||||||
|
|
||||||
def _find_qualifying_tasks_sweep(client, config, categories: list[str]):
|
def _find_all_todo_tasks(client, config, categories: list[str]):
|
||||||
"""Multi-pass sweep for qualifying tasks when no explicit date is given.
|
"""Find ALL 'to do' tasks in cora_categories (no date filter).
|
||||||
|
|
||||||
Pass 1: Tasks due today
|
Used to find sibling tasks sharing the same keyword.
|
||||||
Pass 2: Overdue tasks tagged with current month (e.g. "feb26")
|
|
||||||
Pass 3: Tasks tagged with last month (e.g. "jan26"), still "to do"
|
|
||||||
Pass 4: Tasks due in next 2 days (look-ahead)
|
|
||||||
|
|
||||||
Deduplicates across passes by task ID.
|
|
||||||
Returns list of ClickUpTask objects.
|
|
||||||
"""
|
"""
|
||||||
space_id = config.clickup.space_id
|
space_id = config.clickup.space_id
|
||||||
if not space_id:
|
if not space_id:
|
||||||
return []
|
return []
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
tasks = client.get_tasks_from_space(space_id, statuses=["to do"])
|
||||||
today_start_ms = int(
|
return [t for t in tasks if t.task_type in categories]
|
||||||
now.replace(hour=0, minute=0, second=0, microsecond=0).timestamp() * 1000
|
|
||||||
)
|
|
||||||
today_end_ms = today_start_ms + 24 * 60 * 60 * 1000
|
|
||||||
lookahead_end_ms = today_start_ms + 3 * 24 * 60 * 60 * 1000 # +2 days
|
|
||||||
|
|
||||||
# Current and last month tags (e.g. "feb26", "jan26")
|
|
||||||
current_month_tag = now.strftime("%b%y").lower()
|
|
||||||
# Go back one month
|
|
||||||
if now.month == 1:
|
|
||||||
last_month = now.replace(year=now.year - 1, month=12)
|
|
||||||
else:
|
|
||||||
last_month = now.replace(month=now.month - 1)
|
|
||||||
last_month_tag = last_month.strftime("%b%y").lower()
|
|
||||||
|
|
||||||
# Fetch all "to do" tasks with due dates up to lookahead
|
|
||||||
all_tasks = client.get_tasks_from_space(
|
|
||||||
space_id,
|
|
||||||
statuses=["to do"],
|
|
||||||
due_date_lt=lookahead_end_ms,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Filter to cora categories
|
|
||||||
cora_tasks = [t for t in all_tasks if t.task_type in categories]
|
|
||||||
|
|
||||||
seen_ids: set[str] = set()
|
|
||||||
qualifying: list = []
|
|
||||||
|
|
||||||
def _add(task):
|
|
||||||
if task.id not in seen_ids:
|
|
||||||
seen_ids.add(task.id)
|
|
||||||
qualifying.append(task)
|
|
||||||
|
|
||||||
# Pass 1: Due today
|
|
||||||
for task in cora_tasks:
|
|
||||||
if not task.due_date:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
due_ms = int(task.due_date)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
continue
|
|
||||||
if today_start_ms <= due_ms < today_end_ms:
|
|
||||||
_add(task)
|
|
||||||
|
|
||||||
# Pass 2: Overdue + tagged with current month
|
|
||||||
for task in cora_tasks:
|
|
||||||
if not task.due_date:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
due_ms = int(task.due_date)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
continue
|
|
||||||
if due_ms < today_start_ms and current_month_tag in task.tags:
|
|
||||||
_add(task)
|
|
||||||
|
|
||||||
# Pass 3: Tagged with last month, still "to do"
|
|
||||||
for task in cora_tasks:
|
|
||||||
if last_month_tag in task.tags:
|
|
||||||
_add(task)
|
|
||||||
|
|
||||||
# Pass 4: Look-ahead (due in next 2 days, excluding today which was pass 1)
|
|
||||||
for task in cora_tasks:
|
|
||||||
if not task.due_date:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
due_ms = int(task.due_date)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
continue
|
|
||||||
if today_end_ms <= due_ms < lookahead_end_ms:
|
|
||||||
_add(task)
|
|
||||||
|
|
||||||
log.info(
|
|
||||||
"AutoCora sweep: %d qualifying tasks "
|
|
||||||
"(today=%d, overdue+month=%d, last_month=%d, lookahead=%d)",
|
|
||||||
len(qualifying),
|
|
||||||
sum(1 for t in qualifying if _is_due_today(t, today_start_ms, today_end_ms)),
|
|
||||||
sum(1 for t in qualifying if _is_overdue_with_tag(t, today_start_ms, current_month_tag)),
|
|
||||||
sum(1 for t in qualifying if last_month_tag in t.tags),
|
|
||||||
sum(1 for t in qualifying if _is_lookahead(t, today_end_ms, lookahead_end_ms)),
|
|
||||||
)
|
|
||||||
|
|
||||||
return qualifying
|
|
||||||
|
|
||||||
|
|
||||||
def _is_due_today(task, start_ms, end_ms) -> bool:
|
|
||||||
try:
|
|
||||||
due = int(task.due_date)
|
|
||||||
return start_ms <= due < end_ms
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def _is_overdue_with_tag(task, today_start_ms, tag) -> bool:
|
|
||||||
try:
|
|
||||||
due = int(task.due_date)
|
|
||||||
return due < today_start_ms and tag in task.tags
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def _is_lookahead(task, today_end_ms, lookahead_end_ms) -> bool:
|
|
||||||
try:
|
|
||||||
due = int(task.due_date)
|
|
||||||
return today_end_ms <= due < lookahead_end_ms
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
def _group_by_keyword(tasks, all_tasks):
|
def _group_by_keyword(tasks, all_tasks):
|
||||||
|
|
@ -245,7 +135,8 @@ def _group_by_keyword(tasks, all_tasks):
|
||||||
url = task.custom_fields.get("IMSURL", "") or ""
|
url = task.custom_fields.get("IMSURL", "") or ""
|
||||||
url = str(url).strip()
|
url = str(url).strip()
|
||||||
if not url:
|
if not url:
|
||||||
url = "https://seotoollab.com/blank.html"
|
alerts.append(f"Task '{task.name}' (id={task.id}) missing IMSURL field")
|
||||||
|
continue
|
||||||
|
|
||||||
kw_lower = keyword.lower()
|
kw_lower = keyword.lower()
|
||||||
if kw_lower not in groups:
|
if kw_lower not in groups:
|
||||||
|
|
@ -275,8 +166,7 @@ def _group_by_keyword(tasks, all_tasks):
|
||||||
|
|
||||||
@tool(
|
@tool(
|
||||||
"submit_autocora_jobs",
|
"submit_autocora_jobs",
|
||||||
"Submit Cora SEO report jobs for ClickUp tasks. Uses a multi-pass sweep "
|
"Submit Cora SEO report jobs for ClickUp tasks due on a given date. "
|
||||||
"(today, overdue, last month, look-ahead) unless a specific date is given. "
|
|
||||||
"Writes job JSON files to the AutoCora shared folder queue.",
|
"Writes job JSON files to the AutoCora shared folder queue.",
|
||||||
category="autocora",
|
category="autocora",
|
||||||
)
|
)
|
||||||
|
|
@ -284,36 +174,38 @@ def submit_autocora_jobs(target_date: str = "", ctx: dict | None = None) -> str:
|
||||||
"""Submit AutoCora jobs for qualifying ClickUp tasks.
|
"""Submit AutoCora jobs for qualifying ClickUp tasks.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
target_date: Date to check (YYYY-MM-DD). Empty = multi-pass sweep.
|
target_date: Date to check (YYYY-MM-DD). Defaults to today.
|
||||||
ctx: Injected context with config, db, etc.
|
ctx: Injected context with config, db, etc.
|
||||||
"""
|
"""
|
||||||
if not ctx:
|
if not ctx:
|
||||||
return "Error: context not available"
|
return "Error: context not available"
|
||||||
|
|
||||||
config = ctx["config"]
|
config = ctx["config"]
|
||||||
|
db = ctx["db"]
|
||||||
autocora = config.autocora
|
autocora = config.autocora
|
||||||
|
|
||||||
if not autocora.enabled:
|
if not autocora.enabled:
|
||||||
return "AutoCora is disabled in config."
|
return "AutoCora is disabled in config."
|
||||||
|
|
||||||
|
if not target_date:
|
||||||
|
target_date = datetime.now(UTC).strftime("%Y-%m-%d")
|
||||||
|
|
||||||
if not config.clickup.api_token:
|
if not config.clickup.api_token:
|
||||||
return "Error: ClickUp API token not configured"
|
return "Error: ClickUp API token not configured"
|
||||||
|
|
||||||
client = _get_clickup_client(ctx)
|
client = _get_clickup_client(ctx)
|
||||||
|
|
||||||
# Find qualifying tasks — sweep or single-day
|
# Find qualifying tasks (due on target_date, in cora_categories, status "to do")
|
||||||
if target_date:
|
qualifying = _find_qualifying_tasks(client, config, target_date, autocora.cora_categories)
|
||||||
qualifying = _find_qualifying_tasks(client, config, target_date, autocora.cora_categories)
|
|
||||||
label = target_date
|
|
||||||
else:
|
|
||||||
qualifying = _find_qualifying_tasks_sweep(client, config, autocora.cora_categories)
|
|
||||||
label = "sweep"
|
|
||||||
|
|
||||||
if not qualifying:
|
if not qualifying:
|
||||||
return f"No qualifying tasks found ({label})."
|
return f"No qualifying tasks found for {target_date}."
|
||||||
|
|
||||||
# Group by keyword — only siblings that also passed the sweep qualify
|
# Find ALL to-do tasks in cora categories for sibling keyword matching
|
||||||
groups, alerts = _group_by_keyword(qualifying, qualifying)
|
all_todo = _find_all_todo_tasks(client, config, autocora.cora_categories)
|
||||||
|
|
||||||
|
# Group by keyword
|
||||||
|
groups, alerts = _group_by_keyword(qualifying, all_todo)
|
||||||
|
|
||||||
if not groups and alerts:
|
if not groups and alerts:
|
||||||
return "No jobs submitted.\n\n" + "\n".join(f"- {a}" for a in alerts)
|
return "No jobs submitted.\n\n" + "\n".join(f"- {a}" for a in alerts)
|
||||||
|
|
@ -326,13 +218,19 @@ def submit_autocora_jobs(target_date: str = "", ctx: dict | None = None) -> str:
|
||||||
skipped = []
|
skipped = []
|
||||||
|
|
||||||
for kw_lower, group in groups.items():
|
for kw_lower, group in groups.items():
|
||||||
# Check if a job file already exists for this keyword (dedup by file)
|
# Check KV for existing submission
|
||||||
existing_jobs = list(jobs_dir.glob(f"job-*-{_slugify(group['keyword'])}*.json"))
|
kv_key = f"autocora:job:{kw_lower}"
|
||||||
if existing_jobs:
|
existing = db.kv_get(kv_key)
|
||||||
skipped.append(group["keyword"])
|
if existing:
|
||||||
continue
|
try:
|
||||||
|
state = json.loads(existing)
|
||||||
|
if state.get("status") == "submitted":
|
||||||
|
skipped.append(group["keyword"])
|
||||||
|
continue
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
# Write job file (contains task_ids for the result poller)
|
# Write job file
|
||||||
job_id = _make_job_id(group["keyword"])
|
job_id = _make_job_id(group["keyword"])
|
||||||
job_data = {
|
job_data = {
|
||||||
"keyword": group["keyword"],
|
"keyword": group["keyword"],
|
||||||
|
|
@ -342,21 +240,28 @@ def submit_autocora_jobs(target_date: str = "", ctx: dict | None = None) -> str:
|
||||||
job_path = jobs_dir / f"{job_id}.json"
|
job_path = jobs_dir / f"{job_id}.json"
|
||||||
job_path.write_text(json.dumps(job_data, indent=2), encoding="utf-8")
|
job_path.write_text(json.dumps(job_data, indent=2), encoding="utf-8")
|
||||||
|
|
||||||
# Move ClickUp tasks to "automation underway"
|
# Track in KV
|
||||||
for tid in group["task_ids"]:
|
kv_state = {
|
||||||
client.update_task_status(tid, "automation underway")
|
"status": "submitted",
|
||||||
|
"job_id": job_id,
|
||||||
|
"keyword": group["keyword"],
|
||||||
|
"url": group["url"],
|
||||||
|
"task_ids": group["task_ids"],
|
||||||
|
"submitted_at": datetime.now(UTC).isoformat(),
|
||||||
|
}
|
||||||
|
db.kv_set(kv_key, json.dumps(kv_state))
|
||||||
|
|
||||||
submitted.append(group["keyword"])
|
submitted.append(group["keyword"])
|
||||||
log.info("Submitted AutoCora job: %s -> %s", group["keyword"], job_id)
|
log.info("Submitted AutoCora job: %s → %s", group["keyword"], job_id)
|
||||||
|
|
||||||
# Build response
|
# Build response
|
||||||
lines = [f"AutoCora submission ({label}):"]
|
lines = [f"AutoCora submission for {target_date}:"]
|
||||||
if submitted:
|
if submitted:
|
||||||
lines.append(f"\nSubmitted {len(submitted)} job(s):")
|
lines.append(f"\nSubmitted {len(submitted)} job(s):")
|
||||||
for kw in submitted:
|
for kw in submitted:
|
||||||
lines.append(f" - {kw}")
|
lines.append(f" - {kw}")
|
||||||
if skipped:
|
if skipped:
|
||||||
lines.append(f"\nSkipped {len(skipped)} (job file already exists):")
|
lines.append(f"\nSkipped {len(skipped)} (already submitted):")
|
||||||
for kw in skipped:
|
for kw in skipped:
|
||||||
lines.append(f" - {kw}")
|
lines.append(f" - {kw}")
|
||||||
if alerts:
|
if alerts:
|
||||||
|
|
@ -370,61 +275,93 @@ def submit_autocora_jobs(target_date: str = "", ctx: dict | None = None) -> str:
|
||||||
@tool(
|
@tool(
|
||||||
"poll_autocora_results",
|
"poll_autocora_results",
|
||||||
"Poll the AutoCora results folder for completed Cora SEO report jobs. "
|
"Poll the AutoCora results folder for completed Cora SEO report jobs. "
|
||||||
"Scans for .result files, reads task_ids from the JSON, updates ClickUp, "
|
"Updates ClickUp task statuses based on results.",
|
||||||
"then moves the result file to a processed/ subfolder.",
|
|
||||||
category="autocora",
|
category="autocora",
|
||||||
)
|
)
|
||||||
def poll_autocora_results(ctx: dict | None = None) -> str:
|
def poll_autocora_results(ctx: dict | None = None) -> str:
|
||||||
"""Poll for AutoCora results and update ClickUp tasks.
|
"""Poll for AutoCora results and update ClickUp tasks.
|
||||||
|
|
||||||
Scans the results folder for .result files. Each result file is JSON
|
Args:
|
||||||
containing {status, task_ids, keyword, ...}. After processing, the
|
ctx: Injected context with config, db, etc.
|
||||||
result file is moved to results/processed/ to avoid re-processing.
|
|
||||||
"""
|
"""
|
||||||
if not ctx:
|
if not ctx:
|
||||||
return "Error: context not available"
|
return "Error: context not available"
|
||||||
|
|
||||||
config = ctx["config"]
|
config = ctx["config"]
|
||||||
|
db = ctx["db"]
|
||||||
autocora = config.autocora
|
autocora = config.autocora
|
||||||
|
|
||||||
if not autocora.enabled:
|
if not autocora.enabled:
|
||||||
return "AutoCora is disabled in config."
|
return "AutoCora is disabled in config."
|
||||||
|
|
||||||
|
# Find all submitted jobs in KV
|
||||||
|
kv_entries = db.kv_scan("autocora:job:")
|
||||||
|
submitted = []
|
||||||
|
for key, value in kv_entries:
|
||||||
|
try:
|
||||||
|
state = json.loads(value)
|
||||||
|
if state.get("status") == "submitted":
|
||||||
|
submitted.append((key, state))
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if not submitted:
|
||||||
|
return "No pending AutoCora jobs to check."
|
||||||
|
|
||||||
results_dir = Path(autocora.results_dir)
|
results_dir = Path(autocora.results_dir)
|
||||||
if not results_dir.exists():
|
if not results_dir.exists():
|
||||||
return f"Results directory does not exist: {results_dir}"
|
return f"Results directory does not exist: {results_dir}"
|
||||||
|
|
||||||
# Scan for .result files
|
|
||||||
result_files = list(results_dir.glob("*.result"))
|
|
||||||
if not result_files:
|
|
||||||
return "No result files found in results folder."
|
|
||||||
|
|
||||||
client = None
|
client = None
|
||||||
if config.clickup.api_token:
|
if config.clickup.api_token:
|
||||||
client = _get_clickup_client(ctx)
|
client = _get_clickup_client(ctx)
|
||||||
|
|
||||||
processed_dir = results_dir / "processed"
|
|
||||||
processed = []
|
processed = []
|
||||||
|
still_pending = []
|
||||||
|
|
||||||
for result_path in result_files:
|
for kv_key, state in submitted:
|
||||||
|
job_id = state.get("job_id", "")
|
||||||
|
if not job_id:
|
||||||
|
continue
|
||||||
|
|
||||||
|
result_path = results_dir / f"{job_id}.result"
|
||||||
|
if not result_path.exists():
|
||||||
|
still_pending.append(state.get("keyword", job_id))
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Read and parse result
|
||||||
raw = result_path.read_text(encoding="utf-8").strip()
|
raw = result_path.read_text(encoding="utf-8").strip()
|
||||||
result_data = _parse_result(raw)
|
result_data = _parse_result(raw)
|
||||||
|
|
||||||
task_ids = result_data.get("task_ids", [])
|
# Get task_ids: prefer result file, fall back to KV
|
||||||
|
task_ids = result_data.get("task_ids") or state.get("task_ids", [])
|
||||||
|
|
||||||
status = result_data.get("status", "UNKNOWN")
|
status = result_data.get("status", "UNKNOWN")
|
||||||
keyword = result_data.get("keyword", result_path.stem)
|
keyword = state.get("keyword", "")
|
||||||
|
|
||||||
if status == "SUCCESS":
|
if status == "SUCCESS":
|
||||||
|
# Update KV
|
||||||
|
state["status"] = "completed"
|
||||||
|
state["completed_at"] = datetime.now(UTC).isoformat()
|
||||||
|
db.kv_set(kv_key, json.dumps(state))
|
||||||
|
|
||||||
|
# Update ClickUp tasks
|
||||||
if client and task_ids:
|
if client and task_ids:
|
||||||
for tid in task_ids:
|
for tid in task_ids:
|
||||||
client.update_task_status(tid, autocora.success_status)
|
client.update_task_status(tid, autocora.success_status)
|
||||||
client.add_comment(tid, f"Cora report generated for \"{keyword}\" — ready for you to look at it.")
|
client.add_comment(tid, f"Cora report completed for keyword: {keyword}")
|
||||||
|
|
||||||
processed.append(f"SUCCESS: {keyword}")
|
processed.append(f"SUCCESS: {keyword}")
|
||||||
log.info("AutoCora SUCCESS: %s", keyword)
|
log.info("AutoCora SUCCESS: %s", keyword)
|
||||||
|
|
||||||
elif status == "FAILURE":
|
elif status == "FAILURE":
|
||||||
reason = result_data.get("reason", "unknown error")
|
reason = result_data.get("reason", "unknown error")
|
||||||
|
state["status"] = "failed"
|
||||||
|
state["error"] = reason
|
||||||
|
state["completed_at"] = datetime.now(UTC).isoformat()
|
||||||
|
db.kv_set(kv_key, json.dumps(state))
|
||||||
|
|
||||||
|
# Update ClickUp tasks
|
||||||
if client and task_ids:
|
if client and task_ids:
|
||||||
for tid in task_ids:
|
for tid in task_ids:
|
||||||
client.update_task_status(tid, autocora.error_status)
|
client.update_task_status(tid, autocora.error_status)
|
||||||
|
|
@ -438,19 +375,16 @@ def poll_autocora_results(ctx: dict | None = None) -> str:
|
||||||
else:
|
else:
|
||||||
processed.append(f"UNKNOWN: {keyword} (status={status})")
|
processed.append(f"UNKNOWN: {keyword} (status={status})")
|
||||||
|
|
||||||
# Move result file to processed/ so it's not re-processed
|
|
||||||
processed_dir.mkdir(exist_ok=True)
|
|
||||||
try:
|
|
||||||
result_path.rename(processed_dir / result_path.name)
|
|
||||||
except OSError as e:
|
|
||||||
log.warning("Could not move result file %s: %s", result_path.name, e)
|
|
||||||
|
|
||||||
# Build response
|
# Build response
|
||||||
lines = ["AutoCora poll results:"]
|
lines = ["AutoCora poll results:"]
|
||||||
if processed:
|
if processed:
|
||||||
lines.append(f"\nProcessed {len(processed)} result(s):")
|
lines.append(f"\nProcessed {len(processed)} result(s):")
|
||||||
for p in processed:
|
for p in processed:
|
||||||
lines.append(f" - {p}")
|
lines.append(f" - {p}")
|
||||||
|
if still_pending:
|
||||||
|
lines.append(f"\nStill pending ({len(still_pending)}):")
|
||||||
|
for kw in still_pending:
|
||||||
|
lines.append(f" - {kw}")
|
||||||
|
|
||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,9 +1,9 @@
|
||||||
"""ClickUp chat-facing tools for listing, querying, and resetting tasks."""
|
"""ClickUp chat-facing tools for listing, approving, and declining tasks."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
import logging
|
import logging
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
from . import tool
|
from . import tool
|
||||||
|
|
||||||
|
|
@ -24,6 +24,22 @@ def _get_clickup_client(ctx: dict):
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _get_clickup_states(db) -> dict[str, dict]:
|
||||||
|
"""Load all tracked ClickUp task states from kv_store."""
|
||||||
|
pairs = db.kv_scan("clickup:task:")
|
||||||
|
states = {}
|
||||||
|
for key, value in pairs:
|
||||||
|
# keys look like clickup:task:{id}:state
|
||||||
|
parts = key.split(":")
|
||||||
|
if len(parts) == 4 and parts[3] == "state":
|
||||||
|
task_id = parts[2]
|
||||||
|
try: # noqa: SIM105
|
||||||
|
states[task_id] = json.loads(value)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
return states
|
||||||
|
|
||||||
|
|
||||||
@tool(
|
@tool(
|
||||||
"clickup_query_tasks",
|
"clickup_query_tasks",
|
||||||
"Query ClickUp live for tasks. Optionally filter by status (e.g. 'to do', 'in progress') "
|
"Query ClickUp live for tasks. Optionally filter by status (e.g. 'to do', 'in progress') "
|
||||||
|
|
@ -78,286 +94,112 @@ def clickup_query_tasks(status: str = "", task_type: str = "", ctx: dict | None
|
||||||
|
|
||||||
@tool(
|
@tool(
|
||||||
"clickup_list_tasks",
|
"clickup_list_tasks",
|
||||||
"List ClickUp tasks in automation-related statuses (automation underway, "
|
"List ClickUp tasks that Cheddah is tracking. Optionally filter by internal state "
|
||||||
"outline review, internal review, error). Shows tasks currently being processed.",
|
"(executing, completed, failed).",
|
||||||
category="clickup",
|
category="clickup",
|
||||||
)
|
)
|
||||||
def clickup_list_tasks(status: str = "", ctx: dict | None = None) -> str:
|
def clickup_list_tasks(status: str = "", ctx: dict | None = None) -> str:
|
||||||
"""List ClickUp tasks in automation-related statuses."""
|
"""List tracked ClickUp tasks, optionally filtered by state."""
|
||||||
client = _get_clickup_client(ctx)
|
db = ctx["db"]
|
||||||
if not client:
|
states = _get_clickup_states(db)
|
||||||
return "Error: ClickUp API token not configured."
|
|
||||||
|
|
||||||
cfg = ctx["config"].clickup
|
if not states:
|
||||||
if not cfg.space_id:
|
return "No ClickUp tasks are currently being tracked."
|
||||||
return "Error: ClickUp space_id not configured."
|
|
||||||
|
|
||||||
# Query tasks in automation-related statuses
|
|
||||||
automation_statuses = [
|
|
||||||
cfg.automation_status,
|
|
||||||
"outline review",
|
|
||||||
cfg.review_status,
|
|
||||||
cfg.error_status,
|
|
||||||
]
|
|
||||||
if status:
|
if status:
|
||||||
automation_statuses = [status]
|
states = {tid: s for tid, s in states.items() if s.get("state") == status}
|
||||||
|
if not states:
|
||||||
try:
|
return f"No ClickUp tasks with state '{status}'."
|
||||||
tasks = client.get_tasks_from_space(cfg.space_id, statuses=automation_statuses)
|
|
||||||
except Exception as e:
|
|
||||||
return f"Error querying ClickUp: {e}"
|
|
||||||
finally:
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
if not tasks:
|
|
||||||
filter_note = f" with status '{status}'" if status else " in automation statuses"
|
|
||||||
return f"No tasks found{filter_note}."
|
|
||||||
|
|
||||||
lines = []
|
lines = []
|
||||||
for t in tasks:
|
for task_id, state in sorted(states.items(), key=lambda x: x[1].get("discovered_at", "")):
|
||||||
parts = [f"**{t.name}** (ID: {t.id})"]
|
name = state.get("clickup_task_name", "Unknown")
|
||||||
parts.append(f" Status: {t.status} | Type: {t.task_type or '—'}")
|
task_type = state.get("task_type", "—")
|
||||||
fields = {k: v for k, v in t.custom_fields.items() if v}
|
task_state = state.get("state", "unknown")
|
||||||
if fields:
|
skill = state.get("skill_name", "—")
|
||||||
field_strs = [f"{k}: {v}" for k, v in fields.items()]
|
lines.append(
|
||||||
parts.append(f" Fields: {', '.join(field_strs)}")
|
f"• **{name}** (ID: {task_id})\n"
|
||||||
lines.append("\n".join(parts))
|
f" Type: {task_type} | State: {task_state} | Skill: {skill}"
|
||||||
|
)
|
||||||
|
|
||||||
return f"**Automation Tasks ({len(lines)}):**\n\n" + "\n\n".join(lines)
|
return f"**Tracked ClickUp Tasks ({len(lines)}):**\n\n" + "\n\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
@tool(
|
@tool(
|
||||||
"clickup_task_status",
|
"clickup_task_status",
|
||||||
"Check the current status and details of a ClickUp task by its ID.",
|
"Check the detailed internal processing state of a ClickUp task by its ID.",
|
||||||
category="clickup",
|
category="clickup",
|
||||||
)
|
)
|
||||||
def clickup_task_status(task_id: str, ctx: dict | None = None) -> str:
|
def clickup_task_status(task_id: str, ctx: dict | None = None) -> str:
|
||||||
"""Get current status for a specific ClickUp task from the API."""
|
"""Get detailed state for a specific tracked task."""
|
||||||
client = _get_clickup_client(ctx)
|
db = ctx["db"]
|
||||||
if not client:
|
raw = db.kv_get(f"clickup:task:{task_id}:state")
|
||||||
return "Error: ClickUp API token not configured."
|
if not raw:
|
||||||
|
return f"No tracked state found for task ID '{task_id}'."
|
||||||
|
|
||||||
try:
|
try:
|
||||||
task = client.get_task(task_id)
|
state = json.loads(raw)
|
||||||
except Exception as e:
|
except json.JSONDecodeError:
|
||||||
return f"Error fetching task '{task_id}': {e}"
|
return f"Corrupted state data for task '{task_id}'."
|
||||||
finally:
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
lines = [f"**Task: {task.name}** (ID: {task.id})"]
|
lines = [f"**Task: {state.get('clickup_task_name', 'Unknown')}** (ID: {task_id})"]
|
||||||
lines.append(f"Status: {task.status}")
|
lines.append(f"State: {state.get('state', 'unknown')}")
|
||||||
lines.append(f"Type: {task.task_type or '—'}")
|
lines.append(f"Task Type: {state.get('task_type', '—')}")
|
||||||
if task.url:
|
lines.append(f"Mapped Skill: {state.get('skill_name', '—')}")
|
||||||
lines.append(f"URL: {task.url}")
|
lines.append(f"Discovered: {state.get('discovered_at', '—')}")
|
||||||
if task.due_date:
|
if state.get("started_at"):
|
||||||
lines.append(f"Due: {task.due_date}")
|
lines.append(f"Started: {state['started_at']}")
|
||||||
if task.date_updated:
|
if state.get("completed_at"):
|
||||||
lines.append(f"Updated: {task.date_updated}")
|
lines.append(f"Completed: {state['completed_at']}")
|
||||||
fields = {k: v for k, v in task.custom_fields.items() if v}
|
if state.get("error"):
|
||||||
if fields:
|
lines.append(f"Error: {state['error']}")
|
||||||
field_strs = [f"{k}: {v}" for k, v in fields.items()]
|
if state.get("deliverable_paths"):
|
||||||
lines.append(f"Fields: {', '.join(field_strs)}")
|
lines.append(f"Deliverables: {', '.join(state['deliverable_paths'])}")
|
||||||
|
if state.get("custom_fields"):
|
||||||
|
fields_str = ", ".join(f"{k}: {v}" for k, v in state["custom_fields"].items() if v)
|
||||||
|
if fields_str:
|
||||||
|
lines.append(f"Custom Fields: {fields_str}")
|
||||||
|
|
||||||
return "\n".join(lines)
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
@tool(
|
|
||||||
"clickup_create_task",
|
|
||||||
"Create a new ClickUp task for a client. Requires task name and client name. "
|
|
||||||
"Optionally set work category, description, status, due_date (Unix ms), "
|
|
||||||
"tags (comma-separated), and arbitrary custom fields via custom_fields_json "
|
|
||||||
'(JSON object like {"Keyword":"value","CLIFlags":"--tier1-count 5"}). '
|
|
||||||
"The task is created in the 'Overall' list within the client's folder.",
|
|
||||||
category="clickup",
|
|
||||||
)
|
|
||||||
def clickup_create_task(
|
|
||||||
name: str,
|
|
||||||
client: str,
|
|
||||||
work_category: str = "",
|
|
||||||
description: str = "",
|
|
||||||
status: str = "to do",
|
|
||||||
due_date: str = "",
|
|
||||||
tags: str = "",
|
|
||||||
custom_fields_json: str = "",
|
|
||||||
priority: int = 2,
|
|
||||||
assignee: int = 10765627,
|
|
||||||
time_estimate_ms: int = 0,
|
|
||||||
ctx: dict | None = None,
|
|
||||||
) -> str:
|
|
||||||
"""Create a new ClickUp task in the client's Overall list."""
|
|
||||||
import json as _json
|
|
||||||
|
|
||||||
client_obj = _get_clickup_client(ctx)
|
|
||||||
if not client_obj:
|
|
||||||
return "Error: ClickUp API token not configured."
|
|
||||||
|
|
||||||
cfg = ctx["config"].clickup
|
|
||||||
if not cfg.space_id:
|
|
||||||
return "Error: ClickUp space_id not configured."
|
|
||||||
|
|
||||||
try:
|
|
||||||
# Find the client's Overall list
|
|
||||||
list_id = client_obj.find_list_in_folder(cfg.space_id, client)
|
|
||||||
if not list_id:
|
|
||||||
return (
|
|
||||||
f"Error: Could not find folder '{client}' "
|
|
||||||
f"with an 'Overall' list in space."
|
|
||||||
)
|
|
||||||
|
|
||||||
# Build create kwargs
|
|
||||||
create_kwargs: dict = {
|
|
||||||
"list_id": list_id,
|
|
||||||
"name": name,
|
|
||||||
"description": description,
|
|
||||||
"status": status,
|
|
||||||
"priority": priority,
|
|
||||||
"assignees": [assignee],
|
|
||||||
}
|
|
||||||
if due_date:
|
|
||||||
create_kwargs["due_date"] = int(due_date)
|
|
||||||
if tags:
|
|
||||||
create_kwargs["tags"] = [t.strip() for t in tags.split(",")]
|
|
||||||
if time_estimate_ms:
|
|
||||||
create_kwargs["time_estimate"] = time_estimate_ms
|
|
||||||
|
|
||||||
# Create the task
|
|
||||||
result = client_obj.create_task(**create_kwargs)
|
|
||||||
task_id = result.get("id", "")
|
|
||||||
task_url = result.get("url", "")
|
|
||||||
|
|
||||||
# Set Client dropdown field
|
|
||||||
client_obj.set_custom_field_smart(task_id, list_id, "Client", client)
|
|
||||||
|
|
||||||
# Set Work Category if provided
|
|
||||||
if work_category:
|
|
||||||
client_obj.set_custom_field_smart(
|
|
||||||
task_id, list_id, "Work Category", work_category
|
|
||||||
)
|
|
||||||
|
|
||||||
# Set any additional custom fields
|
|
||||||
if custom_fields_json:
|
|
||||||
extra_fields = _json.loads(custom_fields_json)
|
|
||||||
for field_name, field_value in extra_fields.items():
|
|
||||||
client_obj.set_custom_field_smart(
|
|
||||||
task_id, list_id, field_name, str(field_value)
|
|
||||||
)
|
|
||||||
|
|
||||||
return (
|
|
||||||
f"Task created successfully!\n"
|
|
||||||
f" Name: {name}\n"
|
|
||||||
f" Client: {client}\n"
|
|
||||||
f" ID: {task_id}\n"
|
|
||||||
f" URL: {task_url}"
|
|
||||||
)
|
|
||||||
except Exception as e:
|
|
||||||
return f"Error creating task: {e}"
|
|
||||||
finally:
|
|
||||||
client_obj.close()
|
|
||||||
|
|
||||||
|
|
||||||
@tool(
|
@tool(
|
||||||
"clickup_reset_task",
|
"clickup_reset_task",
|
||||||
"Reset a ClickUp task to 'to do' status so it can be retried on the next poll. "
|
"Reset a ClickUp task's internal tracking state so it can be retried on the next poll. "
|
||||||
"Use this when a task is stuck in an error or automation state.",
|
"Use this when a task has failed or completed and you want to re-run it.",
|
||||||
category="clickup",
|
category="clickup",
|
||||||
)
|
)
|
||||||
def clickup_reset_task(task_id: str, ctx: dict | None = None) -> str:
|
def clickup_reset_task(task_id: str, ctx: dict | None = None) -> str:
|
||||||
"""Reset a ClickUp task status to 'to do' for retry."""
|
"""Delete the kv_store state for a single task so it can be retried."""
|
||||||
client = _get_clickup_client(ctx)
|
db = ctx["db"]
|
||||||
if not client:
|
key = f"clickup:task:{task_id}:state"
|
||||||
return "Error: ClickUp API token not configured."
|
raw = db.kv_get(key)
|
||||||
|
if not raw:
|
||||||
|
return f"No tracked state found for task ID '{task_id}'. Nothing to reset."
|
||||||
|
|
||||||
cfg = ctx["config"].clickup
|
db.kv_delete(key)
|
||||||
reset_status = cfg.poll_statuses[0] if cfg.poll_statuses else "to do"
|
return f"Task '{task_id}' state cleared. It will be picked up on the next scheduler poll."
|
||||||
|
|
||||||
try:
|
|
||||||
client.update_task_status(task_id, reset_status)
|
|
||||||
client.add_comment(
|
|
||||||
task_id,
|
|
||||||
f"Task reset to '{reset_status}' via chat command.",
|
|
||||||
)
|
|
||||||
except Exception as e:
|
|
||||||
return f"Error resetting task '{task_id}': {e}"
|
|
||||||
finally:
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
return (
|
|
||||||
f"Task '{task_id}' reset to '{reset_status}'. "
|
|
||||||
f"It will be picked up on the next scheduler poll."
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def _format_duration(delta) -> str:
|
|
||||||
"""Format a timedelta as a human-readable duration string."""
|
|
||||||
total_seconds = int(delta.total_seconds())
|
|
||||||
hours, remainder = divmod(total_seconds, 3600)
|
|
||||||
minutes, seconds = divmod(remainder, 60)
|
|
||||||
if hours:
|
|
||||||
return f"{hours}h {minutes}m {seconds}s"
|
|
||||||
if minutes:
|
|
||||||
return f"{minutes}m {seconds}s"
|
|
||||||
return f"{seconds}s"
|
|
||||||
|
|
||||||
|
|
||||||
def _format_ago(iso_str: str | None) -> str:
|
|
||||||
"""Format an ISO timestamp as 'Xm ago' relative to now."""
|
|
||||||
if not iso_str:
|
|
||||||
return "never"
|
|
||||||
try:
|
|
||||||
ts = datetime.fromisoformat(iso_str)
|
|
||||||
delta = datetime.now(UTC) - ts
|
|
||||||
total_seconds = int(delta.total_seconds())
|
|
||||||
if total_seconds < 60:
|
|
||||||
return f"{total_seconds}s ago"
|
|
||||||
minutes = total_seconds // 60
|
|
||||||
if minutes < 60:
|
|
||||||
return f"{minutes}m ago"
|
|
||||||
hours = minutes // 60
|
|
||||||
return f"{hours}h {minutes % 60}m ago"
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
return "unknown"
|
|
||||||
|
|
||||||
|
|
||||||
@tool(
|
@tool(
|
||||||
"get_active_tasks",
|
"clickup_reset_all",
|
||||||
"Show what CheddahBot is actively executing right now. "
|
"Clear ALL internal ClickUp task tracking state. Use this to wipe the slate clean "
|
||||||
"Reports running tasks, loop health, and whether it's safe to restart.",
|
"so all eligible tasks can be retried on the next poll cycle.",
|
||||||
category="clickup",
|
category="clickup",
|
||||||
)
|
)
|
||||||
def get_active_tasks(ctx: dict | None = None) -> str:
|
def clickup_reset_all(ctx: dict | None = None) -> str:
|
||||||
"""Show actively running scheduler tasks and loop health."""
|
"""Delete all clickup task states and legacy active_ids from kv_store."""
|
||||||
scheduler = ctx.get("scheduler") if ctx else None
|
db = ctx["db"]
|
||||||
if not scheduler:
|
states = _get_clickup_states(db)
|
||||||
return "Scheduler not available — cannot check active executions."
|
count = 0
|
||||||
|
for task_id in states:
|
||||||
|
db.kv_delete(f"clickup:task:{task_id}:state")
|
||||||
|
count += 1
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
# Also clean up legacy active_ids key
|
||||||
lines = []
|
if db.kv_get("clickup:active_task_ids"):
|
||||||
|
db.kv_delete("clickup:active_task_ids")
|
||||||
|
|
||||||
# Active executions
|
return (
|
||||||
active = scheduler.get_active_executions()
|
f"Cleared {count} task state(s) from tracking. Next poll will re-discover eligible tasks."
|
||||||
if active:
|
)
|
||||||
lines.append(f"**Active Executions ({len(active)}):**")
|
|
||||||
for task_id, info in active.items():
|
|
||||||
duration = _format_duration(now - info["started_at"])
|
|
||||||
lines.append(
|
|
||||||
f"- **{info['name']}** — `{info['tool']}` — "
|
|
||||||
f"running {duration} ({info['thread']} thread)"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
lines.append("**No tasks actively executing.**")
|
|
||||||
|
|
||||||
# Loop health
|
|
||||||
timestamps = scheduler.get_loop_timestamps()
|
|
||||||
lines.append("")
|
|
||||||
lines.append("**Loop Health:**")
|
|
||||||
for loop_name, ts in timestamps.items():
|
|
||||||
lines.append(f"- {loop_name}: last ran {_format_ago(ts)}")
|
|
||||||
|
|
||||||
# Safe to restart?
|
|
||||||
lines.append("")
|
|
||||||
if active:
|
|
||||||
lines.append(f"**Safe to restart: No** ({len(active)} task(s) actively running)")
|
|
||||||
else:
|
|
||||||
lines.append("**Safe to restart: Yes**")
|
|
||||||
|
|
||||||
return "\n".join(lines)
|
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load Diff
|
|
@ -6,11 +6,12 @@ Primary workflow: ingest CORA .xlsx → generate content batch.
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
import re
|
import re
|
||||||
import subprocess
|
import subprocess
|
||||||
from collections.abc import Callable
|
from datetime import UTC, datetime
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from . import tool
|
from . import tool
|
||||||
|
|
@ -30,13 +31,6 @@ def _get_blm_dir(ctx: dict | None) -> str:
|
||||||
return os.getenv("BLM_DIR", "E:/dev/Big-Link-Man")
|
return os.getenv("BLM_DIR", "E:/dev/Big-Link-Man")
|
||||||
|
|
||||||
|
|
||||||
def _get_blm_timeout(ctx: dict | None) -> int:
|
|
||||||
"""Get BLM subprocess timeout from config or default (1800s / 30 min)."""
|
|
||||||
if ctx and "config" in ctx:
|
|
||||||
return ctx["config"].timeouts.blm
|
|
||||||
return 1800
|
|
||||||
|
|
||||||
|
|
||||||
def _run_blm_command(
|
def _run_blm_command(
|
||||||
args: list[str], blm_dir: str, timeout: int = 1800
|
args: list[str], blm_dir: str, timeout: int = 1800
|
||||||
) -> subprocess.CompletedProcess:
|
) -> subprocess.CompletedProcess:
|
||||||
|
|
@ -169,9 +163,9 @@ def _parse_generate_output(stdout: str) -> dict:
|
||||||
|
|
||||||
|
|
||||||
def _set_status(ctx: dict | None, message: str) -> None:
|
def _set_status(ctx: dict | None, message: str) -> None:
|
||||||
"""Log pipeline progress. Previously wrote to KV; now just logs."""
|
"""Write pipeline progress to KV store for UI polling."""
|
||||||
if message:
|
if ctx and "db" in ctx:
|
||||||
log.info("[LB Pipeline] %s", message)
|
ctx["db"].kv_set("linkbuilding:status", message)
|
||||||
|
|
||||||
|
|
||||||
def _get_clickup_client(ctx: dict | None):
|
def _get_clickup_client(ctx: dict | None):
|
||||||
|
|
@ -193,10 +187,25 @@ def _get_clickup_client(ctx: dict | None):
|
||||||
|
|
||||||
|
|
||||||
def _sync_clickup(ctx: dict | None, task_id: str, step: str, message: str) -> None:
|
def _sync_clickup(ctx: dict | None, task_id: str, step: str, message: str) -> None:
|
||||||
"""Post a progress comment to ClickUp."""
|
"""Post a comment to ClickUp and update KV state."""
|
||||||
if not task_id or not ctx:
|
if not task_id or not ctx:
|
||||||
return
|
return
|
||||||
|
|
||||||
|
# Update KV store
|
||||||
|
db = ctx.get("db")
|
||||||
|
if db:
|
||||||
|
kv_key = f"clickup:task:{task_id}:state"
|
||||||
|
raw = db.kv_get(kv_key)
|
||||||
|
if raw:
|
||||||
|
try:
|
||||||
|
state = json.loads(raw)
|
||||||
|
state["last_step"] = step
|
||||||
|
state["last_message"] = message
|
||||||
|
db.kv_set(kv_key, json.dumps(state))
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Post comment to ClickUp
|
||||||
cu_client = _get_clickup_client(ctx)
|
cu_client = _get_clickup_client(ctx)
|
||||||
if cu_client:
|
if cu_client:
|
||||||
try:
|
try:
|
||||||
|
|
@ -245,8 +254,26 @@ def _find_clickup_task(ctx: dict, keyword: str) -> str:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
if _fuzzy_keyword_match(keyword_norm, _normalize_for_match(str(task_keyword))):
|
if _fuzzy_keyword_match(keyword_norm, _normalize_for_match(str(task_keyword))):
|
||||||
# Found a match — move to "automation underway"
|
# Found a match — create executing state
|
||||||
task_id = task.id
|
task_id = task.id
|
||||||
|
now = datetime.now(UTC).isoformat()
|
||||||
|
state = {
|
||||||
|
"state": "executing",
|
||||||
|
"clickup_task_id": task_id,
|
||||||
|
"clickup_task_name": task.name,
|
||||||
|
"task_type": task.task_type,
|
||||||
|
"skill_name": "run_link_building",
|
||||||
|
"discovered_at": now,
|
||||||
|
"started_at": now,
|
||||||
|
"completed_at": None,
|
||||||
|
"error": None,
|
||||||
|
"deliverable_paths": [],
|
||||||
|
"custom_fields": task.custom_fields,
|
||||||
|
}
|
||||||
|
|
||||||
|
db = ctx.get("db")
|
||||||
|
if db:
|
||||||
|
db.kv_set(f"clickup:task:{task_id}:state", json.dumps(state))
|
||||||
|
|
||||||
# Move to "automation underway"
|
# Move to "automation underway"
|
||||||
cu_client2 = _get_clickup_client(ctx)
|
cu_client2 = _get_clickup_client(ctx)
|
||||||
|
|
@ -272,24 +299,30 @@ def _normalize_for_match(text: str) -> str:
|
||||||
return text
|
return text
|
||||||
|
|
||||||
|
|
||||||
def _fuzzy_keyword_match(a: str, b: str, llm_check: Callable[[str, str], bool] | None = None) -> bool:
|
def _fuzzy_keyword_match(a: str, b: str) -> bool:
|
||||||
"""Check if two normalized strings match, allowing singular/plural differences.
|
"""Check if two normalized strings are a fuzzy match.
|
||||||
|
|
||||||
Fast path: exact match after normalization.
|
Matches if: exact, substring in either direction, or >80% word overlap.
|
||||||
Slow path: ask an LLM if the two keywords are the same aside from plural form.
|
|
||||||
Falls back to False if no llm_check is provided and strings differ.
|
|
||||||
"""
|
"""
|
||||||
if not a or not b:
|
if not a or not b:
|
||||||
return False
|
return False
|
||||||
if a == b:
|
if a == b:
|
||||||
return True
|
return True
|
||||||
if llm_check is None:
|
if a in b or b in a:
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Word overlap check
|
||||||
|
words_a = set(a.split())
|
||||||
|
words_b = set(b.split())
|
||||||
|
if not words_a or not words_b:
|
||||||
return False
|
return False
|
||||||
return llm_check(a, b)
|
overlap = len(words_a & words_b)
|
||||||
|
min_len = min(len(words_a), len(words_b))
|
||||||
|
return overlap / min_len >= 0.8 if min_len > 0 else False
|
||||||
|
|
||||||
|
|
||||||
def _complete_clickup_task(ctx: dict | None, task_id: str, message: str, status: str = "") -> None:
|
def _complete_clickup_task(ctx: dict | None, task_id: str, message: str, status: str = "") -> None:
|
||||||
"""Mark a ClickUp task as completed."""
|
"""Mark a ClickUp task as completed and update KV state."""
|
||||||
if not task_id or not ctx:
|
if not task_id or not ctx:
|
||||||
return
|
return
|
||||||
|
|
||||||
|
|
@ -298,6 +331,19 @@ def _complete_clickup_task(ctx: dict | None, task_id: str, message: str, status:
|
||||||
lb_map = skill_map.get("Link Building", {})
|
lb_map = skill_map.get("Link Building", {})
|
||||||
complete_status = status or lb_map.get("complete_status", "complete")
|
complete_status = status or lb_map.get("complete_status", "complete")
|
||||||
|
|
||||||
|
db = ctx.get("db")
|
||||||
|
if db:
|
||||||
|
kv_key = f"clickup:task:{task_id}:state"
|
||||||
|
raw = db.kv_get(kv_key)
|
||||||
|
if raw:
|
||||||
|
try:
|
||||||
|
state = json.loads(raw)
|
||||||
|
state["state"] = "completed"
|
||||||
|
state["completed_at"] = datetime.now(UTC).isoformat()
|
||||||
|
db.kv_set(kv_key, json.dumps(state))
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
cu_client = _get_clickup_client(ctx)
|
cu_client = _get_clickup_client(ctx)
|
||||||
if cu_client:
|
if cu_client:
|
||||||
try:
|
try:
|
||||||
|
|
@ -310,19 +356,33 @@ def _complete_clickup_task(ctx: dict | None, task_id: str, message: str, status:
|
||||||
|
|
||||||
|
|
||||||
def _fail_clickup_task(ctx: dict | None, task_id: str, error_msg: str) -> None:
|
def _fail_clickup_task(ctx: dict | None, task_id: str, error_msg: str) -> None:
|
||||||
"""Mark a ClickUp task as failed."""
|
"""Mark a ClickUp task as failed and update KV state."""
|
||||||
if not task_id or not ctx:
|
if not task_id or not ctx:
|
||||||
return
|
return
|
||||||
|
|
||||||
config = ctx.get("config")
|
config = ctx.get("config")
|
||||||
error_status = config.clickup.error_status if config else "error"
|
error_status = config.clickup.error_status if config else "error"
|
||||||
|
|
||||||
|
db = ctx.get("db")
|
||||||
|
if db:
|
||||||
|
kv_key = f"clickup:task:{task_id}:state"
|
||||||
|
raw = db.kv_get(kv_key)
|
||||||
|
if raw:
|
||||||
|
try:
|
||||||
|
state = json.loads(raw)
|
||||||
|
state["state"] = "failed"
|
||||||
|
state["error"] = error_msg
|
||||||
|
state["completed_at"] = datetime.now(UTC).isoformat()
|
||||||
|
db.kv_set(kv_key, json.dumps(state))
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
cu_client = _get_clickup_client(ctx)
|
cu_client = _get_clickup_client(ctx)
|
||||||
if cu_client:
|
if cu_client:
|
||||||
try:
|
try:
|
||||||
cu_client.add_comment(
|
cu_client.add_comment(
|
||||||
task_id,
|
task_id,
|
||||||
f"[FAILED]Link building pipeline failed.\n\nError: {error_msg[:2000]}",
|
f"❌ Link building pipeline failed.\n\nError: {error_msg[:2000]}",
|
||||||
)
|
)
|
||||||
cu_client.update_task_status(task_id, error_status)
|
cu_client.update_task_status(task_id, error_status)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
|
@ -436,7 +496,7 @@ def run_cora_backlinks(
|
||||||
# ── Step 1: ingest-cora ──
|
# ── Step 1: ingest-cora ──
|
||||||
_set_status(ctx, f"Step 1/2: Ingesting CORA report for {project_name}...")
|
_set_status(ctx, f"Step 1/2: Ingesting CORA report for {project_name}...")
|
||||||
if clickup_task_id:
|
if clickup_task_id:
|
||||||
_sync_clickup(ctx, clickup_task_id, "ingest", "[STARTED]Starting Cora Backlinks pipeline...")
|
_sync_clickup(ctx, clickup_task_id, "ingest", "🔄 Starting Cora Backlinks pipeline...")
|
||||||
|
|
||||||
# Convert branded_plus_ratio from string if needed
|
# Convert branded_plus_ratio from string if needed
|
||||||
try:
|
try:
|
||||||
|
|
@ -453,11 +513,10 @@ def run_cora_backlinks(
|
||||||
cli_flags=cli_flags,
|
cli_flags=cli_flags,
|
||||||
)
|
)
|
||||||
|
|
||||||
blm_timeout = _get_blm_timeout(ctx)
|
|
||||||
try:
|
try:
|
||||||
ingest_result = _run_blm_command(ingest_args, blm_dir, timeout=blm_timeout)
|
ingest_result = _run_blm_command(ingest_args, blm_dir)
|
||||||
except subprocess.TimeoutExpired:
|
except subprocess.TimeoutExpired:
|
||||||
error = f"ingest-cora timed out after {blm_timeout // 60} minutes"
|
error = "ingest-cora timed out after 30 minutes"
|
||||||
_set_status(ctx, "")
|
_set_status(ctx, "")
|
||||||
if clickup_task_id:
|
if clickup_task_id:
|
||||||
_fail_clickup_task(ctx, clickup_task_id, error)
|
_fail_clickup_task(ctx, clickup_task_id, error)
|
||||||
|
|
@ -490,7 +549,7 @@ def run_cora_backlinks(
|
||||||
ctx,
|
ctx,
|
||||||
clickup_task_id,
|
clickup_task_id,
|
||||||
"ingest_done",
|
"ingest_done",
|
||||||
f"[DONE]CORA report ingested. Project ID: {project_id}. Job file: {job_file}",
|
f"✅ CORA report ingested. Project ID: {project_id}. Job file: {job_file}",
|
||||||
)
|
)
|
||||||
|
|
||||||
# ── Step 2: generate-batch ──
|
# ── Step 2: generate-batch ──
|
||||||
|
|
@ -502,9 +561,9 @@ def run_cora_backlinks(
|
||||||
gen_args = ["generate-batch", "-j", str(job_path), "--continue-on-error"]
|
gen_args = ["generate-batch", "-j", str(job_path), "--continue-on-error"]
|
||||||
|
|
||||||
try:
|
try:
|
||||||
gen_result = _run_blm_command(gen_args, blm_dir, timeout=blm_timeout)
|
gen_result = _run_blm_command(gen_args, blm_dir)
|
||||||
except subprocess.TimeoutExpired:
|
except subprocess.TimeoutExpired:
|
||||||
error = f"generate-batch timed out after {blm_timeout // 60} minutes"
|
error = "generate-batch timed out after 30 minutes"
|
||||||
_set_status(ctx, "")
|
_set_status(ctx, "")
|
||||||
if clickup_task_id:
|
if clickup_task_id:
|
||||||
_fail_clickup_task(ctx, clickup_task_id, error)
|
_fail_clickup_task(ctx, clickup_task_id, error)
|
||||||
|
|
@ -534,7 +593,7 @@ def run_cora_backlinks(
|
||||||
|
|
||||||
if clickup_task_id:
|
if clickup_task_id:
|
||||||
summary = (
|
summary = (
|
||||||
f"[DONE]Cora Backlinks pipeline completed for {project_name}.\n\n"
|
f"✅ Cora Backlinks pipeline completed for {project_name}.\n\n"
|
||||||
f"Project ID: {project_id}\n"
|
f"Project ID: {project_id}\n"
|
||||||
f"Keyword: {ingest_parsed['main_keyword']}\n"
|
f"Keyword: {ingest_parsed['main_keyword']}\n"
|
||||||
f"Job file: {gen_parsed['job_moved_to'] or job_file}"
|
f"Job file: {gen_parsed['job_moved_to'] or job_file}"
|
||||||
|
|
@ -592,11 +651,10 @@ def blm_ingest_cora(
|
||||||
cli_flags=cli_flags,
|
cli_flags=cli_flags,
|
||||||
)
|
)
|
||||||
|
|
||||||
blm_timeout = _get_blm_timeout(ctx)
|
|
||||||
try:
|
try:
|
||||||
result = _run_blm_command(ingest_args, blm_dir, timeout=blm_timeout)
|
result = _run_blm_command(ingest_args, blm_dir)
|
||||||
except subprocess.TimeoutExpired:
|
except subprocess.TimeoutExpired:
|
||||||
return f"Error: ingest-cora timed out after {blm_timeout // 60} minutes."
|
return "Error: ingest-cora timed out after 30 minutes."
|
||||||
|
|
||||||
parsed = _parse_ingest_output(result.stdout)
|
parsed = _parse_ingest_output(result.stdout)
|
||||||
|
|
||||||
|
|
@ -647,11 +705,10 @@ def blm_generate_batch(
|
||||||
if debug:
|
if debug:
|
||||||
args.append("--debug")
|
args.append("--debug")
|
||||||
|
|
||||||
blm_timeout = _get_blm_timeout(ctx)
|
|
||||||
try:
|
try:
|
||||||
result = _run_blm_command(args, blm_dir, timeout=blm_timeout)
|
result = _run_blm_command(args, blm_dir)
|
||||||
except subprocess.TimeoutExpired:
|
except subprocess.TimeoutExpired:
|
||||||
return f"Error: generate-batch timed out after {blm_timeout // 60} minutes."
|
return "Error: generate-batch timed out after 30 minutes."
|
||||||
|
|
||||||
parsed = _parse_generate_output(result.stdout)
|
parsed = _parse_generate_output(result.stdout)
|
||||||
|
|
||||||
|
|
@ -692,6 +749,7 @@ def scan_cora_folder(ctx: dict | None = None) -> str:
|
||||||
if not watch_path.exists():
|
if not watch_path.exists():
|
||||||
return f"Watch folder does not exist: {watch_folder}"
|
return f"Watch folder does not exist: {watch_folder}"
|
||||||
|
|
||||||
|
db = ctx.get("db")
|
||||||
xlsx_files = sorted(watch_path.glob("*.xlsx"))
|
xlsx_files = sorted(watch_path.glob("*.xlsx"))
|
||||||
|
|
||||||
if not xlsx_files:
|
if not xlsx_files:
|
||||||
|
|
@ -699,16 +757,18 @@ def scan_cora_folder(ctx: dict | None = None) -> str:
|
||||||
|
|
||||||
lines = [f"## Cora Inbox: {watch_folder}\n"]
|
lines = [f"## Cora Inbox: {watch_folder}\n"]
|
||||||
|
|
||||||
processed_dir = watch_path / "processed"
|
|
||||||
processed_names = set()
|
|
||||||
if processed_dir.exists():
|
|
||||||
processed_names = {f.name for f in processed_dir.glob("*.xlsx")}
|
|
||||||
|
|
||||||
for f in xlsx_files:
|
for f in xlsx_files:
|
||||||
filename = f.name
|
filename = f.name
|
||||||
if filename.startswith("~$"):
|
status = "new"
|
||||||
continue
|
if db:
|
||||||
status = "processed" if filename in processed_names else "new"
|
kv_val = db.kv_get(f"linkbuilding:watched:{filename}")
|
||||||
|
if kv_val:
|
||||||
|
try:
|
||||||
|
watched = json.loads(kv_val)
|
||||||
|
status = watched.get("status", "unknown")
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
status = "tracked"
|
||||||
|
|
||||||
lines.append(f"- **{filename}** — status: {status}")
|
lines.append(f"- **{filename}** — status: {status}")
|
||||||
|
|
||||||
# Check processed subfolder
|
# Check processed subfolder
|
||||||
|
|
|
||||||
|
|
@ -14,7 +14,7 @@ import json
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
import time
|
import time
|
||||||
from datetime import datetime
|
from datetime import UTC, datetime
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from ..docx_export import text_to_docx
|
from ..docx_export import text_to_docx
|
||||||
|
|
@ -38,9 +38,9 @@ SONNET_CLI_MODEL = "sonnet"
|
||||||
|
|
||||||
|
|
||||||
def _set_status(ctx: dict | None, message: str) -> None:
|
def _set_status(ctx: dict | None, message: str) -> None:
|
||||||
"""Log pipeline progress. Previously wrote to KV; now just logs."""
|
"""Write pipeline progress to the DB so the UI can poll it."""
|
||||||
if message:
|
if ctx and "db" in ctx:
|
||||||
log.info("[PR Pipeline] %s", message)
|
ctx["db"].kv_set("pipeline:status", message)
|
||||||
|
|
||||||
|
|
||||||
def _fuzzy_company_match(name: str, candidate: str) -> bool:
|
def _fuzzy_company_match(name: str, candidate: str) -> bool:
|
||||||
|
|
@ -88,15 +88,33 @@ def _find_clickup_task(ctx: dict, company_name: str) -> str:
|
||||||
if task.task_type != "Press Release":
|
if task.task_type != "Press Release":
|
||||||
continue
|
continue
|
||||||
|
|
||||||
client_field = task.custom_fields.get("Client", "")
|
client_field = task.custom_fields.get("Customer", "")
|
||||||
if not (
|
if not (
|
||||||
_fuzzy_company_match(company_name, task.name)
|
_fuzzy_company_match(company_name, task.name)
|
||||||
or _fuzzy_company_match(company_name, client_field)
|
or _fuzzy_company_match(company_name, client_field)
|
||||||
):
|
):
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Found a match — move to "automation underway" on ClickUp
|
# Found a match — create kv_store entry and move to "in progress"
|
||||||
task_id = task.id
|
task_id = task.id
|
||||||
|
now = datetime.now(UTC).isoformat()
|
||||||
|
state = {
|
||||||
|
"state": "executing",
|
||||||
|
"clickup_task_id": task_id,
|
||||||
|
"clickup_task_name": task.name,
|
||||||
|
"task_type": task.task_type,
|
||||||
|
"skill_name": "write_press_releases",
|
||||||
|
"discovered_at": now,
|
||||||
|
"started_at": now,
|
||||||
|
"completed_at": None,
|
||||||
|
"error": None,
|
||||||
|
"deliverable_paths": [],
|
||||||
|
"custom_fields": task.custom_fields,
|
||||||
|
}
|
||||||
|
|
||||||
|
db = ctx.get("db")
|
||||||
|
if db:
|
||||||
|
db.kv_set(f"clickup:task:{task_id}:state", json.dumps(state))
|
||||||
|
|
||||||
# Move to "automation underway" on ClickUp
|
# Move to "automation underway" on ClickUp
|
||||||
cu_client2 = _get_clickup_client(ctx)
|
cu_client2 = _get_clickup_client(ctx)
|
||||||
|
|
@ -236,28 +254,10 @@ def _clean_pr_output(raw: str, headline: str) -> str:
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
|
||||||
def _is_actual_news(topic: str) -> bool:
|
|
||||||
"""Detect whether the topic signals genuinely new news.
|
|
||||||
|
|
||||||
Returns True if the topic contains explicit markers like 'actual news',
|
|
||||||
'new product', 'launch', 'acquisition', 'partnership', 'certification',
|
|
||||||
or 'award'. The user is expected to signal this in the PR Topic field.
|
|
||||||
"""
|
|
||||||
signals = [
|
|
||||||
"actual news", "new product", "launch", "launches",
|
|
||||||
"acquisition", "partnership", "certification", "award",
|
|
||||||
"unveil", "unveils", "introduce", "introduces",
|
|
||||||
]
|
|
||||||
topic_lower = topic.lower()
|
|
||||||
return any(s in topic_lower for s in signals)
|
|
||||||
|
|
||||||
|
|
||||||
def _build_headline_prompt(
|
def _build_headline_prompt(
|
||||||
topic: str, company_name: str, url: str, lsi_terms: str, headlines_ref: str
|
topic: str, company_name: str, url: str, lsi_terms: str, headlines_ref: str
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Build the prompt for Step 1: generate 7 headlines."""
|
"""Build the prompt for Step 1: generate 7 headlines."""
|
||||||
is_news = _is_actual_news(topic)
|
|
||||||
|
|
||||||
prompt = (
|
prompt = (
|
||||||
f"Generate exactly 7 unique press release headline options for the following.\n\n"
|
f"Generate exactly 7 unique press release headline options for the following.\n\n"
|
||||||
f"Topic: {topic}\n"
|
f"Topic: {topic}\n"
|
||||||
|
|
@ -272,34 +272,14 @@ def _build_headline_prompt(
|
||||||
"\nRules for EVERY headline:\n"
|
"\nRules for EVERY headline:\n"
|
||||||
"- Maximum 70 characters (including spaces)\n"
|
"- Maximum 70 characters (including spaces)\n"
|
||||||
"- Title case\n"
|
"- Title case\n"
|
||||||
|
"- News-focused, not promotional\n"
|
||||||
"- NO location/geographic keywords\n"
|
"- NO location/geographic keywords\n"
|
||||||
"- NO superlatives (best, top, leading, #1)\n"
|
"- NO superlatives (best, top, leading, #1)\n"
|
||||||
"- NO questions\n"
|
"- NO questions\n"
|
||||||
"- NO colons — colons are considered lower quality\n"
|
"- NO colons — colons are considered lower quality\n"
|
||||||
|
"- Must contain an actual news announcement\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
if is_news:
|
|
||||||
prompt += (
|
|
||||||
"\nThis topic is ACTUAL NEWS — a real new event, product, partnership, "
|
|
||||||
"or achievement. You may use announcement verbs like 'Announces', "
|
|
||||||
"'Launches', 'Introduces', 'Unveils'.\n"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
prompt += (
|
|
||||||
"\nIMPORTANT — AWARENESS FRAMING:\n"
|
|
||||||
"The company ALREADY offers this product/service/capability. Nothing is "
|
|
||||||
"new, nothing was just launched, expanded, or achieved. You are writing "
|
|
||||||
"an awareness piece about existing capabilities framed in news-wire style.\n\n"
|
|
||||||
"REQUIRED verbs — use these: 'Highlights', 'Reinforces', 'Delivers', "
|
|
||||||
"'Strengthens', 'Showcases', 'Details', 'Offers', 'Provides'\n\n"
|
|
||||||
"BANNED — do NOT use any of these:\n"
|
|
||||||
"- 'Announces', 'Launches', 'Introduces', 'Unveils', 'Expands', "
|
|
||||||
"'Reveals', 'Announces New'\n"
|
|
||||||
"- 'Significant expansion', 'major milestone', 'growing demand', "
|
|
||||||
"'new capabilities', 'celebrates X years'\n"
|
|
||||||
"- Any language that implies something CHANGED or is NEW when it is not\n"
|
|
||||||
)
|
|
||||||
|
|
||||||
if headlines_ref:
|
if headlines_ref:
|
||||||
prompt += (
|
prompt += (
|
||||||
"\nHere are examples of high-quality headlines to use as reference "
|
"\nHere are examples of high-quality headlines to use as reference "
|
||||||
|
|
@ -314,10 +294,8 @@ def _build_headline_prompt(
|
||||||
return prompt
|
return prompt
|
||||||
|
|
||||||
|
|
||||||
def _build_judge_prompt(headlines: str, headlines_ref: str, topic: str = "") -> str:
|
def _build_judge_prompt(headlines: str, headlines_ref: str) -> str:
|
||||||
"""Build the prompt for Step 2: pick the 2 best headlines."""
|
"""Build the prompt for Step 2: pick the 2 best headlines."""
|
||||||
is_news = _is_actual_news(topic)
|
|
||||||
|
|
||||||
prompt = (
|
prompt = (
|
||||||
"You are judging press release headlines for Press Advantage distribution. "
|
"You are judging press release headlines for Press Advantage distribution. "
|
||||||
"Pick the 2 best headlines from the candidates below.\n\n"
|
"Pick the 2 best headlines from the candidates below.\n\n"
|
||||||
|
|
@ -327,25 +305,12 @@ def _build_judge_prompt(headlines: str, headlines_ref: str, topic: str = "") ->
|
||||||
"- Contains superlatives (best, top, leading, #1)\n"
|
"- Contains superlatives (best, top, leading, #1)\n"
|
||||||
"- Is a question\n"
|
"- Is a question\n"
|
||||||
"- Exceeds 70 characters\n"
|
"- Exceeds 70 characters\n"
|
||||||
)
|
"- Implies a NEW product launch when none exists (avoid 'launches', "
|
||||||
|
"'introduces', 'unveils', 'announces new' unless the topic is genuinely new)\n\n"
|
||||||
if is_news:
|
|
||||||
prompt += (
|
|
||||||
"- (This topic IS actual news — announcement verbs are acceptable)\n\n"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
prompt += (
|
|
||||||
"- Uses 'Announces', 'Launches', 'Introduces', 'Unveils', 'Expands', "
|
|
||||||
"'Reveals', or 'Announces New' (this is NOT actual news)\n"
|
|
||||||
"- Implies something CHANGED, is NEW, or was just achieved when it was not "
|
|
||||||
"(e.g. 'significant expansion', 'major milestone', 'growing demand')\n\n"
|
|
||||||
)
|
|
||||||
|
|
||||||
prompt += (
|
|
||||||
"PREFER headlines that:\n"
|
"PREFER headlines that:\n"
|
||||||
"- Match the tone and structure of the reference examples below\n"
|
"- Match the tone and structure of the reference examples below\n"
|
||||||
"- Use awareness verbs like 'Highlights', 'Strengthens', "
|
"- Use action verbs like 'Highlights', 'Expands', 'Strengthens', "
|
||||||
"'Reinforces', 'Delivers', 'Showcases', 'Details'\n"
|
"'Reinforces', 'Delivers', 'Adds'\n"
|
||||||
"- Describe what the company DOES or OFFERS, not what it just invented\n"
|
"- Describe what the company DOES or OFFERS, not what it just invented\n"
|
||||||
"- Read like a real news wire headline, not a product announcement\n\n"
|
"- Read like a real news wire headline, not a product announcement\n\n"
|
||||||
f"Candidates:\n{headlines}\n\n"
|
f"Candidates:\n{headlines}\n\n"
|
||||||
|
|
@ -364,14 +329,16 @@ def _build_judge_prompt(headlines: str, headlines_ref: str, topic: str = "") ->
|
||||||
return prompt
|
return prompt
|
||||||
|
|
||||||
|
|
||||||
def _derive_anchor_phrase(company_name: str, keyword: str) -> str:
|
def _derive_anchor_phrase(company_name: str, topic: str) -> str:
|
||||||
"""Derive a 'brand + keyword' anchor phrase from company name and keyword.
|
"""Derive a 'brand + keyword' anchor phrase from company name and topic.
|
||||||
|
|
||||||
Examples:
|
Examples:
|
||||||
("Advanced Industrial", "PEEK machining") -> "Advanced Industrial PEEK machining"
|
("Advanced Industrial", "PEEK machining") -> "Advanced Industrial PEEK machining"
|
||||||
("Metal Craft", "custom metal fabrication") -> "Metal Craft custom metal fabrication"
|
("Metal Craft", "custom metal fabrication") -> "Metal Craft custom metal fabrication"
|
||||||
"""
|
"""
|
||||||
return f"{company_name} {keyword.strip()}"
|
# Clean up topic: strip leading articles, lowercase
|
||||||
|
keyword = topic.strip()
|
||||||
|
return f"{company_name} {keyword}"
|
||||||
|
|
||||||
|
|
||||||
def _find_anchor_in_text(text: str, anchor: str) -> bool:
|
def _find_anchor_in_text(text: str, anchor: str) -> bool:
|
||||||
|
|
@ -439,8 +406,6 @@ def _build_pr_prompt(
|
||||||
anchor_phrase: str = "",
|
anchor_phrase: str = "",
|
||||||
) -> str:
|
) -> str:
|
||||||
"""Build the prompt for Step 3: write one full press release."""
|
"""Build the prompt for Step 3: write one full press release."""
|
||||||
is_news = _is_actual_news(topic)
|
|
||||||
|
|
||||||
prompt = (
|
prompt = (
|
||||||
f"{skill_text}\n\n"
|
f"{skill_text}\n\n"
|
||||||
"---\n\n"
|
"---\n\n"
|
||||||
|
|
@ -450,25 +415,6 @@ def _build_pr_prompt(
|
||||||
f"Topic: {topic}\n"
|
f"Topic: {topic}\n"
|
||||||
f"Company: {company_name}\n"
|
f"Company: {company_name}\n"
|
||||||
)
|
)
|
||||||
|
|
||||||
if is_news:
|
|
||||||
prompt += (
|
|
||||||
"\nThis is ACTUAL NEWS — a real new event, product, or achievement. "
|
|
||||||
"You may use announcement language (announced, launched, introduced).\n"
|
|
||||||
)
|
|
||||||
else:
|
|
||||||
prompt += (
|
|
||||||
"\nAWARENESS FRAMING — CRITICAL:\n"
|
|
||||||
"The company ALREADY offers this product/service/capability. Nothing new "
|
|
||||||
"happened. Do NOT write that the company 'announced', 'expanded', 'launched', "
|
|
||||||
"'achieved a milestone', or 'saw growing demand'. These are LIES if nothing "
|
|
||||||
"actually changed.\n"
|
|
||||||
"Instead write about what the company DOES, what it OFFERS, what it PROVIDES. "
|
|
||||||
"Frame it as drawing attention to existing capabilities — highlighting, "
|
|
||||||
"reinforcing, detailing, showcasing.\n"
|
|
||||||
"The first paragraph should describe what the company offers, NOT announce "
|
|
||||||
"a fictional event.\n"
|
|
||||||
)
|
|
||||||
if url:
|
if url:
|
||||||
prompt += f"Reference URL (fetch for context): {url}\n"
|
prompt += f"Reference URL (fetch for context): {url}\n"
|
||||||
if lsi_terms:
|
if lsi_terms:
|
||||||
|
|
@ -544,7 +490,6 @@ def write_press_releases(
|
||||||
topic: str,
|
topic: str,
|
||||||
company_name: str,
|
company_name: str,
|
||||||
url: str = "",
|
url: str = "",
|
||||||
keyword: str = "",
|
|
||||||
lsi_terms: str = "",
|
lsi_terms: str = "",
|
||||||
required_phrase: str = "",
|
required_phrase: str = "",
|
||||||
ctx: dict | None = None,
|
ctx: dict | None = None,
|
||||||
|
|
@ -574,7 +519,7 @@ def write_press_releases(
|
||||||
cu_client.update_task_status(clickup_task_id, config.clickup.automation_status)
|
cu_client.update_task_status(clickup_task_id, config.clickup.automation_status)
|
||||||
cu_client.add_comment(
|
cu_client.add_comment(
|
||||||
clickup_task_id,
|
clickup_task_id,
|
||||||
f"[STARTED]CheddahBot starting press release creation.\n\n"
|
f"🔄 CheddahBot starting press release creation.\n\n"
|
||||||
f"Topic: {topic}\nCompany: {company_name}",
|
f"Topic: {topic}\nCompany: {company_name}",
|
||||||
)
|
)
|
||||||
log.info("ClickUp task %s set to automation-underway", clickup_task_id)
|
log.info("ClickUp task %s set to automation-underway", clickup_task_id)
|
||||||
|
|
@ -630,7 +575,7 @@ def write_press_releases(
|
||||||
log.info("[PR Pipeline] Step 2/4: AI judge selecting best 2 headlines...")
|
log.info("[PR Pipeline] Step 2/4: AI judge selecting best 2 headlines...")
|
||||||
_set_status(ctx, "Step 2/4: AI judge selecting best 2 headlines...")
|
_set_status(ctx, "Step 2/4: AI judge selecting best 2 headlines...")
|
||||||
step_start = time.time()
|
step_start = time.time()
|
||||||
judge_prompt = _build_judge_prompt(headlines_raw, headlines_ref, topic)
|
judge_prompt = _build_judge_prompt(headlines_raw, headlines_ref)
|
||||||
messages = [
|
messages = [
|
||||||
{"role": "system", "content": "You are a senior PR editor."},
|
{"role": "system", "content": "You are a senior PR editor."},
|
||||||
{"role": "user", "content": judge_prompt},
|
{"role": "user", "content": judge_prompt},
|
||||||
|
|
@ -667,7 +612,7 @@ def write_press_releases(
|
||||||
|
|
||||||
# ── Step 3: Write 2 press releases (execution brain x 2) ─────────────
|
# ── Step 3: Write 2 press releases (execution brain x 2) ─────────────
|
||||||
log.info("[PR Pipeline] Step 3/4: Writing 2 press releases...")
|
log.info("[PR Pipeline] Step 3/4: Writing 2 press releases...")
|
||||||
anchor_phrase = _derive_anchor_phrase(company_name, keyword) if keyword else ""
|
anchor_phrase = _derive_anchor_phrase(company_name, topic)
|
||||||
pr_texts: list[str] = []
|
pr_texts: list[str] = []
|
||||||
pr_files: list[str] = []
|
pr_files: list[str] = []
|
||||||
docx_files: list[str] = []
|
docx_files: list[str] = []
|
||||||
|
|
@ -707,11 +652,11 @@ def write_press_releases(
|
||||||
if wc < 575 or wc > 800:
|
if wc < 575 or wc > 800:
|
||||||
log.warning("PR %d word count %d outside 575-800 range", i + 1, wc)
|
log.warning("PR %d word count %d outside 575-800 range", i + 1, wc)
|
||||||
|
|
||||||
# Validate anchor phrase (only when keyword provided)
|
# Validate anchor phrase
|
||||||
if anchor_phrase and _find_anchor_in_text(clean_result, anchor_phrase):
|
if _find_anchor_in_text(clean_result, anchor_phrase):
|
||||||
log.info("PR %d contains anchor phrase '%s'", i + 1, anchor_phrase)
|
log.info("PR %d contains anchor phrase '%s'", i + 1, anchor_phrase)
|
||||||
elif anchor_phrase:
|
else:
|
||||||
fuzzy = _fuzzy_find_anchor(clean_result, company_name, keyword)
|
fuzzy = _fuzzy_find_anchor(clean_result, company_name, topic)
|
||||||
if fuzzy:
|
if fuzzy:
|
||||||
log.info("PR %d: exact anchor not found, fuzzy match: '%s'", i + 1, fuzzy)
|
log.info("PR %d: exact anchor not found, fuzzy match: '%s'", i + 1, fuzzy)
|
||||||
anchor_warnings.append(
|
anchor_warnings.append(
|
||||||
|
|
@ -739,27 +684,18 @@ def write_press_releases(
|
||||||
|
|
||||||
# ── ClickUp: upload docx attachments + comment ─────────────────────
|
# ── ClickUp: upload docx attachments + comment ─────────────────────
|
||||||
uploaded_count = 0
|
uploaded_count = 0
|
||||||
failed_uploads: list[str] = []
|
|
||||||
if clickup_task_id and cu_client:
|
if clickup_task_id and cu_client:
|
||||||
try:
|
try:
|
||||||
for path in docx_files:
|
for path in docx_files:
|
||||||
if cu_client.upload_attachment(clickup_task_id, path):
|
if cu_client.upload_attachment(clickup_task_id, path):
|
||||||
uploaded_count += 1
|
uploaded_count += 1
|
||||||
else:
|
else:
|
||||||
failed_uploads.append(path)
|
|
||||||
log.warning("ClickUp: failed to upload %s for task %s", path, clickup_task_id)
|
log.warning("ClickUp: failed to upload %s for task %s", path, clickup_task_id)
|
||||||
upload_warning = ""
|
|
||||||
if failed_uploads:
|
|
||||||
paths_list = "\n".join(f" - {p}" for p in failed_uploads)
|
|
||||||
upload_warning = (
|
|
||||||
f"\n[WARNING]Warning: {len(failed_uploads)} attachment(s) failed to upload. "
|
|
||||||
f"Files saved locally at:\n{paths_list}"
|
|
||||||
)
|
|
||||||
cu_client.add_comment(
|
cu_client.add_comment(
|
||||||
clickup_task_id,
|
clickup_task_id,
|
||||||
f"📎 Saved {len(docx_files)} press release(s). "
|
f"📎 Saved {len(docx_files)} press release(s). "
|
||||||
f"{uploaded_count} file(s) attached.\n"
|
f"{uploaded_count} file(s) attached.\n"
|
||||||
f"Generating JSON-LD schemas next...{upload_warning}",
|
f"Generating JSON-LD schemas next...",
|
||||||
)
|
)
|
||||||
log.info(
|
log.info(
|
||||||
"ClickUp: uploaded %d attachments for task %s", uploaded_count, clickup_task_id
|
"ClickUp: uploaded %d attachments for task %s", uploaded_count, clickup_task_id
|
||||||
|
|
@ -855,19 +791,31 @@ def write_press_releases(
|
||||||
attach_note = f"\n📎 {uploaded_count} file(s) attached." if uploaded_count else ""
|
attach_note = f"\n📎 {uploaded_count} file(s) attached." if uploaded_count else ""
|
||||||
result_text = "\n".join(output_parts)[:3000]
|
result_text = "\n".join(output_parts)[:3000]
|
||||||
comment = (
|
comment = (
|
||||||
f"[DONE]CheddahBot completed this task.\n\n"
|
f"✅ CheddahBot completed this task.\n\n"
|
||||||
f"Skill: write_press_releases\n"
|
f"Skill: write_press_releases\n"
|
||||||
f"Result:\n{result_text}{attach_note}"
|
f"Result:\n{result_text}{attach_note}"
|
||||||
)
|
)
|
||||||
cu_client.add_comment(clickup_task_id, comment)
|
cu_client.add_comment(clickup_task_id, comment)
|
||||||
|
|
||||||
# Set status to pr needs review
|
# Set status to internal review
|
||||||
cu_client.update_task_status(clickup_task_id, config.clickup.pr_review_status)
|
cu_client.update_task_status(clickup_task_id, config.clickup.review_status)
|
||||||
|
|
||||||
|
# Update kv_store state if one exists
|
||||||
|
db = ctx.get("db")
|
||||||
|
if db:
|
||||||
|
kv_key = f"clickup:task:{clickup_task_id}:state"
|
||||||
|
existing = db.kv_get(kv_key)
|
||||||
|
if existing:
|
||||||
|
state = json.loads(existing)
|
||||||
|
state["state"] = "completed"
|
||||||
|
state["completed_at"] = datetime.now(UTC).isoformat()
|
||||||
|
state["deliverable_paths"] = docx_files
|
||||||
|
db.kv_set(kv_key, json.dumps(state))
|
||||||
|
|
||||||
output_parts.append("\n## ClickUp Sync\n")
|
output_parts.append("\n## ClickUp Sync\n")
|
||||||
output_parts.append(f"- Task `{clickup_task_id}` updated")
|
output_parts.append(f"- Task `{clickup_task_id}` updated")
|
||||||
output_parts.append(f"- {uploaded_count} file(s) uploaded")
|
output_parts.append(f"- {uploaded_count} file(s) uploaded")
|
||||||
output_parts.append(f"- Status set to '{config.clickup.pr_review_status}'")
|
output_parts.append(f"- Status set to '{config.clickup.review_status}'")
|
||||||
|
|
||||||
log.info("ClickUp sync complete for task %s", clickup_task_id)
|
log.info("ClickUp sync complete for task %s", clickup_task_id)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
|
@ -1077,7 +1025,7 @@ def _resolve_branded_url(branded_url: str, company_data: dict | None) -> str:
|
||||||
def _build_links(
|
def _build_links(
|
||||||
pr_text: str,
|
pr_text: str,
|
||||||
company_name: str,
|
company_name: str,
|
||||||
keyword: str,
|
topic: str,
|
||||||
target_url: str,
|
target_url: str,
|
||||||
branded_url_resolved: str,
|
branded_url_resolved: str,
|
||||||
) -> tuple[list[dict], list[str]]:
|
) -> tuple[list[dict], list[str]]:
|
||||||
|
|
@ -1090,13 +1038,13 @@ def _build_links(
|
||||||
warnings: list[str] = []
|
warnings: list[str] = []
|
||||||
|
|
||||||
# Link 1: brand+keyword → target_url
|
# Link 1: brand+keyword → target_url
|
||||||
if target_url and keyword:
|
if target_url:
|
||||||
anchor_phrase = _derive_anchor_phrase(company_name, keyword)
|
anchor_phrase = _derive_anchor_phrase(company_name, topic)
|
||||||
if _find_anchor_in_text(pr_text, anchor_phrase):
|
if _find_anchor_in_text(pr_text, anchor_phrase):
|
||||||
links.append({"url": target_url, "anchor": anchor_phrase})
|
links.append({"url": target_url, "anchor": anchor_phrase})
|
||||||
else:
|
else:
|
||||||
# Try fuzzy match
|
# Try fuzzy match
|
||||||
fuzzy = _fuzzy_find_anchor(pr_text, company_name, keyword)
|
fuzzy = _fuzzy_find_anchor(pr_text, company_name, topic)
|
||||||
if fuzzy:
|
if fuzzy:
|
||||||
links.append({"url": target_url, "anchor": fuzzy})
|
links.append({"url": target_url, "anchor": fuzzy})
|
||||||
warnings.append(
|
warnings.append(
|
||||||
|
|
@ -1140,7 +1088,6 @@ def submit_press_release(
|
||||||
company_name: str,
|
company_name: str,
|
||||||
target_url: str = "",
|
target_url: str = "",
|
||||||
branded_url: str = "",
|
branded_url: str = "",
|
||||||
keyword: str = "",
|
|
||||||
topic: str = "",
|
topic: str = "",
|
||||||
pr_text: str = "",
|
pr_text: str = "",
|
||||||
file_path: str = "",
|
file_path: str = "",
|
||||||
|
|
@ -1178,6 +1125,13 @@ def submit_press_release(
|
||||||
f"Press Advantage requires at least 550 words. Please expand the content."
|
f"Press Advantage requires at least 550 words. Please expand the content."
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# --- Derive topic from headline if not provided ---
|
||||||
|
if not topic:
|
||||||
|
topic = headline
|
||||||
|
for part in [company_name, "Inc.", "LLC", "Corp.", "Ltd.", "Limited", "Inc"]:
|
||||||
|
topic = topic.replace(part, "").strip()
|
||||||
|
topic = re.sub(r"\s+", " ", topic).strip(" -\u2013\u2014,")
|
||||||
|
|
||||||
# --- Load company data ---
|
# --- Load company data ---
|
||||||
companies_text = _load_file_if_exists(_COMPANIES_FILE)
|
companies_text = _load_file_if_exists(_COMPANIES_FILE)
|
||||||
company_all = _parse_company_data(companies_text)
|
company_all = _parse_company_data(companies_text)
|
||||||
|
|
@ -1220,7 +1174,7 @@ def submit_press_release(
|
||||||
link_list, link_warnings = _build_links(
|
link_list, link_warnings = _build_links(
|
||||||
pr_text,
|
pr_text,
|
||||||
company_name,
|
company_name,
|
||||||
keyword,
|
topic,
|
||||||
target_url,
|
target_url,
|
||||||
branded_url_resolved,
|
branded_url_resolved,
|
||||||
)
|
)
|
||||||
|
|
@ -1270,7 +1224,7 @@ def submit_press_release(
|
||||||
if link_list:
|
if link_list:
|
||||||
output_parts.append("\n**Links:**")
|
output_parts.append("\n**Links:**")
|
||||||
for link in link_list:
|
for link in link_list:
|
||||||
output_parts.append(f' - "{link["anchor"]}" -> {link["url"]}')
|
output_parts.append(f' - "{link["anchor"]}" → {link["url"]}')
|
||||||
|
|
||||||
if link_warnings:
|
if link_warnings:
|
||||||
output_parts.append("\n**Link warnings:**")
|
output_parts.append("\n**Link warnings:**")
|
||||||
|
|
|
||||||
|
|
@ -454,7 +454,13 @@ def create_ui(
|
||||||
return agent_name, agent_name, chatbot_msgs, convs, new_browser
|
return agent_name, agent_name, chatbot_msgs, convs, new_browser
|
||||||
|
|
||||||
def poll_pipeline_status(agent_name):
|
def poll_pipeline_status(agent_name):
|
||||||
"""Pipeline status indicator (no longer used — kept for UI timer)."""
|
"""Poll the DB for pipeline progress updates."""
|
||||||
|
agent = _get_agent(agent_name)
|
||||||
|
if not agent:
|
||||||
|
return gr.update(value="", visible=False)
|
||||||
|
status = agent.db.kv_get("pipeline:status")
|
||||||
|
if status:
|
||||||
|
return gr.update(value=f"⏳ {status}", visible=True)
|
||||||
return gr.update(value="", visible=False)
|
return gr.update(value="", visible=False)
|
||||||
|
|
||||||
def poll_notifications():
|
def poll_notifications():
|
||||||
|
|
|
||||||
|
|
@ -1,57 +0,0 @@
|
||||||
"""HTMX + FastAPI web frontend for CheddahBot."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import TYPE_CHECKING
|
|
||||||
|
|
||||||
from fastapi import FastAPI
|
|
||||||
from fastapi.templating import Jinja2Templates
|
|
||||||
from starlette.staticfiles import StaticFiles
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..agent_registry import AgentRegistry
|
|
||||||
from ..config import Config
|
|
||||||
from ..db import Database
|
|
||||||
from ..llm import LLMAdapter
|
|
||||||
from ..notifications import NotificationBus
|
|
||||||
from ..scheduler import Scheduler
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
_TEMPLATE_DIR = Path(__file__).resolve().parent.parent / "templates"
|
|
||||||
_STATIC_DIR = Path(__file__).resolve().parent.parent / "static"
|
|
||||||
|
|
||||||
templates = Jinja2Templates(directory=str(_TEMPLATE_DIR))
|
|
||||||
|
|
||||||
|
|
||||||
def mount_web_app(
|
|
||||||
app: FastAPI,
|
|
||||||
registry: AgentRegistry,
|
|
||||||
config: Config,
|
|
||||||
llm: LLMAdapter,
|
|
||||||
notification_bus: NotificationBus | None = None,
|
|
||||||
scheduler: Scheduler | None = None,
|
|
||||||
db: Database | None = None,
|
|
||||||
):
|
|
||||||
"""Mount all web routes and static files onto the FastAPI app."""
|
|
||||||
# Wire dependencies into route modules
|
|
||||||
from . import routes_chat, routes_pages, routes_sse
|
|
||||||
from .routes_chat import router as chat_router
|
|
||||||
from .routes_pages import router as pages_router
|
|
||||||
from .routes_sse import router as sse_router
|
|
||||||
|
|
||||||
routes_pages.setup(registry, config, llm, templates, db=db, scheduler=scheduler)
|
|
||||||
routes_chat.setup(registry, config, llm, db, templates)
|
|
||||||
routes_sse.setup(notification_bus, scheduler, db)
|
|
||||||
|
|
||||||
app.include_router(chat_router)
|
|
||||||
app.include_router(sse_router)
|
|
||||||
# Pages router last (it has catch-all GET /)
|
|
||||||
app.include_router(pages_router)
|
|
||||||
|
|
||||||
# Static files
|
|
||||||
app.mount("/static", StaticFiles(directory=str(_STATIC_DIR)), name="static")
|
|
||||||
|
|
||||||
log.info("Web UI mounted (templates: %s, static: %s)", _TEMPLATE_DIR, _STATIC_DIR)
|
|
||||||
|
|
@ -1,270 +0,0 @@
|
||||||
"""Chat routes: send messages, stream responses, manage conversations."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
|
||||||
import tempfile
|
|
||||||
import time
|
|
||||||
from pathlib import Path
|
|
||||||
from typing import TYPE_CHECKING
|
|
||||||
|
|
||||||
from fastapi import APIRouter, Form, Request, UploadFile
|
|
||||||
from fastapi.responses import HTMLResponse
|
|
||||||
from fastapi.templating import Jinja2Templates
|
|
||||||
from sse_starlette.sse import EventSourceResponse
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..agent_registry import AgentRegistry
|
|
||||||
from ..config import Config
|
|
||||||
from ..db import Database
|
|
||||||
from ..llm import LLMAdapter
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
router = APIRouter(prefix="/chat")
|
|
||||||
|
|
||||||
_registry: AgentRegistry | None = None
|
|
||||||
_config: Config | None = None
|
|
||||||
_llm: LLMAdapter | None = None
|
|
||||||
_db: Database | None = None
|
|
||||||
_templates: Jinja2Templates | None = None
|
|
||||||
|
|
||||||
# Pending responses: conv_id -> {text, files, timestamp}
|
|
||||||
_pending: dict[str, dict] = {}
|
|
||||||
|
|
||||||
|
|
||||||
def setup(registry, config, llm, db, templates):
|
|
||||||
global _registry, _config, _llm, _db, _templates
|
|
||||||
_registry = registry
|
|
||||||
_config = config
|
|
||||||
_llm = llm
|
|
||||||
_db = db
|
|
||||||
_templates = templates
|
|
||||||
|
|
||||||
|
|
||||||
def _get_agent(name: str):
|
|
||||||
if _registry:
|
|
||||||
return _registry.get(name) or _registry.default
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def _cleanup_pending():
|
|
||||||
"""Remove pending entries older than 60s."""
|
|
||||||
now = time.time()
|
|
||||||
expired = [k for k, v in _pending.items() if now - v["timestamp"] > 60]
|
|
||||||
for k in expired:
|
|
||||||
del _pending[k]
|
|
||||||
|
|
||||||
|
|
||||||
@router.post("/send")
|
|
||||||
async def send_message(
|
|
||||||
request: Request,
|
|
||||||
text: str = Form(""),
|
|
||||||
agent_name: str = Form("default"),
|
|
||||||
conv_id: str = Form(""),
|
|
||||||
files: list[UploadFile] | None = None,
|
|
||||||
):
|
|
||||||
"""Accept user message, return user bubble HTML + trigger SSE stream."""
|
|
||||||
_cleanup_pending()
|
|
||||||
|
|
||||||
agent = _get_agent(agent_name)
|
|
||||||
if not agent:
|
|
||||||
return HTMLResponse("<div class='error'>Agent not found</div>", status_code=400)
|
|
||||||
|
|
||||||
# Handle file uploads
|
|
||||||
saved_files = []
|
|
||||||
for f in (files or []):
|
|
||||||
if f.filename and f.size and f.size > 0:
|
|
||||||
tmp = Path(tempfile.mkdtemp()) / f.filename
|
|
||||||
content = await f.read()
|
|
||||||
tmp.write_bytes(content)
|
|
||||||
saved_files.append(str(tmp))
|
|
||||||
|
|
||||||
if not text.strip() and not saved_files:
|
|
||||||
return HTMLResponse("")
|
|
||||||
|
|
||||||
# Ensure conversation exists
|
|
||||||
if not conv_id:
|
|
||||||
agent.new_conversation()
|
|
||||||
conv_id = agent.ensure_conversation()
|
|
||||||
else:
|
|
||||||
agent.conv_id = conv_id
|
|
||||||
|
|
||||||
# Build display text
|
|
||||||
display_text = text
|
|
||||||
if saved_files:
|
|
||||||
file_names = [Path(f).name for f in saved_files]
|
|
||||||
display_text += f"\n[Attached: {', '.join(file_names)}]"
|
|
||||||
|
|
||||||
# Stash for SSE stream
|
|
||||||
_pending[conv_id] = {
|
|
||||||
"text": text,
|
|
||||||
"files": saved_files,
|
|
||||||
"timestamp": time.time(),
|
|
||||||
"agent_name": agent_name,
|
|
||||||
}
|
|
||||||
|
|
||||||
# Render user bubble + SSE trigger div
|
|
||||||
user_html = _templates.get_template("partials/chat_message.html").render(
|
|
||||||
role="user", content=display_text
|
|
||||||
)
|
|
||||||
# The SSE trigger div connects to the stream endpoint
|
|
||||||
sse_div = (
|
|
||||||
f'<div id="sse-trigger" '
|
|
||||||
f'hx-ext="sse" '
|
|
||||||
f'sse-connect="/chat/stream/{conv_id}" '
|
|
||||||
f'sse-swap="chunk" '
|
|
||||||
f'hx-target="#assistant-response" '
|
|
||||||
f'hx-swap="beforeend">'
|
|
||||||
f'</div>'
|
|
||||||
f'<div id="assistant-bubble" class="message assistant">'
|
|
||||||
f'<div class="message-avatar">CB</div>'
|
|
||||||
f'<div class="message-body">'
|
|
||||||
f'<div id="assistant-response" class="message-content"></div>'
|
|
||||||
f'</div></div>'
|
|
||||||
)
|
|
||||||
|
|
||||||
headers = {
|
|
||||||
"HX-Trigger-After-Swap": "scrollChat",
|
|
||||||
"HX-Push-Url": f"/?conv={conv_id}",
|
|
||||||
}
|
|
||||||
|
|
||||||
return HTMLResponse(user_html + sse_div, headers=headers)
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/stream/{conv_id}")
|
|
||||||
async def stream_response(conv_id: str):
|
|
||||||
"""SSE endpoint: stream assistant response chunks."""
|
|
||||||
pending = _pending.pop(conv_id, None)
|
|
||||||
if not pending:
|
|
||||||
async def empty():
|
|
||||||
yield {"event": "done", "data": ""}
|
|
||||||
return EventSourceResponse(empty())
|
|
||||||
|
|
||||||
agent = _get_agent(pending["agent_name"])
|
|
||||||
if not agent:
|
|
||||||
async def error():
|
|
||||||
yield {"event": "chunk", "data": "Agent not found"}
|
|
||||||
yield {"event": "done", "data": ""}
|
|
||||||
return EventSourceResponse(error())
|
|
||||||
|
|
||||||
agent.conv_id = conv_id
|
|
||||||
|
|
||||||
async def generate():
|
|
||||||
loop = asyncio.get_event_loop()
|
|
||||||
queue: asyncio.Queue = asyncio.Queue()
|
|
||||||
|
|
||||||
def run_agent():
|
|
||||||
try:
|
|
||||||
for chunk in agent.respond(pending["text"], files=pending.get("files")):
|
|
||||||
loop.call_soon_threadsafe(queue.put_nowait, ("chunk", chunk))
|
|
||||||
except Exception as e:
|
|
||||||
log.error("Stream error: %s", e, exc_info=True)
|
|
||||||
loop.call_soon_threadsafe(
|
|
||||||
queue.put_nowait, ("chunk", f"\n\nError: {e}")
|
|
||||||
)
|
|
||||||
finally:
|
|
||||||
loop.call_soon_threadsafe(queue.put_nowait, ("done", ""))
|
|
||||||
|
|
||||||
# Run agent.respond() in a thread
|
|
||||||
import threading
|
|
||||||
t = threading.Thread(target=run_agent, daemon=True)
|
|
||||||
t.start()
|
|
||||||
|
|
||||||
while True:
|
|
||||||
event, data = await queue.get()
|
|
||||||
if event == "done":
|
|
||||||
yield {"event": "done", "data": conv_id}
|
|
||||||
break
|
|
||||||
yield {"event": "chunk", "data": data}
|
|
||||||
|
|
||||||
return EventSourceResponse(generate())
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/conversations")
|
|
||||||
async def list_conversations(agent_name: str = "default"):
|
|
||||||
"""Return sidebar conversation list as HTML partial."""
|
|
||||||
agent = _get_agent(agent_name)
|
|
||||||
if not agent:
|
|
||||||
return HTMLResponse("")
|
|
||||||
|
|
||||||
convs = agent.db.list_conversations(limit=50, agent_name=agent_name)
|
|
||||||
html = _templates.get_template("partials/chat_sidebar.html").render(
|
|
||||||
conversations=convs
|
|
||||||
)
|
|
||||||
return HTMLResponse(html)
|
|
||||||
|
|
||||||
|
|
||||||
@router.post("/new")
|
|
||||||
async def new_conversation(agent_name: str = Form("default")):
|
|
||||||
"""Create a new conversation, return empty chat + updated sidebar."""
|
|
||||||
agent = _get_agent(agent_name)
|
|
||||||
if not agent:
|
|
||||||
return HTMLResponse("")
|
|
||||||
|
|
||||||
agent.new_conversation()
|
|
||||||
conv_id = agent.ensure_conversation()
|
|
||||||
|
|
||||||
convs = agent.db.list_conversations(limit=50, agent_name=agent_name)
|
|
||||||
sidebar_html = _templates.get_template("partials/chat_sidebar.html").render(
|
|
||||||
conversations=convs
|
|
||||||
)
|
|
||||||
|
|
||||||
# Return empty chat area + sidebar update via OOB swap
|
|
||||||
html = (
|
|
||||||
f'<div id="chat-messages"></div>'
|
|
||||||
f'<div id="sidebar-conversations" hx-swap-oob="innerHTML">'
|
|
||||||
f'{sidebar_html}</div>'
|
|
||||||
)
|
|
||||||
|
|
||||||
headers = {"HX-Push-Url": f"/?conv={conv_id}"}
|
|
||||||
return HTMLResponse(html, headers=headers)
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/load/{conv_id}")
|
|
||||||
async def load_conversation(conv_id: str, agent_name: str = "default"):
|
|
||||||
"""Load conversation history as HTML."""
|
|
||||||
agent = _get_agent(agent_name)
|
|
||||||
if not agent:
|
|
||||||
return HTMLResponse("")
|
|
||||||
|
|
||||||
messages = agent.load_conversation(conv_id)
|
|
||||||
parts = []
|
|
||||||
for msg in messages:
|
|
||||||
role = msg.get("role", "")
|
|
||||||
content = msg.get("content", "")
|
|
||||||
if role in ("user", "assistant") and content:
|
|
||||||
parts.append(
|
|
||||||
_templates.get_template("partials/chat_message.html").render(
|
|
||||||
role=role, content=content
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
headers = {"HX-Push-Url": f"/?conv={conv_id}"}
|
|
||||||
return HTMLResponse("\n".join(parts), headers=headers)
|
|
||||||
|
|
||||||
|
|
||||||
@router.post("/agent/{name}")
|
|
||||||
async def switch_agent(name: str):
|
|
||||||
"""Switch active agent. Returns updated sidebar via OOB."""
|
|
||||||
agent = _get_agent(name)
|
|
||||||
if not agent:
|
|
||||||
return HTMLResponse("<div class='error'>Agent not found</div>", status_code=400)
|
|
||||||
|
|
||||||
agent.new_conversation()
|
|
||||||
conv_id = agent.ensure_conversation()
|
|
||||||
|
|
||||||
convs = agent.db.list_conversations(limit=50, agent_name=name)
|
|
||||||
sidebar_html = _templates.get_template("partials/chat_sidebar.html").render(
|
|
||||||
conversations=convs
|
|
||||||
)
|
|
||||||
|
|
||||||
html = (
|
|
||||||
f'<div id="chat-messages"></div>'
|
|
||||||
f'<div id="sidebar-conversations" hx-swap-oob="innerHTML">'
|
|
||||||
f'{sidebar_html}</div>'
|
|
||||||
)
|
|
||||||
|
|
||||||
headers = {"HX-Push-Url": f"/?conv={conv_id}"}
|
|
||||||
return HTMLResponse(html, headers=headers)
|
|
||||||
|
|
@ -1,172 +0,0 @@
|
||||||
"""Page routes: GET / (chat), GET /dashboard, dashboard partials."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
from typing import TYPE_CHECKING
|
|
||||||
|
|
||||||
from fastapi import APIRouter, Request
|
|
||||||
from fastapi.responses import HTMLResponse
|
|
||||||
from fastapi.templating import Jinja2Templates
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..agent_registry import AgentRegistry
|
|
||||||
from ..config import Config
|
|
||||||
from ..db import Database
|
|
||||||
from ..llm import LLMAdapter
|
|
||||||
from ..scheduler import Scheduler
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
router = APIRouter()
|
|
||||||
|
|
||||||
_registry: AgentRegistry | None = None
|
|
||||||
_config: Config | None = None
|
|
||||||
_llm: LLMAdapter | None = None
|
|
||||||
_db: Database | None = None
|
|
||||||
_scheduler: Scheduler | None = None
|
|
||||||
_templates: Jinja2Templates | None = None
|
|
||||||
|
|
||||||
|
|
||||||
def setup(registry, config, llm, templates, db=None, scheduler=None):
|
|
||||||
global _registry, _config, _llm, _templates, _db, _scheduler
|
|
||||||
_registry = registry
|
|
||||||
_config = config
|
|
||||||
_llm = llm
|
|
||||||
_templates = templates
|
|
||||||
_db = db
|
|
||||||
_scheduler = scheduler
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/")
|
|
||||||
async def chat_page(request: Request):
|
|
||||||
agent_names = _registry.list_agents() if _registry else []
|
|
||||||
agents = []
|
|
||||||
for name in agent_names:
|
|
||||||
agent = _registry.get(name)
|
|
||||||
display = agent.agent_config.display_name if agent else name
|
|
||||||
agents.append({"name": name, "display_name": display})
|
|
||||||
|
|
||||||
default_agent = _registry.default_name if _registry else "default"
|
|
||||||
chat_model = _config.chat_model if _config else "unknown"
|
|
||||||
exec_available = _llm.is_execution_brain_available() if _llm else False
|
|
||||||
clickup_enabled = _config.clickup.enabled if _config else False
|
|
||||||
|
|
||||||
return _templates.TemplateResponse("chat.html", {
|
|
||||||
"request": request,
|
|
||||||
"agents": agents,
|
|
||||||
"default_agent": default_agent,
|
|
||||||
"chat_model": chat_model,
|
|
||||||
"exec_available": exec_available,
|
|
||||||
"clickup_enabled": clickup_enabled,
|
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/dashboard")
|
|
||||||
async def dashboard_page(request: Request):
|
|
||||||
return _templates.TemplateResponse("dashboard.html", {
|
|
||||||
"request": request,
|
|
||||||
})
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/dashboard/pipeline")
|
|
||||||
async def dashboard_pipeline():
|
|
||||||
"""Return pipeline panel HTML partial with task data."""
|
|
||||||
if not _config or not _config.clickup.enabled:
|
|
||||||
return HTMLResponse('<p class="text-muted">ClickUp not configured</p>')
|
|
||||||
|
|
||||||
try:
|
|
||||||
from ..api import get_tasks
|
|
||||||
data = await get_tasks()
|
|
||||||
all_tasks = data.get("tasks", [])
|
|
||||||
except Exception as e:
|
|
||||||
log.error("Pipeline data fetch failed: %s", e)
|
|
||||||
return HTMLResponse(f'<p class="text-err">Error: {e}</p>')
|
|
||||||
|
|
||||||
# Group by work category, then by status
|
|
||||||
pipeline_statuses = [
|
|
||||||
"to do", "automation underway", "outline review", "internal review", "error",
|
|
||||||
]
|
|
||||||
categories = {} # category -> {status -> [tasks]}
|
|
||||||
for t in all_tasks:
|
|
||||||
cat = t.get("task_type") or "Other"
|
|
||||||
status = t.get("status", "unknown")
|
|
||||||
|
|
||||||
# Only show tasks in pipeline-relevant statuses
|
|
||||||
if status not in pipeline_statuses:
|
|
||||||
continue
|
|
||||||
|
|
||||||
if cat not in categories:
|
|
||||||
categories[cat] = {}
|
|
||||||
categories[cat].setdefault(status, []).append(t)
|
|
||||||
|
|
||||||
# Build HTML
|
|
||||||
html_parts = []
|
|
||||||
|
|
||||||
# Status summary counts
|
|
||||||
total_counts = {}
|
|
||||||
for cat_data in categories.values():
|
|
||||||
for status, tasks in cat_data.items():
|
|
||||||
total_counts[status] = total_counts.get(status, 0) + len(tasks)
|
|
||||||
|
|
||||||
if total_counts:
|
|
||||||
html_parts.append('<div class="pipeline-stats">')
|
|
||||||
for status in pipeline_statuses:
|
|
||||||
count = total_counts.get(status, 0)
|
|
||||||
html_parts.append(
|
|
||||||
f'<div class="pipeline-stat">'
|
|
||||||
f'<div class="stat-count">{count}</div>'
|
|
||||||
f'<div class="stat-label">{status}</div>'
|
|
||||||
f'</div>'
|
|
||||||
)
|
|
||||||
html_parts.append('</div>')
|
|
||||||
|
|
||||||
# Per-category tables
|
|
||||||
for cat_name in sorted(categories.keys()):
|
|
||||||
cat_data = categories[cat_name]
|
|
||||||
all_cat_tasks = []
|
|
||||||
for status in pipeline_statuses:
|
|
||||||
all_cat_tasks.extend(cat_data.get(status, []))
|
|
||||||
|
|
||||||
if not all_cat_tasks:
|
|
||||||
continue
|
|
||||||
|
|
||||||
html_parts.append(f'<div class="pipeline-group"><h4>{cat_name} ({len(all_cat_tasks)})</h4>')
|
|
||||||
html_parts.append('<table class="task-table"><thead><tr>'
|
|
||||||
'<th>Task</th><th>Customer</th><th>Status</th><th>Due</th>'
|
|
||||||
'</tr></thead><tbody>')
|
|
||||||
|
|
||||||
for task in all_cat_tasks:
|
|
||||||
name = task.get("name", "")
|
|
||||||
url = task.get("url", "")
|
|
||||||
customer = (task.get("custom_fields") or {}).get("Client", "N/A")
|
|
||||||
status = task.get("status", "")
|
|
||||||
status_class = "status-" + status.replace(" ", "-")
|
|
||||||
|
|
||||||
# Format due date
|
|
||||||
due_display = "-"
|
|
||||||
due_raw = task.get("due_date")
|
|
||||||
if due_raw:
|
|
||||||
try:
|
|
||||||
due_dt = datetime.fromtimestamp(int(due_raw) / 1000, tz=UTC)
|
|
||||||
due_display = due_dt.strftime("%b %d")
|
|
||||||
except (ValueError, TypeError, OSError):
|
|
||||||
pass
|
|
||||||
|
|
||||||
name_cell = (
|
|
||||||
f'<a href="{url}" target="_blank">{name}</a>' if url else name
|
|
||||||
)
|
|
||||||
|
|
||||||
html_parts.append(
|
|
||||||
f'<tr><td>{name_cell}</td><td>{customer}</td>'
|
|
||||||
f'<td><span class="status-badge {status_class}">{status}</span></td>'
|
|
||||||
f'<td>{due_display}</td></tr>'
|
|
||||||
)
|
|
||||||
|
|
||||||
html_parts.append('</tbody></table></div>')
|
|
||||||
|
|
||||||
if not html_parts:
|
|
||||||
return HTMLResponse('<p class="text-muted">No active pipeline tasks</p>')
|
|
||||||
|
|
||||||
return HTMLResponse('\n'.join(html_parts))
|
|
||||||
|
|
@ -1,94 +0,0 @@
|
||||||
"""SSE routes for live dashboard updates."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import json
|
|
||||||
import logging
|
|
||||||
from datetime import datetime
|
|
||||||
from typing import TYPE_CHECKING
|
|
||||||
|
|
||||||
from fastapi import APIRouter
|
|
||||||
from sse_starlette.sse import EventSourceResponse
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..db import Database
|
|
||||||
from ..notifications import NotificationBus
|
|
||||||
from ..scheduler import Scheduler
|
|
||||||
|
|
||||||
log = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
router = APIRouter(prefix="/sse")
|
|
||||||
|
|
||||||
_notification_bus: NotificationBus | None = None
|
|
||||||
_scheduler: Scheduler | None = None
|
|
||||||
_db: Database | None = None
|
|
||||||
|
|
||||||
|
|
||||||
def setup(notification_bus, scheduler, db):
|
|
||||||
global _notification_bus, _scheduler, _db
|
|
||||||
_notification_bus = notification_bus
|
|
||||||
_scheduler = scheduler
|
|
||||||
_db = db
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/notifications")
|
|
||||||
async def sse_notifications():
|
|
||||||
"""Stream new notifications as they arrive."""
|
|
||||||
listener_id = f"sse-notif-{id(asyncio.current_task())}"
|
|
||||||
|
|
||||||
# Subscribe to notification bus
|
|
||||||
queue: asyncio.Queue = asyncio.Queue()
|
|
||||||
loop = asyncio.get_event_loop()
|
|
||||||
|
|
||||||
if _notification_bus:
|
|
||||||
def on_notify(msg, cat):
|
|
||||||
loop.call_soon_threadsafe(
|
|
||||||
queue.put_nowait, {"message": msg, "category": cat}
|
|
||||||
)
|
|
||||||
_notification_bus.subscribe(listener_id, on_notify)
|
|
||||||
|
|
||||||
async def generate():
|
|
||||||
try:
|
|
||||||
while True:
|
|
||||||
try:
|
|
||||||
notif = await asyncio.wait_for(queue.get(), timeout=30)
|
|
||||||
yield {
|
|
||||||
"event": "notification",
|
|
||||||
"data": json.dumps(notif),
|
|
||||||
}
|
|
||||||
except TimeoutError:
|
|
||||||
yield {"event": "heartbeat", "data": ""}
|
|
||||||
finally:
|
|
||||||
if _notification_bus:
|
|
||||||
_notification_bus.unsubscribe(listener_id)
|
|
||||||
|
|
||||||
return EventSourceResponse(generate())
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/loops")
|
|
||||||
async def sse_loops():
|
|
||||||
"""Push loop timestamps + active executions every 15s."""
|
|
||||||
async def generate():
|
|
||||||
while True:
|
|
||||||
data = {"loops": {}, "executions": {}}
|
|
||||||
if _scheduler:
|
|
||||||
ts = _scheduler.get_loop_timestamps()
|
|
||||||
data["loops"] = ts
|
|
||||||
# Serialize active executions (datetime -> str)
|
|
||||||
raw_exec = _scheduler.get_active_executions()
|
|
||||||
execs = {}
|
|
||||||
for tid, info in raw_exec.items():
|
|
||||||
execs[tid] = {
|
|
||||||
"name": info.get("name", ""),
|
|
||||||
"tool": info.get("tool", ""),
|
|
||||||
"started_at": info["started_at"].isoformat()
|
|
||||||
if isinstance(info.get("started_at"), datetime)
|
|
||||||
else str(info.get("started_at", "")),
|
|
||||||
"thread": info.get("thread", ""),
|
|
||||||
}
|
|
||||||
data["executions"] = execs
|
|
||||||
yield {"event": "loops", "data": json.dumps(data)}
|
|
||||||
await asyncio.sleep(15)
|
|
||||||
|
|
||||||
return EventSourceResponse(generate())
|
|
||||||
68
config.yaml
68
config.yaml
|
|
@ -42,10 +42,8 @@ email:
|
||||||
# ClickUp integration
|
# ClickUp integration
|
||||||
clickup:
|
clickup:
|
||||||
poll_interval_minutes: 20 # 3x per hour
|
poll_interval_minutes: 20 # 3x per hour
|
||||||
poll_statuses: ["to do", "outline approved"]
|
poll_statuses: ["to do"]
|
||||||
poll_task_types: ["Press Release", "On Page Optimization", "Content Creation", "Link Building"]
|
|
||||||
review_status: "internal review"
|
review_status: "internal review"
|
||||||
pr_review_status: "pr needs review"
|
|
||||||
in_progress_status: "in progress"
|
in_progress_status: "in progress"
|
||||||
automation_status: "automation underway"
|
automation_status: "automation underway"
|
||||||
error_status: "error"
|
error_status: "error"
|
||||||
|
|
@ -55,35 +53,14 @@ clickup:
|
||||||
"Press Release":
|
"Press Release":
|
||||||
tool: "write_press_releases"
|
tool: "write_press_releases"
|
||||||
auto_execute: true
|
auto_execute: true
|
||||||
required_fields: [topic, company_name, target_url]
|
|
||||||
field_mapping:
|
field_mapping:
|
||||||
topic: "PR Topic"
|
topic: "task_name"
|
||||||
keyword: "Keyword"
|
company_name: "Customer"
|
||||||
company_name: "Client"
|
|
||||||
target_url: "IMSURL"
|
target_url: "IMSURL"
|
||||||
branded_url: "SocialURL"
|
branded_url: "SocialURL"
|
||||||
"On Page Optimization":
|
|
||||||
tool: "create_content"
|
|
||||||
auto_execute: false
|
|
||||||
trigger_hint: "content-cora-inbox file watcher"
|
|
||||||
required_fields: [keyword, url]
|
|
||||||
field_mapping:
|
|
||||||
url: "IMSURL"
|
|
||||||
keyword: "Keyword"
|
|
||||||
cli_flags: "CLIFlags"
|
|
||||||
"Content Creation":
|
|
||||||
tool: "create_content"
|
|
||||||
auto_execute: false
|
|
||||||
auto_execute_on_status: ["outline approved"]
|
|
||||||
trigger_hint: "content-cora-inbox file watcher (Phase 1), outline approved status (Phase 2)"
|
|
||||||
field_mapping:
|
|
||||||
url: "IMSURL"
|
|
||||||
keyword: "Keyword"
|
|
||||||
cli_flags: "CLIFlags"
|
|
||||||
"Link Building":
|
"Link Building":
|
||||||
tool: "run_link_building"
|
tool: "run_link_building"
|
||||||
auto_execute: false
|
auto_execute: false
|
||||||
trigger_hint: "cora-inbox file watcher"
|
|
||||||
complete_status: "complete"
|
complete_status: "complete"
|
||||||
error_status: "error"
|
error_status: "error"
|
||||||
field_mapping:
|
field_mapping:
|
||||||
|
|
@ -98,46 +75,18 @@ clickup:
|
||||||
# Link Building settings
|
# Link Building settings
|
||||||
link_building:
|
link_building:
|
||||||
blm_dir: "E:/dev/Big-Link-Man"
|
blm_dir: "E:/dev/Big-Link-Man"
|
||||||
watch_folder: "//PennQnap1/SHARE1/cora-inbox"
|
watch_folder: "Z:/cora-inbox"
|
||||||
watch_interval_minutes: 10
|
watch_interval_minutes: 60
|
||||||
default_branded_plus_ratio: 0.7
|
default_branded_plus_ratio: 0.7
|
||||||
|
|
||||||
# AutoCora job submission
|
# AutoCora job submission
|
||||||
autocora:
|
autocora:
|
||||||
jobs_dir: "//PennQnap1/SHARE1/AutoCora/jobs"
|
jobs_dir: "//PennQnap1/SHARE1/AutoCora/jobs"
|
||||||
results_dir: "//PennQnap1/SHARE1/AutoCora/results"
|
results_dir: "//PennQnap1/SHARE1/AutoCora/results"
|
||||||
poll_interval_minutes: 20
|
poll_interval_minutes: 5
|
||||||
success_status: "running cora"
|
success_status: "running cora"
|
||||||
error_status: "error"
|
error_status: "error"
|
||||||
enabled: true
|
enabled: true
|
||||||
cora_human_inbox: "//PennQnap1/SHARE1/Cora-For-Human"
|
|
||||||
|
|
||||||
# Content creation settings
|
|
||||||
content:
|
|
||||||
cora_inbox: "//PennQnap1/SHARE1/content-cora-inbox"
|
|
||||||
outline_dir: "//PennQnap1/SHARE1/content-outlines"
|
|
||||||
|
|
||||||
# ntfy.sh push notifications
|
|
||||||
ntfy:
|
|
||||||
enabled: true
|
|
||||||
channels:
|
|
||||||
- name: human_action
|
|
||||||
topic_env_var: NTFY_TOPIC_HUMAN_ACTION
|
|
||||||
categories: [clickup, autocora, linkbuilding, content]
|
|
||||||
include_patterns: ["completed", "SUCCESS", "copied to"]
|
|
||||||
priority: high
|
|
||||||
tags: white_check_mark
|
|
||||||
- name: errors
|
|
||||||
topic_env_var: NTFY_TOPIC_ERRORS
|
|
||||||
categories: [clickup, autocora, linkbuilding, content]
|
|
||||||
include_patterns: ["failed", "FAILURE", "skipped", "no ClickUp match", "copy failed", "IMSURL is empty"]
|
|
||||||
priority: urgent
|
|
||||||
tags: rotating_light
|
|
||||||
- name: daily_briefing
|
|
||||||
topic_env_var: NTFY_TOPIC_DAILY_BRIEFING
|
|
||||||
categories: [briefing]
|
|
||||||
priority: high
|
|
||||||
tags: clipboard
|
|
||||||
|
|
||||||
# Multi-agent configuration
|
# Multi-agent configuration
|
||||||
# Each agent gets its own personality, tool whitelist, and memory scope.
|
# Each agent gets its own personality, tool whitelist, and memory scope.
|
||||||
|
|
@ -173,11 +122,6 @@ agents:
|
||||||
tools: [run_link_building, run_cora_backlinks, blm_ingest_cora, blm_generate_batch, scan_cora_folder, submit_autocora_jobs, poll_autocora_results, delegate_task, remember, search_memory]
|
tools: [run_link_building, run_cora_backlinks, blm_ingest_cora, blm_generate_batch, scan_cora_folder, submit_autocora_jobs, poll_autocora_results, delegate_task, remember, search_memory]
|
||||||
memory_scope: ""
|
memory_scope: ""
|
||||||
|
|
||||||
- name: content_creator
|
|
||||||
display_name: Content Creator
|
|
||||||
tools: [create_content, continue_content, delegate_task, remember, search_memory, web_search, web_fetch]
|
|
||||||
memory_scope: ""
|
|
||||||
|
|
||||||
- name: planner
|
- name: planner
|
||||||
display_name: Planner
|
display_name: Planner
|
||||||
model: "x-ai/grok-4.1-fast"
|
model: "x-ai/grok-4.1-fast"
|
||||||
|
|
|
||||||
287
cora-link.md
287
cora-link.md
|
|
@ -1,287 +0,0 @@
|
||||||
# Link Building Agent Plan
|
|
||||||
|
|
||||||
## Context
|
|
||||||
|
|
||||||
CheddahBot needs a link building agent that orchestrates the external Big-Link-Man CLI tool (`E:/dev/Big-Link-Man/`). The current workflow is manual: run Cora on another machine → get .xlsx → manually run `main.py ingest-cora` → manually run `main.py generate-batch`. This agent automates steps 2 and 3, triggered by folder watching, ClickUp tasks, or chat commands. It must be expandable for future link building methods (MCP server path, ingest-simple, etc.).
|
|
||||||
|
|
||||||
## Decisions Made
|
|
||||||
|
|
||||||
- **Watch folder**: `Z:/cora-inbox` (network drive, Cora machine accessible)
|
|
||||||
- **File→task matching**: Fuzzy match .xlsx filename stem against ClickUp task's `Keyword` custom field
|
|
||||||
- **New ClickUp field "LB Method"**: Dropdown with initial option "Cora Backlinks" (more added later)
|
|
||||||
- **Dashboard**: API endpoint + NotificationBus events only (no frontend work — separate project)
|
|
||||||
- **Sidecar files**: Not needed — all metadata comes from the matching ClickUp task
|
|
||||||
- **Tool naming**: Orchestrator pattern — `run_link_building` is a thin dispatcher that reads `LB Method` and routes to the specific pipeline tool (e.g., `run_cora_backlinks`). Future link building methods get their own tools and slot into the orchestrator.
|
|
||||||
|
|
||||||
## Files to Create
|
|
||||||
|
|
||||||
### 1. `cheddahbot/tools/linkbuilding.py` — Main tool module
|
|
||||||
|
|
||||||
Four `@tool`-decorated functions + private helpers:
|
|
||||||
|
|
||||||
**`run_link_building(lb_method="", xlsx_path="", project_name="", money_site_url="", branded_plus_ratio=0.7, custom_anchors="", cli_flags="", ctx=None)`**
|
|
||||||
- **Orchestrator/dispatcher** — reads `lb_method` (from ClickUp "LB Method" field or chat) and routes to the correct pipeline tool
|
|
||||||
- If `lb_method` is "Cora Backlinks" or empty (default): calls `run_cora_backlinks()`
|
|
||||||
- Future: if `lb_method` is "MCP Link Building": calls `run_mcp_link_building()` (not yet implemented)
|
|
||||||
- Passes all other args through to the sub-tool
|
|
||||||
- This is what the ClickUp skill_map always routes to
|
|
||||||
|
|
||||||
**`run_cora_backlinks(xlsx_path, project_name, money_site_url, branded_plus_ratio=0.7, custom_anchors="", cli_flags="", ctx=None)`**
|
|
||||||
- The actual Cora pipeline — runs ingest-cora → generate-batch
|
|
||||||
- Step 1: Build CLI args, call `_run_blm_command(["ingest-cora", ...])`, parse stdout for job file path
|
|
||||||
- Step 2: Call `_run_blm_command(["generate-batch", "-j", job_file, "--continue-on-error"])`
|
|
||||||
- Updates KV store state and posts ClickUp comments at each step (following press_release.py pattern)
|
|
||||||
- Returns `## ClickUp Sync` in output to signal scheduler that sync was handled internally
|
|
||||||
- Can also be called directly from chat for explicit Cora runs
|
|
||||||
|
|
||||||
**`blm_ingest_cora(xlsx_path, project_name, money_site_url, branded_plus_ratio=0.7, custom_anchors="", cli_flags="", ctx=None)`**
|
|
||||||
- Standalone ingest — runs ingest-cora only, returns project ID and job file path
|
|
||||||
- For cases where user wants to ingest but not generate yet
|
|
||||||
|
|
||||||
**`blm_generate_batch(job_file, continue_on_error=True, debug=False, ctx=None)`**
|
|
||||||
- Standalone generate — runs generate-batch only on an existing job file
|
|
||||||
- For re-running generation or running a manually-created job
|
|
||||||
|
|
||||||
**Private helpers:**
|
|
||||||
- `_run_blm_command(args, timeout=1800)` — subprocess wrapper, runs `uv run python main.py <args>` from BLM_DIR, injects `-u`/`-p` from `BLM_USERNAME`/`BLM_PASSWORD` env vars
|
|
||||||
- `_parse_ingest_output(stdout)` — regex extract project_id + job_file path
|
|
||||||
- `_parse_generate_output(stdout)` — extract completion stats
|
|
||||||
- `_build_ingest_args(...)` — construct CLI argument list from tool params
|
|
||||||
- `_set_status(ctx, message)` — write pipeline status to KV store (for UI polling)
|
|
||||||
- `_sync_clickup(ctx, task_id, step, message)` — post comment + update state
|
|
||||||
|
|
||||||
**Critical: always pass `-m` flag** to ingest-cora to prevent interactive stdin prompt from blocking the subprocess.
|
|
||||||
|
|
||||||
### 2. `skills/linkbuilding.md` — Skill file
|
|
||||||
|
|
||||||
YAML frontmatter linking to `[run_link_building, run_cora_backlinks, blm_ingest_cora, blm_generate_batch, scan_cora_folder]` tools and `[link_builder, default]` agents. Markdown body describes when to use, default flags, workflow steps.
|
|
||||||
|
|
||||||
### 3. `tests/test_linkbuilding.py` — Test suite (~40 tests)
|
|
||||||
|
|
||||||
All tests mock `subprocess.run` — never call Big-Link-Man. Categories:
|
|
||||||
- Output parser unit tests (`_parse_ingest_output`, `_parse_generate_output`)
|
|
||||||
- CLI arg builder tests (all flag combinations, missing required params)
|
|
||||||
- Full pipeline integration (happy path, ingest failure, generate failure)
|
|
||||||
- ClickUp state machine (executing → completed, executing → failed)
|
|
||||||
- Folder watcher scan logic (new files, skip processed, missing ClickUp match)
|
|
||||||
|
|
||||||
## Files to Modify
|
|
||||||
|
|
||||||
### 4. `cheddahbot/config.py` — Add LinkBuildingConfig
|
|
||||||
|
|
||||||
```python
|
|
||||||
@dataclass
|
|
||||||
class LinkBuildingConfig:
|
|
||||||
blm_dir: str = "E:/dev/Big-Link-Man"
|
|
||||||
watch_folder: str = "" # empty = disabled
|
|
||||||
watch_interval_minutes: int = 60
|
|
||||||
default_branded_plus_ratio: float = 0.7
|
|
||||||
```
|
|
||||||
|
|
||||||
Add `link_building: LinkBuildingConfig` field to `Config` dataclass. Add YAML loading block in `load_config()` (same pattern as memory/scheduler/shell). Add env var override for `BLM_DIR`.
|
|
||||||
|
|
||||||
### 5. `config.yaml` — Three additions
|
|
||||||
|
|
||||||
**New top-level section:**
|
|
||||||
```yaml
|
|
||||||
link_building:
|
|
||||||
blm_dir: "E:/dev/Big-Link-Man"
|
|
||||||
watch_folder: "Z:/cora-inbox"
|
|
||||||
watch_interval_minutes: 60
|
|
||||||
default_branded_plus_ratio: 0.7
|
|
||||||
```
|
|
||||||
|
|
||||||
**New skill_map entry under clickup:**
|
|
||||||
```yaml
|
|
||||||
"Link Building":
|
|
||||||
tool: "run_link_building"
|
|
||||||
auto_execute: false # Cora Backlinks triggered by folder watcher, not scheduler
|
|
||||||
complete_status: "complete" # Override: use "complete" instead of "internal review"
|
|
||||||
error_status: "internal review" # On failure, move to internal review
|
|
||||||
field_mapping:
|
|
||||||
lb_method: "LB Method"
|
|
||||||
project_name: "task_name"
|
|
||||||
money_site_url: "IMSURL"
|
|
||||||
custom_anchors: "CustomAnchors"
|
|
||||||
branded_plus_ratio: "BrandedPlusRatio"
|
|
||||||
cli_flags: "CLIFlags"
|
|
||||||
xlsx_path: "CoraFile"
|
|
||||||
```
|
|
||||||
|
|
||||||
**New agent:**
|
|
||||||
```yaml
|
|
||||||
- name: link_builder
|
|
||||||
display_name: Link Builder
|
|
||||||
tools: [run_link_building, run_cora_backlinks, blm_ingest_cora, blm_generate_batch, scan_cora_folder, delegate_task, remember, search_memory]
|
|
||||||
memory_scope: ""
|
|
||||||
```
|
|
||||||
|
|
||||||
### 6. `cheddahbot/scheduler.py` — Add folder watcher (4th daemon thread)
|
|
||||||
|
|
||||||
**New thread `_folder_watch_loop`** alongside existing poll, heartbeat, and ClickUp threads:
|
|
||||||
- Starts if `config.link_building.watch_folder` is non-empty
|
|
||||||
- Runs every `watch_interval_minutes` (default 60)
|
|
||||||
- `_scan_watch_folder()` globs `*.xlsx` in watch folder
|
|
||||||
- For each file, checks KV store `linkbuilding:watched:{filename}` — skip if already processed
|
|
||||||
- **Fuzzy-matches filename stem against ClickUp tasks** with `LB Method = "Cora Backlinks"` and status "to do":
|
|
||||||
- Queries ClickUp for Link Building tasks
|
|
||||||
- Compares normalized filename stem against each task's `Keyword` custom field
|
|
||||||
- If match found: extracts money_site_url from IMSURL field, cli_flags from CLIFlags field, etc.
|
|
||||||
- If no match: logs warning, marks as "unmatched" in KV store, sends notification asking user to create/link a ClickUp task
|
|
||||||
- On match: executes `run_link_building` tool with args from the ClickUp task fields
|
|
||||||
- On completion: moves .xlsx to `Z:/cora-inbox/processed/` subfolder, updates KV state
|
|
||||||
- On failure: updates KV state with error, notifies via NotificationBus
|
|
||||||
|
|
||||||
**File handling after pipeline:**
|
|
||||||
- On success: .xlsx moved from `Z:/cora-inbox/` → `Z:/cora-inbox/processed/`
|
|
||||||
- On failure: .xlsx stays in `Z:/cora-inbox/` (KV store marks it as failed so watcher doesn't retry automatically; user can reset KV entry to retry)
|
|
||||||
|
|
||||||
**Also adds `scan_cora_folder` tool** (can live in linkbuilding.py):
|
|
||||||
- Chat-invocable utility for the agent to check what's in the watch folder
|
|
||||||
- Returns list of unprocessed .xlsx files with ClickUp match status
|
|
||||||
- Internal agent tool, not a dashboard concern
|
|
||||||
|
|
||||||
### 7. `cheddahbot/clickup.py` — Add field creation method
|
|
||||||
|
|
||||||
Add `create_custom_field(list_id, name, field_type, type_config=None)` method that calls `POST /list/{list_id}/field`. Used by the setup tool to auto-create custom fields across lists.
|
|
||||||
|
|
||||||
### 8. `cheddahbot/__main__.py` — Add API endpoint
|
|
||||||
|
|
||||||
Add before Gradio mount:
|
|
||||||
```python
|
|
||||||
@fastapi_app.get("/api/linkbuilding/status")
|
|
||||||
async def linkbuilding_status():
|
|
||||||
"""Return link building status for dashboard consumption."""
|
|
||||||
# Returns:
|
|
||||||
# {
|
|
||||||
# "pending_cora_runs": [
|
|
||||||
# {"keyword": "precision cnc machining", "url": "https://...", "client": "Chapter 2", "task_id": "abc123"},
|
|
||||||
# ...
|
|
||||||
# ],
|
|
||||||
# "in_progress": [...], # Currently executing pipelines
|
|
||||||
# "completed": [...], # Recently completed (last 7 days)
|
|
||||||
# "failed": [...] # Failed tasks needing attention
|
|
||||||
# }
|
|
||||||
```
|
|
||||||
|
|
||||||
The `pending_cora_runs` section is the key dashboard data: queries ClickUp for "to do" tasks with Work Category="Link Building" and LB Method="Cora Backlinks", returns each task's `Keyword` field and `IMSURL` (copiable URL) so the user can see exactly which Cora reports need to be run.
|
|
||||||
|
|
||||||
Also push link building events to NotificationBus (category="linkbuilding") at each pipeline step for future real-time dashboard support.
|
|
||||||
|
|
||||||
No other `__main__.py` changes needed — agent wiring is automatic from config.yaml.
|
|
||||||
|
|
||||||
## ClickUp Custom Fields (Auto-Created)
|
|
||||||
|
|
||||||
New custom fields to be created programmatically:
|
|
||||||
|
|
||||||
| Field | Type | Purpose |
|
|
||||||
|-------|------|---------|
|
|
||||||
| `LB Method` | Dropdown | Link building subtype. Initial option: "Cora Backlinks" |
|
|
||||||
| `Keyword` | Short Text | Target keyword (used for file matching) |
|
|
||||||
| `CoraFile` | Short Text | Path to .xlsx file (optional, set by agent after file match) |
|
|
||||||
| `CustomAnchors` | Short Text | Comma-separated anchor text overrides |
|
|
||||||
| `BrandedPlusRatio` | Short Text | Override for `-bp` flag (e.g., "0.7") |
|
|
||||||
| `CLIFlags` | Short Text | Raw additional CLI flags (e.g., "-r 5 -t 0.3") |
|
|
||||||
|
|
||||||
Fields that already exist and will be reused: `Client`, `IMSURL`, `Work Category` (add "Link Building" option).
|
|
||||||
|
|
||||||
### Auto-creation approach
|
|
||||||
|
|
||||||
- Add `create_custom_field(list_id, name, type, type_config=None)` method to `cheddahbot/clickup.py` — calls `POST /list/{list_id}/field`
|
|
||||||
- Add a `setup_linkbuilding_fields` tool (category="linkbuilding") that:
|
|
||||||
1. Gets all list IDs in the space
|
|
||||||
2. For each list, checks if fields already exist (via `get_custom_fields`)
|
|
||||||
3. Creates missing fields via the new API method
|
|
||||||
4. For `LB Method` dropdown, creates with `type_config` containing "Cora Backlinks" option
|
|
||||||
5. For `Work Category`, adds "Link Building" option if missing
|
|
||||||
- This tool runs once during initial setup, or can be re-run if new lists are added
|
|
||||||
- Also add "Link Building" as an option to the existing `Work Category` dropdown if not present
|
|
||||||
|
|
||||||
## Data Flow & Status Lifecycle
|
|
||||||
|
|
||||||
### Primary Trigger: Folder Watcher (Cora Backlinks)
|
|
||||||
|
|
||||||
The folder watcher is the main trigger for Cora Backlinks. The ClickUp scheduler does NOT auto-execute these — it can't, because the .xlsx doesn't exist until the user runs Cora.
|
|
||||||
|
|
||||||
```
|
|
||||||
1. ClickUp task created:
|
|
||||||
Work Category="Link Building", LB Method="Cora Backlinks", status="to do"
|
|
||||||
Fields filled: Client, IMSURL, Keyword, CLIFlags, BrandedPlusRatio, etc.
|
|
||||||
→ Appears on dashboard as "needs Cora run"
|
|
||||||
|
|
||||||
2. User runs Cora manually, drops .xlsx in Z:/cora-inbox
|
|
||||||
|
|
||||||
3. Folder watcher (_scan_watch_folder, runs every 60 min):
|
|
||||||
→ Finds precision-cnc-machining.xlsx
|
|
||||||
→ Fuzzy matches "precision cnc machining" against Keyword field on ClickUp "to do" Link Building tasks
|
|
||||||
→ Match found → extracts metadata from ClickUp task (IMSURL, CLIFlags, etc.)
|
|
||||||
→ Sets CoraFile field on the ClickUp task to the file path
|
|
||||||
→ Moves task to "in progress"
|
|
||||||
→ Posts comment: "Starting Cora Backlinks pipeline..."
|
|
||||||
|
|
||||||
4. Pipeline runs:
|
|
||||||
→ Step 1: ingest-cora → comment: "CORA report ingested. Job file: jobs/xxx.json"
|
|
||||||
→ Step 2: generate-batch → comment: "Content generation complete. X articles across Y tiers."
|
|
||||||
|
|
||||||
5. On success:
|
|
||||||
→ Move task to "complete"
|
|
||||||
→ Post summary comment with stats
|
|
||||||
→ Move .xlsx to Z:/cora-inbox/processed/
|
|
||||||
|
|
||||||
6. On failure:
|
|
||||||
→ Move task to "internal review"
|
|
||||||
→ Post error comment with details
|
|
||||||
→ .xlsx stays in Z:/cora-inbox (can retry)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Secondary Trigger: Chat
|
|
||||||
|
|
||||||
```
|
|
||||||
User: "Run link building for Z:/cora-inbox/precision-cnc-machining.xlsx"
|
|
||||||
→ Chat brain calls run_cora_backlinks (or run_link_building with explicit lb_method)
|
|
||||||
→ Tool auto-looks up matching ClickUp task via Keyword field (if exists)
|
|
||||||
→ Same pipeline + ClickUp sync as above
|
|
||||||
→ If no ClickUp match: runs pipeline without ClickUp tracking, returns results to chat only
|
|
||||||
```
|
|
||||||
|
|
||||||
### Future Trigger: ClickUp Scheduler (other LB Methods)
|
|
||||||
|
|
||||||
Future link building methods (MCP, etc.) that don't need a .xlsx CAN be auto-executed by the ClickUp scheduler. The `run_link_building` orchestrator checks `lb_method`:
|
|
||||||
- "Cora Backlinks" → requires xlsx_path, skips if empty (folder watcher handles these)
|
|
||||||
- Future methods → can execute directly from ClickUp task data
|
|
||||||
|
|
||||||
### ClickUp Skill Map Note
|
|
||||||
|
|
||||||
The skill_map entry for "Link Building" exists primarily for **field mapping reference** (so the folder watcher and chat know which ClickUp fields map to which tool params). The ClickUp scheduler will discover these tasks but `run_link_building` will skip Cora Backlinks that have no xlsx_path — they're waiting for the folder watcher.
|
|
||||||
|
|
||||||
## Implementation Order
|
|
||||||
|
|
||||||
1. **Config** — Add `LinkBuildingConfig` to config.py, add `link_building:` section to config.yaml, add `link_builder` agent to config.yaml
|
|
||||||
2. **Core tools** — Create `cheddahbot/tools/linkbuilding.py` with `_run_blm_command`, parsers, `run_link_building` orchestrator, and `run_cora_backlinks` pipeline
|
|
||||||
3. **Standalone tools** — Add `blm_ingest_cora` and `blm_generate_batch`
|
|
||||||
4. **Tests** — Create `tests/test_linkbuilding.py`, verify with `uv run pytest tests/test_linkbuilding.py -v`
|
|
||||||
5. **ClickUp field creation** — Add `create_custom_field` to clickup.py, add `setup_linkbuilding_fields` tool
|
|
||||||
6. **ClickUp integration** — Add skill_map entry, add ClickUp state tracking to tools
|
|
||||||
7. **Folder watcher** — Add `_folder_watch_loop` to scheduler.py, add `scan_cora_folder` tool
|
|
||||||
8. **API endpoint** — Add `/api/linkbuilding/status` to `__main__.py`
|
|
||||||
9. **Skill file** — Create `skills/linkbuilding.md`
|
|
||||||
10. **ClickUp setup** — Run `setup_linkbuilding_fields` to auto-create custom fields across all lists
|
|
||||||
11. **Full test run** — `uv run pytest -v --no-cov`
|
|
||||||
|
|
||||||
## Verification
|
|
||||||
|
|
||||||
1. **Unit tests**: `uv run pytest tests/test_linkbuilding.py -v` — all pass with mocked subprocess
|
|
||||||
2. **Full suite**: `uv run pytest -v --no-cov` — no regressions
|
|
||||||
3. **Lint**: `uv run ruff check .` + `uv run ruff format .`
|
|
||||||
4. **Manual e2e**: Drop a real .xlsx in Z:/cora-inbox, verify ingest-cora runs, job JSON created, generate-batch runs
|
|
||||||
5. **ClickUp e2e**: Create a Link Building task in ClickUp with proper fields, wait for scheduler poll, verify execution
|
|
||||||
6. **Chat e2e**: Ask CheddahBot to "run link building for [keyword]" via chat UI
|
|
||||||
7. **API check**: Hit `http://localhost:7860/api/linkbuilding/status` and verify data returned
|
|
||||||
|
|
||||||
## Key Reference Files
|
|
||||||
|
|
||||||
- `cheddahbot/tools/press_release.py` — Reference pattern for multi-step pipeline tool
|
|
||||||
- `cheddahbot/scheduler.py:55-76` — Where to add 4th daemon thread
|
|
||||||
- `cheddahbot/config.py:108-200` — load_config() pattern for new config sections
|
|
||||||
- `E:/dev/Big-Link-Man/docs/CLI_COMMAND_REFERENCE.md` — Full CLI reference
|
|
||||||
- `E:/dev/Big-Link-Man/src/cli/commands.py` — Exact output formats to parse
|
|
||||||
|
|
@ -1,721 +0,0 @@
|
||||||
# CheddahBot Architecture
|
|
||||||
|
|
||||||
## System Overview
|
|
||||||
|
|
||||||
CheddahBot is a personal AI assistant built in Python. It exposes a Gradio-based
|
|
||||||
web UI, routes user messages through an agent loop backed by a model-agnostic LLM
|
|
||||||
adapter, persists conversations in SQLite, maintains a 4-layer memory system with
|
|
||||||
optional semantic search, and provides an extensible tool registry that the LLM
|
|
||||||
can invoke mid-conversation. A background scheduler handles cron-based tasks and
|
|
||||||
periodic heartbeat checks.
|
|
||||||
|
|
||||||
### Data Flow Diagram
|
|
||||||
|
|
||||||
```
|
|
||||||
User (browser)
|
|
||||||
|
|
|
||||||
v
|
|
||||||
+-----------+ +------------+ +--------------+
|
|
||||||
| Gradio UI | ---> | Agent | ---> | LLM Adapter |
|
|
||||||
| (ui.py) | | (agent.py) | | (llm.py) |
|
|
||||||
+-----------+ +-----+------+ +------+-------+
|
|
||||||
| |
|
|
||||||
+------------+-------+ +-------+--------+
|
|
||||||
| | | | Claude CLI |
|
|
||||||
v v v | OpenRouter |
|
|
||||||
+---------+ +---------+ +---+ | Ollama |
|
|
||||||
| Router | | Tools | | DB| | LM Studio |
|
|
||||||
|(router) | |(tools/) | |(db| +----------------+
|
|
||||||
+----+----+ +----+----+ +---+
|
|
||||||
| |
|
|
||||||
+-------+--+ +----+----+
|
|
||||||
| Identity | | Memory |
|
|
||||||
| SOUL.md | | System |
|
|
||||||
| USER.md | |(memory) |
|
|
||||||
+----------+ +---------+
|
|
||||||
```
|
|
||||||
|
|
||||||
1. The user submits text (or voice / files) through the Gradio interface.
|
|
||||||
2. `ui.py` hands the message to `Agent.respond()`.
|
|
||||||
3. The agent stores the user message in SQLite, builds a system prompt via
|
|
||||||
`router.py` (loading identity files and memory context), and formats the
|
|
||||||
conversation history.
|
|
||||||
4. The agent sends messages to `LLMAdapter.chat()` which dispatches to the
|
|
||||||
correct provider backend.
|
|
||||||
5. The LLM response streams back. If it contains tool-call requests, the agent
|
|
||||||
executes them through `ToolRegistry.execute()`, appends the results, and loops
|
|
||||||
back to step 4 (up to 10 iterations).
|
|
||||||
6. The final assistant response is stored in the database and streamed to the UI.
|
|
||||||
7. After responding, the agent checks whether the conversation has exceeded the
|
|
||||||
flush threshold; if so, the memory system summarizes older messages into the
|
|
||||||
daily log.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Module-by-Module Breakdown
|
|
||||||
|
|
||||||
### `__main__.py` -- Entry Point
|
|
||||||
|
|
||||||
**File:** `cheddahbot/__main__.py`
|
|
||||||
|
|
||||||
Orchestrates startup in this order:
|
|
||||||
|
|
||||||
1. `load_config()` -- loads configuration from env vars / YAML / defaults.
|
|
||||||
2. `Database(config.db_path)` -- opens (or creates) the SQLite database.
|
|
||||||
3. `LLMAdapter(...)` -- initializes the model-agnostic LLM client.
|
|
||||||
4. `Agent(config, db, llm)` -- creates the core agent.
|
|
||||||
5. `MemorySystem(config, db)` -- initializes the memory system and injects it
|
|
||||||
into the agent via `agent.set_memory()`.
|
|
||||||
6. `ToolRegistry(config, db, agent)` -- auto-discovers and loads all tool
|
|
||||||
modules, then injects via `agent.set_tools()`.
|
|
||||||
7. `Scheduler(config, db, agent)` -- starts two daemon threads (task poller and
|
|
||||||
heartbeat).
|
|
||||||
8. `create_ui(agent, config, llm)` -- builds the Gradio Blocks app and launches
|
|
||||||
it on the configured host/port.
|
|
||||||
|
|
||||||
Each subsystem (memory, tools, scheduler) is wrapped in a try/except so the
|
|
||||||
application degrades gracefully if optional dependencies are missing.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `config.py` -- Configuration
|
|
||||||
|
|
||||||
**File:** `cheddahbot/config.py`
|
|
||||||
|
|
||||||
Defines four dataclasses:
|
|
||||||
|
|
||||||
| Dataclass | Key Fields |
|
|
||||||
|------------------|---------------------------------------------------------------|
|
|
||||||
| `Config` | `default_model`, `host`, `port`, `ollama_url`, `lmstudio_url`, `openrouter_api_key`, plus derived paths (`root_dir`, `data_dir`, `identity_dir`, `memory_dir`, `skills_dir`, `db_path`) |
|
|
||||||
| `MemoryConfig` | `max_context_messages` (50), `flush_threshold` (40), `embedding_model` ("all-MiniLM-L6-v2"), `search_top_k` (5) |
|
|
||||||
| `SchedulerConfig` | `heartbeat_interval_minutes` (30), `poll_interval_seconds` (60) |
|
|
||||||
| `ShellConfig` | `blocked_commands`, `require_approval` (False) |
|
|
||||||
|
|
||||||
`load_config()` applies three layers of configuration in priority order:
|
|
||||||
|
|
||||||
1. Dataclass defaults (lowest priority).
|
|
||||||
2. `config.yaml` at the project root (middle priority).
|
|
||||||
3. Environment variables with the `CHEDDAH_` prefix, plus `OPENROUTER_API_KEY`
|
|
||||||
(highest priority).
|
|
||||||
|
|
||||||
The function also ensures required data directories exist on disk.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `db.py` -- Database Layer
|
|
||||||
|
|
||||||
**File:** `cheddahbot/db.py`
|
|
||||||
|
|
||||||
A thin wrapper around SQLite using thread-local connections (one connection per
|
|
||||||
thread), WAL journal mode, and foreign keys.
|
|
||||||
|
|
||||||
**Key methods:**
|
|
||||||
|
|
||||||
- `create_conversation(conv_id, title)` -- insert a new conversation row.
|
|
||||||
- `list_conversations(limit)` -- return recent conversations ordered by
|
|
||||||
`updated_at`.
|
|
||||||
- `add_message(conv_id, role, content, ...)` -- insert a message and touch the
|
|
||||||
conversation's `updated_at`.
|
|
||||||
- `get_messages(conv_id, limit)` -- return messages in chronological order.
|
|
||||||
- `count_messages(conv_id)` -- count messages for flush-threshold checks.
|
|
||||||
- `add_scheduled_task(name, prompt, schedule)` -- persist a scheduled task.
|
|
||||||
- `get_due_tasks()` -- return tasks whose `next_run` is in the past or NULL.
|
|
||||||
- `update_task_next_run(task_id, next_run)` -- update the next execution time.
|
|
||||||
- `log_task_run(task_id, result, error)` -- record the outcome of a task run.
|
|
||||||
- `kv_set(key, value)` / `kv_get(key)` -- generic key-value store.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `agent.py` -- Core Agent Loop
|
|
||||||
|
|
||||||
**File:** `cheddahbot/agent.py`
|
|
||||||
|
|
||||||
Contains the `Agent` class, the central coordinator.
|
|
||||||
|
|
||||||
**Key members:**
|
|
||||||
|
|
||||||
- `conv_id` -- current conversation ID (a 12-character hex string).
|
|
||||||
- `_memory` -- optional `MemorySystem` reference.
|
|
||||||
- `_tools` -- optional `ToolRegistry` reference.
|
|
||||||
|
|
||||||
**Primary method: `respond(user_input, files)`**
|
|
||||||
|
|
||||||
This is a Python generator that yields text chunks for streaming. The detailed
|
|
||||||
flow is described in the next section.
|
|
||||||
|
|
||||||
**Helper: `respond_to_prompt(prompt)`**
|
|
||||||
|
|
||||||
Non-streaming wrapper that collects all chunks and returns a single string. Used
|
|
||||||
by the scheduler and heartbeat for internal prompts.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `router.py` -- System Prompt Builder
|
|
||||||
|
|
||||||
**File:** `cheddahbot/router.py`
|
|
||||||
|
|
||||||
Two functions:
|
|
||||||
|
|
||||||
1. `build_system_prompt(identity_dir, memory_context, tools_description)` --
|
|
||||||
assembles the full system prompt by concatenating these sections separated by
|
|
||||||
horizontal rules:
|
|
||||||
- Contents of `identity/SOUL.md`
|
|
||||||
- Contents of `identity/USER.md`
|
|
||||||
- Memory context string (from the memory system)
|
|
||||||
- Tools description listing (from the tool registry)
|
|
||||||
- A fixed "Instructions" section with core behavioral directives.
|
|
||||||
|
|
||||||
2. `format_messages_for_llm(system_prompt, history, max_messages)` --
|
|
||||||
converts raw database rows into the `[{role, content}]` format expected by
|
|
||||||
the LLM. The system prompt becomes the first message. Tool results are
|
|
||||||
converted to user messages prefixed with `[Tool Result]`. History is trimmed
|
|
||||||
to the most recent `max_messages` entries.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `llm.py` -- LLM Adapter
|
|
||||||
|
|
||||||
**File:** `cheddahbot/llm.py`
|
|
||||||
|
|
||||||
Described in detail in a dedicated section below.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `memory.py` -- Memory System
|
|
||||||
|
|
||||||
**File:** `cheddahbot/memory.py`
|
|
||||||
|
|
||||||
Described in detail in a dedicated section below.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `media.py` -- Audio/Video Processing
|
|
||||||
|
|
||||||
**File:** `cheddahbot/media.py`
|
|
||||||
|
|
||||||
Three utility functions:
|
|
||||||
|
|
||||||
- `transcribe_audio(path)` -- Speech-to-text. Tries local Whisper first, then
|
|
||||||
falls back to the OpenAI Whisper API.
|
|
||||||
- `text_to_speech(text, output_path, voice)` -- Text-to-speech via `edge-tts`
|
|
||||||
(free, no API key). Defaults to the `en-US-AriaNeural` voice.
|
|
||||||
- `extract_video_frames(video_path, max_frames)` -- Extracts key frames from
|
|
||||||
video using `ffprobe` (to get duration) and `ffmpeg` (to extract JPEG frames).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `scheduler.py` -- Scheduler and Heartbeat
|
|
||||||
|
|
||||||
**File:** `cheddahbot/scheduler.py`
|
|
||||||
|
|
||||||
Described in detail in a dedicated section below.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `ui.py` -- Gradio Web Interface
|
|
||||||
|
|
||||||
**File:** `cheddahbot/ui.py`
|
|
||||||
|
|
||||||
Builds a Gradio Blocks application with:
|
|
||||||
|
|
||||||
- A model dropdown (populated from `llm.list_available_models()`) with a refresh
|
|
||||||
button and a "New Chat" button.
|
|
||||||
- A `gr.Chatbot` widget for the conversation (500px height, copy buttons).
|
|
||||||
- A `gr.MultimodalTextbox` supporting text, file upload, and microphone input.
|
|
||||||
- A "Voice Chat" accordion for record-and-respond audio interaction.
|
|
||||||
- A "Conversation History" accordion showing past conversations from the
|
|
||||||
database.
|
|
||||||
- A "Settings" accordion with guidance on editing identity and config files.
|
|
||||||
|
|
||||||
**Event wiring:**
|
|
||||||
|
|
||||||
- Model dropdown change calls `llm.switch_model()`.
|
|
||||||
- Refresh button re-discovers local models.
|
|
||||||
- Message submit calls `agent.respond()` in streaming mode, updating the chatbot
|
|
||||||
widget with each chunk.
|
|
||||||
- Audio files attached to messages are transcribed via `media.transcribe_audio()`
|
|
||||||
before being sent to the agent.
|
|
||||||
- Voice Chat records audio, transcribes it, gets a text response from the agent,
|
|
||||||
converts it to speech via `media.text_to_speech()`, and plays it back.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `tools/__init__.py` -- Tool Registry
|
|
||||||
|
|
||||||
**File:** `cheddahbot/tools/__init__.py`
|
|
||||||
|
|
||||||
Described in detail in a dedicated section below.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `skills/__init__.py` -- Skill Registry
|
|
||||||
|
|
||||||
**File:** `cheddahbot/skills/__init__.py`
|
|
||||||
|
|
||||||
Defines a parallel registry for "skills" (multi-step operations). Key pieces:
|
|
||||||
|
|
||||||
- `SkillDef` -- dataclass holding `name`, `description`, `func`.
|
|
||||||
- `@skill(name, description)` -- decorator that registers a skill in the global
|
|
||||||
`_SKILLS` dict.
|
|
||||||
- `load_skill(path)` -- dynamically loads a `.py` file as a module (triggering
|
|
||||||
any `@skill` decorators inside it).
|
|
||||||
- `discover_skills(skills_dir)` -- loads all `.py` files from the skills
|
|
||||||
directory.
|
|
||||||
- `list_skills()` / `run_skill(name, **kwargs)` -- query and execute skills.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
### `providers/__init__.py` -- Provider Extensions
|
|
||||||
|
|
||||||
**File:** `cheddahbot/providers/__init__.py`
|
|
||||||
|
|
||||||
Reserved for future custom provider implementations. Currently empty.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## The Agent Loop in Detail
|
|
||||||
|
|
||||||
When `Agent.respond(user_input)` is called, the following sequence occurs:
|
|
||||||
|
|
||||||
```
|
|
||||||
1. ensure_conversation()
|
|
||||||
|-- Creates a new conversation in the DB if one doesn't exist
|
|
||||||
|
|
|
||||||
2. db.add_message(conv_id, "user", user_input)
|
|
||||||
|-- Persists the user's message
|
|
||||||
|
|
|
||||||
3. Build system prompt
|
|
||||||
|-- memory.get_context(user_input) --> memory context string
|
|
||||||
|-- tools.get_tools_schema() --> OpenAI-format JSON schemas
|
|
||||||
|-- tools.get_tools_description() --> human-readable tool list
|
|
||||||
|-- router.build_system_prompt(identity_dir, memory_context, tools_description)
|
|
||||||
|
|
|
||||||
4. Load conversation history from DB
|
|
||||||
|-- db.get_messages(conv_id, limit=max_context_messages)
|
|
||||||
|-- router.format_messages_for_llm(system_prompt, history, max_messages)
|
|
||||||
|
|
|
||||||
5. AGENT LOOP (up to MAX_TOOL_ITERATIONS = 10):
|
|
||||||
|
|
|
||||||
|-- llm.chat(messages, tools=tools_schema, stream=True)
|
|
||||||
| |-- Yields {"type":"text","content":"..."} chunks --> streamed to user
|
|
||||||
| |-- Yields {"type":"tool_use","name":"...","input":{...}} chunks
|
|
||||||
|
|
|
||||||
|-- If no tool_calls: store assistant message, BREAK
|
|
||||||
|
|
|
||||||
|-- If tool_calls present:
|
|
||||||
| |-- Store assistant message with tool_calls metadata
|
|
||||||
| |-- For each tool call:
|
|
||||||
| | |-- yield "Using tool: <name>" indicator
|
|
||||||
| | |-- tools.execute(name, input) --> result string
|
|
||||||
| | |-- yield tool result (truncated to 2000 chars)
|
|
||||||
| | |-- db.add_message(conv_id, "tool", result)
|
|
||||||
| | |-- Append result to messages as user message
|
|
||||||
| |-- Continue loop (LLM sees tool results and can respond or call more tools)
|
|
||||||
|
|
|
||||||
6. After loop: check if memory flush is needed
|
|
||||||
|-- If message count > flush_threshold:
|
|
||||||
| |-- memory.auto_flush(conv_id)
|
|
||||||
```
|
|
||||||
|
|
||||||
The loop allows the LLM to chain up to 10 consecutive tool calls before being
|
|
||||||
cut off. Each tool result is injected back into the conversation as a user
|
|
||||||
message so the LLM can reason about it in the next iteration.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## LLM Adapter Design
|
|
||||||
|
|
||||||
**File:** `cheddahbot/llm.py`
|
|
||||||
|
|
||||||
### Provider Routing
|
|
||||||
|
|
||||||
The `LLMAdapter` supports four provider paths. The active provider is determined
|
|
||||||
by examining the current model ID:
|
|
||||||
|
|
||||||
| Model ID Pattern | Provider | Backend |
|
|
||||||
|-----------------------------|---------------|----------------------------------|
|
|
||||||
| `claude-*` | `claude` | Claude Code CLI (subprocess) |
|
|
||||||
| `local/ollama/<model>` | `ollama` | Ollama HTTP API (OpenAI-compat) |
|
|
||||||
| `local/lmstudio/<model>` | `lmstudio` | LM Studio HTTP API (OpenAI-compat) |
|
|
||||||
| Anything else | `openrouter` | OpenRouter API (OpenAI-compat) |
|
|
||||||
|
|
||||||
### The `chat()` Method
|
|
||||||
|
|
||||||
This is the single entry point. It accepts a list of messages, an optional tools
|
|
||||||
schema, and a stream flag. It returns a generator yielding dictionaries:
|
|
||||||
|
|
||||||
- `{"type": "text", "content": "..."}` -- a text chunk to display.
|
|
||||||
- `{"type": "tool_use", "id": "...", "name": "...", "input": {...}}` -- a tool
|
|
||||||
invocation request.
|
|
||||||
|
|
||||||
### Claude Code CLI Path (`_chat_claude_sdk`)
|
|
||||||
|
|
||||||
For Claude models, CheddahBot shells out to the `claude` CLI binary (the Claude
|
|
||||||
Code SDK):
|
|
||||||
|
|
||||||
1. Separates system prompt, conversation history, and the latest user message
|
|
||||||
from the messages list.
|
|
||||||
2. Builds a full system prompt by appending conversation history under a
|
|
||||||
"Conversation So Far" heading.
|
|
||||||
3. Invokes `claude -p <prompt> --model <model> --output-format json --system-prompt <system>`.
|
|
||||||
4. The `CLAUDECODE` environment variable is stripped from the subprocess
|
|
||||||
environment to avoid nested-session errors.
|
|
||||||
5. Parses the JSON output and yields the `result` field as a text chunk.
|
|
||||||
6. On Windows, `shell=True` is used for compatibility with npm-installed
|
|
||||||
binaries.
|
|
||||||
|
|
||||||
### OpenAI-Compatible Path (`_chat_openai_sdk`)
|
|
||||||
|
|
||||||
For OpenRouter, Ollama, and LM Studio, the adapter uses the `openai` Python SDK:
|
|
||||||
|
|
||||||
1. `_resolve_endpoint(provider)` returns the base URL and API key:
|
|
||||||
- OpenRouter: `https://openrouter.ai/api/v1` with the configured API key.
|
|
||||||
- Ollama: `http://localhost:11434/v1` with dummy key `"ollama"`.
|
|
||||||
- LM Studio: `http://localhost:1234/v1` with dummy key `"lm-studio"`.
|
|
||||||
2. `_resolve_model_id(provider)` strips the `local/ollama/` or
|
|
||||||
`local/lmstudio/` prefix from the model ID.
|
|
||||||
3. Creates an `openai.OpenAI` client with the resolved base URL and API key.
|
|
||||||
4. In streaming mode: iterates over `client.chat.completions.create(stream=True)`,
|
|
||||||
accumulates tool call arguments across chunks (indexed by `tc.index`), yields
|
|
||||||
text deltas immediately, and yields completed tool calls at the end of the
|
|
||||||
stream.
|
|
||||||
5. In non-streaming mode: makes a single call and yields text and tool calls from
|
|
||||||
the response.
|
|
||||||
|
|
||||||
### Model Discovery
|
|
||||||
|
|
||||||
- `discover_local_models()` -- probes the Ollama tags endpoint and LM Studio
|
|
||||||
models endpoint (3-second timeout each) and returns `ModelInfo` objects.
|
|
||||||
- `list_available_models()` -- returns a combined list of hardcoded Claude
|
|
||||||
models, hardcoded OpenRouter models (if an API key is configured), and
|
|
||||||
dynamically discovered local models.
|
|
||||||
|
|
||||||
### Model Switching
|
|
||||||
|
|
||||||
`switch_model(model_id)` updates `current_model`. The `provider` property
|
|
||||||
re-evaluates on every access, so switching models also implicitly switches
|
|
||||||
providers.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Memory System
|
|
||||||
|
|
||||||
**File:** `cheddahbot/memory.py`
|
|
||||||
|
|
||||||
### The 4 Layers
|
|
||||||
|
|
||||||
```
|
|
||||||
Layer 1: Identity -- identity/SOUL.md, identity/USER.md
|
|
||||||
(loaded by router.py into the system prompt)
|
|
||||||
|
|
||||||
Layer 2: Long-term -- memory/MEMORY.md
|
|
||||||
(persisted facts and instructions, appended over time)
|
|
||||||
|
|
||||||
Layer 3: Daily logs -- memory/YYYY-MM-DD.md
|
|
||||||
(timestamped entries per day, including auto-flush summaries)
|
|
||||||
|
|
||||||
Layer 4: Semantic -- memory/embeddings.db
|
|
||||||
(SQLite with vector embeddings for similarity search)
|
|
||||||
```
|
|
||||||
|
|
||||||
### How Memory Context is Built
|
|
||||||
|
|
||||||
`MemorySystem.get_context(query)` is called once per agent turn. It assembles a
|
|
||||||
string from:
|
|
||||||
|
|
||||||
1. **Long-term memory** -- the last 2000 characters of `MEMORY.md`.
|
|
||||||
2. **Today's log** -- the last 1500 characters of today's date file.
|
|
||||||
3. **Semantic search results** -- the top-k most similar entries to the user's
|
|
||||||
query, formatted as a bulleted list.
|
|
||||||
|
|
||||||
This string is injected into the system prompt by `router.py` under the heading
|
|
||||||
"Relevant Memory".
|
|
||||||
|
|
||||||
### Embedding and Search
|
|
||||||
|
|
||||||
- The embedding model is `all-MiniLM-L6-v2` from `sentence-transformers` (lazy
|
|
||||||
loaded, thread-safe via a lock).
|
|
||||||
- `_index_text(text, doc_id)` -- encodes the text into a vector and stores it in
|
|
||||||
`memory/embeddings.db` (table: `embeddings` with columns `id TEXT`, `text TEXT`,
|
|
||||||
`vector BLOB`).
|
|
||||||
- `search(query, top_k)` -- encodes the query, loads all vectors from the
|
|
||||||
database, computes cosine similarity against each one, sorts by score, and
|
|
||||||
returns the top-k results.
|
|
||||||
- If `sentence-transformers` is not installed, `_fallback_search()` performs
|
|
||||||
simple case-insensitive substring matching across all `.md` files in the memory
|
|
||||||
directory.
|
|
||||||
|
|
||||||
### Writing to Memory
|
|
||||||
|
|
||||||
- `remember(text)` -- appends a timestamped entry to `memory/MEMORY.md` and
|
|
||||||
indexes it for semantic search. Exposed to the LLM via the `remember_this`
|
|
||||||
tool.
|
|
||||||
- `log_daily(text)` -- appends a timestamped entry to today's daily log file and
|
|
||||||
indexes it. Exposed via the `log_note` tool.
|
|
||||||
|
|
||||||
### Auto-Flush
|
|
||||||
|
|
||||||
When `Agent.respond()` finishes, it checks `db.count_messages(conv_id)`. If the
|
|
||||||
count exceeds `config.memory.flush_threshold` (default 40):
|
|
||||||
|
|
||||||
1. `auto_flush(conv_id)` loads up to 200 messages.
|
|
||||||
2. All but the last 10 are selected for summarization.
|
|
||||||
3. A summary string is built from the selected messages (truncated to 1000
|
|
||||||
chars).
|
|
||||||
4. The summary is appended to the daily log via `log_daily()`.
|
|
||||||
|
|
||||||
This prevents conversations from growing unbounded while preserving context in
|
|
||||||
the daily log for future semantic search.
|
|
||||||
|
|
||||||
### Reindexing
|
|
||||||
|
|
||||||
`reindex_all()` clears all embeddings and re-indexes every line (longer than 10
|
|
||||||
characters) from every `.md` file in the memory directory. This can be called
|
|
||||||
to rebuild the search index from scratch.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Tool System
|
|
||||||
|
|
||||||
**File:** `cheddahbot/tools/__init__.py` (registry) and `cheddahbot/tools/*.py`
|
|
||||||
(tool modules)
|
|
||||||
|
|
||||||
### The `@tool` Decorator
|
|
||||||
|
|
||||||
```python
|
|
||||||
from cheddahbot.tools import tool
|
|
||||||
|
|
||||||
@tool("my_tool_name", "Description of what this tool does", category="general")
|
|
||||||
def my_tool_name(param1: str, param2: int = 10) -> str:
|
|
||||||
return f"Result: {param1}, {param2}"
|
|
||||||
```
|
|
||||||
|
|
||||||
The decorator:
|
|
||||||
|
|
||||||
1. Creates a `ToolDef` object containing the function, name, description,
|
|
||||||
category, and auto-extracted parameter schema.
|
|
||||||
2. Registers it in the global `_TOOLS` dictionary keyed by name.
|
|
||||||
3. Attaches the `ToolDef` as `func._tool_def` on the original function.
|
|
||||||
|
|
||||||
### Parameter Schema Generation
|
|
||||||
|
|
||||||
`_extract_params(func)` inspects the function signature using `inspect`:
|
|
||||||
|
|
||||||
- Skips parameters named `self` or `ctx`.
|
|
||||||
- Maps type annotations to JSON Schema types: `str` -> `"string"`, `int` ->
|
|
||||||
`"integer"`, `float` -> `"number"`, `bool` -> `"boolean"`, `list` ->
|
|
||||||
`"array"`. Unannotated parameters default to `"string"`.
|
|
||||||
- Parameters without defaults are marked as required.
|
|
||||||
|
|
||||||
### Schema Output
|
|
||||||
|
|
||||||
`ToolDef.to_openai_schema()` returns the tool definition in OpenAI
|
|
||||||
function-calling format:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"type": "function",
|
|
||||||
"function": {
|
|
||||||
"name": "tool_name",
|
|
||||||
"description": "...",
|
|
||||||
"parameters": {
|
|
||||||
"type": "object",
|
|
||||||
"properties": { ... },
|
|
||||||
"required": [ ... ]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Auto-Discovery
|
|
||||||
|
|
||||||
When `ToolRegistry.__init__()` is called, `_discover_tools()` uses
|
|
||||||
`pkgutil.iter_modules` to find every `.py` file in `cheddahbot/tools/` (skipping
|
|
||||||
files starting with `_`). Each module is imported via `importlib.import_module`,
|
|
||||||
which triggers the `@tool` decorators and populates the global registry.
|
|
||||||
|
|
||||||
### Tool Execution
|
|
||||||
|
|
||||||
`ToolRegistry.execute(name, args)`:
|
|
||||||
|
|
||||||
1. Looks up the `ToolDef` in the global `_TOOLS` dict.
|
|
||||||
2. Inspects the function signature for a `ctx` parameter. If present, injects a
|
|
||||||
context dictionary containing `config`, `db`, `agent`, and `memory`.
|
|
||||||
3. Calls the function with the provided arguments.
|
|
||||||
4. Returns the result as a string (or `"Done."` if the function returns `None`).
|
|
||||||
5. Catches all exceptions and returns `"Tool error: ..."`.
|
|
||||||
|
|
||||||
### Meta-Tools
|
|
||||||
|
|
||||||
Two special tools enable runtime extensibility:
|
|
||||||
|
|
||||||
**`build_tool`** (in `cheddahbot/tools/build_tool.py`):
|
|
||||||
- Accepts `name`, `description`, and `code` (Python source using the `@tool`
|
|
||||||
decorator).
|
|
||||||
- Writes a new `.py` file into `cheddahbot/tools/`.
|
|
||||||
- Hot-imports the module via `importlib.import_module`, which triggers the
|
|
||||||
`@tool` decorator and registers the new tool immediately.
|
|
||||||
- If the import fails, the file is deleted.
|
|
||||||
|
|
||||||
**`build_skill`** (in `cheddahbot/tools/build_skill.py`):
|
|
||||||
- Accepts `name`, `description`, and `steps` (Python source using the `@skill`
|
|
||||||
decorator).
|
|
||||||
- Writes a new `.py` file into the configured `skills/` directory.
|
|
||||||
- Calls `skills.load_skill()` to dynamically import it.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Scheduler and Heartbeat Design
|
|
||||||
|
|
||||||
**File:** `cheddahbot/scheduler.py`
|
|
||||||
|
|
||||||
The `Scheduler` class starts two daemon threads at application boot.
|
|
||||||
|
|
||||||
### Task Poller Thread
|
|
||||||
|
|
||||||
- Runs in `_poll_loop()`, sleeping for `poll_interval_seconds` (default 60)
|
|
||||||
between iterations.
|
|
||||||
- Each iteration calls `_run_due_tasks()`:
|
|
||||||
1. Queries `db.get_due_tasks()` for tasks where `next_run` is NULL or in the
|
|
||||||
past.
|
|
||||||
2. For each due task, calls `agent.respond_to_prompt(task["prompt"])` to
|
|
||||||
generate a response.
|
|
||||||
3. Logs the result via `db.log_task_run()`.
|
|
||||||
4. If the schedule is `"once:<datetime>"`, the task is disabled.
|
|
||||||
5. Otherwise, the schedule is treated as a cron expression: `croniter` is used
|
|
||||||
to calculate the next run time, which is saved via
|
|
||||||
`db.update_task_next_run()`.
|
|
||||||
|
|
||||||
### Heartbeat Thread
|
|
||||||
|
|
||||||
- Runs in `_heartbeat_loop()`, sleeping for `heartbeat_interval_minutes`
|
|
||||||
(default 30) between iterations.
|
|
||||||
- Waits 60 seconds before the first heartbeat to let the system initialize.
|
|
||||||
- Each iteration calls `_run_heartbeat()`:
|
|
||||||
1. Reads `identity/HEARTBEAT.md`.
|
|
||||||
2. Sends the checklist to the agent as a prompt: "HEARTBEAT CHECK. Review this
|
|
||||||
checklist and take action if needed."
|
|
||||||
3. If the response contains `"HEARTBEAT_OK"`, no action is logged.
|
|
||||||
4. Otherwise, the response is logged to the daily log via
|
|
||||||
`memory.log_daily()`.
|
|
||||||
|
|
||||||
### Thread Safety
|
|
||||||
|
|
||||||
Both threads are daemon threads (they die when the main process exits). The
|
|
||||||
`_stop_event` threading event can be set to gracefully shut down both loops. The
|
|
||||||
database layer uses thread-local connections, so concurrent access from the
|
|
||||||
scheduler threads and the Gradio request threads is safe.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Database Schema
|
|
||||||
|
|
||||||
The SQLite database (`data/cheddahbot.db`) contains five tables:
|
|
||||||
|
|
||||||
### `conversations`
|
|
||||||
|
|
||||||
| Column | Type | Notes |
|
|
||||||
|--------------|------|--------------------|
|
|
||||||
| `id` | TEXT | Primary key (hex) |
|
|
||||||
| `title` | TEXT | Display title |
|
|
||||||
| `created_at` | TEXT | ISO 8601 UTC |
|
|
||||||
| `updated_at` | TEXT | ISO 8601 UTC |
|
|
||||||
|
|
||||||
### `messages`
|
|
||||||
|
|
||||||
| Column | Type | Notes |
|
|
||||||
|---------------|---------|--------------------------------------------|
|
|
||||||
| `id` | INTEGER | Autoincrement primary key |
|
|
||||||
| `conv_id` | TEXT | Foreign key to `conversations.id` |
|
|
||||||
| `role` | TEXT | `"user"`, `"assistant"`, or `"tool"` |
|
|
||||||
| `content` | TEXT | Message body |
|
|
||||||
| `tool_calls` | TEXT | JSON array of `{name, input}` (nullable) |
|
|
||||||
| `tool_result` | TEXT | Name of the tool that produced this result (nullable) |
|
|
||||||
| `model` | TEXT | Model ID used for this response (nullable) |
|
|
||||||
| `created_at` | TEXT | ISO 8601 UTC |
|
|
||||||
|
|
||||||
Index: `idx_messages_conv` on `(conv_id, created_at)`.
|
|
||||||
|
|
||||||
### `scheduled_tasks`
|
|
||||||
|
|
||||||
| Column | Type | Notes |
|
|
||||||
|--------------|---------|---------------------------------------|
|
|
||||||
| `id` | INTEGER | Autoincrement primary key |
|
|
||||||
| `name` | TEXT | Human-readable task name |
|
|
||||||
| `prompt` | TEXT | The prompt to send to the agent |
|
|
||||||
| `schedule` | TEXT | Cron expression or `"once:<datetime>"`|
|
|
||||||
| `enabled` | INTEGER | 1 = active, 0 = disabled |
|
|
||||||
| `next_run` | TEXT | ISO 8601 UTC (nullable) |
|
|
||||||
| `created_at` | TEXT | ISO 8601 UTC |
|
|
||||||
|
|
||||||
### `task_run_logs`
|
|
||||||
|
|
||||||
| Column | Type | Notes |
|
|
||||||
|---------------|---------|------------------------------------|
|
|
||||||
| `id` | INTEGER | Autoincrement primary key |
|
|
||||||
| `task_id` | INTEGER | Foreign key to `scheduled_tasks.id`|
|
|
||||||
| `started_at` | TEXT | ISO 8601 UTC |
|
|
||||||
| `finished_at` | TEXT | ISO 8601 UTC (nullable) |
|
|
||||||
| `result` | TEXT | Agent response (nullable) |
|
|
||||||
| `error` | TEXT | Error message if failed (nullable) |
|
|
||||||
|
|
||||||
### `kv_store`
|
|
||||||
|
|
||||||
| Column | Type | Notes |
|
|
||||||
|---------|------|-----------------|
|
|
||||||
| `key` | TEXT | Primary key |
|
|
||||||
| `value` | TEXT | Arbitrary value |
|
|
||||||
|
|
||||||
### Embeddings Database
|
|
||||||
|
|
||||||
A separate SQLite file at `memory/embeddings.db` holds one table:
|
|
||||||
|
|
||||||
### `embeddings`
|
|
||||||
|
|
||||||
| Column | Type | Notes |
|
|
||||||
|----------|------|--------------------------------------|
|
|
||||||
| `id` | TEXT | Primary key (e.g. `"daily:2026-02-14:08:30"`) |
|
|
||||||
| `text` | TEXT | The original text that was embedded |
|
|
||||||
| `vector` | BLOB | Raw float32 bytes of the embedding vector |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Identity Files
|
|
||||||
|
|
||||||
Three Markdown files in the `identity/` directory define the agent's personality,
|
|
||||||
user context, and background behavior.
|
|
||||||
|
|
||||||
### `identity/SOUL.md`
|
|
||||||
|
|
||||||
Defines the agent's personality, communication style, boundaries, and quirks.
|
|
||||||
This is loaded first into the system prompt, making it the most prominent
|
|
||||||
identity influence on every response.
|
|
||||||
|
|
||||||
Contents are read by `router.build_system_prompt()` at the beginning of each
|
|
||||||
agent turn.
|
|
||||||
|
|
||||||
### `identity/USER.md`
|
|
||||||
|
|
||||||
Contains a user profile template: name, technical level, primary language,
|
|
||||||
current projects, and communication preferences. The user edits this file to
|
|
||||||
customize how the agent addresses them and what context it assumes.
|
|
||||||
|
|
||||||
Loaded by `router.build_system_prompt()` immediately after SOUL.md.
|
|
||||||
|
|
||||||
### `identity/HEARTBEAT.md`
|
|
||||||
|
|
||||||
A checklist of items to review on each heartbeat cycle. The scheduler reads this
|
|
||||||
file and sends it to the agent as a prompt every `heartbeat_interval_minutes`
|
|
||||||
(default 30 minutes). The agent processes the checklist and either confirms
|
|
||||||
"HEARTBEAT_OK" or takes action and logs it.
|
|
||||||
|
|
||||||
### Loading Order in the System Prompt
|
|
||||||
|
|
||||||
The system prompt assembled by `router.build_system_prompt()` concatenates these
|
|
||||||
sections, separated by `\n\n---\n\n`:
|
|
||||||
|
|
||||||
1. SOUL.md contents
|
|
||||||
2. USER.md contents
|
|
||||||
3. Memory context (long-term + daily log + semantic search results)
|
|
||||||
4. Tools description (categorized list of available tools)
|
|
||||||
5. Core instructions (hardcoded behavioral directives)
|
|
||||||
|
|
@ -1,61 +0,0 @@
|
||||||
# ClickUp Task Creation
|
|
||||||
|
|
||||||
## CLI Script
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv run python scripts/create_clickup_task.py --name "LINKS - keyword" --client "Client Name" \
|
|
||||||
--category "Link Building" --due-date 2026-03-18 --tag mar26 --time-estimate 2h \
|
|
||||||
--field "Keyword=keyword" --field "IMSURL=https://example.com" --field "LB Method=Cora Backlinks"
|
|
||||||
```
|
|
||||||
|
|
||||||
## Defaults
|
|
||||||
|
|
||||||
- Priority: High (2)
|
|
||||||
- Assignee: Bryan (10765627)
|
|
||||||
- Status: "to do"
|
|
||||||
- Due date format: YYYY-MM-DD
|
|
||||||
- Tag format: mmmYY (e.g. feb26, mar26)
|
|
||||||
|
|
||||||
## Custom Fields
|
|
||||||
|
|
||||||
Any field can be set via `--field "Name=Value"`. Dropdowns are auto-resolved by name (case-insensitive).
|
|
||||||
|
|
||||||
## Task Types
|
|
||||||
|
|
||||||
### Link Building
|
|
||||||
- **Prefix**: `LINKS - {keyword}`
|
|
||||||
- **Work Category**: "Link Building"
|
|
||||||
- **Required fields**: Keyword, IMSURL
|
|
||||||
- **LB Method**: default "Cora Backlinks"
|
|
||||||
- **CLIFlags**: only add `--tier1-count N` when count is specified
|
|
||||||
- **BrandedPlusRatio**: default to 0.7
|
|
||||||
- **CustomAnchors**: only if given a list of custom anchors
|
|
||||||
- **time estimate**: 2.5h
|
|
||||||
|
|
||||||
### On Page Optimization
|
|
||||||
- **Prefix**: `OPT - {keyword}`
|
|
||||||
- **Work Category**: "On Page Optimization"
|
|
||||||
- **Required fields**: Keyword, IMSURL
|
|
||||||
- **time estimate**: 3h
|
|
||||||
-
|
|
||||||
|
|
||||||
### Content Creation
|
|
||||||
- **Prefix**: `CREATE - {keyword}`
|
|
||||||
- **Work Category**: "Content Creation"
|
|
||||||
- **Required fields**: Keyword
|
|
||||||
- **time estimate**: 4h
|
|
||||||
|
|
||||||
### Press Release
|
|
||||||
- **Prefix**: `PR - {keyword}`
|
|
||||||
- **Required fields**: Keyword, IMSURL
|
|
||||||
- **Work Category**: "Press Release"
|
|
||||||
- **PR Topic**: if not provided, ask if there is a topic. it can be blank if they respond with none.
|
|
||||||
- **time estimate**: 1.5h
|
|
||||||
|
|
||||||
## Chat Tool
|
|
||||||
|
|
||||||
The `clickup_create_task` tool provides the same capabilities via CheddahBot UI. Arbitrary custom fields are passed as JSON via `custom_fields_json`.
|
|
||||||
|
|
||||||
## Client Folder Lookup
|
|
||||||
|
|
||||||
Tasks are created in the "Overall" list inside the client's folder. Folder name is matched case-insensitively.
|
|
||||||
|
|
@ -1,110 +0,0 @@
|
||||||
# ntfy.sh Push Notifications Setup
|
|
||||||
|
|
||||||
CheddahBot sends push notifications to your phone and desktop via [ntfy.sh](https://ntfy.sh) when tasks complete, reports are ready, or errors occur.
|
|
||||||
|
|
||||||
## 1. Install the ntfy App
|
|
||||||
|
|
||||||
- **Android:** [Play Store](https://play.google.com/store/apps/details?id=io.heckel.ntfy)
|
|
||||||
- **iOS:** [App Store](https://apps.apple.com/us/app/ntfy/id1625396347)
|
|
||||||
- **Desktop:** Open [ntfy.sh](https://ntfy.sh) in your browser and enable browser notifications when prompted
|
|
||||||
|
|
||||||
## 2. Pick Topic Names
|
|
||||||
|
|
||||||
Topics are like channels. Anyone who knows the topic name can subscribe, so use random strings:
|
|
||||||
|
|
||||||
```
|
|
||||||
cheddahbot-a8f3k9x2m7
|
|
||||||
cheddahbot-errors-p4w2j6n8
|
|
||||||
```
|
|
||||||
|
|
||||||
Generate your own — any random string works. No account or registration needed.
|
|
||||||
|
|
||||||
## 3. Subscribe to Your Topics
|
|
||||||
|
|
||||||
**Phone app:**
|
|
||||||
1. Open the ntfy app
|
|
||||||
2. Tap the + button
|
|
||||||
3. Enter your topic name (e.g. `cheddahbot-a8f3k9x2m7`)
|
|
||||||
4. Server: `https://ntfy.sh` (default)
|
|
||||||
5. Repeat for your errors topic
|
|
||||||
|
|
||||||
**Browser:**
|
|
||||||
1. Go to [ntfy.sh](https://ntfy.sh)
|
|
||||||
2. Click "Subscribe to topic"
|
|
||||||
3. Enter the same topic names
|
|
||||||
4. Allow browser notifications when prompted
|
|
||||||
|
|
||||||
## 4. Add Topics to .env
|
|
||||||
|
|
||||||
Add these lines to your `.env` file in the CheddahBot root:
|
|
||||||
|
|
||||||
```
|
|
||||||
NTFY_TOPIC_HUMAN_ACTION=cheddahbot-a8f3k9x2m7
|
|
||||||
NTFY_TOPIC_ERRORS=cheddahbot-errors-p4w2j6n8
|
|
||||||
```
|
|
||||||
|
|
||||||
Replace with your actual topic names.
|
|
||||||
|
|
||||||
## 5. Restart CheddahBot
|
|
||||||
|
|
||||||
Kill the running instance and restart:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
uv run python -m cheddahbot
|
|
||||||
```
|
|
||||||
|
|
||||||
You should see in the startup logs:
|
|
||||||
|
|
||||||
```
|
|
||||||
ntfy notifier initialized with 2 channel(s): human_action, errors
|
|
||||||
ntfy notifier subscribed to notification bus
|
|
||||||
```
|
|
||||||
|
|
||||||
## What Gets Notified
|
|
||||||
|
|
||||||
### human_action channel (high priority)
|
|
||||||
Notifications where you need to do something:
|
|
||||||
- Cora report finished and ready
|
|
||||||
- Press release completed
|
|
||||||
- Content outline ready for review
|
|
||||||
- Content optimization completed
|
|
||||||
- Link building pipeline finished
|
|
||||||
- Cora report distributed to inbox
|
|
||||||
|
|
||||||
### errors channel (urgent priority)
|
|
||||||
Notifications when something went wrong:
|
|
||||||
- ClickUp task failed or was skipped
|
|
||||||
- AutoCora job failed
|
|
||||||
- Link building pipeline error
|
|
||||||
- Content pipeline error
|
|
||||||
- Missing ClickUp field matches
|
|
||||||
- File copy failures
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
Channel routing is configured in `config.yaml` under the `ntfy:` section. Each channel has:
|
|
||||||
|
|
||||||
- `topic_env_var` — which env var holds the topic name
|
|
||||||
- `categories` — notification categories to listen to (`clickup`, `autocora`, `linkbuilding`, `content`)
|
|
||||||
- `include_patterns` — regex patterns the message must match (at least one)
|
|
||||||
- `exclude_patterns` — regex patterns that reject the message (takes priority over include)
|
|
||||||
- `priority` — ntfy priority level: `min`, `low`, `default`, `high`, `urgent`
|
|
||||||
- `tags` — emoji shortcodes shown on the notification (e.g. `white_check_mark`, `rotating_light`)
|
|
||||||
|
|
||||||
### Adding a New Channel
|
|
||||||
|
|
||||||
1. Add a new entry under `ntfy.channels` in `config.yaml`
|
|
||||||
2. Add the topic env var to `.env`
|
|
||||||
3. Subscribe to the topic in your ntfy app
|
|
||||||
4. Restart CheddahBot
|
|
||||||
|
|
||||||
### Privacy
|
|
||||||
|
|
||||||
The public ntfy.sh server has no authentication by default. Your topic name is the only security — use a long random string to make it unguessable. Alternatively:
|
|
||||||
|
|
||||||
- Create a free ntfy.sh account and set read/write ACLs on your topics
|
|
||||||
- Self-host ntfy (single binary) and set `server: http://localhost:8080` in config.yaml
|
|
||||||
|
|
||||||
### Disabling
|
|
||||||
|
|
||||||
Set `enabled: false` in the `ntfy:` section of `config.yaml`, or remove the env vars from `.env`.
|
|
||||||
|
|
@ -1,43 +0,0 @@
|
||||||
# Scheduler Refactor Notes
|
|
||||||
|
|
||||||
## Issue: AutoCora Single-Day Window (found 2026-02-27)
|
|
||||||
|
|
||||||
**Symptom:** Task `86b8grf16` ("LINKS - anti vibration rubber mounts", due Feb 18) has been sitting in "to do" forever with no Cora report generated.
|
|
||||||
|
|
||||||
**Root cause:** `_find_qualifying_tasks()` in `tools/autocora.py` filters tasks to **exactly one calendar day** (the `target_date`, which defaults to today). The scheduler calls this daily with `today`:
|
|
||||||
|
|
||||||
```python
|
|
||||||
today = datetime.now(UTC).strftime("%Y-%m-%d")
|
|
||||||
result = submit_autocora_jobs(target_date=today, ctx=ctx)
|
|
||||||
```
|
|
||||||
|
|
||||||
If CheddahBot isn't running on the task's due date (or the DB is empty/wiped), the task is **permanently orphaned** — no catch-up, no retry, no visibility.
|
|
||||||
|
|
||||||
**Affected task types:** All three `cora_categories` — Link Building, On Page Optimization, Content Creation.
|
|
||||||
|
|
||||||
**What needs to change:** Auto-submit should also pick up overdue tasks (due date in the past, still "to do", no existing AutoCora job in KV store).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Empty Database State (found 2026-02-27)
|
|
||||||
|
|
||||||
`cheddahbot.db` has zero rows in all tables (kv_store, notifications, scheduled_tasks, etc.). Either fresh DB or wiped. This means:
|
|
||||||
- No task state tracking is happening
|
|
||||||
- No AutoCora job submissions are recorded
|
|
||||||
- Folder watcher has no history
|
|
||||||
- All loops show no `last_run` timestamps
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Context: Claude Scheduled Tasks
|
|
||||||
|
|
||||||
Claude released scheduled tasks (2026-02-26). Need to evaluate whether parts of CheddahBot's scheduler (heartbeat, poll loop, ClickUp polling, folder watchers, AutoCora) could be replaced or augmented by Claude's native scheduling.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Additional Issues to Investigate
|
|
||||||
|
|
||||||
- [ ] `auto_execute: false` on Link Building — is this intentional given the folder-watcher pipeline?
|
|
||||||
- [ ] Folder watcher at `Z:/cora-inbox` — does this path stay accessible?
|
|
||||||
- [ ] No dashboard/UI surfacing "tasks waiting for action" — stuck tasks are invisible
|
|
||||||
- [ ] AutoCora loop waits 30s before first poll, then runs every 5min — but auto-submit only checks today's tasks each cycle (redundant repeated calls)
|
|
||||||
|
|
@ -1,417 +0,0 @@
|
||||||
# CheddahBot Task Pipeline Flows — Complete Reference
|
|
||||||
|
|
||||||
## ClickUp Statuses Used
|
|
||||||
|
|
||||||
These are the ClickUp task statuses that CheddahBot reads and writes:
|
|
||||||
|
|
||||||
| Status | Set By | Meaning |
|
|
||||||
|--------|--------|---------|
|
|
||||||
| `to do` | Human (or default) | Task is waiting to be picked up |
|
|
||||||
| `automation underway` | CheddahBot | Bot is actively working on this task |
|
|
||||||
| `running cora` | CheddahBot (AutoCora) | Cora report is being generated by external worker |
|
|
||||||
| `outline review` | CheddahBot (Content) | Phase 1 outline is ready for human review |
|
|
||||||
| `outline approved` | Human | Human reviewed the outline, ready for Phase 2 |
|
|
||||||
| `pr needs review` | CheddahBot (Press Release) | Press release pipeline finished, PRs ready for human review |
|
|
||||||
| `internal review` | CheddahBot (Content/OPT) | Content/OPT pipeline finished, deliverables ready for human review |
|
|
||||||
| `complete` | CheddahBot (Link Building) | Pipeline fully done |
|
|
||||||
| `error` | CheddahBot | Something failed, needs attention |
|
|
||||||
| `in progress` | (configured but not used in automation) | — |
|
|
||||||
|
|
||||||
**What CheddahBot polls for:** `["to do", "outline approved"]` (config.yaml line 45)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## ClickUp Custom Fields Used
|
|
||||||
|
|
||||||
| Field Name | Type | Used By | What It Holds |
|
|
||||||
|------------|------|---------|---------------|
|
|
||||||
| `Work Category` | Dropdown | All pipelines | Determines which pipeline runs: "Press Release", "Link Building", "On Page Optimization", "Content Creation" |
|
|
||||||
| `PR Topic` | Text | Press Release | Press release topic/keyword (e.g. "Peek Plastic") — required |
|
|
||||||
| `Customer` | Text | Press Release | Client/company name — required |
|
|
||||||
| `Keyword` | Text | Link Building, Content, OPT | Target SEO keyword |
|
|
||||||
| `IMSURL` | Text | All pipelines | Target page URL (money site) — required for Press Release |
|
|
||||||
| `SocialURL` | Text | Press Release | Branded/social URL for the PR |
|
|
||||||
| `LB Method` | Dropdown | Link Building | "Cora Backlinks" or other methods |
|
|
||||||
| `CustomAnchors` | Text | Link Building | Custom anchor text overrides |
|
|
||||||
| `BrandedPlusRatio` | Number | Link Building | Ratio for branded anchors (default 0.7) |
|
|
||||||
| `CLIFlags` | Text | Link Building, Content, OPT | Extra flags passed to tools (e.g., "service") |
|
|
||||||
| `CoraFile` | Text | Link Building | Path to Cora xlsx file |
|
|
||||||
|
|
||||||
**Tags:** Tasks are tagged with month in `mmmyy` format (e.g., `feb26`, `mar26`).
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Background Threads
|
|
||||||
|
|
||||||
CheddahBot runs 6 daemon threads. All start at boot and run until shutdown.
|
|
||||||
|
|
||||||
| Thread | Interval | What It Does |
|
|
||||||
|--------|----------|-------------|
|
|
||||||
| **poll** | 60 seconds | Runs cron-scheduled tasks from the database |
|
|
||||||
| **heartbeat** | 30 minutes | Reads HEARTBEAT.md checklist, takes action if needed |
|
|
||||||
| **clickup** | 20 minutes | Polls ClickUp for tasks to auto-execute (only Press Releases currently) |
|
|
||||||
| **folder_watch** | 40 minutes | Scans `//PennQnap1/SHARE1/cora-inbox` for .xlsx files → triggers Link Building |
|
|
||||||
| **autocora** | 5 minutes | Submits Cora jobs for today's tasks + polls for results |
|
|
||||||
| **content_watch** | 40 minutes | Scans `//PennQnap1/SHARE1/content-cora-inbox` for .xlsx files → triggers Content/OPT Phase 1 |
|
|
||||||
| **cora_distribute** | 40 minutes | Scans `//PennQnap1/SHARE1/Cora-For-Human` for .xlsx files → distributes to pipeline inboxes |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Pipeline 1: PRESS RELEASE
|
|
||||||
|
|
||||||
**Work Category:** "Press Release"
|
|
||||||
**auto_execute:** TRUE — the only pipeline that runs automatically from ClickUp polling
|
|
||||||
**Tool:** `write_press_releases`
|
|
||||||
|
|
||||||
### Flow
|
|
||||||
|
|
||||||
```
|
|
||||||
CLICKUP POLL (every 20 min)
|
|
||||||
│
|
|
||||||
├─ Finds task with Work Category = "Press Release", status = "to do", due within 3 weeks
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
CHECK LOCAL DB
|
|
||||||
│ Key: clickup:task:{id}:state
|
|
||||||
│ If state = "executing" or "completed" or "failed" → SKIP (already handled)
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET STATUS → "automation underway"
|
|
||||||
│ ClickUp API: PUT /task/{id} status
|
|
||||||
│ Local DB: state = "executing"
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
STEP 1: Generate 7 Headlines (chat brain - GPT-4o-mini)
|
|
||||||
│ Uses configured chat model
|
|
||||||
│ Saves to: data/generated/press_releases/{company}/{slug}_headlines.txt
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
STEP 2: AI Judge Picks Best 2 (chat brain)
|
|
||||||
│ Filters out rule-violating headlines (colons, superlatives, etc.)
|
|
||||||
│ Falls back to first 2 if judge returns < 2
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
STEP 3: Write 2 Full Press Releases (execution brain - Claude Code CLI)
|
|
||||||
│ For each winning headline:
|
|
||||||
│ - Claude writes full 575-800 word PR
|
|
||||||
│ - Validates anchor phrase
|
|
||||||
│ - Saves .txt and .docx
|
|
||||||
│ - Uploads .docx to ClickUp as attachment
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
STEP 4: Generate JSON-LD Schemas (execution brain - Sonnet)
|
|
||||||
│ For each PR:
|
|
||||||
│ - Generates NewsArticle schema
|
|
||||||
│ - Saves .json file
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET STATUS → "internal review"
|
|
||||||
│ ClickUp API: comment with results + PUT status
|
|
||||||
│ Local DB: state = "completed"
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
DONE — Human reviews in ClickUp
|
|
||||||
```
|
|
||||||
|
|
||||||
### ClickUp Fields Read
|
|
||||||
- `PR Topic` → press release topic/keyword (required)
|
|
||||||
- `Customer` → company name in PR (required)
|
|
||||||
- `IMSURL` → target URL for anchor link (required)
|
|
||||||
- `SocialURL` → branded URL (optional)
|
|
||||||
|
|
||||||
### What Can Go Wrong
|
|
||||||
- **BUG: Crash mid-step → stuck forever.** DB says "executing", never retries. Manual reset needed.
|
|
||||||
- **BUG: DB says "completed" but ClickUp API failed → out of sync.** DB written before API call.
|
|
||||||
- **BUG: Attachment upload fails silently.** Task marked complete, files missing from ClickUp.
|
|
||||||
- Headline generation returns empty → tool exits with error, task marked "failed"
|
|
||||||
- Schema JSON invalid → warning logged but task still completes
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Pipeline 2: LINK BUILDING (Cora Backlinks)
|
|
||||||
|
|
||||||
**Work Category:** "Link Building"
|
|
||||||
**auto_execute:** FALSE — triggered by folder watcher, not ClickUp polling
|
|
||||||
**Tool:** `run_cora_backlinks`
|
|
||||||
|
|
||||||
### Full Lifecycle (3 stages)
|
|
||||||
|
|
||||||
```
|
|
||||||
STAGE A: AUTOCORA SUBMITS CORA JOB
|
|
||||||
══════════════════════════════════
|
|
||||||
|
|
||||||
AUTOCORA LOOP (every 5 min)
|
|
||||||
│
|
|
||||||
├─ Calls submit_autocora_jobs(target_date = today)
|
|
||||||
│ Finds tasks: Work Category in ["Link Building", "On Page Optimization", "Content Creation"]
|
|
||||||
│ status = "to do"
|
|
||||||
│ due date = TODAY (exact 24h window) ← ★ BUG: misses overdue tasks
|
|
||||||
│
|
|
||||||
├─ Groups tasks by Keyword (case-insensitive)
|
|
||||||
│ If same keyword across multiple tasks → one job covers all
|
|
||||||
│
|
|
||||||
├─ For each keyword group:
|
|
||||||
│ Check local DB: autocora:job:{keyword_lower}
|
|
||||||
│ If already submitted → SKIP
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
WRITE JOB FILE
|
|
||||||
│ Path: //PennQnap1/SHARE1/AutoCora/jobs/{job-id}.json
|
|
||||||
│ Content: {"keyword": "...", "url": "IMSURL", "task_ids": ["id1", "id2"]}
|
|
||||||
│ Local DB: autocora:job:{keyword} = {status: "submitted", job_id: "..."}
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET ALL TASK STATUSES → "automation underway"
|
|
||||||
|
|
||||||
|
|
||||||
STAGE B: EXTERNAL WORKER RUNS CORA (not CheddahBot code)
|
|
||||||
═════════════════════════════════════════════════════════
|
|
||||||
|
|
||||||
Worker on another machine:
|
|
||||||
│ Watches //PennQnap1/SHARE1/AutoCora/jobs/
|
|
||||||
│ Picks up .json, runs Cora SEO tool
|
|
||||||
│ Writes .xlsx report to Z:/cora-inbox/ ← auto-deposited
|
|
||||||
│ Writes //PennQnap1/SHARE1/AutoCora/results/{job-id}.result = "SUCCESS" or "FAILURE: reason"
|
|
||||||
|
|
||||||
|
|
||||||
STAGE C: AUTOCORA POLLS FOR RESULTS
|
|
||||||
════════════════════════════════════
|
|
||||||
|
|
||||||
AUTOCORA LOOP (every 5 min)
|
|
||||||
│
|
|
||||||
├─ Scans local DB for autocora:job:* with status = "submitted"
|
|
||||||
│ For each: checks if results/{job-id}.result exists
|
|
||||||
│
|
|
||||||
├─ If SUCCESS:
|
|
||||||
│ Local DB: status = "completed"
|
|
||||||
│ ClickUp: all task_ids → status = "running cora"
|
|
||||||
│ ClickUp: comment "Cora report completed for keyword: ..."
|
|
||||||
│
|
|
||||||
├─ If FAILURE:
|
|
||||||
│ Local DB: status = "failed"
|
|
||||||
│ ClickUp: all task_ids → status = "error"
|
|
||||||
│ ClickUp: comment with failure reason
|
|
||||||
│
|
|
||||||
└─ If no result file yet: skip, check again in 5 min
|
|
||||||
|
|
||||||
|
|
||||||
STAGE D: FOLDER WATCHER TRIGGERS LINK BUILDING
|
|
||||||
═══════════════════════════════════════════════
|
|
||||||
|
|
||||||
FOLDER WATCHER (every 60 min)
|
|
||||||
│
|
|
||||||
├─ Scans Z:/cora-inbox/ for .xlsx files
|
|
||||||
│ Skips: ~$ temp files, already-completed files (via local DB)
|
|
||||||
│
|
|
||||||
├─ For each new .xlsx:
|
|
||||||
│ Normalize filename: "anti-vibration-rubber-mounts.xlsx" → "anti vibration rubber mounts"
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
MATCH TO CLICKUP TASK
|
|
||||||
│ Queries all tasks in space with Work Category = "Link Building"
|
|
||||||
│ Fuzzy matches Keyword field against normalized filename:
|
|
||||||
│ - Exact match
|
|
||||||
│ - Substring match (either direction)
|
|
||||||
│ - >80% word overlap
|
|
||||||
│
|
|
||||||
├─ NO MATCH → local DB: status = "unmatched", notification sent, retry next scan
|
|
||||||
│
|
|
||||||
├─ MATCH FOUND but IMSURL empty → local DB: status = "blocked", ClickUp → "error"
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET STATUS → "automation underway"
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
STEP 1: Ingest CORA Report (Big-Link-Man subprocess)
|
|
||||||
│ Runs: E:/dev/Big-Link-Man/.venv/Scripts/python.exe main.py ingest-cora -f {xlsx} -n {keyword} ...
|
|
||||||
│ BLM parses xlsx, creates project, writes job file
|
|
||||||
│ Timeout: 30 minutes
|
|
||||||
│ ClickUp: comment "CORA report ingested. Project ID: ..."
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
STEP 2: Generate Content Batch (Big-Link-Man subprocess)
|
|
||||||
│ Runs: python main.py generate-batch -j {job_file} --continue-on-error
|
|
||||||
│ BLM generates content for each prospect
|
|
||||||
│ Moves job file to jobs/done/
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET STATUS → "complete"
|
|
||||||
│ ClickUp: comment with results
|
|
||||||
│ Move .xlsx to Z:/cora-inbox/processed/
|
|
||||||
│ Local DB: linkbuilding:watched:{filename} = {status: "completed"}
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
DONE
|
|
||||||
```
|
|
||||||
|
|
||||||
### ClickUp Fields Read
|
|
||||||
- `Keyword` → matches against .xlsx filename + used as project name
|
|
||||||
- `IMSURL` → money site URL (required)
|
|
||||||
- `LB Method` → must be "Cora Backlinks" or empty
|
|
||||||
- `CustomAnchors`, `BrandedPlusRatio`, `CLIFlags` → passed to BLM
|
|
||||||
|
|
||||||
### What Can Go Wrong
|
|
||||||
- **BUG: AutoCora only checks today's tasks.** Due date missed = never gets a Cora report.
|
|
||||||
- **BUG: Crash mid-step → stuck "executing".** Same as PR pipeline.
|
|
||||||
- No ClickUp task with matching Keyword → file sits unmatched, notification sent
|
|
||||||
- IMSURL empty → blocked, ClickUp set to "error"
|
|
||||||
- BLM subprocess timeout (30 min) or crash → task fails
|
|
||||||
- Network share offline → can't write job file or read results
|
|
||||||
|
|
||||||
### Retry Behavior
|
|
||||||
- "processing", "blocked", "unmatched" .xlsx files → retried on next scan (KV entry deleted)
|
|
||||||
- "completed", "failed" → never retried
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Pipeline 3: CONTENT CREATION
|
|
||||||
|
|
||||||
**Work Category:** "Content Creation"
|
|
||||||
**auto_execute:** FALSE — triggered by content folder watcher
|
|
||||||
**Tool:** `create_content` (two-phase)
|
|
||||||
|
|
||||||
### Flow
|
|
||||||
|
|
||||||
```
|
|
||||||
STAGE A: AUTOCORA SUBMITS CORA JOB (same as Link Building Stage A)
|
|
||||||
══════════════════════════════════════════════════════════════════
|
|
||||||
Same AutoCora loop, same BUG with today-only filtering.
|
|
||||||
Worker generates .xlsx → deposits in Z:/content-cora-inbox/
|
|
||||||
|
|
||||||
|
|
||||||
STAGE B: CONTENT WATCHER TRIGGERS PHASE 1
|
|
||||||
══════════════════════════════════════════
|
|
||||||
|
|
||||||
CONTENT WATCHER (every 60 min)
|
|
||||||
│
|
|
||||||
├─ Scans Z:/content-cora-inbox/ for .xlsx files
|
|
||||||
│ Same skip/retry logic as link building watcher
|
|
||||||
│
|
|
||||||
├─ Normalize filename, fuzzy match to ClickUp task
|
|
||||||
│ Matches: Work Category in ["Content Creation", "On Page Optimization"]
|
|
||||||
│
|
|
||||||
├─ NO MATCH → "unmatched", notification
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
PHASE 1: Research + Outline (execution brain - Claude Code CLI)
|
|
||||||
│
|
|
||||||
│ ★ BUG: Does NOT set "automation underway" status (link building watcher does)
|
|
||||||
│
|
|
||||||
│ Build prompt based on content type:
|
|
||||||
│ - If IMSURL present → "optimize existing page" (scrape it, analyze, outline improvements)
|
|
||||||
│ - If IMSURL empty → "new content" (competitor research, outline from scratch)
|
|
||||||
│ - If Cora .xlsx found → "use this Cora report for keyword targets and entities"
|
|
||||||
│ - If CLIFlags contains "service" → includes service page template
|
|
||||||
│
|
|
||||||
│ Claude Code runs: web searches, scrapes competitors, reads Cora report
|
|
||||||
│ Generates outline with entity recommendations
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SAVE OUTLINE
|
|
||||||
│ Path: Z:/content-outlines/{keyword-slug}/outline.md
|
|
||||||
│ Local DB: clickup:task:{id}:state = {state: "outline_review", outline_path: "..."}
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET STATUS → "outline review"
|
|
||||||
│ ClickUp: comment "Outline ready for review"
|
|
||||||
│
|
|
||||||
│ ★ BUG: .xlsx NOT moved to processed/ (link building watcher moves files)
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
WAITING FOR HUMAN
|
|
||||||
│ Human opens outline at Z:/content-outlines/{slug}/outline.md
|
|
||||||
│ Human edits/approves
|
|
||||||
│ Human moves ClickUp task to "outline approved"
|
|
||||||
|
|
||||||
|
|
||||||
STAGE C: CLICKUP POLL TRIGGERS PHASE 2
|
|
||||||
═══════════════════════════════════════
|
|
||||||
|
|
||||||
CLICKUP POLL (every 20 min)
|
|
||||||
│
|
|
||||||
├─ Finds task with status = "outline approved" (in poll_statuses list)
|
|
||||||
│
|
|
||||||
├─ Check local DB: clickup:task:{id}:state
|
|
||||||
│ Sees state = "outline_review" → this means Phase 2 is ready
|
|
||||||
│ ★ BUG: If DB was wiped, no entry → runs Phase 1 AGAIN, overwrites outline
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
PHASE 2: Write Full Content (execution brain - Claude Code CLI)
|
|
||||||
│
|
|
||||||
│ Reads outline from path stored in local DB (outline_path)
|
|
||||||
│ ★ BUG: If outline file was deleted → Phase 2 fails every time, no recovery
|
|
||||||
│
|
|
||||||
│ Claude Code writes full content using the approved outline
|
|
||||||
│ Includes entity optimization, keyword density targets from Cora
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SAVE FINAL CONTENT
|
|
||||||
│ Path: Z:/content-outlines/{keyword-slug}/final-content.md
|
|
||||||
│ Local DB: state = "completed"
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
SET STATUS → "internal review"
|
|
||||||
│ ClickUp: comment with content path
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
DONE — Human reviews final content
|
|
||||||
```
|
|
||||||
|
|
||||||
### ClickUp Fields Read
|
|
||||||
- `Keyword` → target keyword, used for Cora matching and content generation
|
|
||||||
- `IMSURL` → if present = optimization, if empty = new content
|
|
||||||
- `CLIFlags` → hints like "service" for service page template
|
|
||||||
|
|
||||||
### What Can Go Wrong
|
|
||||||
- **BUG: AutoCora only checks today → Cora report never generated for overdue tasks**
|
|
||||||
- **BUG: DB wipe → Phase 2 reruns Phase 1, destroys approved outline**
|
|
||||||
- **BUG: Outline file deleted → Phase 2 permanently fails**
|
|
||||||
- **BUG: No "automation underway" set during Phase 1 from watcher**
|
|
||||||
- **BUG: .xlsx not moved to processed/**
|
|
||||||
- Network share offline → can't save outline or read it back
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Pipeline 4: ON PAGE OPTIMIZATION
|
|
||||||
|
|
||||||
**Work Category:** "On Page Optimization"
|
|
||||||
**auto_execute:** FALSE
|
|
||||||
**Tool:** `create_content` (same as Content Creation)
|
|
||||||
|
|
||||||
### Flow
|
|
||||||
|
|
||||||
Identical to Content Creation except:
|
|
||||||
- Phase 1 prompt says "optimize existing page at {IMSURL}" instead of "create new content"
|
|
||||||
- Phase 1 scrapes the existing page first, then builds optimization outline
|
|
||||||
- IMSURL is always present (it's the page being optimized)
|
|
||||||
|
|
||||||
Same bugs apply.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## The Local DB (KV Store) — What It Tracks
|
|
||||||
|
|
||||||
| Key Pattern | What It Stores | Read By | Actually Needed? |
|
|
||||||
|---|---|---|---|
|
|
||||||
| `clickup:task:{id}:state` | Full task execution state (status, timestamps, outline_path, errors) | ClickUp poll dedup check, Phase 2 detection | **PARTIALLY** — outline_path is needed for Phase 2, but dedup could use ClickUp status instead |
|
|
||||||
| `autocora:job:{keyword}` | Job submission tracking (job_id, status, task_ids) | AutoCora result poller | **YES** — maps keyword to job_id for result file lookup |
|
|
||||||
| `linkbuilding:watched:{filename}` | File processing state (processing/completed/failed/unmatched/blocked) | Folder watcher scan | **YES** — prevents re-processing files |
|
|
||||||
| `content:watched:{filename}` | Same as above for content files | Content watcher scan | **YES** — prevents re-processing |
|
|
||||||
| `pipeline:status` | Current step text for UI ("Step 2/4: Judging...") | Gradio UI polling | **NO** — just a display string, could be in-memory |
|
|
||||||
| `linkbuilding:status` | Same for link building UI | Gradio UI polling | **NO** — same |
|
|
||||||
| `system:loop:*:last_run` (x6) | Timestamp of last loop run | Dashboard API | **NO** — informational only, never used in logic |
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Summary of All Bugs
|
|
||||||
|
|
||||||
| # | Bug | Severity | Pipelines Affected |
|
|
||||||
|---|-----|----------|-------------------|
|
|
||||||
| 1 | AutoCora only submits for today's due date | HIGH | Link Building, Content, OPT |
|
|
||||||
| 2 | DB wipe → Phase 2 reruns Phase 1 | HIGH | Content, OPT |
|
|
||||||
| 3 | Stuck "executing" after crash, no recovery | HIGH | All 4 |
|
|
||||||
| 4 | Content watcher missing "automation underway" | MEDIUM | Content, OPT |
|
|
||||||
| 5 | Content watcher doesn't move .xlsx to processed/ | MEDIUM | Content, OPT |
|
|
||||||
| 6 | KV written before ClickUp API → out of sync | MEDIUM | All 4 |
|
|
||||||
| 7 | Silent attachment upload failures | MEDIUM | Press Release |
|
|
||||||
| 8 | Phase 2 fails permanently if outline file gone | LOW | Content, OPT |
|
|
||||||
|
|
@ -15,10 +15,6 @@ dependencies = [
|
||||||
"croniter>=2.0",
|
"croniter>=2.0",
|
||||||
"edge-tts>=6.1",
|
"edge-tts>=6.1",
|
||||||
"python-docx>=1.2.0",
|
"python-docx>=1.2.0",
|
||||||
"openpyxl>=3.1.5",
|
|
||||||
"jinja2>=3.1.6",
|
|
||||||
"python-multipart>=0.0.22",
|
|
||||||
"sse-starlette>=3.3.3",
|
|
||||||
]
|
]
|
||||||
|
|
||||||
[build-system]
|
[build-system]
|
||||||
|
|
|
||||||
|
|
@ -1,94 +0,0 @@
|
||||||
"""Query ClickUp 'to do' tasks tagged feb26 in OPT/LINKS/Content categories."""
|
|
||||||
|
|
||||||
import sys
|
|
||||||
from datetime import datetime, timezone
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
sys.stdout.reconfigure(line_buffering=True)
|
|
||||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|
||||||
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
CATEGORY_PREFIXES = ("opt", "link", "content", "ai content")
|
|
||||||
TAG_FILTER = "feb26"
|
|
||||||
|
|
||||||
|
|
||||||
def ms_to_date(ms_str: str) -> str:
|
|
||||||
if not ms_str:
|
|
||||||
return "—"
|
|
||||||
try:
|
|
||||||
ts = int(ms_str) / 1000
|
|
||||||
return datetime.fromtimestamp(ts, tz=timezone.utc).strftime("%m/%d")
|
|
||||||
except (ValueError, OSError):
|
|
||||||
return "—"
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
cfg = load_config()
|
|
||||||
if not cfg.clickup.api_token or not cfg.clickup.space_id:
|
|
||||||
print("ERROR: CLICKUP_API_TOKEN or CLICKUP_SPACE_ID not set.")
|
|
||||||
return
|
|
||||||
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=cfg.clickup.api_token,
|
|
||||||
workspace_id=cfg.clickup.workspace_id,
|
|
||||||
task_type_field_name=cfg.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
|
||||||
# Fetch all 'to do' tasks across the space
|
|
||||||
tasks = client.get_tasks_from_space(cfg.clickup.space_id, statuses=["to do"])
|
|
||||||
|
|
||||||
# Filter by feb26 tag
|
|
||||||
tagged = [t for t in tasks if TAG_FILTER in [tag.lower() for tag in t.tags]]
|
|
||||||
|
|
||||||
if not tagged:
|
|
||||||
all_tags = set()
|
|
||||||
for t in tasks:
|
|
||||||
all_tags.update(t.tags)
|
|
||||||
print(f"No tasks with tag '{TAG_FILTER}'. Tags seen: {sorted(all_tags)}")
|
|
||||||
print(f"Total 'to do' tasks found: {len(tasks)}")
|
|
||||||
return
|
|
||||||
|
|
||||||
# Filter to OPT/LINKS/Content categories (by task name, Work Category, or list name)
|
|
||||||
def is_target_category(t):
|
|
||||||
name_lower = t.name.lower().strip()
|
|
||||||
wc = (t.custom_fields.get("Work Category") or "").lower()
|
|
||||||
ln = (t.list_name or "").lower()
|
|
||||||
for prefix in CATEGORY_PREFIXES:
|
|
||||||
if name_lower.startswith(prefix) or prefix in wc or prefix in ln:
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
filtered = [t for t in tagged if is_target_category(t)]
|
|
||||||
skipped = [t for t in tagged if not is_target_category(t)]
|
|
||||||
|
|
||||||
# Sort by due date (oldest first), tasks with no due date go last
|
|
||||||
filtered.sort(key=lambda t: int(t.due_date) if t.due_date else float("inf"))
|
|
||||||
|
|
||||||
top = filtered[:10]
|
|
||||||
|
|
||||||
# Build table
|
|
||||||
print(f"feb26-tagged 'to do' tasks — OPT / LINKS / Content (top 10, oldest first)")
|
|
||||||
print(f"\n{'#':>2} | {'ID':<11} | {'Keyword/Name':<50} | {'Due':<6} | {'Customer':<25} | Tags")
|
|
||||||
print("-" * 120)
|
|
||||||
for i, t in enumerate(top, 1):
|
|
||||||
customer = t.custom_fields.get("Customer", "") or "—"
|
|
||||||
due = ms_to_date(t.due_date)
|
|
||||||
tags = ", ".join(t.tags)
|
|
||||||
name = t.name[:50]
|
|
||||||
print(f"{i:>2} | {t.id:<11} | {name:<50} | {due:<6} | {customer:<25} | {tags}")
|
|
||||||
|
|
||||||
print(f"\nShowing {len(top)} of {len(filtered)} OPT/LINKS/Content tasks ({len(tagged)} total feb26-tagged).")
|
|
||||||
if skipped:
|
|
||||||
print(f"\nSkipped {len(skipped)} non-OPT/LINKS/Content tasks:")
|
|
||||||
for t in skipped:
|
|
||||||
print(f" - {t.name} ({t.id})")
|
|
||||||
|
|
||||||
finally:
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,120 +0,0 @@
|
||||||
"""Query ClickUp 'to do' tasks tagged feb26 in OPT/LINKS/Content categories."""
|
|
||||||
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
from datetime import datetime, timezone
|
|
||||||
|
|
||||||
# Add project root to path
|
|
||||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|
||||||
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
|
|
||||||
def ms_to_date(ms_str: str) -> str:
|
|
||||||
"""Convert Unix-ms timestamp string to YYYY-MM-DD."""
|
|
||||||
if not ms_str:
|
|
||||||
return "—"
|
|
||||||
try:
|
|
||||||
ts = int(ms_str) / 1000
|
|
||||||
return datetime.fromtimestamp(ts, tz=timezone.utc).strftime("%Y-%m-%d")
|
|
||||||
except (ValueError, OSError):
|
|
||||||
return "—"
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
cfg = load_config()
|
|
||||||
if not cfg.clickup.api_token or not cfg.clickup.space_id:
|
|
||||||
print("ERROR: CLICKUP_API_TOKEN or CLICKUP_SPACE_ID not set.")
|
|
||||||
return
|
|
||||||
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=cfg.clickup.api_token,
|
|
||||||
workspace_id=cfg.clickup.workspace_id,
|
|
||||||
task_type_field_name=cfg.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Step 1: Get folders, find OPT/LINKS/Content
|
|
||||||
target_folders = {"opt", "links", "content"}
|
|
||||||
try:
|
|
||||||
folders = client.get_folders(cfg.clickup.space_id)
|
|
||||||
except Exception as e:
|
|
||||||
print(f"ERROR fetching folders: {e}")
|
|
||||||
client.close()
|
|
||||||
return
|
|
||||||
|
|
||||||
print(f"All folders: {[f['name'] for f in folders]}")
|
|
||||||
|
|
||||||
matched_lists = [] # (list_id, list_name, folder_name)
|
|
||||||
for folder in folders:
|
|
||||||
if folder["name"].lower() in target_folders:
|
|
||||||
for lst in folder["lists"]:
|
|
||||||
matched_lists.append((lst["id"], lst["name"], folder["name"]))
|
|
||||||
|
|
||||||
if not matched_lists:
|
|
||||||
print(f"No folders matching {target_folders}. Falling back to full space scan.")
|
|
||||||
try:
|
|
||||||
tasks = client.get_tasks_from_space(cfg.clickup.space_id, statuses=["to do"])
|
|
||||||
finally:
|
|
||||||
client.close()
|
|
||||||
else:
|
|
||||||
print(f"Querying lists: {[(ln, fn) for _, ln, fn in matched_lists]}")
|
|
||||||
tasks = []
|
|
||||||
for list_id, list_name, folder_name in matched_lists:
|
|
||||||
try:
|
|
||||||
batch = client.get_tasks(list_id, statuses=["to do"])
|
|
||||||
# Stash folder name on each task for display
|
|
||||||
for t in batch:
|
|
||||||
t._folder = folder_name
|
|
||||||
tasks.extend(batch)
|
|
||||||
except Exception as e:
|
|
||||||
print(f" Error fetching {list_name}: {e}")
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
print(f"Total 'to do' tasks from target folders: {len(tasks)}")
|
|
||||||
|
|
||||||
# Filter by "feb26" tag (case-insensitive)
|
|
||||||
tagged = [t for t in tasks if any(tag.lower() == "feb26" for tag in t.tags)]
|
|
||||||
|
|
||||||
if not tagged:
|
|
||||||
print(f"No 'to do' tasks with 'feb26' tag found.")
|
|
||||||
all_tags = set()
|
|
||||||
for t in tasks:
|
|
||||||
all_tags.update(t.tags)
|
|
||||||
print(f"Tags found across all to-do tasks: {sorted(all_tags)}")
|
|
||||||
return
|
|
||||||
|
|
||||||
filtered = tagged
|
|
||||||
|
|
||||||
# Sort by due date (oldest first), tasks without due date go last
|
|
||||||
def sort_key(t):
|
|
||||||
if t.due_date:
|
|
||||||
return (0, int(t.due_date))
|
|
||||||
return (1, 0)
|
|
||||||
|
|
||||||
filtered.sort(key=sort_key)
|
|
||||||
|
|
||||||
# Take top 10
|
|
||||||
top10 = filtered[:10]
|
|
||||||
|
|
||||||
# Build table
|
|
||||||
print(f"\n## ClickUp 'to do' — feb26 tag — OPT/LINKS/Content ({len(filtered)} total, showing top 10)\n")
|
|
||||||
print(f"{'#':<3} | {'ID':<12} | {'Keyword/Name':<40} | {'Due':<12} | {'Customer':<20} | Tags")
|
|
||||||
print(f"{'—'*3} | {'—'*12} | {'—'*40} | {'—'*12} | {'—'*20} | {'—'*15}")
|
|
||||||
|
|
||||||
for i, t in enumerate(top10, 1):
|
|
||||||
customer = t.custom_fields.get("Customer", "") or "—"
|
|
||||||
due = ms_to_date(t.due_date)
|
|
||||||
tags = ", ".join(t.tags) if t.tags else "—"
|
|
||||||
name = t.name[:38] + ".." if len(t.name) > 40 else t.name
|
|
||||||
print(f"{i:<3} | {t.id:<12} | {name:<40} | {due:<12} | {customer:<20} | {tags}")
|
|
||||||
|
|
||||||
print(f"\nCategory breakdown:")
|
|
||||||
from collections import Counter
|
|
||||||
cats = Counter(t.task_type for t in filtered)
|
|
||||||
for cat, count in cats.most_common():
|
|
||||||
print(f" {cat or '(none)'}: {count}")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,184 +0,0 @@
|
||||||
"""CLI script to create a ClickUp task in a client's Overall list.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run python scripts/create_clickup_task.py --name "Task" --client "Client"
|
|
||||||
uv run python scripts/create_clickup_task.py --name "LB" --client "Acme" \\
|
|
||||||
--category "Link Building" --due-date 2026-03-11 --tag feb26 \\
|
|
||||||
--field "Keyword=some keyword" --field "CLIFlags=--tier1-count 5"
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import json
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Add project root to path so we can import cheddahbot
|
|
||||||
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
DEFAULT_ASSIGNEE = 10765627 # Bryan Bigari
|
|
||||||
|
|
||||||
|
|
||||||
def _date_to_unix_ms(date_str: str) -> int:
|
|
||||||
"""Convert YYYY-MM-DD to Unix milliseconds (noon UTC).
|
|
||||||
|
|
||||||
Noon UTC ensures the date displays correctly in US timezones.
|
|
||||||
"""
|
|
||||||
dt = datetime.strptime(date_str, "%Y-%m-%d").replace(
|
|
||||||
hour=12, tzinfo=UTC
|
|
||||||
)
|
|
||||||
return int(dt.timestamp() * 1000)
|
|
||||||
|
|
||||||
|
|
||||||
def _parse_time_estimate(s: str) -> int:
|
|
||||||
"""Parse a human time string like '2h', '30m', '1h30m' to ms."""
|
|
||||||
import re
|
|
||||||
|
|
||||||
total_min = 0
|
|
||||||
match = re.match(r"(?:(\d+)h)?(?:(\d+)m)?$", s.strip())
|
|
||||||
if not match or not any(match.groups()):
|
|
||||||
raise ValueError(f"Invalid time estimate: '{s}' (use e.g. '2h', '30m', '1h30m')")
|
|
||||||
if match.group(1):
|
|
||||||
total_min += int(match.group(1)) * 60
|
|
||||||
if match.group(2):
|
|
||||||
total_min += int(match.group(2))
|
|
||||||
return total_min * 60 * 1000
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
load_dotenv()
|
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(description="Create a ClickUp task")
|
|
||||||
parser.add_argument("--name", required=True, help="Task name")
|
|
||||||
parser.add_argument(
|
|
||||||
"--client", required=True, help="Client folder name"
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--category", default="", help="Work Category dropdown value"
|
|
||||||
)
|
|
||||||
parser.add_argument("--description", default="", help="Task description")
|
|
||||||
parser.add_argument(
|
|
||||||
"--status", default="to do", help="Initial status (default: 'to do')"
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--due-date", default="", help="Due date as YYYY-MM-DD"
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--tag", action="append", default=[], help="Tag (mmmYY, repeatable)"
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--field",
|
|
||||||
action="append",
|
|
||||||
default=[],
|
|
||||||
help="Custom field as Name=Value (repeatable)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--priority",
|
|
||||||
type=int,
|
|
||||||
default=2,
|
|
||||||
help="Priority: 1=Urgent, 2=High, 3=Normal, 4=Low (default: 2)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--assignee",
|
|
||||||
type=int,
|
|
||||||
action="append",
|
|
||||||
default=[],
|
|
||||||
help="ClickUp user ID (default: Bryan 10765627)",
|
|
||||||
)
|
|
||||||
parser.add_argument(
|
|
||||||
"--time-estimate",
|
|
||||||
default="",
|
|
||||||
help="Time estimate (e.g. '2h', '30m', '1h30m')",
|
|
||||||
)
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
api_token = os.environ.get("CLICKUP_API_TOKEN", "")
|
|
||||||
space_id = os.environ.get("CLICKUP_SPACE_ID", "")
|
|
||||||
|
|
||||||
if not api_token:
|
|
||||||
print("Error: CLICKUP_API_TOKEN not set", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
if not space_id:
|
|
||||||
print("Error: CLICKUP_SPACE_ID not set", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
# Parse custom fields
|
|
||||||
custom_fields: dict[str, str] = {}
|
|
||||||
for f in args.field:
|
|
||||||
if "=" not in f:
|
|
||||||
print(f"Error: --field must be Name=Value, got: {f}", file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
name, value = f.split("=", 1)
|
|
||||||
custom_fields[name] = value
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token=api_token)
|
|
||||||
try:
|
|
||||||
# Find the client's Overall list
|
|
||||||
list_id = client.find_list_in_folder(space_id, args.client)
|
|
||||||
if not list_id:
|
|
||||||
msg = f"Error: No folder '{args.client}' with 'Overall' list"
|
|
||||||
print(msg, file=sys.stderr)
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
# Build create_task kwargs
|
|
||||||
create_kwargs: dict = {
|
|
||||||
"list_id": list_id,
|
|
||||||
"name": args.name,
|
|
||||||
"description": args.description,
|
|
||||||
"status": args.status,
|
|
||||||
}
|
|
||||||
if args.due_date:
|
|
||||||
create_kwargs["due_date"] = _date_to_unix_ms(args.due_date)
|
|
||||||
if args.tag:
|
|
||||||
create_kwargs["tags"] = args.tag
|
|
||||||
create_kwargs["priority"] = args.priority
|
|
||||||
create_kwargs["assignees"] = args.assignee or [DEFAULT_ASSIGNEE]
|
|
||||||
if args.time_estimate:
|
|
||||||
create_kwargs["time_estimate"] = _parse_time_estimate(
|
|
||||||
args.time_estimate
|
|
||||||
)
|
|
||||||
|
|
||||||
# Create the task
|
|
||||||
result = client.create_task(**create_kwargs)
|
|
||||||
task_id = result.get("id", "")
|
|
||||||
|
|
||||||
# Set Client dropdown field
|
|
||||||
client.set_custom_field_smart(task_id, list_id, "Client", args.client)
|
|
||||||
|
|
||||||
# Set Work Category if provided
|
|
||||||
if args.category:
|
|
||||||
client.set_custom_field_smart(
|
|
||||||
task_id, list_id, "Work Category", args.category
|
|
||||||
)
|
|
||||||
|
|
||||||
# Set any additional custom fields
|
|
||||||
for field_name, field_value in custom_fields.items():
|
|
||||||
ok = client.set_custom_field_smart(
|
|
||||||
task_id, list_id, field_name, field_value
|
|
||||||
)
|
|
||||||
if not ok:
|
|
||||||
print(
|
|
||||||
f"Warning: Failed to set '{field_name}'",
|
|
||||||
file=sys.stderr,
|
|
||||||
)
|
|
||||||
|
|
||||||
print(json.dumps({
|
|
||||||
"id": task_id,
|
|
||||||
"name": args.name,
|
|
||||||
"client": args.client,
|
|
||||||
"url": result.get("url", ""),
|
|
||||||
"status": args.status,
|
|
||||||
}, indent=2))
|
|
||||||
finally:
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,97 +0,0 @@
|
||||||
"""Query ClickUp for feb26-tagged to-do tasks in OPT/LINKS/Content categories."""
|
|
||||||
|
|
||||||
from datetime import datetime, UTC
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
cfg = load_config()
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=cfg.clickup.api_token,
|
|
||||||
workspace_id=cfg.clickup.workspace_id,
|
|
||||||
task_type_field_name=cfg.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
tasks = client.get_tasks_from_overall_lists(cfg.clickup.space_id, statuses=["to do"])
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
# Filter: tagged feb26
|
|
||||||
feb26 = [t for t in tasks if "feb26" in t.tags]
|
|
||||||
|
|
||||||
# Filter: OPT / LINKS / Content categories (by Work Category or name prefix)
|
|
||||||
def is_target(t):
|
|
||||||
cat = (t.task_type or "").lower()
|
|
||||||
name = t.name.upper()
|
|
||||||
if cat in ("on page optimization", "link building", "content creation"):
|
|
||||||
return True
|
|
||||||
if name.startswith("OPT") or name.startswith("LINKS") or name.startswith("NEW -"):
|
|
||||||
return True
|
|
||||||
return False
|
|
||||||
|
|
||||||
filtered = [t for t in feb26 if is_target(t)]
|
|
||||||
|
|
||||||
# Sort by due date ascending (no due date = sort last)
|
|
||||||
def sort_key(t):
|
|
||||||
if t.due_date:
|
|
||||||
return int(t.due_date)
|
|
||||||
return float("inf")
|
|
||||||
|
|
||||||
filtered.sort(key=sort_key)
|
|
||||||
top10 = filtered[:10]
|
|
||||||
|
|
||||||
def fmt_due(ms_str):
|
|
||||||
if not ms_str:
|
|
||||||
return "No due"
|
|
||||||
ts = int(ms_str) / 1000
|
|
||||||
return datetime.fromtimestamp(ts, tz=UTC).strftime("%b %d")
|
|
||||||
|
|
||||||
def fmt_customer(t):
|
|
||||||
c = t.custom_fields.get("Customer", "")
|
|
||||||
if c and str(c) != "None":
|
|
||||||
return str(c)
|
|
||||||
return t.list_name
|
|
||||||
|
|
||||||
def fmt_cat(t):
|
|
||||||
cat = t.task_type
|
|
||||||
name = t.name.upper()
|
|
||||||
if not cat or cat.strip() == "":
|
|
||||||
if name.startswith("LINKS"):
|
|
||||||
return "LINKS"
|
|
||||||
elif name.startswith("OPT"):
|
|
||||||
return "OPT"
|
|
||||||
elif name.startswith("NEW"):
|
|
||||||
return "Content"
|
|
||||||
return "?"
|
|
||||||
mapping = {
|
|
||||||
"On Page Optimization": "OPT",
|
|
||||||
"Link Building": "LINKS",
|
|
||||||
"Content Creation": "Content",
|
|
||||||
}
|
|
||||||
return mapping.get(cat, cat)
|
|
||||||
|
|
||||||
def fmt_tags(t):
|
|
||||||
return ", ".join(t.tags) if t.tags else ""
|
|
||||||
|
|
||||||
print(f"## feb26 To-Do: OPT / LINKS / Content ({len(filtered)} total, showing top 10 oldest)")
|
|
||||||
print()
|
|
||||||
print("| # | ID | Keyword/Name | Due | Customer | Tags |")
|
|
||||||
print("|---|-----|-------------|-----|----------|------|")
|
|
||||||
for i, t in enumerate(top10, 1):
|
|
||||||
name = t.name[:55]
|
|
||||||
tid = t.id
|
|
||||||
due = fmt_due(t.due_date)
|
|
||||||
cust = fmt_customer(t)
|
|
||||||
tags = fmt_tags(t)
|
|
||||||
print(f"| {i} | {tid} | {name} | {due} | {cust} | {tags} |")
|
|
||||||
|
|
||||||
if len(filtered) > 10:
|
|
||||||
print()
|
|
||||||
remaining = filtered[10:]
|
|
||||||
print(f"### Remaining {len(remaining)} tasks:")
|
|
||||||
print("| # | ID | Keyword/Name | Due | Customer | Tags |")
|
|
||||||
print("|---|-----|-------------|-----|----------|------|")
|
|
||||||
for i, t in enumerate(remaining, 11):
|
|
||||||
name = t.name[:55]
|
|
||||||
print(f"| {i} | {t.id} | {name} | {fmt_due(t.due_date)} | {fmt_customer(t)} | {fmt_tags(t)} |")
|
|
||||||
|
|
||||||
print()
|
|
||||||
print(f"*{len(filtered)} matching tasks, {len(feb26)} total feb26 tasks, {len(tasks)} total to-do*")
|
|
||||||
|
|
@ -1,87 +0,0 @@
|
||||||
"""Query ClickUp 'to do' tasks tagged feb26 in OPT/LINKS/Content categories."""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
from datetime import datetime, timezone
|
|
||||||
|
|
||||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
|
|
||||||
load_dotenv(os.path.join(os.path.dirname(__file__), "..", ".env"))
|
|
||||||
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
TOKEN = os.getenv("CLICKUP_API_TOKEN", "")
|
|
||||||
SPACE_ID = os.getenv("CLICKUP_SPACE_ID", "")
|
|
||||||
|
|
||||||
if not TOKEN or not SPACE_ID:
|
|
||||||
print("ERROR: CLICKUP_API_TOKEN and CLICKUP_SPACE_ID must be set in .env")
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
CATEGORIES = {"On Page Optimization", "Content Creation", "Link Building"}
|
|
||||||
TAG_FILTER = "feb26"
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token=TOKEN, workspace_id="", task_type_field_name="Work Category")
|
|
||||||
|
|
||||||
print(f"Querying ClickUp space {SPACE_ID} for 'to do' tasks...")
|
|
||||||
tasks = client.get_tasks_from_space(SPACE_ID, statuses=["to do"])
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
print(f"Total 'to do' tasks found: {len(tasks)}")
|
|
||||||
|
|
||||||
# Filter by feb26 tag
|
|
||||||
tagged = [t for t in tasks if TAG_FILTER in [tag.lower() for tag in t.tags]]
|
|
||||||
print(f"Tasks with '{TAG_FILTER}' tag: {len(tagged)}")
|
|
||||||
|
|
||||||
# Filter by Work Category (OPT / LINKS / Content)
|
|
||||||
filtered = []
|
|
||||||
for t in tagged:
|
|
||||||
cat = (t.custom_fields.get("Work Category") or t.task_type or "").strip()
|
|
||||||
if cat in CATEGORIES:
|
|
||||||
filtered.append(t)
|
|
||||||
|
|
||||||
if not filtered and tagged:
|
|
||||||
# Show what categories exist so we can refine
|
|
||||||
cats_found = set()
|
|
||||||
for t in tagged:
|
|
||||||
cats_found.add(t.custom_fields.get("Work Category") or t.task_type or "(none)")
|
|
||||||
print(f"\nNo tasks matched categories {CATEGORIES}.")
|
|
||||||
print(f"Work Categories found on feb26-tagged tasks: {cats_found}")
|
|
||||||
print("\nShowing ALL feb26-tagged tasks instead:\n")
|
|
||||||
filtered = tagged
|
|
||||||
|
|
||||||
# Sort by due date (oldest first), tasks without due date go last
|
|
||||||
def sort_key(t):
|
|
||||||
if t.due_date:
|
|
||||||
return int(t.due_date)
|
|
||||||
return float("inf")
|
|
||||||
|
|
||||||
filtered.sort(key=sort_key)
|
|
||||||
|
|
||||||
# Take top 10
|
|
||||||
top = filtered[:10]
|
|
||||||
|
|
||||||
# Format table
|
|
||||||
def fmt_due(raw_due: str) -> str:
|
|
||||||
if not raw_due:
|
|
||||||
return "—"
|
|
||||||
try:
|
|
||||||
ts = int(raw_due) / 1000
|
|
||||||
return datetime.fromtimestamp(ts, tz=timezone.utc).strftime("%m/%d")
|
|
||||||
except (ValueError, OSError):
|
|
||||||
return raw_due
|
|
||||||
|
|
||||||
def fmt_customer(t) -> str:
|
|
||||||
return t.custom_fields.get("Customer", "") or "—"
|
|
||||||
|
|
||||||
print(f"\n{'#':<3} | {'ID':<12} | {'Keyword/Name':<45} | {'Cat':<15} | {'Due':<6} | {'Customer':<20} | Tags")
|
|
||||||
print("-" * 120)
|
|
||||||
|
|
||||||
for i, t in enumerate(top, 1):
|
|
||||||
tags_str = ", ".join(t.tags)
|
|
||||||
name = t.name[:45]
|
|
||||||
cat = t.custom_fields.get("Work Category") or t.task_type or "—"
|
|
||||||
print(f"{i:<3} | {t.id:<12} | {name:<45} | {cat:<15} | {fmt_due(t.due_date):<6} | {fmt_customer(t):<20} | {tags_str}")
|
|
||||||
|
|
||||||
print(f"\nTotal shown: {len(top)} of {len(filtered)} matching tasks")
|
|
||||||
|
|
@ -1,64 +0,0 @@
|
||||||
"""Find all Press Release tasks due in February 2026, any status."""
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
logging.basicConfig(level=logging.WARNING)
|
|
||||||
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
import json
|
|
||||||
|
|
||||||
config = load_config()
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=config.clickup.api_token,
|
|
||||||
workspace_id=config.clickup.workspace_id,
|
|
||||||
task_type_field_name=config.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
space_id = config.clickup.space_id
|
|
||||||
list_ids = client.get_list_ids_from_space(space_id)
|
|
||||||
field_filter = client.discover_field_filter(
|
|
||||||
next(iter(list_ids)), config.clickup.task_type_field_name
|
|
||||||
)
|
|
||||||
|
|
||||||
pr_opt_id = field_filter["options"]["Press Release"]
|
|
||||||
custom_fields_filter = json.dumps(
|
|
||||||
[{"field_id": field_filter["field_id"], "operator": "ANY", "value": [pr_opt_id]}]
|
|
||||||
)
|
|
||||||
|
|
||||||
# February 2026 window
|
|
||||||
feb_start = int(datetime(2026, 2, 1, tzinfo=UTC).timestamp() * 1000)
|
|
||||||
feb_end = int(datetime(2026, 3, 1, tzinfo=UTC).timestamp() * 1000)
|
|
||||||
|
|
||||||
# Query with broad statuses, include closed
|
|
||||||
tasks = client.get_tasks_from_space(
|
|
||||||
space_id,
|
|
||||||
custom_fields=custom_fields_filter,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Filter for due in February 2026
|
|
||||||
feb_prs = []
|
|
||||||
for t in tasks:
|
|
||||||
if t.task_type != "Press Release":
|
|
||||||
continue
|
|
||||||
if not t.due_date:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
due_ms = int(t.due_date)
|
|
||||||
if feb_start <= due_ms < feb_end:
|
|
||||||
feb_prs.append(t)
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
continue
|
|
||||||
|
|
||||||
print(f"\nPress Release tasks due in February 2026: {len(feb_prs)}\n")
|
|
||||||
for t in feb_prs:
|
|
||||||
due_dt = datetime.fromtimestamp(int(t.due_date) / 1000, tz=UTC)
|
|
||||||
due = due_dt.strftime("%Y-%m-%d")
|
|
||||||
tags_str = ", ".join(t.tags) if t.tags else "(none)"
|
|
||||||
customer = t.custom_fields.get("Customer", "?")
|
|
||||||
imsurl = t.custom_fields.get("IMSURL", "")
|
|
||||||
print(f" [{t.status:20s}] {t.name}")
|
|
||||||
print(f" id={t.id} due={due} tags={tags_str}")
|
|
||||||
print(f" customer={customer} imsurl={imsurl or '(none)'}")
|
|
||||||
print()
|
|
||||||
|
|
@ -1,61 +0,0 @@
|
||||||
"""Find all feb26-tagged Press Release tasks regardless of due date or status."""
|
|
||||||
|
|
||||||
import logging
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(message)s", datefmt="%H:%M:%S")
|
|
||||||
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
config = load_config()
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=config.clickup.api_token,
|
|
||||||
workspace_id=config.clickup.workspace_id,
|
|
||||||
task_type_field_name=config.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
space_id = config.clickup.space_id
|
|
||||||
|
|
||||||
# Query ALL statuses (no status filter, no due date filter) but filter by Press Release
|
|
||||||
list_ids = client.get_list_ids_from_space(space_id)
|
|
||||||
field_filter = client.discover_field_filter(
|
|
||||||
next(iter(list_ids)), config.clickup.task_type_field_name
|
|
||||||
)
|
|
||||||
|
|
||||||
import json
|
|
||||||
pr_opt_id = field_filter["options"]["Press Release"]
|
|
||||||
custom_fields_filter = json.dumps(
|
|
||||||
[{"field_id": field_filter["field_id"], "operator": "ANY", "value": [pr_opt_id]}]
|
|
||||||
)
|
|
||||||
|
|
||||||
# Get tasks with NO status filter and NO due date filter
|
|
||||||
tasks = client.get_tasks_from_space(
|
|
||||||
space_id,
|
|
||||||
statuses=["to do", "outline approved", "in progress", "automation underway"],
|
|
||||||
custom_fields=custom_fields_filter,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Filter for feb26 tag
|
|
||||||
feb26_tasks = [t for t in tasks if "feb26" in t.tags]
|
|
||||||
all_pr = [t for t in tasks if t.task_type == "Press Release"]
|
|
||||||
|
|
||||||
print(f"\n{'='*70}")
|
|
||||||
print(f"Total tasks returned: {len(tasks)}")
|
|
||||||
print(f"Press Release tasks: {len(all_pr)}")
|
|
||||||
print(f"feb26-tagged PR tasks: {len(feb26_tasks)}")
|
|
||||||
print(f"{'='*70}\n")
|
|
||||||
|
|
||||||
for t in all_pr:
|
|
||||||
due = ""
|
|
||||||
if t.due_date:
|
|
||||||
try:
|
|
||||||
due_dt = datetime.fromtimestamp(int(t.due_date) / 1000, tz=UTC)
|
|
||||||
due = due_dt.strftime("%Y-%m-%d")
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
due = t.due_date
|
|
||||||
tags_str = ", ".join(t.tags) if t.tags else "(no tags)"
|
|
||||||
customer = t.custom_fields.get("Customer", "?")
|
|
||||||
print(f" [{t.status:20s}] {t.name}")
|
|
||||||
print(f" id={t.id} due={due or '(none)'} tags={tags_str} customer={customer}")
|
|
||||||
print()
|
|
||||||
|
|
@ -1,161 +0,0 @@
|
||||||
"""Migrate ClickUp 'Customer' (list-level) → 'Client' (space-level) field.
|
|
||||||
|
|
||||||
This script does NOT create the field — you must create the space-level "Client"
|
|
||||||
dropdown manually in ClickUp UI first, using the company names this script prints.
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
1. Fetch all folders, filter to those with an 'Overall' list
|
|
||||||
2. Print sorted company names for dropdown creation
|
|
||||||
3. Pause for you to create the field in ClickUp UI
|
|
||||||
4. Discover the new 'Client' field's UUID + option IDs
|
|
||||||
5. Set 'Client' on every active task (folder name as value)
|
|
||||||
6. Report results
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
DRY_RUN=1 uv run python scripts/migrate_client_field.py # preview only
|
|
||||||
uv run python scripts/migrate_client_field.py # live run
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Allow running from repo root
|
|
||||||
_root = Path(__file__).resolve().parent.parent
|
|
||||||
sys.path.insert(0, str(_root))
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
|
|
||||||
load_dotenv(_root / ".env")
|
|
||||||
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
# ── Config ──────────────────────────────────────────────────────────────────
|
|
||||||
DRY_RUN = os.environ.get("DRY_RUN", "0") not in ("0", "false", "")
|
|
||||||
NEW_FIELD_NAME = "Client"
|
|
||||||
|
|
||||||
API_TOKEN = os.environ.get("CLICKUP_API_TOKEN", "")
|
|
||||||
SPACE_ID = os.environ.get("CLICKUP_SPACE_ID", "")
|
|
||||||
|
|
||||||
if not API_TOKEN:
|
|
||||||
sys.exit("ERROR: CLICKUP_API_TOKEN env var is required")
|
|
||||||
if not SPACE_ID:
|
|
||||||
sys.exit("ERROR: CLICKUP_SPACE_ID env var is required")
|
|
||||||
|
|
||||||
|
|
||||||
def main() -> None:
|
|
||||||
client = ClickUpClient(api_token=API_TOKEN)
|
|
||||||
|
|
||||||
# 1. Get folders, filter to those with an Overall list
|
|
||||||
print(f"\n{'=' * 60}")
|
|
||||||
print(f" Migrate to '{NEW_FIELD_NAME}' field -- Space {SPACE_ID}")
|
|
||||||
print(f" Mode: {'DRY RUN' if DRY_RUN else 'LIVE'}")
|
|
||||||
print(f"{'=' * 60}\n")
|
|
||||||
|
|
||||||
folders = client.get_folders(SPACE_ID)
|
|
||||||
print(f"Found {len(folders)} folders:\n")
|
|
||||||
|
|
||||||
client_folders = []
|
|
||||||
for f in folders:
|
|
||||||
overall = next(
|
|
||||||
(lst for lst in f["lists"] if lst["name"].lower() == "overall"), None
|
|
||||||
)
|
|
||||||
if overall:
|
|
||||||
print(f" {f['name']:35s} Overall list: {overall['id']}")
|
|
||||||
client_folders.append({"name": f["name"], "overall_id": overall["id"]})
|
|
||||||
else:
|
|
||||||
print(f" {f['name']:35s} [SKIP - no Overall list]")
|
|
||||||
|
|
||||||
if not client_folders:
|
|
||||||
sys.exit("\nNo client folders with Overall lists found.")
|
|
||||||
|
|
||||||
# 2. Print company names for dropdown creation
|
|
||||||
option_names = sorted(cf["name"] for cf in client_folders)
|
|
||||||
print(f"\n--- Dropdown options for '{NEW_FIELD_NAME}' ({len(option_names)}) ---")
|
|
||||||
for name in option_names:
|
|
||||||
print(f" {name}")
|
|
||||||
|
|
||||||
# 3. Build plan: fetch active tasks per folder
|
|
||||||
print("\nFetching active tasks from Overall lists ...")
|
|
||||||
plan: list[dict] = []
|
|
||||||
for cf in client_folders:
|
|
||||||
tasks = client.get_tasks(cf["overall_id"], include_closed=False)
|
|
||||||
plan.append({
|
|
||||||
"folder_name": cf["name"],
|
|
||||||
"list_id": cf["overall_id"],
|
|
||||||
"tasks": tasks,
|
|
||||||
})
|
|
||||||
|
|
||||||
total_tasks = sum(len(p["tasks"]) for p in plan)
|
|
||||||
print(f"\n--- Update Plan (active tasks only) ---")
|
|
||||||
for p in plan:
|
|
||||||
print(f" {p['folder_name']:35s} {len(p['tasks']):3d} tasks")
|
|
||||||
print(f" {'TOTAL':35s} {total_tasks:3d} tasks")
|
|
||||||
|
|
||||||
if DRY_RUN:
|
|
||||||
print("\n** DRY RUN -- no changes made. Unset DRY_RUN to execute. **\n")
|
|
||||||
return
|
|
||||||
|
|
||||||
# 4. Field should already be created in ClickUp UI
|
|
||||||
print(f"\nLooking for space-level '{NEW_FIELD_NAME}' field ...")
|
|
||||||
|
|
||||||
# 5. Discover the field UUID + option IDs
|
|
||||||
first_list_id = client_folders[0]["overall_id"]
|
|
||||||
print(f"\nDiscovering '{NEW_FIELD_NAME}' field UUID and option IDs ...")
|
|
||||||
field_info = client.discover_field_filter(first_list_id, NEW_FIELD_NAME)
|
|
||||||
if field_info is None:
|
|
||||||
sys.exit(
|
|
||||||
f"\nERROR: Could not find '{NEW_FIELD_NAME}' field on list {first_list_id}.\n"
|
|
||||||
f"Make sure you created it as a SPACE-level field (visible to all lists)."
|
|
||||||
)
|
|
||||||
|
|
||||||
field_id = field_info["field_id"]
|
|
||||||
option_map = field_info["options"] # {name: uuid}
|
|
||||||
print(f" Field ID: {field_id}")
|
|
||||||
print(f" Options found: {len(option_map)}")
|
|
||||||
|
|
||||||
# Verify all folder names have matching options
|
|
||||||
missing = [cf["name"] for cf in client_folders if cf["name"] not in option_map]
|
|
||||||
if missing:
|
|
||||||
print(f"\n WARNING: These folder names have no matching dropdown option:")
|
|
||||||
for name in missing:
|
|
||||||
print(f" - {name}")
|
|
||||||
print(" Tasks in these folders will be SKIPPED.")
|
|
||||||
|
|
||||||
# 6. Set Client field on each task
|
|
||||||
updated = 0
|
|
||||||
skipped = 0
|
|
||||||
failed = 0
|
|
||||||
for p in plan:
|
|
||||||
folder_name = p["folder_name"]
|
|
||||||
opt_id = option_map.get(folder_name)
|
|
||||||
if not opt_id:
|
|
||||||
skipped += len(p["tasks"])
|
|
||||||
print(f"\n SKIP: '{folder_name}' -- no matching option")
|
|
||||||
continue
|
|
||||||
|
|
||||||
print(f"\nUpdating {len(p['tasks'])} tasks in '{folder_name}' ...")
|
|
||||||
for task in p["tasks"]:
|
|
||||||
ok = client.set_custom_field_value(task.id, field_id, opt_id)
|
|
||||||
if ok:
|
|
||||||
updated += 1
|
|
||||||
else:
|
|
||||||
failed += 1
|
|
||||||
print(f" FAILED: task {task.id} ({task.name})")
|
|
||||||
time.sleep(0.15)
|
|
||||||
|
|
||||||
print(f"\n{'=' * 60}")
|
|
||||||
print(f" Done! Updated: {updated} | Skipped: {skipped} | Failed: {failed}")
|
|
||||||
print(f"{'=' * 60}")
|
|
||||||
print(f"\n Next steps:")
|
|
||||||
print(f" 1. Verify tasks in ClickUp have the '{NEW_FIELD_NAME}' field set correctly")
|
|
||||||
print(f" 2. Update config.yaml: change 'Customer' → '{NEW_FIELD_NAME}' in field_mapping")
|
|
||||||
print(f" 3. Test CheddahBot with the new field")
|
|
||||||
print(f" 4. Delete the old list-level 'Customer' fields from ClickUp\n")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,102 +0,0 @@
|
||||||
"""Query ClickUp 'to do' tasks tagged 'feb26' in OPT/LINKS/Content categories."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
from datetime import datetime, timezone
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
_root = Path(__file__).resolve().parent.parent
|
|
||||||
sys.path.insert(0, str(_root))
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
load_dotenv(_root / ".env")
|
|
||||||
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
API_TOKEN = os.environ.get("CLICKUP_API_TOKEN", "")
|
|
||||||
SPACE_ID = os.environ.get("CLICKUP_SPACE_ID", "")
|
|
||||||
|
|
||||||
if not API_TOKEN:
|
|
||||||
sys.exit("ERROR: CLICKUP_API_TOKEN env var is required")
|
|
||||||
if not SPACE_ID:
|
|
||||||
sys.exit("ERROR: CLICKUP_SPACE_ID env var is required")
|
|
||||||
|
|
||||||
# Work Category values to include (case-insensitive partial match)
|
|
||||||
CATEGORY_FILTERS = ["opt", "link", "content"]
|
|
||||||
TAG_FILTER = "feb26"
|
|
||||||
|
|
||||||
|
|
||||||
def ms_to_date(ms_str: str) -> str:
|
|
||||||
"""Convert Unix-ms timestamp string to YYYY-MM-DD."""
|
|
||||||
if not ms_str:
|
|
||||||
return "—"
|
|
||||||
try:
|
|
||||||
ts = int(ms_str) / 1000
|
|
||||||
return datetime.fromtimestamp(ts, tz=timezone.utc).strftime("%Y-%m-%d")
|
|
||||||
except (ValueError, OSError):
|
|
||||||
return ms_str
|
|
||||||
|
|
||||||
|
|
||||||
def main() -> None:
|
|
||||||
client = ClickUpClient(api_token=API_TOKEN, task_type_field_name="Work Category")
|
|
||||||
|
|
||||||
print(f"Fetching 'to do' tasks from space {SPACE_ID} ...")
|
|
||||||
tasks = client.get_tasks_from_overall_lists(SPACE_ID, statuses=["to do"])
|
|
||||||
print(f"Total 'to do' tasks: {len(tasks)}")
|
|
||||||
|
|
||||||
# Filter by feb26 tag
|
|
||||||
tagged = [t for t in tasks if TAG_FILTER in [tag.lower() for tag in t.tags]]
|
|
||||||
print(f"Tasks with '{TAG_FILTER}' tag: {len(tagged)}")
|
|
||||||
|
|
||||||
# Show all Work Category values for debugging
|
|
||||||
categories = set()
|
|
||||||
for t in tagged:
|
|
||||||
wc = t.custom_fields.get("Work Category", "") or ""
|
|
||||||
categories.add(wc)
|
|
||||||
print(f"Work Categories found: {categories}")
|
|
||||||
|
|
||||||
# Filter by OPT/LINKS/Content categories
|
|
||||||
filtered = []
|
|
||||||
for t in tagged:
|
|
||||||
wc = str(t.custom_fields.get("Work Category", "") or "").lower()
|
|
||||||
if any(cat in wc for cat in CATEGORY_FILTERS):
|
|
||||||
filtered.append(t)
|
|
||||||
|
|
||||||
print(f"After category filter (OPT/LINKS/Content): {len(filtered)}")
|
|
||||||
|
|
||||||
# Sort by due date (oldest first), tasks with no due date go last
|
|
||||||
def sort_key(t):
|
|
||||||
if t.due_date:
|
|
||||||
try:
|
|
||||||
return (0, int(t.due_date))
|
|
||||||
except ValueError:
|
|
||||||
return (1, 0)
|
|
||||||
return (2, 0)
|
|
||||||
|
|
||||||
filtered.sort(key=sort_key)
|
|
||||||
|
|
||||||
# Top 10
|
|
||||||
top10 = filtered[:10]
|
|
||||||
|
|
||||||
# Print table
|
|
||||||
print(f"\n{'#':>3} | {'ID':>11} | {'Keyword/Name':<45} | {'Due':>10} | {'Customer':<20} | Tags")
|
|
||||||
print("-" * 120)
|
|
||||||
|
|
||||||
for i, t in enumerate(top10, 1):
|
|
||||||
customer = t.custom_fields.get("Customer", "") or "—"
|
|
||||||
due = ms_to_date(t.due_date)
|
|
||||||
wc = t.custom_fields.get("Work Category", "") or ""
|
|
||||||
tags_str = ", ".join(t.tags)
|
|
||||||
name_display = t.name[:45] if len(t.name) > 45 else t.name
|
|
||||||
print(f"{i:>3} | {t.id:>11} | {name_display:<45} | {due:>10} | {customer:<20} | {tags_str}")
|
|
||||||
|
|
||||||
if not top10:
|
|
||||||
print(" (no matching tasks found)")
|
|
||||||
|
|
||||||
print(f"\n--- {len(filtered)} total matching tasks, showing top {len(top10)} (oldest first) ---")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,149 +0,0 @@
|
||||||
"""One-time script: rebuild the 'Customer' dropdown custom field in ClickUp.
|
|
||||||
|
|
||||||
Steps:
|
|
||||||
1. Fetch all folders from the PII-Agency-SEO space
|
|
||||||
2. Filter out non-client folders
|
|
||||||
3. Create a 'Customer' dropdown field with folder names as options
|
|
||||||
4. For each client folder, find the 'Overall' list and set Customer on all tasks
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
DRY_RUN=1 uv run python scripts/rebuild_customer_field.py # preview only
|
|
||||||
uv run python scripts/rebuild_customer_field.py # live run
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import os
|
|
||||||
import sys
|
|
||||||
import time
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
# Allow running from repo root
|
|
||||||
_root = Path(__file__).resolve().parent.parent
|
|
||||||
sys.path.insert(0, str(_root))
|
|
||||||
|
|
||||||
from dotenv import load_dotenv
|
|
||||||
|
|
||||||
load_dotenv(_root / ".env")
|
|
||||||
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
# ── Config ──────────────────────────────────────────────────────────────────
|
|
||||||
DRY_RUN = os.environ.get("DRY_RUN", "0") not in ("0", "false", "")
|
|
||||||
EXCLUDED_FOLDERS = {"SEO Audits", "SEO Projects", "Business Related"}
|
|
||||||
FIELD_NAME = "Customer"
|
|
||||||
|
|
||||||
API_TOKEN = os.environ.get("CLICKUP_API_TOKEN", "")
|
|
||||||
SPACE_ID = os.environ.get("CLICKUP_SPACE_ID", "")
|
|
||||||
|
|
||||||
if not API_TOKEN:
|
|
||||||
sys.exit("ERROR: CLICKUP_API_TOKEN env var is required")
|
|
||||||
if not SPACE_ID:
|
|
||||||
sys.exit("ERROR: CLICKUP_SPACE_ID env var is required")
|
|
||||||
|
|
||||||
|
|
||||||
def main() -> None:
|
|
||||||
client = ClickUpClient(api_token=API_TOKEN)
|
|
||||||
|
|
||||||
# 1. Get folders
|
|
||||||
print(f"\n{'=' * 60}")
|
|
||||||
print(f" Rebuild '{FIELD_NAME}' field -- Space {SPACE_ID}")
|
|
||||||
print(f" Mode: {'DRY RUN' if DRY_RUN else 'LIVE'}")
|
|
||||||
print(f"{'=' * 60}\n")
|
|
||||||
|
|
||||||
folders = client.get_folders(SPACE_ID)
|
|
||||||
print(f"Found {len(folders)} folders:\n")
|
|
||||||
|
|
||||||
client_folders = []
|
|
||||||
for f in folders:
|
|
||||||
excluded = f["name"] in EXCLUDED_FOLDERS
|
|
||||||
marker = " [SKIP]" if excluded else ""
|
|
||||||
list_names = [lst["name"] for lst in f["lists"]]
|
|
||||||
print(f" {f['name']}{marker} (lists: {', '.join(list_names) or 'none'})")
|
|
||||||
if not excluded:
|
|
||||||
client_folders.append(f)
|
|
||||||
|
|
||||||
if not client_folders:
|
|
||||||
sys.exit("\nNo client folders found -- nothing to do.")
|
|
||||||
|
|
||||||
option_names = sorted(f["name"] for f in client_folders)
|
|
||||||
print(f"\nDropdown options ({len(option_names)}): {', '.join(option_names)}")
|
|
||||||
|
|
||||||
# 2. Build a plan: folder → Overall list → tasks
|
|
||||||
plan: list[dict] = [] # {folder_name, list_id, tasks: [ClickUpTask]}
|
|
||||||
first_list_id = None
|
|
||||||
|
|
||||||
for f in client_folders:
|
|
||||||
overall = next((lst for lst in f["lists"] if lst["name"] == "Overall"), None)
|
|
||||||
if overall is None:
|
|
||||||
print(f"\n WARNING: '{f['name']}' has no 'Overall' list -- skipping task update")
|
|
||||||
continue
|
|
||||||
if first_list_id is None:
|
|
||||||
first_list_id = overall["id"]
|
|
||||||
tasks = client.get_tasks(overall["id"])
|
|
||||||
plan.append({"folder_name": f["name"], "list_id": overall["id"], "tasks": tasks})
|
|
||||||
|
|
||||||
# 3. Print summary
|
|
||||||
total_tasks = sum(len(p["tasks"]) for p in plan)
|
|
||||||
print("\n--- Update Plan ---")
|
|
||||||
for p in plan:
|
|
||||||
print(f" {p['folder_name']:30s} -> {len(p['tasks']):3d} tasks in list {p['list_id']}")
|
|
||||||
print(f" {'TOTAL':30s} -> {total_tasks:3d} tasks")
|
|
||||||
|
|
||||||
if DRY_RUN:
|
|
||||||
print("\n** DRY RUN -- no changes made. Unset DRY_RUN to execute. **\n")
|
|
||||||
return
|
|
||||||
|
|
||||||
if first_list_id is None:
|
|
||||||
sys.exit("\nNo 'Overall' list found in any client folder -- cannot create field.")
|
|
||||||
|
|
||||||
# 4. Create the dropdown field
|
|
||||||
print(f"\nCreating '{FIELD_NAME}' dropdown on list {first_list_id} ...")
|
|
||||||
type_config = {
|
|
||||||
"options": [{"name": name, "color": None} for name in option_names],
|
|
||||||
}
|
|
||||||
client.create_custom_field(first_list_id, FIELD_NAME, "drop_down", type_config)
|
|
||||||
print(" Field created.")
|
|
||||||
|
|
||||||
# Brief pause for ClickUp to propagate
|
|
||||||
time.sleep(2)
|
|
||||||
|
|
||||||
# 5. Discover the field UUID + option IDs
|
|
||||||
print("Discovering field UUID and option IDs ...")
|
|
||||||
field_info = client.discover_field_filter(first_list_id, FIELD_NAME)
|
|
||||||
if field_info is None:
|
|
||||||
sys.exit(f"\nERROR: Could not find '{FIELD_NAME}' field after creation!")
|
|
||||||
|
|
||||||
field_id = field_info["field_id"]
|
|
||||||
option_map = field_info["options"] # {name: uuid}
|
|
||||||
print(f" Field ID: {field_id}")
|
|
||||||
print(f" Options: {option_map}")
|
|
||||||
|
|
||||||
# 6. Set Customer field on each task
|
|
||||||
updated = 0
|
|
||||||
failed = 0
|
|
||||||
for p in plan:
|
|
||||||
folder_name = p["folder_name"]
|
|
||||||
opt_id = option_map.get(folder_name)
|
|
||||||
if not opt_id:
|
|
||||||
print(f"\n WARNING: No option ID for '{folder_name}' -- skipping")
|
|
||||||
continue
|
|
||||||
|
|
||||||
print(f"\nUpdating {len(p['tasks'])} tasks in '{folder_name}' ...")
|
|
||||||
for task in p["tasks"]:
|
|
||||||
ok = client.set_custom_field_value(task.id, field_id, opt_id)
|
|
||||||
if ok:
|
|
||||||
updated += 1
|
|
||||||
else:
|
|
||||||
failed += 1
|
|
||||||
print(f" FAILED: task {task.id} ({task.name})")
|
|
||||||
# Light rate-limit courtesy
|
|
||||||
time.sleep(0.15)
|
|
||||||
|
|
||||||
print(f"\n{'=' * 60}")
|
|
||||||
print(f" Done! Updated: {updated} | Failed: {failed}")
|
|
||||||
print(f"{'=' * 60}\n")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,144 +0,0 @@
|
||||||
"""Re-run press release pipeline for specific tasks that are missing attachments."""
|
|
||||||
|
|
||||||
import logging
|
|
||||||
import sys
|
|
||||||
import io
|
|
||||||
|
|
||||||
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding="utf-8")
|
|
||||||
|
|
||||||
logging.basicConfig(
|
|
||||||
level=logging.INFO,
|
|
||||||
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
|
|
||||||
datefmt="%H:%M:%S",
|
|
||||||
handlers=[logging.StreamHandler(stream=io.TextIOWrapper(sys.stderr.buffer, encoding="utf-8"))],
|
|
||||||
)
|
|
||||||
log = logging.getLogger("pr_rerun")
|
|
||||||
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.db import Database
|
|
||||||
from cheddahbot.llm import LLMAdapter
|
|
||||||
from cheddahbot.agent import Agent
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
|
|
||||||
TASKS_TO_RERUN = [
|
|
||||||
("86b8ebfk9", "Advanced Industrial highlights medical grade plastic expertise", "Advanced Industrial"),
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
def bootstrap():
|
|
||||||
config = load_config()
|
|
||||||
db = Database(config.db_path)
|
|
||||||
llm = LLMAdapter(
|
|
||||||
default_model=config.chat_model,
|
|
||||||
openrouter_key=config.openrouter_api_key,
|
|
||||||
ollama_url=config.ollama_url,
|
|
||||||
lmstudio_url=config.lmstudio_url,
|
|
||||||
)
|
|
||||||
|
|
||||||
agent_cfg = config.agents[0] if config.agents else None
|
|
||||||
agent = Agent(config, db, llm, agent_config=agent_cfg)
|
|
||||||
|
|
||||||
try:
|
|
||||||
from cheddahbot.memory import MemorySystem
|
|
||||||
scope = agent_cfg.memory_scope if agent_cfg else ""
|
|
||||||
memory = MemorySystem(config, db, scope=scope)
|
|
||||||
agent.set_memory(memory)
|
|
||||||
except Exception as e:
|
|
||||||
log.warning("Memory not available: %s", e)
|
|
||||||
|
|
||||||
from cheddahbot.tools import ToolRegistry
|
|
||||||
tools = ToolRegistry(config, db, agent)
|
|
||||||
agent.set_tools(tools)
|
|
||||||
|
|
||||||
try:
|
|
||||||
from cheddahbot.skills import SkillRegistry
|
|
||||||
skills = SkillRegistry(config.skills_dir)
|
|
||||||
agent.set_skills_registry(skills)
|
|
||||||
except Exception as e:
|
|
||||||
log.warning("Skills not available: %s", e)
|
|
||||||
|
|
||||||
return config, db, agent, tools
|
|
||||||
|
|
||||||
|
|
||||||
def run_task(agent, tools, config, client, task_id, task_name, customer):
|
|
||||||
"""Execute write_press_releases for a specific task."""
|
|
||||||
# Build args matching the field_mapping from config
|
|
||||||
args = {
|
|
||||||
"topic": task_name,
|
|
||||||
"company_name": customer,
|
|
||||||
"clickup_task_id": task_id,
|
|
||||||
}
|
|
||||||
|
|
||||||
# Also fetch IMSURL from the task
|
|
||||||
import httpx as _httpx
|
|
||||||
resp = _httpx.get(
|
|
||||||
f"https://api.clickup.com/api/v2/task/{task_id}",
|
|
||||||
headers={"Authorization": config.clickup.api_token},
|
|
||||||
timeout=30.0,
|
|
||||||
)
|
|
||||||
task_data = resp.json()
|
|
||||||
for cf in task_data.get("custom_fields", []):
|
|
||||||
if cf["name"] == "IMSURL":
|
|
||||||
val = cf.get("value")
|
|
||||||
if val:
|
|
||||||
args["url"] = val
|
|
||||||
elif cf["name"] == "SocialURL":
|
|
||||||
val = cf.get("value")
|
|
||||||
if val:
|
|
||||||
args["branded_url"] = val
|
|
||||||
|
|
||||||
log.info("=" * 70)
|
|
||||||
log.info("EXECUTING: %s", task_name)
|
|
||||||
log.info(" Task ID: %s", task_id)
|
|
||||||
log.info(" Customer: %s", customer)
|
|
||||||
log.info(" Args: %s", {k: v for k, v in args.items() if k != "clickup_task_id"})
|
|
||||||
log.info("=" * 70)
|
|
||||||
|
|
||||||
try:
|
|
||||||
result = tools.execute("write_press_releases", args)
|
|
||||||
|
|
||||||
if result.startswith("Skipped:") or result.startswith("Error:"):
|
|
||||||
log.error("Task skipped/errored: %s", result[:500])
|
|
||||||
return False
|
|
||||||
|
|
||||||
log.info("Task completed!")
|
|
||||||
# Print first 500 chars of result
|
|
||||||
print(f"\n--- Result for {task_name} ---")
|
|
||||||
print(result[:1000])
|
|
||||||
print("--- End ---\n")
|
|
||||||
return True
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
log.error("Task failed: %s", e, exc_info=True)
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
log.info("Bootstrapping CheddahBot...")
|
|
||||||
config, db, agent, tools = bootstrap()
|
|
||||||
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=config.clickup.api_token,
|
|
||||||
workspace_id=config.clickup.workspace_id,
|
|
||||||
task_type_field_name=config.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
|
|
||||||
log.info("Will re-run %d tasks", len(TASKS_TO_RERUN))
|
|
||||||
|
|
||||||
results = []
|
|
||||||
for i, (task_id, name, customer) in enumerate(TASKS_TO_RERUN):
|
|
||||||
log.info("\n>>> Task %d/%d <<<", i + 1, len(TASKS_TO_RERUN))
|
|
||||||
success = run_task(agent, tools, config, client, task_id, name, customer)
|
|
||||||
results.append((name, success))
|
|
||||||
|
|
||||||
print(f"\n{'=' * 70}")
|
|
||||||
print("RESULTS SUMMARY")
|
|
||||||
print(f"{'=' * 70}")
|
|
||||||
for name, success in results:
|
|
||||||
status = "OK" if success else "FAILED"
|
|
||||||
print(f" [{status}] {name}")
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,241 +0,0 @@
|
||||||
"""Run the press-release pipeline for up to N ClickUp tasks.
|
|
||||||
|
|
||||||
Usage:
|
|
||||||
uv run python scripts/run_pr_pipeline.py # discover + execute up to 3
|
|
||||||
uv run python scripts/run_pr_pipeline.py --dry-run # discover only, don't execute
|
|
||||||
uv run python scripts/run_pr_pipeline.py --max 1 # execute only 1 task
|
|
||||||
"""
|
|
||||||
|
|
||||||
import argparse
|
|
||||||
import logging
|
|
||||||
import sys
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
logging.basicConfig(
|
|
||||||
level=logging.INFO,
|
|
||||||
format="%(asctime)s [%(name)s] %(levelname)s: %(message)s",
|
|
||||||
datefmt="%H:%M:%S",
|
|
||||||
)
|
|
||||||
log = logging.getLogger("pr_pipeline")
|
|
||||||
|
|
||||||
# ── Bootstrap CheddahBot (config, db, agent, tools) ─────────────────────
|
|
||||||
|
|
||||||
from cheddahbot.config import load_config
|
|
||||||
from cheddahbot.db import Database
|
|
||||||
from cheddahbot.llm import LLMAdapter
|
|
||||||
from cheddahbot.agent import Agent
|
|
||||||
from cheddahbot.clickup import ClickUpClient
|
|
||||||
|
|
||||||
|
|
||||||
def bootstrap():
|
|
||||||
"""Set up config, db, agent, and tool registry — same as __main__.py."""
|
|
||||||
config = load_config()
|
|
||||||
db = Database(config.db_path)
|
|
||||||
llm = LLMAdapter(
|
|
||||||
default_model=config.chat_model,
|
|
||||||
openrouter_key=config.openrouter_api_key,
|
|
||||||
ollama_url=config.ollama_url,
|
|
||||||
lmstudio_url=config.lmstudio_url,
|
|
||||||
)
|
|
||||||
|
|
||||||
agent_cfg = config.agents[0] if config.agents else None
|
|
||||||
agent = Agent(config, db, llm, agent_config=agent_cfg)
|
|
||||||
|
|
||||||
# Memory
|
|
||||||
try:
|
|
||||||
from cheddahbot.memory import MemorySystem
|
|
||||||
scope = agent_cfg.memory_scope if agent_cfg else ""
|
|
||||||
memory = MemorySystem(config, db, scope=scope)
|
|
||||||
agent.set_memory(memory)
|
|
||||||
except Exception as e:
|
|
||||||
log.warning("Memory not available: %s", e)
|
|
||||||
|
|
||||||
# Tools
|
|
||||||
from cheddahbot.tools import ToolRegistry
|
|
||||||
tools = ToolRegistry(config, db, agent)
|
|
||||||
agent.set_tools(tools)
|
|
||||||
|
|
||||||
# Skills
|
|
||||||
try:
|
|
||||||
from cheddahbot.skills import SkillRegistry
|
|
||||||
skills = SkillRegistry(config.skills_dir)
|
|
||||||
agent.set_skills_registry(skills)
|
|
||||||
except Exception as e:
|
|
||||||
log.warning("Skills not available: %s", e)
|
|
||||||
|
|
||||||
return config, db, agent, tools
|
|
||||||
|
|
||||||
|
|
||||||
def discover_pr_tasks(config):
|
|
||||||
"""Poll ClickUp for Press Release tasks — same logic as scheduler._poll_clickup()."""
|
|
||||||
client = ClickUpClient(
|
|
||||||
api_token=config.clickup.api_token,
|
|
||||||
workspace_id=config.clickup.workspace_id,
|
|
||||||
task_type_field_name=config.clickup.task_type_field_name,
|
|
||||||
)
|
|
||||||
space_id = config.clickup.space_id
|
|
||||||
skill_map = config.clickup.skill_map
|
|
||||||
|
|
||||||
if not space_id:
|
|
||||||
log.error("No space_id configured")
|
|
||||||
return [], client
|
|
||||||
|
|
||||||
# Discover field filter (Work Category UUID + options)
|
|
||||||
list_ids = client.get_list_ids_from_space(space_id)
|
|
||||||
if not list_ids:
|
|
||||||
log.error("No lists found in space %s", space_id)
|
|
||||||
return [], client
|
|
||||||
|
|
||||||
first_list = next(iter(list_ids))
|
|
||||||
field_filter = client.discover_field_filter(
|
|
||||||
first_list, config.clickup.task_type_field_name
|
|
||||||
)
|
|
||||||
|
|
||||||
# Build custom fields filter for API query
|
|
||||||
custom_fields_filter = None
|
|
||||||
if field_filter and field_filter.get("options"):
|
|
||||||
import json
|
|
||||||
field_id = field_filter["field_id"]
|
|
||||||
options = field_filter["options"]
|
|
||||||
# Only Press Release
|
|
||||||
pr_opt_id = options.get("Press Release")
|
|
||||||
if pr_opt_id:
|
|
||||||
custom_fields_filter = json.dumps(
|
|
||||||
[{"field_id": field_id, "operator": "ANY", "value": [pr_opt_id]}]
|
|
||||||
)
|
|
||||||
log.info("Filtering for Press Release option ID: %s", pr_opt_id)
|
|
||||||
else:
|
|
||||||
log.warning("'Press Release' not found in Work Category options: %s", list(options.keys()))
|
|
||||||
return [], client
|
|
||||||
|
|
||||||
# Due date window (3 weeks)
|
|
||||||
now_ms = int(datetime.now(UTC).timestamp() * 1000)
|
|
||||||
due_date_lt = now_ms + (3 * 7 * 24 * 60 * 60 * 1000)
|
|
||||||
|
|
||||||
tasks = client.get_tasks_from_space(
|
|
||||||
space_id,
|
|
||||||
statuses=config.clickup.poll_statuses,
|
|
||||||
due_date_lt=due_date_lt,
|
|
||||||
custom_fields=custom_fields_filter,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Client-side filter: must be Press Release + have due date in window
|
|
||||||
pr_tasks = []
|
|
||||||
for task in tasks:
|
|
||||||
if task.task_type != "Press Release":
|
|
||||||
continue
|
|
||||||
if not task.due_date:
|
|
||||||
continue
|
|
||||||
try:
|
|
||||||
if int(task.due_date) > due_date_lt:
|
|
||||||
continue
|
|
||||||
except (ValueError, TypeError):
|
|
||||||
continue
|
|
||||||
pr_tasks.append(task)
|
|
||||||
|
|
||||||
return pr_tasks, client
|
|
||||||
|
|
||||||
|
|
||||||
def execute_task(agent, tools, config, client, task):
|
|
||||||
"""Execute a single PR task — same logic as scheduler._execute_task()."""
|
|
||||||
skill_map = config.clickup.skill_map
|
|
||||||
mapping = skill_map.get("Press Release", {})
|
|
||||||
tool_name = mapping.get("tool", "write_press_releases")
|
|
||||||
|
|
||||||
task_id = task.id
|
|
||||||
|
|
||||||
# Build tool args from field mapping
|
|
||||||
field_mapping = mapping.get("field_mapping", {})
|
|
||||||
args = {}
|
|
||||||
for tool_param, source in field_mapping.items():
|
|
||||||
if source == "task_name":
|
|
||||||
args[tool_param] = task.name
|
|
||||||
elif source == "task_description":
|
|
||||||
args[tool_param] = task.custom_fields.get("description", "")
|
|
||||||
else:
|
|
||||||
args[tool_param] = task.custom_fields.get(source, "")
|
|
||||||
|
|
||||||
args["clickup_task_id"] = task_id
|
|
||||||
|
|
||||||
log.info("=" * 70)
|
|
||||||
log.info("EXECUTING: %s", task.name)
|
|
||||||
log.info(" Task ID: %s", task_id)
|
|
||||||
log.info(" Tool: %s", tool_name)
|
|
||||||
log.info(" Args: %s", {k: v for k, v in args.items() if k != "clickup_task_id"})
|
|
||||||
log.info("=" * 70)
|
|
||||||
|
|
||||||
# Move to "automation underway"
|
|
||||||
client.update_task_status(task_id, config.clickup.automation_status)
|
|
||||||
|
|
||||||
try:
|
|
||||||
result = tools.execute(tool_name, args)
|
|
||||||
|
|
||||||
if result.startswith("Skipped:") or result.startswith("Error:"):
|
|
||||||
log.error("Task skipped/errored: %s", result[:500])
|
|
||||||
client.add_comment(
|
|
||||||
task_id,
|
|
||||||
f"⚠️ CheddahBot could not execute this task.\n\n{result[:2000]}",
|
|
||||||
)
|
|
||||||
client.update_task_status(task_id, config.clickup.error_status)
|
|
||||||
return False
|
|
||||||
|
|
||||||
log.info("Task completed successfully!")
|
|
||||||
log.info("Result preview:\n%s", result[:1000])
|
|
||||||
return True
|
|
||||||
|
|
||||||
except Exception as e:
|
|
||||||
log.error("Task failed with exception: %s", e, exc_info=True)
|
|
||||||
client.add_comment(
|
|
||||||
task_id,
|
|
||||||
f"❌ CheddahBot failed to complete this task.\n\nError: {str(e)[:2000]}",
|
|
||||||
)
|
|
||||||
client.update_task_status(task_id, config.clickup.error_status)
|
|
||||||
return False
|
|
||||||
|
|
||||||
|
|
||||||
def main():
|
|
||||||
parser = argparse.ArgumentParser(description="Run PR pipeline from ClickUp")
|
|
||||||
parser.add_argument("--dry-run", action="store_true", help="Discover only, don't execute")
|
|
||||||
parser.add_argument("--max", type=int, default=3, help="Max tasks to execute (default: 3)")
|
|
||||||
args = parser.parse_args()
|
|
||||||
|
|
||||||
log.info("Bootstrapping CheddahBot...")
|
|
||||||
config, db, agent, tools = bootstrap()
|
|
||||||
|
|
||||||
log.info("Polling ClickUp for Press Release tasks...")
|
|
||||||
pr_tasks, client = discover_pr_tasks(config)
|
|
||||||
|
|
||||||
if not pr_tasks:
|
|
||||||
log.info("No Press Release tasks found in statuses %s", config.clickup.poll_statuses)
|
|
||||||
return
|
|
||||||
|
|
||||||
log.info("Found %d Press Release task(s):", len(pr_tasks))
|
|
||||||
for i, task in enumerate(pr_tasks):
|
|
||||||
status_str = f"status={task.status}" if hasattr(task, "status") else ""
|
|
||||||
log.info(" %d. %s (id=%s) %s", i + 1, task.name, task.id, status_str)
|
|
||||||
log.info(" Custom fields: %s", task.custom_fields)
|
|
||||||
|
|
||||||
if args.dry_run:
|
|
||||||
log.info("Dry run — not executing. Use without --dry-run to execute.")
|
|
||||||
return
|
|
||||||
|
|
||||||
# Execute up to --max tasks
|
|
||||||
to_run = pr_tasks[: args.max]
|
|
||||||
log.info("Will execute %d task(s) (max=%d)", len(to_run), args.max)
|
|
||||||
|
|
||||||
results = []
|
|
||||||
for i, task in enumerate(to_run):
|
|
||||||
log.info("\n>>> Task %d/%d <<<", i + 1, len(to_run))
|
|
||||||
success = execute_task(agent, tools, config, client, task)
|
|
||||||
results.append((task.name, success))
|
|
||||||
|
|
||||||
log.info("\n" + "=" * 70)
|
|
||||||
log.info("RESULTS SUMMARY")
|
|
||||||
log.info("=" * 70)
|
|
||||||
for name, success in results:
|
|
||||||
status = "OK" if success else "FAILED"
|
|
||||||
log.info(" [%s] %s", status, name)
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main()
|
|
||||||
|
|
@ -1,29 +0,0 @@
|
||||||
---
|
|
||||||
name: create-task
|
|
||||||
description: Create new ClickUp tasks for clients. Use when the user asks to create, add, or make a new task.
|
|
||||||
tools: [clickup_create_task]
|
|
||||||
agents: [default]
|
|
||||||
---
|
|
||||||
|
|
||||||
# Create ClickUp Task
|
|
||||||
|
|
||||||
Creates a new task in the client's "Overall" list in ClickUp.
|
|
||||||
|
|
||||||
## Required Information
|
|
||||||
|
|
||||||
- **name**: The task name (e.g., "Write blog post about AI trends")
|
|
||||||
- **client**: The client/folder name in ClickUp (e.g., "Acme Corp")
|
|
||||||
|
|
||||||
## Optional Information
|
|
||||||
|
|
||||||
- **work_category**: The work category dropdown value (e.g., "Press Release", "Link Building", "Content Creation", "On Page Optimization")
|
|
||||||
- **description**: Task description (supports markdown)
|
|
||||||
- **status**: Initial status (default: "to do")
|
|
||||||
|
|
||||||
## Examples
|
|
||||||
|
|
||||||
"Create a press release task for Acme Corp about their new product launch"
|
|
||||||
-> name: "Press Release - New Product Launch", client: "Acme Corp", work_category: "Press Release"
|
|
||||||
|
|
||||||
"Add a link building task for Widget Co"
|
|
||||||
-> name: "Link Building Campaign", client: "Widget Co", work_category: "Link Building"
|
|
||||||
|
|
@ -18,10 +18,9 @@ When the user provides a press release topic, follow this workflow:
|
||||||
- Each headline must be:
|
- Each headline must be:
|
||||||
- Maximum 70 characters
|
- Maximum 70 characters
|
||||||
- Title case
|
- Title case
|
||||||
- News-wire style (not promotional)
|
- News-focused (not promotional)
|
||||||
- Free of location keywords, superlatives (best/top/leading/#1), and questions
|
- Free of location keywords, superlatives (best/top/leading/#1), and questions
|
||||||
- MUST NOT fabricate events, expansions, milestones, or demand claims
|
- Not make up information that isn't true.
|
||||||
- Unless the topic explicitly signals actual news (e.g. "Actual News", "New Product", "Launch"), assume the company ALREADY offers this — use awareness verbs like "Highlights", "Reinforces", "Delivers", "Showcases", NOT announcement verbs like "Announces", "Launches", "Expands"
|
|
||||||
- Present all 7 titles to an AI agent to judge which is best. This can be decided by looking at titles on Press Advantage for other businesses, and seeing how closely the headline follows the instructions.
|
- Present all 7 titles to an AI agent to judge which is best. This can be decided by looking at titles on Press Advantage for other businesses, and seeing how closely the headline follows the instructions.
|
||||||
|
|
||||||
** EXAMPLE GREAT HEADLINES: **
|
** EXAMPLE GREAT HEADLINES: **
|
||||||
|
|
@ -76,10 +75,8 @@ When generating the 7 headline options:
|
||||||
|
|
||||||
### Content Type
|
### Content Type
|
||||||
- This is a PRESS RELEASE, not an advertorial, blog post, or promotional content
|
- This is a PRESS RELEASE, not an advertorial, blog post, or promotional content
|
||||||
- Must be written in objective, journalistic style
|
- Must be objective news announcement written in journalistic style
|
||||||
- By default this is an AWARENESS piece — the company already offers this capability. Frame it as highlighting/reinforcing existing offerings, NOT as announcing something new
|
- Must announce actual NEWS (about products/services, milestones, awards, reactions to current events)
|
||||||
- Only use announcement language (announces, launches, introduces) when the topic explicitly signals actual news (e.g. topic contains "Actual News", "New Product", "Launch")
|
|
||||||
- Do NOT fabricate events, expansions, milestones, or demand claims. If nothing new happened, do not pretend it did.
|
|
||||||
- Must read like it could appear verbatim in a newspaper
|
- Must read like it could appear verbatim in a newspaper
|
||||||
|
|
||||||
### Writing Style - MANDATORY
|
### Writing Style - MANDATORY
|
||||||
|
|
@ -206,7 +203,7 @@ Before finalizing, verify:
|
||||||
3. Include 1-2 executive quotes for human perspective
|
3. Include 1-2 executive quotes for human perspective
|
||||||
4. Provide context about the company/organization
|
4. Provide context about the company/organization
|
||||||
5. Explain significance and impact
|
5. Explain significance and impact
|
||||||
6. Do NOT include an "About" section or company boilerplate — Press Advantage adds this automatically
|
6. End with company boilerplate and contact information
|
||||||
7. Write in inverted pyramid style - can be cut from bottom up
|
7. Write in inverted pyramid style - can be cut from bottom up
|
||||||
|
|
||||||
## Tone Guidelines
|
## Tone Guidelines
|
||||||
|
|
|
||||||
3
start.sh
3
start.sh
|
|
@ -1,3 +0,0 @@
|
||||||
#!/usr/bin/env bash
|
|
||||||
cd "$(dirname "$0")"
|
|
||||||
exec uv run python -m cheddahbot
|
|
||||||
|
|
@ -12,7 +12,6 @@ import pytest
|
||||||
|
|
||||||
from cheddahbot.config import AutoCoraConfig, ClickUpConfig, Config
|
from cheddahbot.config import AutoCoraConfig, ClickUpConfig, Config
|
||||||
from cheddahbot.tools.autocora import (
|
from cheddahbot.tools.autocora import (
|
||||||
_find_qualifying_tasks_sweep,
|
|
||||||
_group_by_keyword,
|
_group_by_keyword,
|
||||||
_make_job_id,
|
_make_job_id,
|
||||||
_parse_result,
|
_parse_result,
|
||||||
|
|
@ -37,7 +36,6 @@ class FakeTask:
|
||||||
task_type: str = "Content Creation"
|
task_type: str = "Content Creation"
|
||||||
due_date: str = ""
|
due_date: str = ""
|
||||||
custom_fields: dict[str, Any] = field(default_factory=dict)
|
custom_fields: dict[str, Any] = field(default_factory=dict)
|
||||||
tags: list[str] = field(default_factory=list)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture()
|
@pytest.fixture()
|
||||||
|
|
@ -149,12 +147,11 @@ class TestGroupByKeyword:
|
||||||
assert len(groups) == 0
|
assert len(groups) == 0
|
||||||
assert any("missing Keyword" in a for a in alerts)
|
assert any("missing Keyword" in a for a in alerts)
|
||||||
|
|
||||||
def test_missing_imsurl_uses_fallback(self):
|
def test_missing_imsurl(self):
|
||||||
"""Missing IMSURL gets a fallback blank URL."""
|
|
||||||
tasks = [FakeTask(id="t1", name="No URL", custom_fields={"Keyword": "test"})]
|
tasks = [FakeTask(id="t1", name="No URL", custom_fields={"Keyword": "test"})]
|
||||||
groups, alerts = _group_by_keyword(tasks, tasks)
|
groups, alerts = _group_by_keyword(tasks, tasks)
|
||||||
assert len(groups) == 1
|
assert len(groups) == 0
|
||||||
assert groups["test"]["url"] == "https://seotoollab.com/blank.html"
|
assert any("missing IMSURL" in a for a in alerts)
|
||||||
|
|
||||||
def test_sibling_tasks(self):
|
def test_sibling_tasks(self):
|
||||||
"""Tasks sharing a keyword from all_tasks should be included."""
|
"""Tasks sharing a keyword from all_tasks should be included."""
|
||||||
|
|
@ -204,7 +201,9 @@ class TestSubmitAutocoraJobs:
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
||||||
)
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"cheddahbot.tools.autocora._find_all_todo_tasks", lambda *a, **kw: [task]
|
||||||
|
)
|
||||||
|
|
||||||
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
||||||
assert "Submitted 1 job" in result
|
assert "Submitted 1 job" in result
|
||||||
|
|
@ -220,8 +219,8 @@ class TestSubmitAutocoraJobs:
|
||||||
assert job_data["url"] == "http://example.com"
|
assert job_data["url"] == "http://example.com"
|
||||||
assert job_data["task_ids"] == ["t1"]
|
assert job_data["task_ids"] == ["t1"]
|
||||||
|
|
||||||
def test_submit_writes_job_with_task_ids(self, ctx, monkeypatch):
|
def test_submit_tracks_kv(self, ctx, monkeypatch):
|
||||||
"""Job file contains task_ids for the result poller."""
|
"""KV store tracks submitted jobs."""
|
||||||
task = FakeTask(
|
task = FakeTask(
|
||||||
id="t1",
|
id="t1",
|
||||||
name="Test",
|
name="Test",
|
||||||
|
|
@ -231,18 +230,20 @@ class TestSubmitAutocoraJobs:
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
||||||
)
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"cheddahbot.tools.autocora._find_all_todo_tasks", lambda *a, **kw: [task]
|
||||||
|
)
|
||||||
|
|
||||||
submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
||||||
|
|
||||||
jobs_dir = Path(ctx["config"].autocora.jobs_dir)
|
raw = ctx["db"].kv_get("autocora:job:test keyword")
|
||||||
job_files = list(jobs_dir.glob("job-*.json"))
|
assert raw is not None
|
||||||
assert len(job_files) == 1
|
state = json.loads(raw)
|
||||||
data = json.loads(job_files[0].read_text())
|
assert state["status"] == "submitted"
|
||||||
assert "t1" in data["task_ids"]
|
assert "t1" in state["task_ids"]
|
||||||
|
|
||||||
def test_duplicate_prevention(self, ctx, monkeypatch):
|
def test_duplicate_prevention(self, ctx, monkeypatch):
|
||||||
"""Already-submitted keywords are skipped (job file exists)."""
|
"""Already-submitted keywords are skipped."""
|
||||||
task = FakeTask(
|
task = FakeTask(
|
||||||
id="t1",
|
id="t1",
|
||||||
name="Test",
|
name="Test",
|
||||||
|
|
@ -252,12 +253,14 @@ class TestSubmitAutocoraJobs:
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
||||||
)
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"cheddahbot.tools.autocora._find_all_todo_tasks", lambda *a, **kw: [task]
|
||||||
|
)
|
||||||
|
|
||||||
# First submit
|
# First submit
|
||||||
submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
||||||
|
|
||||||
# Second submit — should skip (job file already exists)
|
# Second submit — should skip
|
||||||
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
||||||
assert "Skipped 1" in result
|
assert "Skipped 1" in result
|
||||||
|
|
||||||
|
|
@ -272,13 +275,15 @@ class TestSubmitAutocoraJobs:
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
||||||
)
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"cheddahbot.tools.autocora._find_all_todo_tasks", lambda *a, **kw: [task]
|
||||||
|
)
|
||||||
|
|
||||||
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
||||||
assert "missing Keyword" in result
|
assert "missing Keyword" in result
|
||||||
|
|
||||||
def test_missing_imsurl_uses_fallback(self, ctx, monkeypatch):
|
def test_missing_imsurl_alert(self, ctx, monkeypatch):
|
||||||
"""Tasks without IMSURL use fallback URL and still submit."""
|
"""Tasks without IMSURL field produce alerts."""
|
||||||
task = FakeTask(
|
task = FakeTask(
|
||||||
id="t1",
|
id="t1",
|
||||||
name="No URL Task",
|
name="No URL Task",
|
||||||
|
|
@ -288,10 +293,12 @@ class TestSubmitAutocoraJobs:
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
"cheddahbot.tools.autocora._find_qualifying_tasks", lambda *a, **kw: [task]
|
||||||
)
|
)
|
||||||
|
monkeypatch.setattr(
|
||||||
|
"cheddahbot.tools.autocora._find_all_todo_tasks", lambda *a, **kw: [task]
|
||||||
|
)
|
||||||
|
|
||||||
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
result = submit_autocora_jobs(target_date="2025-01-01", ctx=ctx)
|
||||||
assert "Submitted 1 job" in result
|
assert "missing IMSURL" in result
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|
@ -305,18 +312,33 @@ class TestPollAutocoraResults:
|
||||||
result = poll_autocora_results(ctx=ctx)
|
result = poll_autocora_results(ctx=ctx)
|
||||||
assert "disabled" in result.lower()
|
assert "disabled" in result.lower()
|
||||||
|
|
||||||
def test_no_result_files(self, ctx):
|
def test_no_pending(self, ctx):
|
||||||
result = poll_autocora_results(ctx=ctx)
|
result = poll_autocora_results(ctx=ctx)
|
||||||
assert "No result files" in result
|
assert "No pending" in result
|
||||||
|
|
||||||
def test_success_json(self, ctx, monkeypatch):
|
def test_success_json(self, ctx, monkeypatch):
|
||||||
"""JSON SUCCESS result updates ClickUp and moves result file."""
|
"""JSON SUCCESS result updates KV and ClickUp."""
|
||||||
|
db = ctx["db"]
|
||||||
results_dir = Path(ctx["config"].autocora.results_dir)
|
results_dir = Path(ctx["config"].autocora.results_dir)
|
||||||
|
|
||||||
# Write result file directly (no KV needed)
|
# Set up submitted job in KV
|
||||||
result_data = {"status": "SUCCESS", "task_ids": ["t1", "t2"], "keyword": "test keyword"}
|
job_id = "job-123-test"
|
||||||
(results_dir / "job-123-test.result").write_text(json.dumps(result_data))
|
kv_key = "autocora:job:test keyword"
|
||||||
|
db.kv_set(
|
||||||
|
kv_key,
|
||||||
|
json.dumps({
|
||||||
|
"status": "submitted",
|
||||||
|
"job_id": job_id,
|
||||||
|
"keyword": "test keyword",
|
||||||
|
"task_ids": ["t1", "t2"],
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Write result file
|
||||||
|
result_data = {"status": "SUCCESS", "task_ids": ["t1", "t2"]}
|
||||||
|
(results_dir / f"{job_id}.result").write_text(json.dumps(result_data))
|
||||||
|
|
||||||
|
# Mock ClickUp client
|
||||||
mock_client = MagicMock()
|
mock_client = MagicMock()
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
"cheddahbot.tools.autocora._get_clickup_client", lambda ctx: mock_client
|
"cheddahbot.tools.autocora._get_clickup_client", lambda ctx: mock_client
|
||||||
|
|
@ -325,27 +347,39 @@ class TestPollAutocoraResults:
|
||||||
result = poll_autocora_results(ctx=ctx)
|
result = poll_autocora_results(ctx=ctx)
|
||||||
assert "SUCCESS: test keyword" in result
|
assert "SUCCESS: test keyword" in result
|
||||||
|
|
||||||
|
# Verify KV updated
|
||||||
|
state = json.loads(db.kv_get(kv_key))
|
||||||
|
assert state["status"] == "completed"
|
||||||
|
|
||||||
# Verify ClickUp calls
|
# Verify ClickUp calls
|
||||||
assert mock_client.update_task_status.call_count == 2
|
assert mock_client.update_task_status.call_count == 2
|
||||||
mock_client.update_task_status.assert_any_call("t1", "running cora")
|
mock_client.update_task_status.assert_any_call("t1", "running cora")
|
||||||
mock_client.update_task_status.assert_any_call("t2", "running cora")
|
mock_client.update_task_status.assert_any_call("t2", "running cora")
|
||||||
assert mock_client.add_comment.call_count == 2
|
assert mock_client.add_comment.call_count == 2
|
||||||
|
|
||||||
# Verify result file moved to processed/
|
|
||||||
assert not (results_dir / "job-123-test.result").exists()
|
|
||||||
assert (results_dir / "processed" / "job-123-test.result").exists()
|
|
||||||
|
|
||||||
def test_failure_json(self, ctx, monkeypatch):
|
def test_failure_json(self, ctx, monkeypatch):
|
||||||
"""JSON FAILURE result updates ClickUp with error."""
|
"""JSON FAILURE result updates KV and ClickUp with error."""
|
||||||
|
db = ctx["db"]
|
||||||
results_dir = Path(ctx["config"].autocora.results_dir)
|
results_dir = Path(ctx["config"].autocora.results_dir)
|
||||||
|
|
||||||
|
job_id = "job-456-fail"
|
||||||
|
kv_key = "autocora:job:fail keyword"
|
||||||
|
db.kv_set(
|
||||||
|
kv_key,
|
||||||
|
json.dumps({
|
||||||
|
"status": "submitted",
|
||||||
|
"job_id": job_id,
|
||||||
|
"keyword": "fail keyword",
|
||||||
|
"task_ids": ["t3"],
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
result_data = {
|
result_data = {
|
||||||
"status": "FAILURE",
|
"status": "FAILURE",
|
||||||
"reason": "Cora not running",
|
"reason": "Cora not running",
|
||||||
"task_ids": ["t3"],
|
"task_ids": ["t3"],
|
||||||
"keyword": "fail keyword",
|
|
||||||
}
|
}
|
||||||
(results_dir / "job-456-fail.result").write_text(json.dumps(result_data))
|
(results_dir / f"{job_id}.result").write_text(json.dumps(result_data))
|
||||||
|
|
||||||
mock_client = MagicMock()
|
mock_client = MagicMock()
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
|
|
@ -356,14 +390,31 @@ class TestPollAutocoraResults:
|
||||||
assert "FAILURE: fail keyword" in result
|
assert "FAILURE: fail keyword" in result
|
||||||
assert "Cora not running" in result
|
assert "Cora not running" in result
|
||||||
|
|
||||||
|
state = json.loads(db.kv_get(kv_key))
|
||||||
|
assert state["status"] == "failed"
|
||||||
|
assert state["error"] == "Cora not running"
|
||||||
|
|
||||||
mock_client.update_task_status.assert_called_once_with("t3", "error")
|
mock_client.update_task_status.assert_called_once_with("t3", "error")
|
||||||
|
|
||||||
def test_legacy_plain_text(self, ctx, monkeypatch):
|
def test_legacy_plain_text(self, ctx, monkeypatch):
|
||||||
"""Legacy plain-text SUCCESS result still works (keyword from filename)."""
|
"""Legacy plain-text SUCCESS result still works."""
|
||||||
|
db = ctx["db"]
|
||||||
results_dir = Path(ctx["config"].autocora.results_dir)
|
results_dir = Path(ctx["config"].autocora.results_dir)
|
||||||
|
|
||||||
|
job_id = "job-789-legacy"
|
||||||
|
kv_key = "autocora:job:legacy kw"
|
||||||
|
db.kv_set(
|
||||||
|
kv_key,
|
||||||
|
json.dumps({
|
||||||
|
"status": "submitted",
|
||||||
|
"job_id": job_id,
|
||||||
|
"keyword": "legacy kw",
|
||||||
|
"task_ids": ["t5"],
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
# Legacy format — plain text, no JSON
|
# Legacy format — plain text, no JSON
|
||||||
(results_dir / "job-789-legacy-kw.result").write_text("SUCCESS")
|
(results_dir / f"{job_id}.result").write_text("SUCCESS")
|
||||||
|
|
||||||
mock_client = MagicMock()
|
mock_client = MagicMock()
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
|
|
@ -371,17 +422,31 @@ class TestPollAutocoraResults:
|
||||||
)
|
)
|
||||||
|
|
||||||
result = poll_autocora_results(ctx=ctx)
|
result = poll_autocora_results(ctx=ctx)
|
||||||
assert "SUCCESS:" in result
|
assert "SUCCESS: legacy kw" in result
|
||||||
|
|
||||||
# No task_ids in legacy format, so no ClickUp calls
|
# task_ids come from KV fallback
|
||||||
mock_client.update_task_status.assert_not_called()
|
mock_client.update_task_status.assert_called_once_with("t5", "running cora")
|
||||||
|
|
||||||
def test_task_ids_from_result_file(self, ctx, monkeypatch):
|
def test_task_ids_from_result_preferred(self, ctx, monkeypatch):
|
||||||
"""task_ids from result file drive ClickUp updates."""
|
"""task_ids from result file take precedence over KV."""
|
||||||
|
db = ctx["db"]
|
||||||
results_dir = Path(ctx["config"].autocora.results_dir)
|
results_dir = Path(ctx["config"].autocora.results_dir)
|
||||||
|
|
||||||
result_data = {"status": "SUCCESS", "task_ids": ["new_t1", "new_t2"], "keyword": "pref kw"}
|
job_id = "job-100-pref"
|
||||||
(results_dir / "job-100-pref.result").write_text(json.dumps(result_data))
|
kv_key = "autocora:job:pref kw"
|
||||||
|
db.kv_set(
|
||||||
|
kv_key,
|
||||||
|
json.dumps({
|
||||||
|
"status": "submitted",
|
||||||
|
"job_id": job_id,
|
||||||
|
"keyword": "pref kw",
|
||||||
|
"task_ids": ["old_t1"], # KV has old IDs
|
||||||
|
}),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Result has updated task_ids
|
||||||
|
result_data = {"status": "SUCCESS", "task_ids": ["new_t1", "new_t2"]}
|
||||||
|
(results_dir / f"{job_id}.result").write_text(json.dumps(result_data))
|
||||||
|
|
||||||
mock_client = MagicMock()
|
mock_client = MagicMock()
|
||||||
monkeypatch.setattr(
|
monkeypatch.setattr(
|
||||||
|
|
@ -390,107 +455,25 @@ class TestPollAutocoraResults:
|
||||||
|
|
||||||
poll_autocora_results(ctx=ctx)
|
poll_autocora_results(ctx=ctx)
|
||||||
|
|
||||||
|
# Should use result file task_ids, not KV
|
||||||
calls = [c.args for c in mock_client.update_task_status.call_args_list]
|
calls = [c.args for c in mock_client.update_task_status.call_args_list]
|
||||||
assert ("new_t1", "running cora") in calls
|
assert ("new_t1", "running cora") in calls
|
||||||
assert ("new_t2", "running cora") in calls
|
assert ("new_t2", "running cora") in calls
|
||||||
|
assert ("old_t1", "running cora") not in calls
|
||||||
|
|
||||||
|
def test_still_pending(self, ctx):
|
||||||
# ---------------------------------------------------------------------------
|
"""Jobs without result files show as still pending."""
|
||||||
# Sweep tests
|
db = ctx["db"]
|
||||||
# ---------------------------------------------------------------------------
|
db.kv_set(
|
||||||
|
"autocora:job:waiting",
|
||||||
|
json.dumps({
|
||||||
class TestFindQualifyingTasksSweep:
|
"status": "submitted",
|
||||||
"""Test the multi-pass sweep logic."""
|
"job_id": "job-999-wait",
|
||||||
|
"keyword": "waiting",
|
||||||
def _make_client(self, tasks):
|
"task_ids": ["t99"],
|
||||||
client = MagicMock()
|
}),
|
||||||
client.get_tasks_from_space.return_value = tasks
|
|
||||||
return client
|
|
||||||
|
|
||||||
def _make_config(self):
|
|
||||||
config = MagicMock()
|
|
||||||
config.clickup.space_id = "sp1"
|
|
||||||
return config
|
|
||||||
|
|
||||||
def test_finds_tasks_due_today(self):
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
|
||||||
today_ms = int(now.replace(hour=12).timestamp() * 1000)
|
|
||||||
task = FakeTask(id="t1", name="Today", due_date=str(today_ms))
|
|
||||||
client = self._make_client([task])
|
|
||||||
config = self._make_config()
|
|
||||||
|
|
||||||
result = _find_qualifying_tasks_sweep(client, config, ["Content Creation"])
|
|
||||||
assert any(t.id == "t1" for t in result)
|
|
||||||
|
|
||||||
def test_finds_overdue_with_month_tag(self):
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
|
||||||
month_tag = now.strftime("%b%y").lower()
|
|
||||||
# Due 3 days ago
|
|
||||||
overdue_ms = int((now.timestamp() - 3 * 86400) * 1000)
|
|
||||||
task = FakeTask(
|
|
||||||
id="t2", name="Overdue", due_date=str(overdue_ms), tags=[month_tag]
|
|
||||||
)
|
)
|
||||||
client = self._make_client([task])
|
|
||||||
config = self._make_config()
|
|
||||||
|
|
||||||
result = _find_qualifying_tasks_sweep(client, config, ["Content Creation"])
|
result = poll_autocora_results(ctx=ctx)
|
||||||
assert any(t.id == "t2" for t in result)
|
assert "Still pending" in result
|
||||||
|
assert "waiting" in result
|
||||||
def test_finds_last_month_tagged(self):
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
|
||||||
if now.month == 1:
|
|
||||||
last = now.replace(year=now.year - 1, month=12)
|
|
||||||
else:
|
|
||||||
last = now.replace(month=now.month - 1)
|
|
||||||
last_tag = last.strftime("%b%y").lower()
|
|
||||||
# No due date needed for month-tag pass
|
|
||||||
task = FakeTask(id="t3", name="Last Month", tags=[last_tag])
|
|
||||||
client = self._make_client([task])
|
|
||||||
config = self._make_config()
|
|
||||||
|
|
||||||
result = _find_qualifying_tasks_sweep(client, config, ["Content Creation"])
|
|
||||||
assert any(t.id == "t3" for t in result)
|
|
||||||
|
|
||||||
def test_finds_lookahead(self):
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
|
||||||
tomorrow_ms = int((now.timestamp() + 36 * 3600) * 1000)
|
|
||||||
task = FakeTask(id="t4", name="Tomorrow", due_date=str(tomorrow_ms))
|
|
||||||
client = self._make_client([task])
|
|
||||||
config = self._make_config()
|
|
||||||
|
|
||||||
result = _find_qualifying_tasks_sweep(client, config, ["Content Creation"])
|
|
||||||
assert any(t.id == "t4" for t in result)
|
|
||||||
|
|
||||||
def test_deduplicates_across_passes(self):
|
|
||||||
from datetime import UTC, datetime
|
|
||||||
|
|
||||||
now = datetime.now(UTC)
|
|
||||||
month_tag = now.strftime("%b%y").lower()
|
|
||||||
today_ms = int(now.replace(hour=12).timestamp() * 1000)
|
|
||||||
# Task is due today AND has month tag — should only appear once
|
|
||||||
task = FakeTask(
|
|
||||||
id="t5", name="Multi", due_date=str(today_ms), tags=[month_tag]
|
|
||||||
)
|
|
||||||
client = self._make_client([task])
|
|
||||||
config = self._make_config()
|
|
||||||
|
|
||||||
result = _find_qualifying_tasks_sweep(client, config, ["Content Creation"])
|
|
||||||
ids = [t.id for t in result]
|
|
||||||
assert ids.count("t5") == 1
|
|
||||||
|
|
||||||
def test_empty_space_id(self):
|
|
||||||
config = self._make_config()
|
|
||||||
config.clickup.space_id = ""
|
|
||||||
client = self._make_client([])
|
|
||||||
|
|
||||||
result = _find_qualifying_tasks_sweep(client, config, ["Content Creation"])
|
|
||||||
assert result == []
|
|
||||||
|
|
|
||||||
|
|
@ -434,208 +434,3 @@ class TestClickUpClient:
|
||||||
|
|
||||||
assert result is None
|
assert result is None
|
||||||
client.close()
|
client.close()
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_create_task(self):
|
|
||||||
respx.post(f"{BASE_URL}/list/list_1/task").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={
|
|
||||||
"id": "new_task_1",
|
|
||||||
"name": "Test Task",
|
|
||||||
"url": "https://app.clickup.com/t/new_task_1",
|
|
||||||
},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.create_task(
|
|
||||||
list_id="list_1",
|
|
||||||
name="Test Task",
|
|
||||||
description="A test description",
|
|
||||||
status="to do",
|
|
||||||
)
|
|
||||||
|
|
||||||
assert result["id"] == "new_task_1"
|
|
||||||
assert result["url"] == "https://app.clickup.com/t/new_task_1"
|
|
||||||
request = respx.calls.last.request
|
|
||||||
import json
|
|
||||||
|
|
||||||
body = json.loads(request.content)
|
|
||||||
assert body["name"] == "Test Task"
|
|
||||||
assert body["description"] == "A test description"
|
|
||||||
assert body["status"] == "to do"
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_create_task_with_optional_fields(self):
|
|
||||||
respx.post(f"{BASE_URL}/list/list_1/task").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={"id": "new_task_2", "name": "Tagged Task", "url": ""},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.create_task(
|
|
||||||
list_id="list_1",
|
|
||||||
name="Tagged Task",
|
|
||||||
due_date=1740000000000,
|
|
||||||
tags=["urgent", "mar26"],
|
|
||||||
custom_fields=[{"id": "cf_1", "value": "opt_1"}],
|
|
||||||
)
|
|
||||||
|
|
||||||
assert result["id"] == "new_task_2"
|
|
||||||
import json
|
|
||||||
|
|
||||||
body = json.loads(respx.calls.last.request.content)
|
|
||||||
assert body["due_date"] == 1740000000000
|
|
||||||
assert body["tags"] == ["urgent", "mar26"]
|
|
||||||
assert body["custom_fields"] == [{"id": "cf_1", "value": "opt_1"}]
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_find_list_in_folder_found(self):
|
|
||||||
respx.get(f"{BASE_URL}/space/space_1/folder").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={
|
|
||||||
"folders": [
|
|
||||||
{
|
|
||||||
"id": "f1",
|
|
||||||
"name": "Acme Corp",
|
|
||||||
"lists": [
|
|
||||||
{"id": "list_overall", "name": "Overall"},
|
|
||||||
{"id": "list_archive", "name": "Archive"},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"id": "f2",
|
|
||||||
"name": "Widget Co",
|
|
||||||
"lists": [
|
|
||||||
{"id": "list_w_overall", "name": "Overall"},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
]
|
|
||||||
},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.find_list_in_folder("space_1", "Acme Corp")
|
|
||||||
assert result == "list_overall"
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_find_list_in_folder_case_insensitive(self):
|
|
||||||
respx.get(f"{BASE_URL}/space/space_1/folder").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={
|
|
||||||
"folders": [
|
|
||||||
{
|
|
||||||
"id": "f1",
|
|
||||||
"name": "Acme Corp",
|
|
||||||
"lists": [{"id": "list_overall", "name": "Overall"}],
|
|
||||||
},
|
|
||||||
]
|
|
||||||
},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.find_list_in_folder("space_1", "acme corp")
|
|
||||||
assert result == "list_overall"
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_find_list_in_folder_not_found(self):
|
|
||||||
respx.get(f"{BASE_URL}/space/space_1/folder").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={
|
|
||||||
"folders": [
|
|
||||||
{
|
|
||||||
"id": "f1",
|
|
||||||
"name": "Acme Corp",
|
|
||||||
"lists": [{"id": "list_1", "name": "Overall"}],
|
|
||||||
},
|
|
||||||
]
|
|
||||||
},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.find_list_in_folder("space_1", "NonExistent Client")
|
|
||||||
assert result is None
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_set_custom_field_smart_dropdown(self):
|
|
||||||
"""Resolves dropdown option name to UUID automatically."""
|
|
||||||
respx.get(f"{BASE_URL}/list/list_1/field").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={
|
|
||||||
"fields": [
|
|
||||||
{
|
|
||||||
"id": "cf_lb",
|
|
||||||
"name": "LB Method",
|
|
||||||
"type": "drop_down",
|
|
||||||
"type_config": {
|
|
||||||
"options": [
|
|
||||||
{"id": "opt_cora", "name": "Cora Backlinks"},
|
|
||||||
{"id": "opt_manual", "name": "Manual"},
|
|
||||||
]
|
|
||||||
},
|
|
||||||
},
|
|
||||||
]
|
|
||||||
},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
respx.post(f"{BASE_URL}/task/t1/field/cf_lb").mock(
|
|
||||||
return_value=httpx.Response(200, json={})
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.set_custom_field_smart(
|
|
||||||
"t1", "list_1", "LB Method", "Cora Backlinks"
|
|
||||||
)
|
|
||||||
assert result is True
|
|
||||||
import json
|
|
||||||
|
|
||||||
body = json.loads(respx.calls.last.request.content)
|
|
||||||
assert body["value"] == "opt_cora"
|
|
||||||
client.close()
|
|
||||||
|
|
||||||
@respx.mock
|
|
||||||
def test_set_custom_field_smart_text(self):
|
|
||||||
"""Passes text field values through without resolution."""
|
|
||||||
respx.get(f"{BASE_URL}/list/list_1/field").mock(
|
|
||||||
return_value=httpx.Response(
|
|
||||||
200,
|
|
||||||
json={
|
|
||||||
"fields": [
|
|
||||||
{
|
|
||||||
"id": "cf_kw",
|
|
||||||
"name": "Keyword",
|
|
||||||
"type": "short_text",
|
|
||||||
},
|
|
||||||
]
|
|
||||||
},
|
|
||||||
)
|
|
||||||
)
|
|
||||||
respx.post(f"{BASE_URL}/task/t1/field/cf_kw").mock(
|
|
||||||
return_value=httpx.Response(200, json={})
|
|
||||||
)
|
|
||||||
|
|
||||||
client = ClickUpClient(api_token="pk_test")
|
|
||||||
result = client.set_custom_field_smart(
|
|
||||||
"t1", "list_1", "Keyword", "shaft manufacturing"
|
|
||||||
)
|
|
||||||
assert result is True
|
|
||||||
import json
|
|
||||||
|
|
||||||
body = json.loads(respx.calls.last.request.content)
|
|
||||||
assert body["value"] == "shaft manufacturing"
|
|
||||||
client.close()
|
|
||||||
|
|
|
||||||
|
|
@ -1,180 +1,147 @@
|
||||||
"""Tests for the ClickUp chat tools (API-backed, no KV store)."""
|
"""Tests for the ClickUp chat tools."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
import json
|
||||||
from unittest.mock import MagicMock, patch
|
|
||||||
|
|
||||||
from cheddahbot.tools.clickup_tool import (
|
from cheddahbot.tools.clickup_tool import (
|
||||||
clickup_list_tasks,
|
clickup_list_tasks,
|
||||||
clickup_query_tasks,
|
clickup_reset_all,
|
||||||
clickup_reset_task,
|
clickup_reset_task,
|
||||||
clickup_task_status,
|
clickup_task_status,
|
||||||
get_active_tasks,
|
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
def _make_ctx(db):
|
||||||
class FakeTask:
|
return {"db": db}
|
||||||
id: str = "t1"
|
|
||||||
name: str = "Test Task"
|
|
||||||
status: str = "to do"
|
|
||||||
task_type: str = "Press Release"
|
|
||||||
url: str = "https://app.clickup.com/t/t1"
|
|
||||||
due_date: str = ""
|
|
||||||
date_updated: str = ""
|
|
||||||
tags: list = field(default_factory=list)
|
|
||||||
custom_fields: dict = field(default_factory=dict)
|
|
||||||
|
|
||||||
|
|
||||||
def _make_ctx():
|
def _seed_task(db, task_id, state, **overrides):
|
||||||
config = MagicMock()
|
"""Insert a task state into kv_store."""
|
||||||
config.clickup.api_token = "test-token"
|
data = {
|
||||||
config.clickup.workspace_id = "ws1"
|
"state": state,
|
||||||
config.clickup.space_id = "sp1"
|
"clickup_task_id": task_id,
|
||||||
config.clickup.task_type_field_name = "Work Category"
|
"clickup_task_name": f"Task {task_id}",
|
||||||
config.clickup.automation_status = "automation underway"
|
"task_type": "Press Release",
|
||||||
config.clickup.review_status = "internal review"
|
"skill_name": "write_press_releases",
|
||||||
config.clickup.error_status = "error"
|
"discovered_at": "2026-01-01T00:00:00",
|
||||||
config.clickup.poll_statuses = ["to do"]
|
"started_at": None,
|
||||||
return {"config": config, "db": MagicMock()}
|
"completed_at": None,
|
||||||
|
"error": None,
|
||||||
|
"deliverable_paths": [],
|
||||||
class TestClickupQueryTasks:
|
"custom_fields": {},
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
}
|
||||||
def test_returns_tasks(self, mock_client_fn):
|
data.update(overrides)
|
||||||
mock_client = MagicMock()
|
db.kv_set(f"clickup:task:{task_id}:state", json.dumps(data))
|
||||||
mock_client.get_tasks_from_space.return_value = [
|
|
||||||
FakeTask(id="t1", name="PR Task", task_type="Press Release"),
|
|
||||||
]
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_query_tasks(ctx=_make_ctx())
|
|
||||||
assert "PR Task" in result
|
|
||||||
assert "t1" in result
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
|
||||||
def test_no_tasks_found(self, mock_client_fn):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_client.get_tasks_from_space.return_value = []
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_query_tasks(ctx=_make_ctx())
|
|
||||||
assert "No tasks found" in result
|
|
||||||
|
|
||||||
|
|
||||||
class TestClickupListTasks:
|
class TestClickupListTasks:
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
def test_empty_when_no_tasks(self, tmp_db):
|
||||||
def test_lists_automation_tasks(self, mock_client_fn):
|
result = clickup_list_tasks(ctx=_make_ctx(tmp_db))
|
||||||
mock_client = MagicMock()
|
assert "No ClickUp tasks" in result
|
||||||
mock_client.get_tasks_from_space.return_value = [
|
|
||||||
FakeTask(id="t1", name="Active Task", status="automation underway"),
|
|
||||||
]
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_list_tasks(ctx=_make_ctx())
|
def test_lists_all_tracked_tasks(self, tmp_db):
|
||||||
assert "Active Task" in result
|
_seed_task(tmp_db, "a1", "discovered")
|
||||||
assert "t1" in result
|
_seed_task(tmp_db, "a2", "approved")
|
||||||
|
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
result = clickup_list_tasks(ctx=_make_ctx(tmp_db))
|
||||||
def test_no_automation_tasks(self, mock_client_fn):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_client.get_tasks_from_space.return_value = []
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_list_tasks(ctx=_make_ctx())
|
assert "a1" in result
|
||||||
assert "No tasks found" in result
|
assert "a2" in result
|
||||||
|
assert "2" in result # count
|
||||||
|
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
def test_filter_by_status(self, tmp_db):
|
||||||
def test_filter_by_status(self, mock_client_fn):
|
_seed_task(tmp_db, "a1", "discovered")
|
||||||
mock_client = MagicMock()
|
_seed_task(tmp_db, "a2", "approved")
|
||||||
mock_client.get_tasks_from_space.return_value = [
|
_seed_task(tmp_db, "a3", "completed")
|
||||||
FakeTask(id="t1", name="Error Task", status="error"),
|
|
||||||
]
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_list_tasks(status="error", ctx=_make_ctx())
|
result = clickup_list_tasks(status="approved", ctx=_make_ctx(tmp_db))
|
||||||
assert "Error Task" in result
|
|
||||||
|
assert "a2" in result
|
||||||
|
assert "a1" not in result
|
||||||
|
assert "a3" not in result
|
||||||
|
|
||||||
|
def test_filter_returns_empty_message(self, tmp_db):
|
||||||
|
_seed_task(tmp_db, "a1", "discovered")
|
||||||
|
|
||||||
|
result = clickup_list_tasks(status="completed", ctx=_make_ctx(tmp_db))
|
||||||
|
|
||||||
|
assert "No ClickUp tasks with state" in result
|
||||||
|
|
||||||
|
|
||||||
class TestClickupTaskStatus:
|
class TestClickupTaskStatus:
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
def test_shows_details(self, tmp_db):
|
||||||
def test_shows_details(self, mock_client_fn):
|
_seed_task(tmp_db, "a1", "executing", started_at="2026-01-01T12:00:00")
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_client.get_task.return_value = FakeTask(
|
|
||||||
id="t1",
|
|
||||||
name="My Task",
|
|
||||||
status="automation underway",
|
|
||||||
task_type="Press Release",
|
|
||||||
)
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_task_status(task_id="t1", ctx=_make_ctx())
|
result = clickup_task_status(task_id="a1", ctx=_make_ctx(tmp_db))
|
||||||
assert "My Task" in result
|
|
||||||
assert "automation underway" in result
|
assert "Task a1" in result
|
||||||
|
assert "executing" in result
|
||||||
assert "Press Release" in result
|
assert "Press Release" in result
|
||||||
|
assert "2026-01-01T12:00:00" in result
|
||||||
|
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
def test_unknown_task(self, tmp_db):
|
||||||
def test_api_error(self, mock_client_fn):
|
result = clickup_task_status(task_id="nonexistent", ctx=_make_ctx(tmp_db))
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_client.get_task.side_effect = Exception("Not found")
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_task_status(task_id="bad", ctx=_make_ctx())
|
assert "No tracked state" in result
|
||||||
assert "Error" in result
|
|
||||||
|
def test_shows_error_when_failed(self, tmp_db):
|
||||||
|
_seed_task(tmp_db, "f1", "failed", error="API timeout")
|
||||||
|
|
||||||
|
result = clickup_task_status(task_id="f1", ctx=_make_ctx(tmp_db))
|
||||||
|
|
||||||
|
assert "API timeout" in result
|
||||||
|
|
||||||
|
def test_shows_deliverables(self, tmp_db):
|
||||||
|
_seed_task(tmp_db, "c1", "completed", deliverable_paths=["/data/pr1.txt", "/data/pr2.txt"])
|
||||||
|
|
||||||
|
result = clickup_task_status(task_id="c1", ctx=_make_ctx(tmp_db))
|
||||||
|
|
||||||
|
assert "/data/pr1.txt" in result
|
||||||
|
|
||||||
|
|
||||||
class TestClickupResetTask:
|
class TestClickupResetTask:
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
def test_resets_failed_task(self, tmp_db):
|
||||||
def test_resets_task(self, mock_client_fn):
|
_seed_task(tmp_db, "f1", "failed")
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_reset_task(task_id="t1", ctx=_make_ctx())
|
result = clickup_reset_task(task_id="f1", ctx=_make_ctx(tmp_db))
|
||||||
assert "reset" in result.lower()
|
|
||||||
mock_client.update_task_status.assert_called_once_with("t1", "to do")
|
|
||||||
mock_client.add_comment.assert_called_once()
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.clickup_tool._get_clickup_client")
|
assert "cleared" in result.lower()
|
||||||
def test_api_error(self, mock_client_fn):
|
assert tmp_db.kv_get("clickup:task:f1:state") is None
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_client.update_task_status.side_effect = Exception("API error")
|
|
||||||
mock_client_fn.return_value = mock_client
|
|
||||||
|
|
||||||
result = clickup_reset_task(task_id="t1", ctx=_make_ctx())
|
def test_resets_completed_task(self, tmp_db):
|
||||||
assert "Error" in result
|
_seed_task(tmp_db, "c1", "completed")
|
||||||
|
|
||||||
|
result = clickup_reset_task(task_id="c1", ctx=_make_ctx(tmp_db))
|
||||||
|
|
||||||
|
assert "cleared" in result.lower()
|
||||||
|
assert tmp_db.kv_get("clickup:task:c1:state") is None
|
||||||
|
|
||||||
|
def test_unknown_task(self, tmp_db):
|
||||||
|
result = clickup_reset_task(task_id="nope", ctx=_make_ctx(tmp_db))
|
||||||
|
assert "Nothing to reset" in result
|
||||||
|
|
||||||
|
|
||||||
class TestGetActiveTasks:
|
class TestClickupResetAll:
|
||||||
def test_no_scheduler(self):
|
def test_clears_all_states(self, tmp_db):
|
||||||
result = get_active_tasks(ctx={"config": MagicMock()})
|
_seed_task(tmp_db, "a1", "completed")
|
||||||
assert "not available" in result.lower()
|
_seed_task(tmp_db, "a2", "failed")
|
||||||
|
_seed_task(tmp_db, "a3", "executing")
|
||||||
|
|
||||||
def test_nothing_running(self):
|
result = clickup_reset_all(ctx=_make_ctx(tmp_db))
|
||||||
scheduler = MagicMock()
|
|
||||||
scheduler.get_active_executions.return_value = {}
|
|
||||||
scheduler.get_loop_timestamps.return_value = {"clickup": None, "folder_watch": None}
|
|
||||||
|
|
||||||
result = get_active_tasks(ctx={"scheduler": scheduler})
|
assert "3" in result
|
||||||
assert "No tasks actively executing" in result
|
assert tmp_db.kv_get("clickup:task:a1:state") is None
|
||||||
assert "Safe to restart: Yes" in result
|
assert tmp_db.kv_get("clickup:task:a2:state") is None
|
||||||
|
assert tmp_db.kv_get("clickup:task:a3:state") is None
|
||||||
|
|
||||||
def test_tasks_running(self):
|
def test_clears_legacy_active_ids(self, tmp_db):
|
||||||
from datetime import UTC, datetime, timedelta
|
tmp_db.kv_set("clickup:active_task_ids", json.dumps(["a1", "a2"]))
|
||||||
|
|
||||||
scheduler = MagicMock()
|
clickup_reset_all(ctx=_make_ctx(tmp_db))
|
||||||
scheduler.get_active_executions.return_value = {
|
|
||||||
"t1": {
|
|
||||||
"name": "Press Release for Acme",
|
|
||||||
"tool": "write_press_releases",
|
|
||||||
"started_at": datetime.now(UTC) - timedelta(minutes=5),
|
|
||||||
"thread": "clickup_thread",
|
|
||||||
}
|
|
||||||
}
|
|
||||||
scheduler.get_loop_timestamps.return_value = {"clickup": datetime.now(UTC).isoformat()}
|
|
||||||
|
|
||||||
result = get_active_tasks(ctx={"scheduler": scheduler})
|
assert tmp_db.kv_get("clickup:active_task_ids") is None
|
||||||
assert "Active Executions (1)" in result
|
|
||||||
assert "Press Release for Acme" in result
|
def test_empty_returns_zero(self, tmp_db):
|
||||||
assert "write_press_releases" in result
|
result = clickup_reset_all(ctx=_make_ctx(tmp_db))
|
||||||
assert "Safe to restart: No" in result
|
assert "0" in result
|
||||||
|
|
|
||||||
|
|
@ -1,994 +0,0 @@
|
||||||
"""Tests for the content creation pipeline tool."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import json
|
|
||||||
from pathlib import Path
|
|
||||||
from unittest.mock import MagicMock, patch
|
|
||||||
|
|
||||||
from cheddahbot.config import Config, ContentConfig
|
|
||||||
from cheddahbot.tools.content_creation import (
|
|
||||||
_build_optimization_prompt,
|
|
||||||
_build_phase1_prompt,
|
|
||||||
_build_phase2_prompt,
|
|
||||||
_finalize_optimization,
|
|
||||||
_find_cora_report,
|
|
||||||
_run_optimization,
|
|
||||||
_save_content,
|
|
||||||
_slugify,
|
|
||||||
_sync_clickup_optimization_complete,
|
|
||||||
continue_content,
|
|
||||||
create_content,
|
|
||||||
)
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _slugify
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
def test_slugify_basic():
|
|
||||||
assert _slugify("Plumbing Services") == "plumbing-services"
|
|
||||||
|
|
||||||
|
|
||||||
def test_slugify_special_chars():
|
|
||||||
assert _slugify("AC Repair & Maintenance!") == "ac-repair-maintenance"
|
|
||||||
|
|
||||||
|
|
||||||
def test_slugify_truncates():
|
|
||||||
long = "a" * 200
|
|
||||||
assert len(_slugify(long)) <= 80
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _build_phase1_prompt
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestBuildPhase1Prompt:
|
|
||||||
def test_contains_trigger_keywords(self):
|
|
||||||
prompt = _build_phase1_prompt(
|
|
||||||
"https://example.com/plumbing",
|
|
||||||
"plumbing services",
|
|
||||||
"service page",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
)
|
|
||||||
assert "on-page optimization" in prompt
|
|
||||||
assert "plumbing services" in prompt
|
|
||||||
assert "https://example.com/plumbing" in prompt
|
|
||||||
|
|
||||||
def test_includes_cora_path(self):
|
|
||||||
prompt = _build_phase1_prompt(
|
|
||||||
"https://example.com",
|
|
||||||
"keyword",
|
|
||||||
"blog post",
|
|
||||||
"Z:/cora/report.xlsx",
|
|
||||||
"",
|
|
||||||
)
|
|
||||||
assert "Z:/cora/report.xlsx" in prompt
|
|
||||||
assert "Cora SEO report" in prompt
|
|
||||||
|
|
||||||
def test_includes_capabilities_default(self):
|
|
||||||
default = "Verify on website."
|
|
||||||
prompt = _build_phase1_prompt(
|
|
||||||
"https://example.com",
|
|
||||||
"keyword",
|
|
||||||
"service page",
|
|
||||||
"",
|
|
||||||
default,
|
|
||||||
)
|
|
||||||
assert default in prompt
|
|
||||||
assert "company capabilities" in prompt
|
|
||||||
|
|
||||||
def test_no_cora_no_capabilities(self):
|
|
||||||
prompt = _build_phase1_prompt(
|
|
||||||
"https://example.com",
|
|
||||||
"keyword",
|
|
||||||
"service page",
|
|
||||||
"",
|
|
||||||
"",
|
|
||||||
)
|
|
||||||
assert "Cora SEO report" not in prompt
|
|
||||||
assert "company capabilities" not in prompt
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _build_phase2_prompt
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestBuildPhase2Prompt:
|
|
||||||
def test_contains_outline(self):
|
|
||||||
outline = "## Section 1\nContent here."
|
|
||||||
prompt = _build_phase2_prompt(
|
|
||||||
"https://example.com",
|
|
||||||
"plumbing",
|
|
||||||
outline,
|
|
||||||
"",
|
|
||||||
)
|
|
||||||
assert outline in prompt
|
|
||||||
assert "writing phase" in prompt
|
|
||||||
assert "plumbing" in prompt
|
|
||||||
|
|
||||||
def test_includes_cora_path(self):
|
|
||||||
prompt = _build_phase2_prompt(
|
|
||||||
"https://example.com",
|
|
||||||
"keyword",
|
|
||||||
"outline text",
|
|
||||||
"Z:/cora/report.xlsx",
|
|
||||||
)
|
|
||||||
assert "Z:/cora/report.xlsx" in prompt
|
|
||||||
|
|
||||||
def test_no_cora(self):
|
|
||||||
prompt = _build_phase2_prompt(
|
|
||||||
"https://example.com",
|
|
||||||
"keyword",
|
|
||||||
"outline text",
|
|
||||||
"",
|
|
||||||
)
|
|
||||||
assert "Cora SEO report" not in prompt
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _find_cora_report
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestFindCoraReport:
|
|
||||||
def test_empty_inbox(self, tmp_path):
|
|
||||||
assert _find_cora_report("keyword", str(tmp_path)) == ""
|
|
||||||
|
|
||||||
def test_nonexistent_path(self):
|
|
||||||
assert _find_cora_report("keyword", "/nonexistent/path") == ""
|
|
||||||
|
|
||||||
def test_empty_keyword(self, tmp_path):
|
|
||||||
assert _find_cora_report("", str(tmp_path)) == ""
|
|
||||||
|
|
||||||
def test_exact_match(self, tmp_path):
|
|
||||||
report = tmp_path / "plumbing services.xlsx"
|
|
||||||
report.touch()
|
|
||||||
result = _find_cora_report("plumbing services", str(tmp_path))
|
|
||||||
assert result == str(report)
|
|
||||||
|
|
||||||
def test_substring_match(self, tmp_path):
|
|
||||||
report = tmp_path / "plumbing-services-city.xlsx"
|
|
||||||
report.touch()
|
|
||||||
result = _find_cora_report("plumbing services", str(tmp_path))
|
|
||||||
# "plumbing services" is a substring of "plumbing-services-city"
|
|
||||||
assert result == str(report)
|
|
||||||
|
|
||||||
def test_word_overlap(self, tmp_path):
|
|
||||||
report = tmp_path / "residential-plumbing-repair.xlsx"
|
|
||||||
report.touch()
|
|
||||||
result = _find_cora_report("plumbing repair", str(tmp_path))
|
|
||||||
assert result == str(report)
|
|
||||||
|
|
||||||
def test_skips_temp_files(self, tmp_path):
|
|
||||||
(tmp_path / "~$report.xlsx").touch()
|
|
||||||
(tmp_path / "actual-report.xlsx").touch()
|
|
||||||
result = _find_cora_report("actual report", str(tmp_path))
|
|
||||||
assert "~$" not in result
|
|
||||||
assert "actual-report" in result
|
|
||||||
|
|
||||||
def test_no_match(self, tmp_path):
|
|
||||||
(tmp_path / "completely-unrelated.xlsx").touch()
|
|
||||||
result = _find_cora_report("plumbing services", str(tmp_path))
|
|
||||||
assert result == ""
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _save_content
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestSaveContent:
|
|
||||||
def _make_config(self, outline_dir: str = "") -> Config:
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=outline_dir)
|
|
||||||
return cfg
|
|
||||||
|
|
||||||
def test_saves_to_primary_path(self, tmp_path):
|
|
||||||
cfg = self._make_config(str(tmp_path / "outlines"))
|
|
||||||
path = _save_content("# Outline", "plumbing services", "outline.md", cfg)
|
|
||||||
assert "outlines" in path
|
|
||||||
assert Path(path).read_text(encoding="utf-8") == "# Outline"
|
|
||||||
|
|
||||||
def test_falls_back_to_local(self, tmp_path):
|
|
||||||
# Point to an invalid network path
|
|
||||||
cfg = self._make_config("\\\\nonexistent\\share\\outlines")
|
|
||||||
with patch(
|
|
||||||
"cheddahbot.tools.content_creation._LOCAL_CONTENT_DIR",
|
|
||||||
tmp_path / "local",
|
|
||||||
):
|
|
||||||
path = _save_content("# Outline", "plumbing", "outline.md", cfg)
|
|
||||||
assert str(tmp_path / "local") in path
|
|
||||||
assert Path(path).read_text(encoding="utf-8") == "# Outline"
|
|
||||||
|
|
||||||
def test_empty_outline_dir_uses_local(self, tmp_path):
|
|
||||||
cfg = self._make_config("")
|
|
||||||
with patch(
|
|
||||||
"cheddahbot.tools.content_creation._LOCAL_CONTENT_DIR",
|
|
||||||
tmp_path / "local",
|
|
||||||
):
|
|
||||||
path = _save_content("content", "keyword", "outline.md", cfg)
|
|
||||||
assert str(tmp_path / "local") in path
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# create_content — Phase 1
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestCreateContentPhase1:
|
|
||||||
def _make_ctx(self, tmp_db, tmp_path):
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "## Generated Outline\nSection 1..."
|
|
||||||
return {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "task123",
|
|
||||||
}
|
|
||||||
|
|
||||||
def test_requires_keyword(self, tmp_db):
|
|
||||||
ctx = {"agent": MagicMock(), "config": Config(), "db": tmp_db}
|
|
||||||
assert create_content(keyword="", ctx=ctx).startswith("Error:")
|
|
||||||
|
|
||||||
def test_requires_context(self):
|
|
||||||
assert create_content(keyword="kw", url="http://x", ctx=None).startswith("Error:")
|
|
||||||
|
|
||||||
def test_phase1_runs_for_new_content(self, tmp_db, tmp_path):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
result = create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
assert "Phase 1 Complete" in result
|
|
||||||
assert "outline" in result.lower()
|
|
||||||
ctx["agent"].execute_task.assert_called_once()
|
|
||||||
call_kwargs = ctx["agent"].execute_task.call_args
|
|
||||||
assert call_kwargs.kwargs.get("skip_permissions") is True
|
|
||||||
|
|
||||||
def test_phase1_saves_outline_file(self, tmp_db, tmp_path):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
# The outline should have been saved
|
|
||||||
outline_dir = tmp_path / "outlines" / "plumbing-services"
|
|
||||||
assert outline_dir.exists()
|
|
||||||
saved = (outline_dir / "outline.md").read_text(encoding="utf-8")
|
|
||||||
assert saved == "## Generated Outline\nSection 1..."
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase1_syncs_clickup(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
# Verify outline review status was set and OutlinePath was stored
|
|
||||||
mock_client.update_task_status.assert_any_call("task123", "outline review")
|
|
||||||
mock_client.set_custom_field_by_name.assert_called_once()
|
|
||||||
call_args = mock_client.set_custom_field_by_name.call_args
|
|
||||||
assert call_args[0][0] == "task123"
|
|
||||||
assert call_args[0][1] == "OutlinePath"
|
|
||||||
|
|
||||||
def test_phase1_includes_clickup_sync_marker(self, tmp_db, tmp_path):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
result = create_content(
|
|
||||||
keyword="test keyword",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
assert "## ClickUp Sync" in result
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# create_content — Phase 2
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestCreateContentPhase2:
|
|
||||||
def _setup_phase2(self, tmp_db, tmp_path):
|
|
||||||
"""Set up outline file and return (ctx, outline_path)."""
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
|
|
||||||
# Create the outline file
|
|
||||||
outline_dir = tmp_path / "outlines" / "plumbing-services"
|
|
||||||
outline_dir.mkdir(parents=True)
|
|
||||||
outline_file = outline_dir / "outline.md"
|
|
||||||
outline_file.write_text("## Approved Outline\nSection content here.", encoding="utf-8")
|
|
||||||
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "# Full Content\nParagraph..."
|
|
||||||
ctx = {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "task456",
|
|
||||||
}
|
|
||||||
return ctx, str(outline_file)
|
|
||||||
|
|
||||||
def _make_phase2_client(self, outline_path):
|
|
||||||
"""Create a mock ClickUp client that triggers Phase 2 detection."""
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_task = MagicMock()
|
|
||||||
mock_task.status = "outline approved"
|
|
||||||
mock_client.get_task.return_value = mock_task
|
|
||||||
mock_client.get_custom_field_by_name.return_value = outline_path
|
|
||||||
return mock_client
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase2_detects_outline_approved_status(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
ctx, outline_path = self._setup_phase2(tmp_db, tmp_path)
|
|
||||||
mock_get_client.return_value = self._make_phase2_client(outline_path)
|
|
||||||
|
|
||||||
result = create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
assert "Phase 2 Complete" in result
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase2_reads_outline(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
ctx, outline_path = self._setup_phase2(tmp_db, tmp_path)
|
|
||||||
mock_get_client.return_value = self._make_phase2_client(outline_path)
|
|
||||||
|
|
||||||
create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
call_args = ctx["agent"].execute_task.call_args
|
|
||||||
prompt = call_args.args[0] if call_args.args else call_args.kwargs.get("prompt", "")
|
|
||||||
assert "Approved Outline" in prompt
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase2_saves_content_file(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
ctx, outline_path = self._setup_phase2(tmp_db, tmp_path)
|
|
||||||
mock_get_client.return_value = self._make_phase2_client(outline_path)
|
|
||||||
|
|
||||||
create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
content_file = tmp_path / "outlines" / "plumbing-services" / "final-content.md"
|
|
||||||
assert content_file.exists()
|
|
||||||
assert content_file.read_text(encoding="utf-8") == "# Full Content\nParagraph..."
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase2_syncs_clickup_complete(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
ctx, outline_path = self._setup_phase2(tmp_db, tmp_path)
|
|
||||||
mock_client = self._make_phase2_client(outline_path)
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
|
|
||||||
create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
# Verify ClickUp was synced to internal review
|
|
||||||
mock_client.update_task_status.assert_any_call("task456", "internal review")
|
|
||||||
mock_client.add_comment.assert_called()
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase2_includes_clickup_sync_marker(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
ctx, outline_path = self._setup_phase2(tmp_db, tmp_path)
|
|
||||||
mock_get_client.return_value = self._make_phase2_client(outline_path)
|
|
||||||
|
|
||||||
result = create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
assert "## ClickUp Sync" in result
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# continue_content
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestContinueContent:
|
|
||||||
def test_requires_keyword(self, tmp_db):
|
|
||||||
ctx = {"agent": MagicMock(), "db": tmp_db, "config": Config()}
|
|
||||||
assert continue_content(keyword="", ctx=ctx).startswith("Error:")
|
|
||||||
|
|
||||||
def test_no_matching_entry(self, tmp_db):
|
|
||||||
ctx = {"agent": MagicMock(), "db": tmp_db, "config": Config()}
|
|
||||||
result = continue_content(keyword="nonexistent", ctx=ctx)
|
|
||||||
assert "No outline awaiting review" in result
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_finds_and_runs_phase2(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
cfg.clickup.space_id = "sp1"
|
|
||||||
|
|
||||||
# Create outline file
|
|
||||||
outline_dir = tmp_path / "outlines" / "plumbing-services"
|
|
||||||
outline_dir.mkdir(parents=True)
|
|
||||||
outline_file = outline_dir / "outline.md"
|
|
||||||
outline_file.write_text("## Outline", encoding="utf-8")
|
|
||||||
|
|
||||||
# Mock ClickUp client — returns a task matching the keyword
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_task = MagicMock()
|
|
||||||
mock_task.id = "task789"
|
|
||||||
mock_task.custom_fields = {
|
|
||||||
"Keyword": "plumbing services",
|
|
||||||
"IMSURL": "https://example.com",
|
|
||||||
}
|
|
||||||
mock_client.get_tasks_from_space.return_value = [mock_task]
|
|
||||||
mock_client.get_custom_field_by_name.return_value = str(outline_file)
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "# Full content"
|
|
||||||
ctx = {"agent": agent, "db": tmp_db, "config": cfg}
|
|
||||||
result = continue_content(keyword="plumbing services", ctx=ctx)
|
|
||||||
assert "Phase 2 Complete" in result
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Error propagation
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestErrorPropagation:
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase1_execution_error_syncs_clickup(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.side_effect = RuntimeError("CLI crashed")
|
|
||||||
ctx = {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "task_err",
|
|
||||||
}
|
|
||||||
result = create_content(
|
|
||||||
keyword="test",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
assert "Error:" in result
|
|
||||||
# Verify ClickUp was notified of the failure
|
|
||||||
mock_client.update_task_status.assert_any_call("task_err", "error")
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_phase1_error_return_syncs_clickup(self, mock_get_client, tmp_db, tmp_path):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "Error: something went wrong"
|
|
||||||
ctx = {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "task_err2",
|
|
||||||
}
|
|
||||||
result = create_content(
|
|
||||||
keyword="test",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
assert result.startswith("Error:")
|
|
||||||
# Verify ClickUp was notified of the failure
|
|
||||||
mock_client.update_task_status.assert_any_call("task_err2", "error")
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _build_optimization_prompt
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestBuildOptimizationPrompt:
|
|
||||||
def test_contains_url_and_keyword(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com/plumbing",
|
|
||||||
keyword="plumbing services",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
)
|
|
||||||
assert "https://example.com/plumbing" in prompt
|
|
||||||
assert "plumbing services" in prompt
|
|
||||||
|
|
||||||
def test_contains_cora_path(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="kw",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
)
|
|
||||||
assert "Z:/cora/report.xlsx" in prompt
|
|
||||||
|
|
||||||
def test_contains_all_script_commands(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="kw",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
)
|
|
||||||
assert "competitor_scraper.py" in prompt
|
|
||||||
assert "test_block_prep.py" in prompt
|
|
||||||
assert "test_block_generator.py" in prompt
|
|
||||||
assert "test_block_validate.py" in prompt
|
|
||||||
|
|
||||||
def test_contains_step8_instructions(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="kw",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
)
|
|
||||||
assert "optimization_instructions.md" in prompt
|
|
||||||
assert "Heading Changes" in prompt
|
|
||||||
assert "Entity Integration Points" in prompt
|
|
||||||
assert "Meta Tag Updates" in prompt
|
|
||||||
assert "Priority Ranking" in prompt
|
|
||||||
|
|
||||||
def test_service_page_note(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="kw",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
is_service_page=True,
|
|
||||||
capabilities_default="Check website.",
|
|
||||||
)
|
|
||||||
assert "service page" in prompt
|
|
||||||
assert "Check website." in prompt
|
|
||||||
|
|
||||||
def test_no_service_page_note_by_default(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="kw",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
)
|
|
||||||
assert "service page" not in prompt.lower().split("step")[0]
|
|
||||||
|
|
||||||
def test_all_eight_steps_present(self):
|
|
||||||
prompt = _build_optimization_prompt(
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="kw",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
work_dir="/tmp/work",
|
|
||||||
scripts_dir="/scripts",
|
|
||||||
)
|
|
||||||
for step_num in range(1, 9):
|
|
||||||
assert f"Step {step_num}" in prompt
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _run_optimization
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestRunOptimization:
|
|
||||||
def _make_ctx(self, tmp_db, tmp_path):
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "Optimization complete"
|
|
||||||
return {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "opt_task_1",
|
|
||||||
}
|
|
||||||
|
|
||||||
def test_fails_without_cora_report(self, tmp_db, tmp_path):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
result = _run_optimization(
|
|
||||||
agent=ctx["agent"],
|
|
||||||
config=ctx["config"],
|
|
||||||
ctx=ctx,
|
|
||||||
task_id="opt_task_1",
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="plumbing services",
|
|
||||||
cora_path="",
|
|
||||||
)
|
|
||||||
assert "Error:" in result
|
|
||||||
assert "Cora report" in result
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._sync_clickup_fail")
|
|
||||||
def test_syncs_clickup_on_missing_cora(self, mock_fail, tmp_db, tmp_path):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
_run_optimization(
|
|
||||||
agent=ctx["agent"],
|
|
||||||
config=ctx["config"],
|
|
||||||
ctx=ctx,
|
|
||||||
task_id="opt_task_1",
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="plumbing services",
|
|
||||||
cora_path="",
|
|
||||||
)
|
|
||||||
mock_fail.assert_called_once()
|
|
||||||
assert mock_fail.call_args[0][1] == "opt_task_1"
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._finalize_optimization")
|
|
||||||
@patch("cheddahbot.tools.content_creation._sync_clickup_start")
|
|
||||||
def test_creates_work_dir_and_calls_execute(
|
|
||||||
self, mock_start, mock_finalize, tmp_db, tmp_path
|
|
||||||
):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
mock_finalize.return_value = "finalized"
|
|
||||||
with patch(
|
|
||||||
"cheddahbot.tools.content_creation._LOCAL_CONTENT_DIR",
|
|
||||||
tmp_path / "content",
|
|
||||||
):
|
|
||||||
result = _run_optimization(
|
|
||||||
agent=ctx["agent"],
|
|
||||||
config=ctx["config"],
|
|
||||||
ctx=ctx,
|
|
||||||
task_id="opt_task_1",
|
|
||||||
url="https://example.com/plumbing",
|
|
||||||
keyword="plumbing services",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
)
|
|
||||||
ctx["agent"].execute_task.assert_called_once()
|
|
||||||
mock_start.assert_called_once_with(ctx, "opt_task_1")
|
|
||||||
mock_finalize.assert_called_once()
|
|
||||||
assert result == "finalized"
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._sync_clickup_fail")
|
|
||||||
@patch("cheddahbot.tools.content_creation._sync_clickup_start")
|
|
||||||
def test_syncs_clickup_on_execution_error(
|
|
||||||
self, mock_start, mock_fail, tmp_db, tmp_path
|
|
||||||
):
|
|
||||||
ctx = self._make_ctx(tmp_db, tmp_path)
|
|
||||||
ctx["agent"].execute_task.side_effect = RuntimeError("CLI crashed")
|
|
||||||
with patch(
|
|
||||||
"cheddahbot.tools.content_creation._LOCAL_CONTENT_DIR",
|
|
||||||
tmp_path / "content",
|
|
||||||
):
|
|
||||||
result = _run_optimization(
|
|
||||||
agent=ctx["agent"],
|
|
||||||
config=ctx["config"],
|
|
||||||
ctx=ctx,
|
|
||||||
task_id="opt_task_1",
|
|
||||||
url="https://example.com",
|
|
||||||
keyword="plumbing services",
|
|
||||||
cora_path="Z:/cora/report.xlsx",
|
|
||||||
)
|
|
||||||
assert "Error:" in result
|
|
||||||
mock_fail.assert_called_once()
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _finalize_optimization
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestFinalizeOptimization:
|
|
||||||
def _make_config(self, outline_dir: str = "") -> Config:
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=outline_dir)
|
|
||||||
return cfg
|
|
||||||
|
|
||||||
def test_errors_on_missing_test_block(self, tmp_path):
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
# Only create instructions, not test_block.html
|
|
||||||
(work_dir / "optimization_instructions.md").write_text("instructions")
|
|
||||||
cfg = self._make_config()
|
|
||||||
result = _finalize_optimization(
|
|
||||||
ctx=None,
|
|
||||||
config=cfg,
|
|
||||||
task_id="",
|
|
||||||
keyword="kw",
|
|
||||||
url="https://example.com",
|
|
||||||
work_dir=work_dir,
|
|
||||||
exec_result="done",
|
|
||||||
)
|
|
||||||
assert "Error:" in result
|
|
||||||
assert "test_block.html" in result
|
|
||||||
|
|
||||||
def test_errors_on_missing_instructions(self, tmp_path):
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
# Only create test_block, not instructions
|
|
||||||
(work_dir / "test_block.html").write_text("<div>block</div>")
|
|
||||||
cfg = self._make_config()
|
|
||||||
result = _finalize_optimization(
|
|
||||||
ctx=None,
|
|
||||||
config=cfg,
|
|
||||||
task_id="",
|
|
||||||
keyword="kw",
|
|
||||||
url="https://example.com",
|
|
||||||
work_dir=work_dir,
|
|
||||||
exec_result="done",
|
|
||||||
)
|
|
||||||
assert "Error:" in result
|
|
||||||
assert "optimization_instructions.md" in result
|
|
||||||
|
|
||||||
def test_succeeds_with_required_files(self, tmp_path):
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
(work_dir / "test_block.html").write_text("<div>block</div>")
|
|
||||||
(work_dir / "optimization_instructions.md").write_text("# Instructions")
|
|
||||||
cfg = self._make_config()
|
|
||||||
result = _finalize_optimization(
|
|
||||||
ctx=None,
|
|
||||||
config=cfg,
|
|
||||||
task_id="",
|
|
||||||
keyword="plumbing services",
|
|
||||||
url="https://example.com",
|
|
||||||
work_dir=work_dir,
|
|
||||||
exec_result="all done",
|
|
||||||
)
|
|
||||||
assert "Optimization Complete" in result
|
|
||||||
assert "plumbing services" in result
|
|
||||||
assert "test_block.html" in result
|
|
||||||
|
|
||||||
def test_copies_to_network_path(self, tmp_path):
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
(work_dir / "test_block.html").write_text("<div>block</div>")
|
|
||||||
(work_dir / "optimization_instructions.md").write_text("# Instructions")
|
|
||||||
net_dir = tmp_path / "network"
|
|
||||||
cfg = self._make_config(str(net_dir))
|
|
||||||
_finalize_optimization(
|
|
||||||
ctx=None,
|
|
||||||
config=cfg,
|
|
||||||
task_id="",
|
|
||||||
keyword="plumbing services",
|
|
||||||
url="https://example.com",
|
|
||||||
work_dir=work_dir,
|
|
||||||
exec_result="done",
|
|
||||||
)
|
|
||||||
assert (net_dir / "plumbing-services" / "test_block.html").exists()
|
|
||||||
assert (net_dir / "plumbing-services" / "optimization_instructions.md").exists()
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._sync_clickup_optimization_complete")
|
|
||||||
def test_syncs_clickup_when_task_id_present(self, mock_sync, tmp_path, tmp_db):
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
(work_dir / "test_block.html").write_text("<div>block</div>")
|
|
||||||
(work_dir / "optimization_instructions.md").write_text("# Instructions")
|
|
||||||
cfg = self._make_config()
|
|
||||||
ctx = {"config": cfg, "db": tmp_db}
|
|
||||||
_finalize_optimization(
|
|
||||||
ctx=ctx,
|
|
||||||
config=cfg,
|
|
||||||
task_id="task_fin",
|
|
||||||
keyword="kw",
|
|
||||||
url="https://example.com",
|
|
||||||
work_dir=work_dir,
|
|
||||||
exec_result="done",
|
|
||||||
)
|
|
||||||
mock_sync.assert_called_once()
|
|
||||||
call_kwargs = mock_sync.call_args.kwargs
|
|
||||||
assert call_kwargs["task_id"] == "task_fin"
|
|
||||||
assert "test_block.html" in call_kwargs["found_files"]
|
|
||||||
assert "optimization_instructions.md" in call_kwargs["found_files"]
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# _sync_clickup_optimization_complete
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestSyncClickupOptimizationComplete:
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_uploads_files_and_posts_comment(self, mock_get_client, tmp_path):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
tb_path = work_dir / "test_block.html"
|
|
||||||
tb_path.write_text("<div>block</div>")
|
|
||||||
inst_path = work_dir / "optimization_instructions.md"
|
|
||||||
inst_path.write_text("# Instructions")
|
|
||||||
val_path = work_dir / "validation_report.json"
|
|
||||||
val_path.write_text(json.dumps({"summary": "All metrics improved."}))
|
|
||||||
|
|
||||||
cfg = Config()
|
|
||||||
ctx = {"config": cfg}
|
|
||||||
found_files = {
|
|
||||||
"test_block.html": tb_path,
|
|
||||||
"optimization_instructions.md": inst_path,
|
|
||||||
"validation_report.json": val_path,
|
|
||||||
}
|
|
||||||
_sync_clickup_optimization_complete(
|
|
||||||
ctx=ctx,
|
|
||||||
config=cfg,
|
|
||||||
task_id="task_sync",
|
|
||||||
keyword="plumbing",
|
|
||||||
url="https://example.com",
|
|
||||||
found_files=found_files,
|
|
||||||
work_dir=work_dir,
|
|
||||||
)
|
|
||||||
# 3 file uploads
|
|
||||||
assert mock_client.upload_attachment.call_count == 3
|
|
||||||
# Comment posted
|
|
||||||
mock_client.add_comment.assert_called_once()
|
|
||||||
comment = mock_client.add_comment.call_args[0][1]
|
|
||||||
assert "plumbing" in comment
|
|
||||||
assert "All metrics improved." in comment
|
|
||||||
assert "Next Steps" in comment
|
|
||||||
# Status set to internal review
|
|
||||||
mock_client.update_task_status.assert_called_once_with(
|
|
||||||
"task_sync", cfg.clickup.review_status
|
|
||||||
)
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._get_clickup_client")
|
|
||||||
def test_handles_no_validation_report(self, mock_get_client, tmp_path):
|
|
||||||
mock_client = MagicMock()
|
|
||||||
mock_get_client.return_value = mock_client
|
|
||||||
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
tb_path = work_dir / "test_block.html"
|
|
||||||
tb_path.write_text("<div>block</div>")
|
|
||||||
inst_path = work_dir / "optimization_instructions.md"
|
|
||||||
inst_path.write_text("# Instructions")
|
|
||||||
|
|
||||||
cfg = Config()
|
|
||||||
ctx = {"config": cfg}
|
|
||||||
found_files = {
|
|
||||||
"test_block.html": tb_path,
|
|
||||||
"optimization_instructions.md": inst_path,
|
|
||||||
}
|
|
||||||
_sync_clickup_optimization_complete(
|
|
||||||
ctx=ctx,
|
|
||||||
config=cfg,
|
|
||||||
task_id="task_sync2",
|
|
||||||
keyword="kw",
|
|
||||||
url="https://example.com",
|
|
||||||
found_files=found_files,
|
|
||||||
work_dir=work_dir,
|
|
||||||
)
|
|
||||||
# 2 uploads (no validation_report.json)
|
|
||||||
assert mock_client.upload_attachment.call_count == 2
|
|
||||||
mock_client.add_comment.assert_called_once()
|
|
||||||
|
|
||||||
def test_noop_without_task_id(self, tmp_path):
|
|
||||||
"""No ClickUp sync when task_id is empty."""
|
|
||||||
work_dir = tmp_path / "work"
|
|
||||||
work_dir.mkdir()
|
|
||||||
cfg = Config()
|
|
||||||
# Should not raise
|
|
||||||
_sync_clickup_optimization_complete(
|
|
||||||
ctx={"config": cfg},
|
|
||||||
config=cfg,
|
|
||||||
task_id="",
|
|
||||||
keyword="kw",
|
|
||||||
url="https://example.com",
|
|
||||||
found_files={},
|
|
||||||
work_dir=work_dir,
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# create_content — Routing (URL → optimization vs new content → phases)
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestCreateContentRouting:
|
|
||||||
@patch("cheddahbot.tools.content_creation._run_optimization")
|
|
||||||
def test_explicit_optimization_routes_correctly(self, mock_opt, tmp_db, tmp_path):
|
|
||||||
"""When content_type='on page optimization', routes to _run_optimization."""
|
|
||||||
mock_opt.return_value = "## Optimization Complete"
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
ctx = {
|
|
||||||
"agent": MagicMock(),
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "routing_test",
|
|
||||||
}
|
|
||||||
result = create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
url="https://example.com/plumbing",
|
|
||||||
content_type="on page optimization",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
mock_opt.assert_called_once()
|
|
||||||
assert result == "## Optimization Complete"
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._run_optimization")
|
|
||||||
def test_explicit_new_content_with_url_routes_to_phase1(self, mock_opt, tmp_db, tmp_path):
|
|
||||||
"""Content Creation with URL should go to Phase 1, NOT optimization."""
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "## Outline"
|
|
||||||
ctx = {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "",
|
|
||||||
}
|
|
||||||
result = create_content(
|
|
||||||
keyword="new keyword",
|
|
||||||
url="https://example.com/future-page",
|
|
||||||
content_type="new content",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
mock_opt.assert_not_called()
|
|
||||||
assert "Phase 1 Complete" in result
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._run_optimization")
|
|
||||||
def test_optimization_without_url_returns_error(self, mock_opt, tmp_db, tmp_path):
|
|
||||||
"""On Page Optimization without URL should return an error."""
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
ctx = {
|
|
||||||
"agent": MagicMock(),
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "",
|
|
||||||
}
|
|
||||||
result = create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
url="",
|
|
||||||
content_type="on page optimization",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
mock_opt.assert_not_called()
|
|
||||||
assert "Error" in result
|
|
||||||
assert "URL" in result
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._run_optimization")
|
|
||||||
def test_fallback_url_routes_to_optimization(self, mock_opt, tmp_db, tmp_path):
|
|
||||||
"""When content_type is empty and URL present, falls back to optimization."""
|
|
||||||
mock_opt.return_value = "## Optimization Complete"
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
ctx = {
|
|
||||||
"agent": MagicMock(),
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "routing_test",
|
|
||||||
}
|
|
||||||
result = create_content(
|
|
||||||
keyword="plumbing services",
|
|
||||||
url="https://example.com/plumbing",
|
|
||||||
content_type="",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
mock_opt.assert_called_once()
|
|
||||||
assert result == "## Optimization Complete"
|
|
||||||
|
|
||||||
@patch("cheddahbot.tools.content_creation._run_optimization")
|
|
||||||
def test_new_content_still_calls_phase1(self, mock_opt, tmp_db, tmp_path):
|
|
||||||
"""Regression: new content (no URL, no content_type) still goes through _run_phase1."""
|
|
||||||
cfg = Config()
|
|
||||||
cfg.content = ContentConfig(outline_dir=str(tmp_path / "outlines"))
|
|
||||||
agent = MagicMock()
|
|
||||||
agent.execute_task.return_value = "## Generated Outline\nContent..."
|
|
||||||
ctx = {
|
|
||||||
"agent": agent,
|
|
||||||
"config": cfg,
|
|
||||||
"db": tmp_db,
|
|
||||||
"clickup_task_id": "",
|
|
||||||
}
|
|
||||||
create_content(
|
|
||||||
keyword="new topic",
|
|
||||||
url="",
|
|
||||||
ctx=ctx,
|
|
||||||
)
|
|
||||||
mock_opt.assert_not_called()
|
|
||||||
agent.execute_task.assert_called_once()
|
|
||||||
# Verify it's the phase 1 prompt (new content path)
|
|
||||||
call_args = agent.execute_task.call_args
|
|
||||||
prompt = call_args.args[0] if call_args.args else call_args.kwargs.get("prompt", "")
|
|
||||||
assert "new content creation project" in prompt
|
|
||||||
|
|
@ -1,233 +0,0 @@
|
||||||
"""Tests for the Cora distribution watcher (scheduler._distribute_cora_file)."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from dataclasses import dataclass, field
|
|
||||||
from pathlib import Path
|
|
||||||
from unittest.mock import MagicMock, patch
|
|
||||||
|
|
||||||
from cheddahbot.config import AutoCoraConfig, Config, ContentConfig, LinkBuildingConfig
|
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
|
||||||
class FakeTask:
|
|
||||||
"""Minimal ClickUp task stub for distribution tests."""
|
|
||||||
|
|
||||||
id: str = "fake_id"
|
|
||||||
name: str = ""
|
|
||||||
task_type: str = ""
|
|
||||||
status: str = "running cora"
|
|
||||||
custom_fields: dict = field(default_factory=dict)
|
|
||||||
|
|
||||||
|
|
||||||
def _make_scheduler(tmp_path, *, lb_folder="", content_inbox="", human_inbox=""):
|
|
||||||
"""Build a Scheduler with temp paths and mocked dependencies."""
|
|
||||||
from cheddahbot.scheduler import Scheduler
|
|
||||||
|
|
||||||
config = Config()
|
|
||||||
config.link_building = LinkBuildingConfig(watch_folder=lb_folder)
|
|
||||||
config.content = ContentConfig(cora_inbox=content_inbox)
|
|
||||||
config.autocora = AutoCoraConfig(cora_human_inbox=human_inbox, enabled=True)
|
|
||||||
config.clickup.enabled = True
|
|
||||||
config.clickup.space_id = "sp1"
|
|
||||||
config.clickup.api_token = "tok"
|
|
||||||
|
|
||||||
db = MagicMock()
|
|
||||||
agent = MagicMock()
|
|
||||||
sched = Scheduler(config=config, db=db, agent=agent)
|
|
||||||
return sched
|
|
||||||
|
|
||||||
|
|
||||||
KW_FIELDS = {"Keyword": "ac drive repair"}
|
|
||||||
|
|
||||||
|
|
||||||
def _drop_xlsx(folder: Path, name: str = "ac-drive-repair.xlsx") -> Path:
|
|
||||||
"""Create a dummy xlsx file in the given folder."""
|
|
||||||
folder.mkdir(parents=True, exist_ok=True)
|
|
||||||
p = folder / name
|
|
||||||
p.write_bytes(b"fake-xlsx-data")
|
|
||||||
return p
|
|
||||||
|
|
||||||
|
|
||||||
# ── Distribution logic tests ──
|
|
||||||
|
|
||||||
|
|
||||||
def test_distribute_lb_only(tmp_path):
|
|
||||||
"""LB task matched → copies to cora-inbox only."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
xlsx = _drop_xlsx(human)
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
tasks = [FakeTask(name="LB task", task_type="Link Building", custom_fields=KW_FIELDS)]
|
|
||||||
|
|
||||||
with patch.object(sched, "_get_clickup_client") as mock_client:
|
|
||||||
mock_client.return_value.get_tasks_from_overall_lists.return_value = tasks
|
|
||||||
sched._distribute_cora_file(xlsx)
|
|
||||||
|
|
||||||
assert (lb / xlsx.name).exists()
|
|
||||||
assert not (content / xlsx.name).exists()
|
|
||||||
assert (human / "processed" / xlsx.name).exists()
|
|
||||||
assert not xlsx.exists()
|
|
||||||
|
|
||||||
|
|
||||||
def test_distribute_content_only(tmp_path):
|
|
||||||
"""Content task matched → copies to content-cora-inbox only."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
xlsx = _drop_xlsx(human)
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
tasks = [FakeTask(name="CC task", task_type="Content Creation", custom_fields=KW_FIELDS)]
|
|
||||||
|
|
||||||
with patch.object(sched, "_get_clickup_client") as mock_client:
|
|
||||||
mock_client.return_value.get_tasks_from_overall_lists.return_value = tasks
|
|
||||||
sched._distribute_cora_file(xlsx)
|
|
||||||
|
|
||||||
assert not (lb / xlsx.name).exists()
|
|
||||||
assert (content / xlsx.name).exists()
|
|
||||||
assert (human / "processed" / xlsx.name).exists()
|
|
||||||
|
|
||||||
|
|
||||||
def test_distribute_mixed(tmp_path):
|
|
||||||
"""Both LB and Content tasks matched → copies to both inboxes."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
xlsx = _drop_xlsx(human)
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
tasks = [
|
|
||||||
FakeTask(name="LB task", task_type="Link Building", custom_fields=KW_FIELDS),
|
|
||||||
FakeTask(name="CC task", task_type="Content Creation", custom_fields=KW_FIELDS),
|
|
||||||
]
|
|
||||||
|
|
||||||
with patch.object(sched, "_get_clickup_client") as mock_client:
|
|
||||||
mock_client.return_value.get_tasks_from_overall_lists.return_value = tasks
|
|
||||||
sched._distribute_cora_file(xlsx)
|
|
||||||
|
|
||||||
assert (lb / xlsx.name).exists()
|
|
||||||
assert (content / xlsx.name).exists()
|
|
||||||
assert (human / "processed" / xlsx.name).exists()
|
|
||||||
|
|
||||||
|
|
||||||
def test_distribute_no_match(tmp_path):
|
|
||||||
"""No matching tasks → file stays in inbox, not moved to processed."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
xlsx = _drop_xlsx(human)
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
with patch.object(sched, "_get_clickup_client") as mock_client:
|
|
||||||
mock_client.return_value.get_tasks_from_overall_lists.return_value = []
|
|
||||||
sched._distribute_cora_file(xlsx)
|
|
||||||
|
|
||||||
assert xlsx.exists() # Still in inbox
|
|
||||||
assert not (human / "processed" / xlsx.name).exists()
|
|
||||||
|
|
||||||
|
|
||||||
def test_distribute_opo_task(tmp_path):
|
|
||||||
"""On Page Optimization task → copies to content inbox."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
xlsx = _drop_xlsx(human)
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
tasks = [FakeTask(name="OPO task", task_type="On Page Optimization", custom_fields=KW_FIELDS)]
|
|
||||||
|
|
||||||
with patch.object(sched, "_get_clickup_client") as mock_client:
|
|
||||||
mock_client.return_value.get_tasks_from_overall_lists.return_value = tasks
|
|
||||||
sched._distribute_cora_file(xlsx)
|
|
||||||
|
|
||||||
assert not (lb / xlsx.name).exists()
|
|
||||||
assert (content / xlsx.name).exists()
|
|
||||||
|
|
||||||
|
|
||||||
# ── Scan tests ──
|
|
||||||
|
|
||||||
|
|
||||||
def test_scan_skips_processed(tmp_path):
|
|
||||||
"""Files already in processed/ are skipped."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
|
|
||||||
# File in both top-level and processed/
|
|
||||||
_drop_xlsx(human)
|
|
||||||
_drop_xlsx(human / "processed")
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
with patch.object(sched, "_distribute_cora_file") as mock_dist:
|
|
||||||
sched._scan_cora_human_inbox()
|
|
||||||
mock_dist.assert_not_called()
|
|
||||||
|
|
||||||
|
|
||||||
def test_scan_skips_temp_files(tmp_path):
|
|
||||||
"""Office temp files (~$...) are skipped."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
lb = tmp_path / "lb"
|
|
||||||
content = tmp_path / "content"
|
|
||||||
|
|
||||||
_drop_xlsx(human, name="~$ac-drive-repair.xlsx")
|
|
||||||
|
|
||||||
sched = _make_scheduler(
|
|
||||||
tmp_path, lb_folder=str(lb), content_inbox=str(content), human_inbox=str(human)
|
|
||||||
)
|
|
||||||
|
|
||||||
with patch.object(sched, "_distribute_cora_file") as mock_dist:
|
|
||||||
sched._scan_cora_human_inbox()
|
|
||||||
mock_dist.assert_not_called()
|
|
||||||
|
|
||||||
|
|
||||||
def test_scan_empty_inbox(tmp_path):
|
|
||||||
"""Empty inbox → no-op."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
human.mkdir()
|
|
||||||
|
|
||||||
sched = _make_scheduler(tmp_path, human_inbox=str(human))
|
|
||||||
|
|
||||||
with patch.object(sched, "_distribute_cora_file") as mock_dist:
|
|
||||||
sched._scan_cora_human_inbox()
|
|
||||||
mock_dist.assert_not_called()
|
|
||||||
|
|
||||||
|
|
||||||
def test_distribute_copy_failure_no_move(tmp_path):
|
|
||||||
"""If copy fails, original is NOT moved to processed."""
|
|
||||||
human = tmp_path / "human"
|
|
||||||
xlsx = _drop_xlsx(human)
|
|
||||||
|
|
||||||
sched = _make_scheduler(tmp_path, lb_folder="/nonexistent/network/path", human_inbox=str(human))
|
|
||||||
|
|
||||||
tasks = [FakeTask(name="LB task", task_type="Link Building", custom_fields=KW_FIELDS)]
|
|
||||||
|
|
||||||
with (
|
|
||||||
patch.object(sched, "_get_clickup_client") as mock_client,
|
|
||||||
patch("cheddahbot.scheduler.shutil.copy2", side_effect=OSError("network down")),
|
|
||||||
):
|
|
||||||
mock_client.return_value.get_tasks_from_overall_lists.return_value = tasks
|
|
||||||
sched._distribute_cora_file(xlsx)
|
|
||||||
|
|
||||||
assert xlsx.exists() # Original untouched
|
|
||||||
assert not (human / "processed" / xlsx.name).exists()
|
|
||||||
|
|
@ -10,27 +10,21 @@ from cheddahbot.db import Database
|
||||||
class TestConversationsAgentName:
|
class TestConversationsAgentName:
|
||||||
"""Conversations are tagged by agent_name for per-agent history filtering."""
|
"""Conversations are tagged by agent_name for per-agent history filtering."""
|
||||||
|
|
||||||
def _add_msg(self, db, conv_id):
|
|
||||||
"""Add a dummy message so list_conversations() includes this conv."""
|
|
||||||
db.add_message(conv_id, "user", "hello")
|
|
||||||
|
|
||||||
def test_create_with_default_agent_name(self, tmp_db):
|
def test_create_with_default_agent_name(self, tmp_db):
|
||||||
tmp_db.create_conversation("conv1")
|
tmp_db.create_conversation("conv1")
|
||||||
self._add_msg(tmp_db, "conv1")
|
|
||||||
convs = tmp_db.list_conversations()
|
convs = tmp_db.list_conversations()
|
||||||
assert len(convs) == 1
|
assert len(convs) == 1
|
||||||
assert convs[0]["agent_name"] == "default"
|
assert convs[0]["agent_name"] == "default"
|
||||||
|
|
||||||
def test_create_with_custom_agent_name(self, tmp_db):
|
def test_create_with_custom_agent_name(self, tmp_db):
|
||||||
tmp_db.create_conversation("conv1", agent_name="writer")
|
tmp_db.create_conversation("conv1", agent_name="writer")
|
||||||
self._add_msg(tmp_db, "conv1")
|
|
||||||
convs = tmp_db.list_conversations()
|
convs = tmp_db.list_conversations()
|
||||||
assert convs[0]["agent_name"] == "writer"
|
assert convs[0]["agent_name"] == "writer"
|
||||||
|
|
||||||
def test_list_filters_by_agent_name(self, tmp_db):
|
def test_list_filters_by_agent_name(self, tmp_db):
|
||||||
for cid, agent in [("c1", "default"), ("c2", "writer"), ("c3", "default")]:
|
tmp_db.create_conversation("c1", agent_name="default")
|
||||||
tmp_db.create_conversation(cid, agent_name=agent)
|
tmp_db.create_conversation("c2", agent_name="writer")
|
||||||
self._add_msg(tmp_db, cid)
|
tmp_db.create_conversation("c3", agent_name="default")
|
||||||
|
|
||||||
default_convs = tmp_db.list_conversations(agent_name="default")
|
default_convs = tmp_db.list_conversations(agent_name="default")
|
||||||
writer_convs = tmp_db.list_conversations(agent_name="writer")
|
writer_convs = tmp_db.list_conversations(agent_name="writer")
|
||||||
|
|
@ -41,16 +35,14 @@ class TestConversationsAgentName:
|
||||||
assert len(all_convs) == 3
|
assert len(all_convs) == 3
|
||||||
|
|
||||||
def test_list_without_filter_returns_all(self, tmp_db):
|
def test_list_without_filter_returns_all(self, tmp_db):
|
||||||
for cid, agent in [("c1", "a"), ("c2", "b")]:
|
tmp_db.create_conversation("c1", agent_name="a")
|
||||||
tmp_db.create_conversation(cid, agent_name=agent)
|
tmp_db.create_conversation("c2", agent_name="b")
|
||||||
self._add_msg(tmp_db, cid)
|
|
||||||
|
|
||||||
convs = tmp_db.list_conversations()
|
convs = tmp_db.list_conversations()
|
||||||
assert len(convs) == 2
|
assert len(convs) == 2
|
||||||
|
|
||||||
def test_list_returns_agent_name_in_results(self, tmp_db):
|
def test_list_returns_agent_name_in_results(self, tmp_db):
|
||||||
tmp_db.create_conversation("c1", agent_name="researcher")
|
tmp_db.create_conversation("c1", agent_name="researcher")
|
||||||
self._add_msg(tmp_db, "c1")
|
|
||||||
convs = tmp_db.list_conversations()
|
convs = tmp_db.list_conversations()
|
||||||
assert "agent_name" in convs[0]
|
assert "agent_name" in convs[0]
|
||||||
assert convs[0]["agent_name"] == "researcher"
|
assert convs[0]["agent_name"] == "researcher"
|
||||||
|
|
@ -60,7 +52,6 @@ class TestConversationsAgentName:
|
||||||
db_path = tmp_path / "test_migrate.db"
|
db_path = tmp_path / "test_migrate.db"
|
||||||
db1 = Database(db_path)
|
db1 = Database(db_path)
|
||||||
db1.create_conversation("c1", agent_name="ops")
|
db1.create_conversation("c1", agent_name="ops")
|
||||||
db1.add_message("c1", "user", "hello")
|
|
||||||
# Re-init on same DB file triggers migration again
|
# Re-init on same DB file triggers migration again
|
||||||
db2 = Database(db_path)
|
db2 = Database(db_path)
|
||||||
convs = db2.list_conversations()
|
convs = db2.list_conversations()
|
||||||
|
|
|
||||||
|
|
@ -2,6 +2,7 @@
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
import subprocess
|
import subprocess
|
||||||
from unittest.mock import MagicMock, patch
|
from unittest.mock import MagicMock, patch
|
||||||
|
|
||||||
|
|
@ -227,36 +228,23 @@ class TestFuzzyKeywordMatch:
|
||||||
def test_exact_match(self):
|
def test_exact_match(self):
|
||||||
assert _fuzzy_keyword_match("precision cnc", "precision cnc") is True
|
assert _fuzzy_keyword_match("precision cnc", "precision cnc") is True
|
||||||
|
|
||||||
def test_no_match_without_llm(self):
|
def test_substring_match_a_in_b(self):
|
||||||
"""Without an llm_check, non-exact strings return False."""
|
assert _fuzzy_keyword_match("cnc machining", "precision cnc machining services") is True
|
||||||
assert _fuzzy_keyword_match("shaft", "shafts") is False
|
|
||||||
assert _fuzzy_keyword_match("shaft manufacturing", "custom shaft manufacturing") is False
|
def test_substring_match_b_in_a(self):
|
||||||
|
assert _fuzzy_keyword_match("precision cnc machining services", "cnc machining") is True
|
||||||
|
|
||||||
|
def test_word_overlap(self):
|
||||||
|
assert _fuzzy_keyword_match("precision cnc machining", "cnc machining precision") is True
|
||||||
|
|
||||||
|
def test_no_match(self):
|
||||||
|
assert _fuzzy_keyword_match("precision cnc", "web design agency") is False
|
||||||
|
|
||||||
def test_empty_strings(self):
|
def test_empty_strings(self):
|
||||||
assert _fuzzy_keyword_match("", "test") is False
|
assert _fuzzy_keyword_match("", "test") is False
|
||||||
assert _fuzzy_keyword_match("test", "") is False
|
assert _fuzzy_keyword_match("test", "") is False
|
||||||
assert _fuzzy_keyword_match("", "") is False
|
assert _fuzzy_keyword_match("", "") is False
|
||||||
|
|
||||||
def test_llm_check_called_on_mismatch(self):
|
|
||||||
"""When strings differ, llm_check is called and its result is returned."""
|
|
||||||
llm_yes = lambda a, b: True
|
|
||||||
llm_no = lambda a, b: False
|
|
||||||
|
|
||||||
assert _fuzzy_keyword_match("shaft", "shafts", llm_check=llm_yes) is True
|
|
||||||
assert _fuzzy_keyword_match("shaft", "shafts", llm_check=llm_no) is False
|
|
||||||
|
|
||||||
def test_llm_check_not_called_on_exact(self):
|
|
||||||
"""Exact match should not call llm_check."""
|
|
||||||
def boom(a, b):
|
|
||||||
raise AssertionError("should not be called")
|
|
||||||
|
|
||||||
assert _fuzzy_keyword_match("shaft", "shaft", llm_check=boom) is True
|
|
||||||
|
|
||||||
def test_no_substring_match_without_llm(self):
|
|
||||||
"""Substring matching is gone — different keywords must not match."""
|
|
||||||
assert _fuzzy_keyword_match("shaft manufacturing", "custom shaft manufacturing") is False
|
|
||||||
assert _fuzzy_keyword_match("cnc machining", "precision cnc machining services") is False
|
|
||||||
|
|
||||||
|
|
||||||
class TestNormalizeForMatch:
|
class TestNormalizeForMatch:
|
||||||
def test_lowercase_and_strip(self):
|
def test_lowercase_and_strip(self):
|
||||||
|
|
@ -560,6 +548,16 @@ class TestScanCoraFolder:
|
||||||
assert "Processed" in result
|
assert "Processed" in result
|
||||||
assert "old.xlsx" in result
|
assert "old.xlsx" in result
|
||||||
|
|
||||||
|
def test_shows_kv_status(self, mock_ctx, tmp_path):
|
||||||
|
mock_ctx["config"].link_building.watch_folder = str(tmp_path)
|
||||||
|
(tmp_path / "tracked.xlsx").write_text("fake")
|
||||||
|
|
||||||
|
db = mock_ctx["db"]
|
||||||
|
db.kv_set("linkbuilding:watched:tracked.xlsx", json.dumps({"status": "completed"}))
|
||||||
|
|
||||||
|
result = scan_cora_folder(ctx=mock_ctx)
|
||||||
|
assert "completed" in result
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# ClickUp state machine tests
|
# ClickUp state machine tests
|
||||||
|
|
@ -584,6 +582,12 @@ class TestClickUpStateMachine:
|
||||||
mock_ctx["clickup_task_id"] = "task_abc"
|
mock_ctx["clickup_task_id"] = "task_abc"
|
||||||
mock_ctx["config"].clickup.enabled = True
|
mock_ctx["config"].clickup.enabled = True
|
||||||
|
|
||||||
|
# Pre-set executing state
|
||||||
|
mock_ctx["db"].kv_set(
|
||||||
|
"clickup:task:task_abc:state",
|
||||||
|
json.dumps({"state": "executing"}),
|
||||||
|
)
|
||||||
|
|
||||||
ingest_proc = subprocess.CompletedProcess(
|
ingest_proc = subprocess.CompletedProcess(
|
||||||
args=[], returncode=0, stdout=ingest_success_stdout, stderr=""
|
args=[], returncode=0, stdout=ingest_success_stdout, stderr=""
|
||||||
)
|
)
|
||||||
|
|
@ -599,9 +603,10 @@ class TestClickUpStateMachine:
|
||||||
|
|
||||||
assert "ClickUp Sync" in result
|
assert "ClickUp Sync" in result
|
||||||
|
|
||||||
# Verify ClickUp API was called for completion
|
# Verify KV state was updated
|
||||||
cu.add_comment.assert_called()
|
raw = mock_ctx["db"].kv_get("clickup:task:task_abc:state")
|
||||||
cu.update_task_status.assert_called()
|
state = json.loads(raw)
|
||||||
|
assert state["state"] == "completed"
|
||||||
|
|
||||||
@patch("cheddahbot.tools.linkbuilding._run_blm_command")
|
@patch("cheddahbot.tools.linkbuilding._run_blm_command")
|
||||||
@patch("cheddahbot.tools.linkbuilding._get_clickup_client")
|
@patch("cheddahbot.tools.linkbuilding._get_clickup_client")
|
||||||
|
|
@ -614,6 +619,14 @@ class TestClickUpStateMachine:
|
||||||
|
|
||||||
mock_ctx["clickup_task_id"] = "task_fail"
|
mock_ctx["clickup_task_id"] = "task_fail"
|
||||||
mock_ctx["config"].clickup.enabled = True
|
mock_ctx["config"].clickup.enabled = True
|
||||||
|
mock_ctx["config"].clickup.skill_map = {
|
||||||
|
"Link Building": {"error_status": "internal review"}
|
||||||
|
}
|
||||||
|
|
||||||
|
mock_ctx["db"].kv_set(
|
||||||
|
"clickup:task:task_fail:state",
|
||||||
|
json.dumps({"state": "executing"}),
|
||||||
|
)
|
||||||
|
|
||||||
mock_cmd.return_value = subprocess.CompletedProcess(
|
mock_cmd.return_value = subprocess.CompletedProcess(
|
||||||
args=[], returncode=1, stdout="Error", stderr="crash"
|
args=[], returncode=1, stdout="Error", stderr="crash"
|
||||||
|
|
@ -625,6 +638,6 @@ class TestClickUpStateMachine:
|
||||||
)
|
)
|
||||||
assert "Error" in result
|
assert "Error" in result
|
||||||
|
|
||||||
# Verify ClickUp API was called for failure
|
raw = mock_ctx["db"].kv_get("clickup:task:task_fail:state")
|
||||||
cu.add_comment.assert_called()
|
state = json.loads(raw)
|
||||||
cu.update_task_status.assert_called()
|
assert state["state"] == "failed"
|
||||||
|
|
|
||||||
|
|
@ -1,410 +0,0 @@
|
||||||
"""Tests for the ntfy.sh push notification sender."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from unittest.mock import MagicMock, patch
|
|
||||||
|
|
||||||
import httpx
|
|
||||||
|
|
||||||
from cheddahbot.ntfy import NtfyChannel, NtfyNotifier
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# NtfyChannel routing
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestNtfyChannel:
|
|
||||||
def test_accepts_matching_category_and_pattern(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="human_action",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["clickup", "autocora"],
|
|
||||||
include_patterns=["completed", "SUCCESS"],
|
|
||||||
)
|
|
||||||
assert ch.accepts("ClickUp task completed: **Acme PR**", "clickup") is True
|
|
||||||
assert ch.accepts("AutoCora SUCCESS: **keyword**", "autocora") is True
|
|
||||||
|
|
||||||
def test_rejects_wrong_category(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="human_action",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
include_patterns=["completed"],
|
|
||||||
)
|
|
||||||
assert ch.accepts("Some autocora message completed", "autocora") is False
|
|
||||||
|
|
||||||
def test_rejects_non_matching_pattern(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="human_action",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
include_patterns=["completed"],
|
|
||||||
)
|
|
||||||
assert ch.accepts("Executing ClickUp task: **Acme PR**", "clickup") is False
|
|
||||||
|
|
||||||
def test_no_include_patterns_accepts_all_in_category(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="all_clickup",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
assert ch.accepts("Any message at all", "clickup") is True
|
|
||||||
|
|
||||||
def test_exclude_patterns_take_priority(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
include_patterns=["task"],
|
|
||||||
exclude_patterns=["Executing"],
|
|
||||||
)
|
|
||||||
assert ch.accepts("Executing ClickUp task", "clickup") is False
|
|
||||||
assert ch.accepts("ClickUp task completed", "clickup") is True
|
|
||||||
|
|
||||||
def test_case_insensitive_patterns(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["autocora"],
|
|
||||||
include_patterns=["success"],
|
|
||||||
)
|
|
||||||
assert ch.accepts("AutoCora SUCCESS: **kw**", "autocora") is True
|
|
||||||
|
|
||||||
def test_empty_topic_filtered_by_notifier(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="empty", server="https://ntfy.sh", topic="",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
assert notifier.enabled is False
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# NtfyNotifier
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestNtfyNotifier:
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_notify_posts_to_matching_channel(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="human_action",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="my-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
include_patterns=["completed"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("ClickUp task completed: **Acme PR**", "clickup")
|
|
||||||
|
|
||||||
# Wait for daemon thread
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
mock_post.assert_called_once()
|
|
||||||
call_args = mock_post.call_args
|
|
||||||
assert call_args[0][0] == "https://ntfy.sh/my-topic"
|
|
||||||
assert call_args[1]["headers"]["Title"] == "CheddahBot [clickup]"
|
|
||||||
assert call_args[1]["headers"]["Priority"] == "high"
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_notify_skips_non_matching_channel(self, mock_post):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="errors",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="err-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
include_patterns=["failed"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("ClickUp task completed: **Acme PR**", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
mock_post.assert_not_called()
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_notify_routes_to_multiple_channels(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch1 = NtfyChannel(
|
|
||||||
name="all", server="https://ntfy.sh", topic="all-topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
ch2 = NtfyChannel(
|
|
||||||
name="errors", server="https://ntfy.sh", topic="err-topic",
|
|
||||||
categories=["clickup"], include_patterns=["failed"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch1, ch2])
|
|
||||||
notifier.notify("ClickUp task failed: **Acme**", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
assert mock_post.call_count == 2
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_webhook_error_is_swallowed(self, mock_post):
|
|
||||||
mock_post.side_effect = httpx.ConnectError("connection refused")
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
# Should not raise
|
|
||||||
notifier.notify("ClickUp task completed: **test**", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_4xx_is_logged_not_raised(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=400, text="Bad Request")
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("ClickUp task completed: **test**", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
def test_enabled_property(self):
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
assert NtfyNotifier([ch]).enabled is True
|
|
||||||
assert NtfyNotifier([]).enabled is False
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Post format
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestPostFormat:
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_includes_tags_header(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"], tags="white_check_mark",
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("task completed", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
headers = mock_post.call_args[1]["headers"]
|
|
||||||
assert headers["Tags"] == "white_check_mark"
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_omits_tags_header_when_empty(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"], tags="",
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("task completed", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
headers = mock_post.call_args[1]["headers"]
|
|
||||||
assert "Tags" not in headers
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_custom_server_url(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://my-ntfy.example.com",
|
|
||||||
topic="topic", categories=["clickup"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("task completed", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
assert mock_post.call_args[0][0] == "https://my-ntfy.example.com/topic"
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_message_sent_as_body(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"],
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("Hello **world**", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
assert mock_post.call_args[1]["content"] == b"Hello **world**"
|
|
||||||
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_priority_header(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
ch = NtfyChannel(
|
|
||||||
name="test", server="https://ntfy.sh", topic="topic",
|
|
||||||
categories=["clickup"], priority="urgent",
|
|
||||||
)
|
|
||||||
notifier = NtfyNotifier([ch])
|
|
||||||
notifier.notify("task completed", "clickup")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
assert mock_post.call_args[1]["headers"]["Priority"] == "urgent"
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Dedup window
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
def _make_channel(**overrides) -> NtfyChannel:
|
|
||||||
defaults = dict(
|
|
||||||
name="errors",
|
|
||||||
server="https://ntfy.sh",
|
|
||||||
topic="test-topic",
|
|
||||||
categories=["alert"],
|
|
||||||
)
|
|
||||||
defaults.update(overrides)
|
|
||||||
return NtfyChannel(**defaults)
|
|
||||||
|
|
||||||
|
|
||||||
class TestDedup:
|
|
||||||
def test_first_message_goes_through(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
assert notifier._check_and_track("errors", "task X skipped") is True
|
|
||||||
|
|
||||||
def test_duplicate_permanently_suppressed(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
assert notifier._check_and_track("errors", "task X skipped") is True
|
|
||||||
assert notifier._check_and_track("errors", "task X skipped") is False
|
|
||||||
|
|
||||||
def test_duplicate_still_suppressed_after_day_rollover(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
assert notifier._check_and_track("errors", "task X skipped") is True
|
|
||||||
# Dedup memory persists even across date rollover
|
|
||||||
with patch.object(notifier, "_today", return_value="2099-01-01"):
|
|
||||||
assert notifier._check_and_track("errors", "task X skipped") is False
|
|
||||||
|
|
||||||
def test_different_messages_not_deduped(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
assert notifier._check_and_track("errors", "task A skipped") is True
|
|
||||||
assert notifier._check_and_track("errors", "task B skipped") is True
|
|
||||||
|
|
||||||
def test_same_message_different_channel_not_deduped(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
assert notifier._check_and_track("errors", "task X skipped") is True
|
|
||||||
assert notifier._check_and_track("alerts", "task X skipped") is True
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Daily cap
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestDailyCap:
|
|
||||||
def test_sends_up_to_cap(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()], daily_cap=3)
|
|
||||||
for i in range(3):
|
|
||||||
assert notifier._check_and_track("errors", f"msg {i}") is True
|
|
||||||
assert notifier._check_and_track("errors", "msg 3") is False
|
|
||||||
|
|
||||||
def test_cap_resets_on_new_day(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()], daily_cap=2)
|
|
||||||
assert notifier._check_and_track("errors", "msg 0") is True
|
|
||||||
assert notifier._check_and_track("errors", "msg 1") is True
|
|
||||||
assert notifier._check_and_track("errors", "msg 2") is False
|
|
||||||
|
|
||||||
with patch.object(notifier, "_today", return_value="2099-01-01"):
|
|
||||||
assert notifier._check_and_track("errors", "msg 2") is True
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# 429 backoff
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestRateLimitBackoff:
|
|
||||||
def test_429_suppresses_rest_of_day(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
notifier._mark_rate_limited()
|
|
||||||
assert notifier._check_and_track("errors", "new message") is False
|
|
||||||
|
|
||||||
def test_429_resets_next_day(self):
|
|
||||||
notifier = NtfyNotifier([_make_channel()])
|
|
||||||
notifier._mark_rate_limited()
|
|
||||||
assert notifier._check_and_track("errors", "blocked") is False
|
|
||||||
|
|
||||||
with patch.object(notifier, "_today", return_value="2099-01-01"):
|
|
||||||
assert notifier._check_and_track("errors", "unblocked") is True
|
|
||||||
|
|
||||||
def test_post_sets_rate_limit_on_429(self):
|
|
||||||
channel = _make_channel()
|
|
||||||
notifier = NtfyNotifier([channel])
|
|
||||||
|
|
||||||
mock_resp = MagicMock(status_code=429, text="Rate limited")
|
|
||||||
with patch("cheddahbot.ntfy.httpx.post", return_value=mock_resp):
|
|
||||||
notifier._post(channel, "test msg", "alert")
|
|
||||||
|
|
||||||
assert notifier._rate_limited_until == notifier._today()
|
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
# Notify integration with dedup
|
|
||||||
# ---------------------------------------------------------------------------
|
|
||||||
|
|
||||||
|
|
||||||
class TestNotifyDedup:
|
|
||||||
@patch("cheddahbot.ntfy.httpx.post")
|
|
||||||
def test_notify_skips_deduped_messages(self, mock_post):
|
|
||||||
mock_post.return_value = MagicMock(status_code=200)
|
|
||||||
channel = _make_channel()
|
|
||||||
notifier = NtfyNotifier([channel])
|
|
||||||
|
|
||||||
notifier.notify("same msg", "alert")
|
|
||||||
notifier.notify("same msg", "alert")
|
|
||||||
|
|
||||||
import threading
|
|
||||||
for t in threading.enumerate():
|
|
||||||
if t.daemon and t.is_alive():
|
|
||||||
t.join(timeout=2)
|
|
||||||
|
|
||||||
# Only one post — second was deduped
|
|
||||||
mock_post.assert_called_once()
|
|
||||||
|
|
@ -552,7 +552,7 @@ class TestSubmitPressRelease:
|
||||||
headline="Advanced Industrial Expands PEEK Machining",
|
headline="Advanced Industrial Expands PEEK Machining",
|
||||||
company_name="Advanced Industrial",
|
company_name="Advanced Industrial",
|
||||||
pr_text=REALISTIC_PR_TEXT,
|
pr_text=REALISTIC_PR_TEXT,
|
||||||
keyword="PEEK machining",
|
topic="PEEK machining",
|
||||||
target_url="https://advancedindustrial.com/peek",
|
target_url="https://advancedindustrial.com/peek",
|
||||||
ctx=submit_ctx,
|
ctx=submit_ctx,
|
||||||
)
|
)
|
||||||
|
|
@ -575,7 +575,7 @@ class TestSubmitPressRelease:
|
||||||
headline="Advanced Industrial Expands PEEK Machining",
|
headline="Advanced Industrial Expands PEEK Machining",
|
||||||
company_name="Advanced Industrial",
|
company_name="Advanced Industrial",
|
||||||
pr_text=REALISTIC_PR_TEXT,
|
pr_text=REALISTIC_PR_TEXT,
|
||||||
keyword="PEEK machining",
|
topic="PEEK machining",
|
||||||
branded_url="https://linkedin.com/company/advanced-industrial",
|
branded_url="https://linkedin.com/company/advanced-industrial",
|
||||||
ctx=submit_ctx,
|
ctx=submit_ctx,
|
||||||
)
|
)
|
||||||
|
|
@ -598,7 +598,7 @@ class TestSubmitPressRelease:
|
||||||
headline="Advanced Industrial Expands PEEK Machining",
|
headline="Advanced Industrial Expands PEEK Machining",
|
||||||
company_name="Advanced Industrial",
|
company_name="Advanced Industrial",
|
||||||
pr_text=REALISTIC_PR_TEXT,
|
pr_text=REALISTIC_PR_TEXT,
|
||||||
keyword="PEEK machining",
|
topic="PEEK machining",
|
||||||
branded_url="GBP",
|
branded_url="GBP",
|
||||||
ctx=submit_ctx,
|
ctx=submit_ctx,
|
||||||
)
|
)
|
||||||
|
|
@ -694,7 +694,7 @@ class TestSubmitPressRelease:
|
||||||
headline="Advanced Industrial Expands PEEK Machining",
|
headline="Advanced Industrial Expands PEEK Machining",
|
||||||
company_name="Advanced Industrial",
|
company_name="Advanced Industrial",
|
||||||
pr_text=LONG_PR_TEXT,
|
pr_text=LONG_PR_TEXT,
|
||||||
keyword="PEEK machining",
|
topic="PEEK machining",
|
||||||
target_url="https://example.com/peek",
|
target_url="https://example.com/peek",
|
||||||
ctx=submit_ctx,
|
ctx=submit_ctx,
|
||||||
)
|
)
|
||||||
|
|
|
||||||
|
|
@ -2,6 +2,7 @@
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
from datetime import UTC, datetime
|
from datetime import UTC, datetime
|
||||||
from unittest.mock import MagicMock
|
from unittest.mock import MagicMock
|
||||||
|
|
@ -17,7 +18,7 @@ _PR_MAPPING = {
|
||||||
"auto_execute": True,
|
"auto_execute": True,
|
||||||
"field_mapping": {
|
"field_mapping": {
|
||||||
"topic": "task_name",
|
"topic": "task_name",
|
||||||
"company_name": "Client",
|
"company_name": "Customer",
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -35,7 +36,6 @@ class _FakeClickUpConfig:
|
||||||
error_status: str = "error"
|
error_status: str = "error"
|
||||||
task_type_field_name: str = "Work Category"
|
task_type_field_name: str = "Work Category"
|
||||||
default_auto_execute: bool = True
|
default_auto_execute: bool = True
|
||||||
poll_task_types: list[str] = field(default_factory=lambda: ["Press Release"])
|
|
||||||
skill_map: dict = field(default_factory=lambda: {"Press Release": _PR_MAPPING})
|
skill_map: dict = field(default_factory=lambda: {"Press Release": _PR_MAPPING})
|
||||||
enabled: bool = True
|
enabled: bool = True
|
||||||
|
|
||||||
|
|
@ -69,7 +69,7 @@ def _now_ms():
|
||||||
return int(datetime.now(UTC).timestamp() * 1000)
|
return int(datetime.now(UTC).timestamp() * 1000)
|
||||||
|
|
||||||
|
|
||||||
_FIELDS = {"Client": "Acme"}
|
_FIELDS = {"Customer": "Acme"}
|
||||||
|
|
||||||
|
|
||||||
# ── Tests ──
|
# ── Tests ──
|
||||||
|
|
@ -104,6 +104,55 @@ class TestPollClickup:
|
||||||
mock_client.discover_field_filter.return_value = field_filter
|
mock_client.discover_field_filter.return_value = field_filter
|
||||||
return mock_client
|
return mock_client
|
||||||
|
|
||||||
|
def test_skips_task_already_completed(self, tmp_db):
|
||||||
|
"""Tasks with completed state should be skipped."""
|
||||||
|
config = _FakeConfig()
|
||||||
|
agent = MagicMock()
|
||||||
|
scheduler = Scheduler(config, tmp_db, agent)
|
||||||
|
|
||||||
|
state = {"state": "completed", "clickup_task_id": "t1"}
|
||||||
|
tmp_db.kv_set("clickup:task:t1:state", json.dumps(state))
|
||||||
|
|
||||||
|
due = str(_now_ms() + 86400000)
|
||||||
|
task = _make_task(
|
||||||
|
"t1",
|
||||||
|
"PR for Acme",
|
||||||
|
"Press Release",
|
||||||
|
due_date=due,
|
||||||
|
custom_fields=_FIELDS,
|
||||||
|
)
|
||||||
|
|
||||||
|
scheduler._clickup_client = self._make_mock_client(
|
||||||
|
tasks=[task],
|
||||||
|
)
|
||||||
|
scheduler._poll_clickup()
|
||||||
|
|
||||||
|
scheduler._clickup_client.update_task_status.assert_not_called()
|
||||||
|
|
||||||
|
def test_skips_task_already_failed(self, tmp_db):
|
||||||
|
"""Tasks with failed state should be skipped."""
|
||||||
|
config = _FakeConfig()
|
||||||
|
agent = MagicMock()
|
||||||
|
scheduler = Scheduler(config, tmp_db, agent)
|
||||||
|
|
||||||
|
state = {"state": "failed", "clickup_task_id": "t1"}
|
||||||
|
tmp_db.kv_set("clickup:task:t1:state", json.dumps(state))
|
||||||
|
|
||||||
|
due = str(_now_ms() + 86400000)
|
||||||
|
task = _make_task(
|
||||||
|
"t1",
|
||||||
|
"PR for Acme",
|
||||||
|
"Press Release",
|
||||||
|
due_date=due,
|
||||||
|
)
|
||||||
|
|
||||||
|
scheduler._clickup_client = self._make_mock_client(
|
||||||
|
tasks=[task],
|
||||||
|
)
|
||||||
|
scheduler._poll_clickup()
|
||||||
|
|
||||||
|
scheduler._clickup_client.update_task_status.assert_not_called()
|
||||||
|
|
||||||
def test_skips_task_with_no_due_date(self, tmp_db):
|
def test_skips_task_with_no_due_date(self, tmp_db):
|
||||||
"""Tasks with no due date should be skipped."""
|
"""Tasks with no due date should be skipped."""
|
||||||
config = _FakeConfig()
|
config = _FakeConfig()
|
||||||
|
|
@ -150,11 +199,11 @@ class TestExecuteTask:
|
||||||
"""Test the simplified _execute_task method."""
|
"""Test the simplified _execute_task method."""
|
||||||
|
|
||||||
def test_success_flow(self, tmp_db):
|
def test_success_flow(self, tmp_db):
|
||||||
"""Successful execution: tool called, automation underway set."""
|
"""Successful execution: state=completed."""
|
||||||
config = _FakeConfig()
|
config = _FakeConfig()
|
||||||
agent = MagicMock()
|
agent = MagicMock()
|
||||||
agent._tools = MagicMock()
|
agent._tools = MagicMock()
|
||||||
agent._tools.execute.return_value = "Pipeline completed successfully"
|
agent._tools.execute.return_value = "## ClickUp Sync\nDone"
|
||||||
scheduler = Scheduler(config, tmp_db, agent)
|
scheduler = Scheduler(config, tmp_db, agent)
|
||||||
|
|
||||||
mock_client = MagicMock()
|
mock_client = MagicMock()
|
||||||
|
|
@ -175,10 +224,51 @@ class TestExecuteTask:
|
||||||
"t1",
|
"t1",
|
||||||
"automation underway",
|
"automation underway",
|
||||||
)
|
)
|
||||||
agent._tools.execute.assert_called_once()
|
|
||||||
|
raw = tmp_db.kv_get("clickup:task:t1:state")
|
||||||
|
state = json.loads(raw)
|
||||||
|
assert state["state"] == "completed"
|
||||||
|
|
||||||
|
def test_success_fallback_path(self, tmp_db):
|
||||||
|
"""Scheduler uploads docx and sets review status."""
|
||||||
|
config = _FakeConfig()
|
||||||
|
agent = MagicMock()
|
||||||
|
agent._tools = MagicMock()
|
||||||
|
agent._tools.execute.return_value = "Press releases done.\n**Docx:** `output/pr.docx`"
|
||||||
|
scheduler = Scheduler(config, tmp_db, agent)
|
||||||
|
|
||||||
|
mock_client = MagicMock()
|
||||||
|
mock_client.update_task_status.return_value = True
|
||||||
|
mock_client.upload_attachment.return_value = True
|
||||||
|
mock_client.add_comment.return_value = True
|
||||||
|
scheduler._clickup_client = mock_client
|
||||||
|
|
||||||
|
due = str(_now_ms() + 86400000)
|
||||||
|
task = _make_task(
|
||||||
|
"t1",
|
||||||
|
"PR for Acme",
|
||||||
|
"Press Release",
|
||||||
|
due_date=due,
|
||||||
|
custom_fields=_FIELDS,
|
||||||
|
)
|
||||||
|
scheduler._execute_task(task)
|
||||||
|
|
||||||
|
mock_client.update_task_status.assert_any_call(
|
||||||
|
"t1",
|
||||||
|
"internal review",
|
||||||
|
)
|
||||||
|
mock_client.upload_attachment.assert_called_once_with(
|
||||||
|
"t1",
|
||||||
|
"output/pr.docx",
|
||||||
|
)
|
||||||
|
|
||||||
|
raw = tmp_db.kv_get("clickup:task:t1:state")
|
||||||
|
state = json.loads(raw)
|
||||||
|
assert state["state"] == "completed"
|
||||||
|
assert "output/pr.docx" in state["deliverable_paths"]
|
||||||
|
|
||||||
def test_failure_flow(self, tmp_db):
|
def test_failure_flow(self, tmp_db):
|
||||||
"""Failed: error comment posted, status set to 'error'."""
|
"""Failed: state=failed, error comment, status set to 'error'."""
|
||||||
config = _FakeConfig()
|
config = _FakeConfig()
|
||||||
agent = MagicMock()
|
agent = MagicMock()
|
||||||
agent._tools = MagicMock()
|
agent._tools = MagicMock()
|
||||||
|
|
@ -205,6 +295,11 @@ class TestExecuteTask:
|
||||||
comment_text = mock_client.add_comment.call_args[0][1]
|
comment_text = mock_client.add_comment.call_args[0][1]
|
||||||
assert "failed" in comment_text.lower()
|
assert "failed" in comment_text.lower()
|
||||||
|
|
||||||
|
raw = tmp_db.kv_get("clickup:task:t1:state")
|
||||||
|
state = json.loads(raw)
|
||||||
|
assert state["state"] == "failed"
|
||||||
|
assert "API timeout" in state["error"]
|
||||||
|
|
||||||
|
|
||||||
class TestFieldFilterDiscovery:
|
class TestFieldFilterDiscovery:
|
||||||
"""Test _discover_field_filter caching."""
|
"""Test _discover_field_filter caching."""
|
||||||
|
|
@ -232,62 +327,3 @@ class TestFieldFilterDiscovery:
|
||||||
mock_client.discover_field_filter.reset_mock()
|
mock_client.discover_field_filter.reset_mock()
|
||||||
scheduler._poll_clickup()
|
scheduler._poll_clickup()
|
||||||
mock_client.discover_field_filter.assert_not_called()
|
mock_client.discover_field_filter.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
class TestActiveExecutions:
|
|
||||||
"""Test the active execution registry."""
|
|
||||||
|
|
||||||
def test_register_and_get(self, tmp_db):
|
|
||||||
config = _FakeConfig()
|
|
||||||
scheduler = Scheduler(config, tmp_db, MagicMock())
|
|
||||||
|
|
||||||
scheduler._register_execution("t1", "Task One", "write_press_releases")
|
|
||||||
active = scheduler.get_active_executions()
|
|
||||||
|
|
||||||
assert "t1" in active
|
|
||||||
assert active["t1"]["name"] == "Task One"
|
|
||||||
assert active["t1"]["tool"] == "write_press_releases"
|
|
||||||
assert "started_at" in active["t1"]
|
|
||||||
assert "thread" in active["t1"]
|
|
||||||
|
|
||||||
def test_unregister(self, tmp_db):
|
|
||||||
config = _FakeConfig()
|
|
||||||
scheduler = Scheduler(config, tmp_db, MagicMock())
|
|
||||||
|
|
||||||
scheduler._register_execution("t1", "Task One", "write_press_releases")
|
|
||||||
scheduler._unregister_execution("t1")
|
|
||||||
assert scheduler.get_active_executions() == {}
|
|
||||||
|
|
||||||
def test_unregister_nonexistent_is_noop(self, tmp_db):
|
|
||||||
config = _FakeConfig()
|
|
||||||
scheduler = Scheduler(config, tmp_db, MagicMock())
|
|
||||||
|
|
||||||
# Should not raise
|
|
||||||
scheduler._unregister_execution("nonexistent")
|
|
||||||
assert scheduler.get_active_executions() == {}
|
|
||||||
|
|
||||||
def test_multiple_executions(self, tmp_db):
|
|
||||||
config = _FakeConfig()
|
|
||||||
scheduler = Scheduler(config, tmp_db, MagicMock())
|
|
||||||
|
|
||||||
scheduler._register_execution("t1", "Task One", "write_press_releases")
|
|
||||||
scheduler._register_execution("t2", "Task Two", "run_cora_backlinks")
|
|
||||||
active = scheduler.get_active_executions()
|
|
||||||
|
|
||||||
assert len(active) == 2
|
|
||||||
assert "t1" in active
|
|
||||||
assert "t2" in active
|
|
||||||
|
|
||||||
def test_get_returns_snapshot(self, tmp_db):
|
|
||||||
"""get_active_executions returns a copy, not a reference."""
|
|
||||||
config = _FakeConfig()
|
|
||||||
scheduler = Scheduler(config, tmp_db, MagicMock())
|
|
||||||
|
|
||||||
scheduler._register_execution("t1", "Task One", "tool_a")
|
|
||||||
snapshot = scheduler.get_active_executions()
|
|
||||||
scheduler._unregister_execution("t1")
|
|
||||||
|
|
||||||
# Snapshot should still have t1
|
|
||||||
assert "t1" in snapshot
|
|
||||||
# But live state should be empty
|
|
||||||
assert scheduler.get_active_executions() == {}
|
|
||||||
|
|
|
||||||
|
|
@ -1,42 +1,32 @@
|
||||||
"""Tests for scheduler helper functions.
|
"""Tests for scheduler helper functions."""
|
||||||
|
|
||||||
Note: _extract_docx_paths was removed as part of KV store elimination.
|
|
||||||
The scheduler no longer handles docx extraction — tools own their own sync.
|
|
||||||
"""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from cheddahbot.scheduler import _extract_docx_paths
|
||||||
|
|
||||||
class TestLoopTimestamps:
|
|
||||||
"""Test that loop timestamps use in-memory storage."""
|
|
||||||
|
|
||||||
def test_initial_timestamps_are_none(self):
|
class TestExtractDocxPaths:
|
||||||
from unittest.mock import MagicMock
|
def test_extracts_paths_from_realistic_output(self):
|
||||||
|
result = (
|
||||||
|
"Press releases generated successfully!\n\n"
|
||||||
|
"**Docx:** `output/press_releases/acme-corp-launch.docx`\n"
|
||||||
|
"**Docx:** `output/press_releases/acme-corp-expansion.docx`\n"
|
||||||
|
"Files saved to output/press_releases/"
|
||||||
|
)
|
||||||
|
paths = _extract_docx_paths(result)
|
||||||
|
|
||||||
from cheddahbot.scheduler import Scheduler
|
assert len(paths) == 2
|
||||||
|
assert paths[0] == "output/press_releases/acme-corp-launch.docx"
|
||||||
|
assert paths[1] == "output/press_releases/acme-corp-expansion.docx"
|
||||||
|
|
||||||
config = MagicMock()
|
def test_returns_empty_list_when_no_paths(self):
|
||||||
db = MagicMock()
|
result = "Task completed successfully. No files generated."
|
||||||
agent = MagicMock()
|
paths = _extract_docx_paths(result)
|
||||||
sched = Scheduler(config, db, agent)
|
|
||||||
|
|
||||||
timestamps = sched.get_loop_timestamps()
|
assert paths == []
|
||||||
assert timestamps["heartbeat"] is None
|
|
||||||
assert timestamps["poll"] is None
|
|
||||||
assert timestamps["clickup"] is None
|
|
||||||
|
|
||||||
def test_timestamps_update_in_memory(self):
|
def test_only_matches_docx_extension(self):
|
||||||
from unittest.mock import MagicMock
|
result = "**Docx:** `report.docx`\n**PDF:** `report.pdf`\n**Docx:** `summary.txt`\n"
|
||||||
|
paths = _extract_docx_paths(result)
|
||||||
|
|
||||||
from cheddahbot.scheduler import Scheduler
|
assert paths == ["report.docx"]
|
||||||
|
|
||||||
config = MagicMock()
|
|
||||||
db = MagicMock()
|
|
||||||
agent = MagicMock()
|
|
||||||
sched = Scheduler(config, db, agent)
|
|
||||||
|
|
||||||
sched._loop_timestamps["heartbeat"] = "2026-02-27T12:00:00+00:00"
|
|
||||||
timestamps = sched.get_loop_timestamps()
|
|
||||||
assert timestamps["heartbeat"] == "2026-02-27T12:00:00+00:00"
|
|
||||||
# Ensure db.kv_set was never called
|
|
||||||
db.kv_set.assert_not_called()
|
|
||||||
|
|
|
||||||
48
uv.lock
48
uv.lock
|
|
@ -325,16 +325,12 @@ dependencies = [
|
||||||
{ name = "edge-tts" },
|
{ name = "edge-tts" },
|
||||||
{ name = "gradio" },
|
{ name = "gradio" },
|
||||||
{ name = "httpx" },
|
{ name = "httpx" },
|
||||||
{ name = "jinja2" },
|
|
||||||
{ name = "numpy" },
|
{ name = "numpy" },
|
||||||
{ name = "openai" },
|
{ name = "openai" },
|
||||||
{ name = "openpyxl" },
|
|
||||||
{ name = "python-docx" },
|
{ name = "python-docx" },
|
||||||
{ name = "python-dotenv" },
|
{ name = "python-dotenv" },
|
||||||
{ name = "python-multipart" },
|
|
||||||
{ name = "pyyaml" },
|
{ name = "pyyaml" },
|
||||||
{ name = "sentence-transformers" },
|
{ name = "sentence-transformers" },
|
||||||
{ name = "sse-starlette" },
|
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.dev-dependencies]
|
[package.dev-dependencies]
|
||||||
|
|
@ -360,16 +356,12 @@ requires-dist = [
|
||||||
{ name = "edge-tts", specifier = ">=6.1" },
|
{ name = "edge-tts", specifier = ">=6.1" },
|
||||||
{ name = "gradio", specifier = ">=5.0" },
|
{ name = "gradio", specifier = ">=5.0" },
|
||||||
{ name = "httpx", specifier = ">=0.27" },
|
{ name = "httpx", specifier = ">=0.27" },
|
||||||
{ name = "jinja2", specifier = ">=3.1.6" },
|
|
||||||
{ name = "numpy", specifier = ">=1.24" },
|
{ name = "numpy", specifier = ">=1.24" },
|
||||||
{ name = "openai", specifier = ">=1.30" },
|
{ name = "openai", specifier = ">=1.30" },
|
||||||
{ name = "openpyxl", specifier = ">=3.1.5" },
|
|
||||||
{ name = "python-docx", specifier = ">=1.2.0" },
|
{ name = "python-docx", specifier = ">=1.2.0" },
|
||||||
{ name = "python-dotenv", specifier = ">=1.0" },
|
{ name = "python-dotenv", specifier = ">=1.0" },
|
||||||
{ name = "python-multipart", specifier = ">=0.0.22" },
|
|
||||||
{ name = "pyyaml", specifier = ">=6.0" },
|
{ name = "pyyaml", specifier = ">=6.0" },
|
||||||
{ name = "sentence-transformers", specifier = ">=3.0" },
|
{ name = "sentence-transformers", specifier = ">=3.0" },
|
||||||
{ name = "sse-starlette", specifier = ">=3.3.3" },
|
|
||||||
]
|
]
|
||||||
|
|
||||||
[package.metadata.requires-dev]
|
[package.metadata.requires-dev]
|
||||||
|
|
@ -572,15 +564,6 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/bf/89/92ac6b154ab87d236c15e5e0c73cb99be58efb1ea3eb9318c266bf9a36bf/edge_tts-7.2.7-py3-none-any.whl", hash = "sha256:ac11d9e834347e5ee62cbe72e8a56ffd65d3c4e795be14b1e593b72cf6480dd9", size = 30556, upload-time = "2025-12-12T20:54:26.956Z" },
|
{ url = "https://files.pythonhosted.org/packages/bf/89/92ac6b154ab87d236c15e5e0c73cb99be58efb1ea3eb9318c266bf9a36bf/edge_tts-7.2.7-py3-none-any.whl", hash = "sha256:ac11d9e834347e5ee62cbe72e8a56ffd65d3c4e795be14b1e593b72cf6480dd9", size = 30556, upload-time = "2025-12-12T20:54:26.956Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
|
||||||
name = "et-xmlfile"
|
|
||||||
version = "2.0.0"
|
|
||||||
source = { registry = "https://pypi.org/simple" }
|
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/d3/38/af70d7ab1ae9d4da450eeec1fa3918940a5fafb9055e934af8d6eb0c2313/et_xmlfile-2.0.0.tar.gz", hash = "sha256:dab3f4764309081ce75662649be815c4c9081e88f0837825f90fd28317d4da54", size = 17234, upload-time = "2024-10-25T17:25:40.039Z" }
|
|
||||||
wheels = [
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/c1/8b/5fe2cc11fee489817272089c4203e679c63b570a5aaeb18d852ae3cbba6a/et_xmlfile-2.0.0-py3-none-any.whl", hash = "sha256:7a91720bc756843502c3b7504c77b8fe44217c85c537d85037f0f536151b2caa", size = 18059, upload-time = "2024-10-25T17:25:39.051Z" },
|
|
||||||
]
|
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "fastapi"
|
name = "fastapi"
|
||||||
version = "0.129.0"
|
version = "0.129.0"
|
||||||
|
|
@ -1569,18 +1552,6 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/cc/56/0a89092a453bb2c676d66abee44f863e742b2110d4dbb1dbcca3f7e5fc33/openai-2.21.0-py3-none-any.whl", hash = "sha256:0bc1c775e5b1536c294eded39ee08f8407656537ccc71b1004104fe1602e267c", size = 1103065, upload-time = "2026-02-14T00:11:59.603Z" },
|
{ url = "https://files.pythonhosted.org/packages/cc/56/0a89092a453bb2c676d66abee44f863e742b2110d4dbb1dbcca3f7e5fc33/openai-2.21.0-py3-none-any.whl", hash = "sha256:0bc1c775e5b1536c294eded39ee08f8407656537ccc71b1004104fe1602e267c", size = 1103065, upload-time = "2026-02-14T00:11:59.603Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
|
||||||
name = "openpyxl"
|
|
||||||
version = "3.1.5"
|
|
||||||
source = { registry = "https://pypi.org/simple" }
|
|
||||||
dependencies = [
|
|
||||||
{ name = "et-xmlfile" },
|
|
||||||
]
|
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/3d/f9/88d94a75de065ea32619465d2f77b29a0469500e99012523b91cc4141cd1/openpyxl-3.1.5.tar.gz", hash = "sha256:cf0e3cf56142039133628b5acffe8ef0c12bc902d2aadd3e0fe5878dc08d1050", size = 186464, upload-time = "2024-06-28T14:03:44.161Z" }
|
|
||||||
wheels = [
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/c0/da/977ded879c29cbd04de313843e76868e6e13408a94ed6b987245dc7c8506/openpyxl-3.1.5-py2.py3-none-any.whl", hash = "sha256:5282c12b107bffeef825f4617dc029afaf41d0ea60823bbb665ef3079dc79de2", size = 250910, upload-time = "2024-06-28T14:03:41.161Z" },
|
|
||||||
]
|
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "orjson"
|
name = "orjson"
|
||||||
version = "3.11.7"
|
version = "3.11.7"
|
||||||
|
|
@ -2562,19 +2533,6 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/46/2c/1462b1d0a634697ae9e55b3cecdcb64788e8b7d63f54d923fcd0bb140aed/soupsieve-2.8.3-py3-none-any.whl", hash = "sha256:ed64f2ba4eebeab06cc4962affce381647455978ffc1e36bb79a545b91f45a95", size = 37016, upload-time = "2026-01-20T04:27:01.012Z" },
|
{ url = "https://files.pythonhosted.org/packages/46/2c/1462b1d0a634697ae9e55b3cecdcb64788e8b7d63f54d923fcd0bb140aed/soupsieve-2.8.3-py3-none-any.whl", hash = "sha256:ed64f2ba4eebeab06cc4962affce381647455978ffc1e36bb79a545b91f45a95", size = 37016, upload-time = "2026-01-20T04:27:01.012Z" },
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
|
||||||
name = "sse-starlette"
|
|
||||||
version = "3.3.3"
|
|
||||||
source = { registry = "https://pypi.org/simple" }
|
|
||||||
dependencies = [
|
|
||||||
{ name = "anyio" },
|
|
||||||
{ name = "starlette" },
|
|
||||||
]
|
|
||||||
sdist = { url = "https://files.pythonhosted.org/packages/14/2f/9223c24f568bb7a0c03d751e609844dce0968f13b39a3f73fbb3a96cd27a/sse_starlette-3.3.3.tar.gz", hash = "sha256:72a95d7575fd5129bd0ae15275ac6432bb35ac542fdebb82889c24bb9f3f4049", size = 32420, upload-time = "2026-03-17T20:05:55.529Z" }
|
|
||||||
wheels = [
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/78/e2/b8cff57a67dddf9a464d7e943218e031617fb3ddc133aeeb0602ff5f6c85/sse_starlette-3.3.3-py3-none-any.whl", hash = "sha256:c5abb5082a1cc1c6294d89c5290c46b5f67808cfdb612b7ec27e8ba061c22e8d", size = 14329, upload-time = "2026-03-17T20:05:54.35Z" },
|
|
||||||
]
|
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "starlette"
|
name = "starlette"
|
||||||
version = "0.52.1"
|
version = "0.52.1"
|
||||||
|
|
@ -2741,12 +2699,6 @@ wheels = [
|
||||||
{ url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" },
|
{ url = "https://files.pythonhosted.org/packages/0f/8b/4b61d6e13f7108f36910df9ab4b58fd389cc2520d54d81b88660804aad99/torch-2.10.0-2-cp311-none-macosx_11_0_arm64.whl", hash = "sha256:418997cb02d0a0f1497cf6a09f63166f9f5df9f3e16c8a716ab76a72127c714f", size = 79423467, upload-time = "2026-02-10T21:44:48.711Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
|
{ url = "https://files.pythonhosted.org/packages/d3/54/a2ba279afcca44bbd320d4e73675b282fcee3d81400ea1b53934efca6462/torch-2.10.0-2-cp312-none-macosx_11_0_arm64.whl", hash = "sha256:13ec4add8c3faaed8d13e0574f5cd4a323c11655546f91fbe6afa77b57423574", size = 79498202, upload-time = "2026-02-10T21:44:52.603Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" },
|
{ url = "https://files.pythonhosted.org/packages/ec/23/2c9fe0c9c27f7f6cb865abcea8a4568f29f00acaeadfc6a37f6801f84cb4/torch-2.10.0-2-cp313-none-macosx_11_0_arm64.whl", hash = "sha256:e521c9f030a3774ed770a9c011751fb47c4d12029a3d6522116e48431f2ff89e", size = 79498254, upload-time = "2026-02-10T21:44:44.095Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/36/ab/7b562f1808d3f65414cd80a4f7d4bb00979d9355616c034c171249e1a303/torch-2.10.0-3-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:ac5bdcbb074384c66fa160c15b1ead77839e3fe7ed117d667249afce0acabfac", size = 915518691, upload-time = "2026-03-11T14:15:43.147Z" },
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/b3/7a/abada41517ce0011775f0f4eacc79659bc9bc6c361e6bfe6f7052a6b9363/torch-2.10.0-3-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:98c01b8bb5e3240426dcde1446eed6f40c778091c8544767ef1168fc663a05a6", size = 915622781, upload-time = "2026-03-11T14:17:11.354Z" },
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/ab/c6/4dfe238342ffdcec5aef1c96c457548762d33c40b45a1ab7033bb26d2ff2/torch-2.10.0-3-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:80b1b5bfe38eb0e9f5ff09f206dcac0a87aadd084230d4a36eea5ec5232c115b", size = 915627275, upload-time = "2026-03-11T14:16:11.325Z" },
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/d8/f0/72bf18847f58f877a6a8acf60614b14935e2f156d942483af1ffc081aea0/torch-2.10.0-3-cp313-cp313t-manylinux_2_28_x86_64.whl", hash = "sha256:46b3574d93a2a8134b3f5475cfb98e2eb46771794c57015f6ad1fb795ec25e49", size = 915523474, upload-time = "2026-03-11T14:17:44.422Z" },
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/f4/39/590742415c3030551944edc2ddc273ea1fdfe8ffb2780992e824f1ebee98/torch-2.10.0-3-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:b1d5e2aba4eb7f8e87fbe04f86442887f9167a35f092afe4c237dfcaaef6e328", size = 915632474, upload-time = "2026-03-11T14:15:13.666Z" },
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/b6/8e/34949484f764dde5b222b7fe3fede43e4a6f0da9d7f8c370bb617d629ee2/torch-2.10.0-3-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:0228d20b06701c05a8f978357f657817a4a63984b0c90745def81c18aedfa591", size = 915523882, upload-time = "2026-03-11T14:14:46.311Z" },
|
|
||||||
{ url = "https://files.pythonhosted.org/packages/78/89/f5554b13ebd71e05c0b002f95148033e730d3f7067f67423026cc9c69410/torch-2.10.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:3282d9febd1e4e476630a099692b44fdc214ee9bf8ee5377732d9d9dfe5712e4", size = 145992610, upload-time = "2026-01-21T16:25:26.327Z" },
|
{ url = "https://files.pythonhosted.org/packages/78/89/f5554b13ebd71e05c0b002f95148033e730d3f7067f67423026cc9c69410/torch-2.10.0-cp311-cp311-manylinux_2_28_aarch64.whl", hash = "sha256:3282d9febd1e4e476630a099692b44fdc214ee9bf8ee5377732d9d9dfe5712e4", size = 145992610, upload-time = "2026-01-21T16:25:26.327Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/ae/30/a3a2120621bf9c17779b169fc17e3dc29b230c29d0f8222f499f5e159aa8/torch-2.10.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:a2f9edd8dbc99f62bc4dfb78af7bf89499bca3d753423ac1b4e06592e467b763", size = 915607863, upload-time = "2026-01-21T16:25:06.696Z" },
|
{ url = "https://files.pythonhosted.org/packages/ae/30/a3a2120621bf9c17779b169fc17e3dc29b230c29d0f8222f499f5e159aa8/torch-2.10.0-cp311-cp311-manylinux_2_28_x86_64.whl", hash = "sha256:a2f9edd8dbc99f62bc4dfb78af7bf89499bca3d753423ac1b4e06592e467b763", size = 915607863, upload-time = "2026-01-21T16:25:06.696Z" },
|
||||||
{ url = "https://files.pythonhosted.org/packages/6f/3d/c87b33c5f260a2a8ad68da7147e105f05868c281c63d65ed85aa4da98c66/torch-2.10.0-cp311-cp311-win_amd64.whl", hash = "sha256:29b7009dba4b7a1c960260fc8ac85022c784250af43af9fb0ebafc9883782ebd", size = 113723116, upload-time = "2026-01-21T16:25:21.916Z" },
|
{ url = "https://files.pythonhosted.org/packages/6f/3d/c87b33c5f260a2a8ad68da7147e105f05868c281c63d65ed85aa4da98c66/torch-2.10.0-cp311-cp311-win_amd64.whl", hash = "sha256:29b7009dba4b7a1c960260fc8ac85022c784250af43af9fb0ebafc9883782ebd", size = 113723116, upload-time = "2026-01-21T16:25:21.916Z" },
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue