# CheddahBot Tools Reference ## Overview CheddahBot uses an extensible tool system that allows the LLM to invoke Python functions during a conversation. Tools are registered via the `@tool` decorator and auto-discovered at startup. The LLM receives tool schemas in OpenAI function-calling format and can request tool invocations, which the agent executes and feeds back into the conversation. --- ## Registered Tools ### Category: files #### `read_file` Read the contents of a file. | Parameter | Type | Required | Description | |-----------|--------|----------|------------------| | `path` | string | Yes | Path to the file | Returns the file contents as a string. Files larger than 50,000 characters are truncated. Returns an error message if the file is not found or is not a regular file. --- #### `write_file` Write content to a file (creates or overwrites). | Parameter | Type | Required | Description | |-----------|--------|----------|--------------------------------| | `path` | string | Yes | Path to the file | | `content` | string | Yes | Content to write to the file | Creates parent directories automatically if they do not exist. --- #### `edit_file` Replace text in a file (first occurrence). | Parameter | Type | Required | Description | |------------|--------|----------|--------------------------| | `path` | string | Yes | Path to the file | | `old_text` | string | Yes | Text to find and replace | | `new_text` | string | Yes | Replacement text | Replaces only the first occurrence of `old_text`. Returns an error if the file does not exist or the text is not found. --- #### `list_directory` List files and folders in a directory. | Parameter | Type | Required | Description | |-----------|--------|----------|--------------------------------------| | `path` | string | No | Directory path (defaults to `"."`) | Returns up to 200 entries, sorted with directories first. Each entry shows the name and file size. --- #### `search_files` Search for files matching a glob pattern. | Parameter | Type | Required | Description | |-------------|--------|----------|--------------------------------------| | `pattern` | string | Yes | Glob pattern (e.g. `"**/*.py"`) | | `directory` | string | No | Root directory (defaults to `"."`) | Returns up to 100 matching file paths. --- #### `search_in_files` Search for text content across files. | Parameter | Type | Required | Description | |-------------|--------|----------|--------------------------------------| | `query` | string | Yes | Text to search for (case-insensitive)| | `directory` | string | No | Root directory (defaults to `"."`) | | `extension` | string | No | File extension filter (e.g. `".py"`) | Returns up to 50 matches in `file:line: content` format. Skips files larger than 1 MB. --- ### Category: shell #### `run_command` Execute a shell command and return output. | Parameter | Type | Required | Description | |-----------|---------|----------|------------------------------------------| | `command` | string | Yes | Shell command to execute | | `timeout` | integer | No | Timeout in seconds (default 30, max 120) | Includes safety checks that block dangerous patterns: - `rm -rf /` - `format c:` - `:(){:|:&};:` (fork bomb) - `dd if=/dev/zero` - `mkfs.` - `> /dev/sda` Output is truncated to 10,000 characters. Returns stdout, stderr, and exit code. --- ### Category: web #### `web_search` Search the web using DuckDuckGo. | Parameter | Type | Required | Description | |---------------|---------|----------|------------------------------------| | `query` | string | Yes | Search query | | `max_results` | integer | No | Number of results (default 5) | Uses DuckDuckGo HTML search (no API key required). Returns formatted results with title, URL, and snippet. --- #### `fetch_url` Fetch and extract text content from a URL. | Parameter | Type | Required | Description | |-----------|--------|----------|----------------| | `url` | string | Yes | URL to fetch | For HTML pages: strips script, style, nav, footer, and header elements, then extracts text (truncated to 15,000 characters). For JSON responses: returns raw JSON (truncated to 15,000 characters). For other content types: returns raw text (truncated to 5,000 characters). --- ### Category: code #### `run_python` Execute Python code and return the output. | Parameter | Type | Required | Description | |-----------|---------|----------|------------------------------------------| | `code` | string | Yes | Python code to execute | | `timeout` | integer | No | Timeout in seconds (default 30, max 60) | Writes the code to a temporary file and runs it as a subprocess using the same Python interpreter that CheddahBot is running on. The temp file is deleted after execution. Output is truncated to 10,000 characters. --- ### Category: memory #### `remember_this` Save an important fact or instruction to long-term memory. | Parameter | Type | Required | Description | |-----------|--------|----------|---------------------------------| | `text` | string | Yes | The fact or instruction to save | Appends a timestamped entry to `memory/MEMORY.md` and indexes it in the embedding database for future semantic search. --- #### `search_memory` Search through saved memories. | Parameter | Type | Required | Description | |-----------|--------|----------|-------------------| | `query` | string | Yes | Search query text | Performs semantic search (or keyword fallback) over all indexed memory entries. Returns results with similarity scores. --- #### `log_note` Add a timestamped note to today's daily log. | Parameter | Type | Required | Description | |-----------|--------|----------|------------------------| | `text` | string | Yes | Note text to log | Appends to `memory/YYYY-MM-DD.md` (today's date) and indexes the text for semantic search. --- ### Category: scheduling #### `schedule_task` Schedule a recurring or one-time task. | Parameter | Type | Required | Description | |------------|--------|----------|---------------------------------------------------| | `name` | string | Yes | Human-readable task name | | `prompt` | string | Yes | The prompt to send to the agent when the task runs| | `schedule` | string | Yes | Cron expression or `"once:YYYY-MM-DDTHH:MM"` | Examples: - `schedule="0 9 * * *"` -- every day at 9:00 AM UTC - `schedule="once:2026-03-01T14:00"` -- one-time execution --- #### `list_tasks` List all scheduled tasks. | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | (none) | | | | Returns all tasks with their ID, name, schedule, enabled status, and next run time. --- ### Category: media #### `analyze_image` Describe or analyze an image file. | Parameter | Type | Required | Description | |------------|--------|----------|------------------------------------------------| | `path` | string | Yes | Path to the image file | | `question` | string | No | Question about the image (default: "Describe this image in detail.") | Reads the image, base64-encodes it, and sends it to the current LLM as a multimodal message. Supports PNG, JPEG, GIF, WebP, and BMP formats. Requires a vision-capable model. --- ### Category: data #### `read_csv` Read a CSV file and return summary or specific rows. | Parameter | Type | Required | Description | |------------|---------|----------|-----------------------------------------| | `path` | string | Yes | Path to the CSV file | | `max_rows` | integer | No | Maximum rows to display (default 20) | Returns the data formatted as a Markdown table, with a count of total rows if the file is larger than `max_rows`. --- #### `read_json` Read and pretty-print a JSON file. | Parameter | Type | Required | Description | |-----------|--------|----------|-----------------------| | `path` | string | Yes | Path to the JSON file | Returns the JSON content pretty-printed with 2-space indentation. Truncated to 15,000 characters. --- #### `query_json` Extract data from a JSON file using a dot-notation path. | Parameter | Type | Required | Description | |-------------|--------|----------|--------------------------------------| | `path` | string | Yes | Path to the JSON file | | `json_path` | string | Yes | Dot-notation path (e.g. `"data.users.0.name"`) | Supports `*` as a wildcard for arrays. For example, `"results.*.id"` returns the full array at `results`. --- ### Category: delivery #### `email_file` Email a file as an attachment via Gmail SMTP. | Parameter | Type | Required | Description | |-------------|--------|----------|--------------------------------------------------| | `file_path` | string | Yes | Path to the file to send | | `to` | string | No | Recipient address (defaults to `EMAIL_DEFAULT_TO`)| | `subject` | string | No | Email subject (defaults to filename) | If the file is `.txt`, it is automatically converted to `.docx` before sending. Requires `GMAIL_USERNAME` and `GMAIL_APP_PASSWORD` in `.env`. --- ### Category: content #### `write_press_releases` Full autonomous press-release pipeline. | Parameter | Type | Required | Description | |-------------------|--------|----------|------------------------------------------------| | `topic` | string | Yes | Press release topic | | `company_name` | string | Yes | Company name | | `url` | string | No | Reference URL for context | | `lsi_terms` | string | No | LSI keywords to integrate | | `required_phrase` | string | No | Exact phrase to include once | Generates 7 headlines, AI-picks the best 2, writes 2 full press releases (600-750 words each), generates JSON-LD schema for each, and saves all files. Output includes `.txt`, `.docx` (Google Docs-ready), and `.json` schema files in `data/generated/press_releases/{company}/`. --- ### Category: meta #### `build_tool` Create a new tool from a description. The agent writes Python code with the `@tool` decorator. | Parameter | Type | Required | Description | |---------------|--------|----------|----------------------------------------------| | `name` | string | Yes | Tool name in snake_case | | `description` | string | Yes | What the tool does | | `code` | string | Yes | Python code with `@tool` decorator | See the dedicated section below for details. --- #### `build_skill` Create a new multi-step skill from a description. | Parameter | Type | Required | Description | |---------------|--------|----------|----------------------------------------------| | `name` | string | Yes | Skill name in snake_case | | `description` | string | Yes | What the skill does | | `steps` | string | Yes | Python code with `@skill` decorator | See the dedicated section below for details. --- ## How to Create Custom Tools Using `@tool` ### Step 1: Create a Python file in `cheddahbot/tools/` The file name does not matter (as long as it does not start with `_`), but by convention it should describe the category of tools it contains. ### Step 2: Import the decorator and define your function ```python """My custom tools.""" from __future__ import annotations from . import tool @tool("greet_user", "Greet a user by name", category="social") def greet_user(name: str, enthusiasm: int = 1) -> str: exclamation = "!" * enthusiasm return f"Hello, {name}{exclamation}" ``` ### Decorator Signature ```python @tool(name: str, description: str, category: str = "general") ``` - `name` -- The tool name the LLM will use to invoke it. Must be unique across all registered tools. - `description` -- A short description shown to the LLM in the system prompt and in the tool schema. - `category` -- A grouping label for organizing tools in the system prompt. ### Function Requirements - The return type should be `str`. If the function returns a non-string value, it is converted via `str()`. If it returns `None`, the result is `"Done."`. - Type annotations on parameters are used to generate the JSON Schema: - `str` -> `"string"` - `int` -> `"integer"` - `float` -> `"number"` - `bool` -> `"boolean"` - `list` -> `"array"` (of strings) - No annotation -> `"string"` - Parameters with default values are optional in the schema. Parameters without defaults are required. - To access the agent's runtime context (config, database, memory system, agent instance), add a `ctx: dict = None` parameter. The tool registry will automatically inject a dictionary with keys `"config"`, `"db"`, `"agent"`, and `"memory"`. ### Step 3: Restart CheddahBot The tool module is auto-discovered on startup. No additional registration code is needed. ### Full Example with Context Access ```python """Tools that interact with the database.""" from __future__ import annotations from . import tool @tool("count_conversations", "Count total conversations in the database", category="stats") def count_conversations(ctx: dict = None) -> str: if not ctx or not ctx.get("db"): return "Database not available." row = ctx["db"]._conn.execute("SELECT COUNT(*) as cnt FROM conversations").fetchone() return f"Total conversations: {row['cnt']}" @tool("get_setting", "Retrieve a value from the key-value store", category="config") def get_setting(key: str, ctx: dict = None) -> str: if not ctx or not ctx.get("db"): return "Database not available." value = ctx["db"].kv_get(key) if value is None: return f"No value found for key: {key}" return f"{key} = {value}" ``` --- ## How `build_tool` (Meta-Tool) Works The `build_tool` tool allows the LLM to create new tools at runtime without restarting the application. This is the mechanism by which you can ask the agent "create a tool that does X" and it will write, save, and hot-load the tool. ### Internal Process 1. **Validation** -- The tool name must be a valid Python identifier. 2. **Code wrapping** -- The provided `code` parameter is wrapped in a module template that adds the necessary `from . import tool` import. 3. **File creation** -- The module is written to `cheddahbot/tools/.py`. If a file with that name already exists, the operation is rejected. 4. **Hot-loading** -- `importlib.import_module()` imports the new module. This triggers the `@tool` decorator inside the code, which registers the tool in the global `_TOOLS` dictionary. 5. **Cleanup on failure** -- If the import fails (syntax error, import error, etc.), the file is deleted to avoid leaving broken modules. ### What the LLM Generates When the LLM calls `build_tool`, it provides: - `name`: e.g. `"word_count"` - `description`: e.g. `"Count words in a text string"` - `code`: The body of the tool function, including the `@tool` decorator: ```python @tool("word_count", "Count words in a text string", category="text") def word_count(text: str) -> str: count = len(text.split()) return f"Word count: {count}" ``` The `build_tool` function wraps this in the necessary imports and writes it to disk. ### Persistence Because the tool is saved as a `.py` file in the tools directory, it survives application restarts. On the next startup, auto-discovery will find and load it like any other built-in tool. --- ## How `build_skill` Works The `build_skill` tool creates multi-step skills -- higher-level operations that can orchestrate multiple actions. ### Internal Process 1. **Validation** -- The skill name must be a valid Python identifier. 2. **Code wrapping** -- The provided `steps` parameter is wrapped in a module template that adds `from cheddahbot.skills import skill`. 3. **File creation** -- The module is written to `skills/.py` (the project-level skills directory, not inside the package). 4. **Dynamic loading** -- `skills.load_skill()` uses `importlib.util.spec_from_file_location` to load the module from the file path, triggering the `@skill` decorator. ### The `@skill` Decorator ```python from cheddahbot.skills import skill @skill("my_skill", "Description of what this skill does") def my_skill(**kwargs) -> str: # Multi-step logic here return "Skill completed." ``` Skills are registered in the global `_SKILLS` dictionary and can be listed with `skills.list_skills()` and executed with `skills.run_skill(name, **kwargs)`. ### Difference Between Tools and Skills | Aspect | Tools | Skills | |------------|------------------------------------------|---------------------------------------| | Invoked by | The LLM (via function calling) | Code or agent internally | | Schema | OpenAI function-calling JSON schema | No schema; free-form kwargs | | Location | `cheddahbot/tools/` (inside the package) | `skills/` (project-level directory) | | Purpose | Single focused operations | Multi-step workflows | --- ## Example: Creating a Custom Tool Manually Suppose you want a tool that converts temperatures between Fahrenheit and Celsius. ### 1. Create the file Create `cheddahbot/tools/temperature.py`: ```python """Temperature conversion tools.""" from __future__ import annotations from . import tool @tool("convert_temperature", "Convert temperature between Fahrenheit and Celsius", category="utility") def convert_temperature(value: float, from_unit: str = "F") -> str: """Convert a temperature value. Args: value: The temperature value to convert from_unit: Source unit - 'F' for Fahrenheit, 'C' for Celsius """ from_unit = from_unit.upper() if from_unit == "F": celsius = (value - 32) * 5 / 9 return f"{value} F = {celsius:.1f} C" elif from_unit == "C": fahrenheit = value * 9 / 5 + 32 return f"{value} C = {fahrenheit:.1f} F" else: return f"Unknown unit: {from_unit}. Use 'F' or 'C'." ``` ### 2. Restart CheddahBot The tool is auto-discovered. The LLM will see it in the system prompt: ``` ### Utility - **convert_temperature**(value, from_unit): Convert temperature between Fahrenheit and Celsius ``` ### 3. Use it in conversation > User: "What is 72 degrees Fahrenheit in Celsius?" > > The LLM calls `convert_temperature(value=72.0, from_unit="F")` and receives > `"72.0 F = 22.2 C"`, which it incorporates into its response. --- ## Example: Asking the Agent to Create a Tool via `build_tool` You can ask the agent to create a tool for you in natural language. The agent will use the `build_tool` meta-tool to generate and register it. ### Conversation > **User:** Create a tool that counts the number of lines in a file. > > **Agent:** I'll create that tool for you. > > *[Uses tool: build_tool]* > ``` > name: "count_lines" > description: "Count the number of lines in a file" > code: > @tool("count_lines", "Count the number of lines in a file", category="files") > def count_lines(path: str) -> str: > from pathlib import Path > p = Path(path).resolve() > if not p.exists(): > return f"File not found: {path}" > if not p.is_file(): > return f"Not a file: {path}" > try: > content = p.read_text(encoding="utf-8", errors="replace") > line_count = len(content.splitlines()) > return f"{path}: {line_count} lines" > except Exception as e: > return f"Error: {e}" > ``` > *Tool 'count_lines' created and loaded successfully.* > > Done. The `count_lines` tool is now available. You can ask me to count lines > in any file, or I'll use it automatically when relevant. ### What Happened Behind the Scenes 1. The agent decided `build_tool` was the right tool for the request. 2. It generated valid Python code including the `@tool` decorator. 3. `build_tool` wrote `cheddahbot/tools/count_lines.py` with the wrapped code. 4. `importlib.import_module` loaded the module, triggering registration. 5. The tool is now available for the LLM to use in the current session and all future sessions. ### Verifying the Tool Exists After creation, the tool appears in: - The system prompt's tools section (on the next turn). - The output of `ToolRegistry.get_tools_schema()`. - The file system at `cheddahbot/tools/count_lines.py`. --- ## Tool Registry API Summary ### `ToolRegistry(config, db, agent)` Constructor. Auto-discovers and imports all tool modules. ### `ToolRegistry.get_tools_schema() -> list[dict]` Returns all tools as OpenAI function-calling JSON schema objects. ### `ToolRegistry.get_tools_description() -> str` Returns a human-readable Markdown string listing all tools organized by category. This is injected into the system prompt. ### `ToolRegistry.execute(name, args) -> str` Executes a tool by name with the given arguments. Returns the result as a string. Automatically injects the `ctx` context dict if the tool function accepts one. ### `ToolRegistry.register_external(tool_def)` Manually registers a `ToolDef` object. Used for programmatic tool registration outside the `@tool` decorator.