26 KiB
CheddahBot Architecture
System Overview
CheddahBot is a personal AI assistant built in Python. It exposes a Gradio-based web UI, routes user messages through an agent loop backed by a model-agnostic LLM adapter, persists conversations in SQLite, maintains a 4-layer memory system with optional semantic search, and provides an extensible tool registry that the LLM can invoke mid-conversation. A background scheduler handles cron-based tasks and periodic heartbeat checks.
Data Flow Diagram
User (browser)
|
v
+-----------+ +------------+ +--------------+
| Gradio UI | ---> | Agent | ---> | LLM Adapter |
| (ui.py) | | (agent.py) | | (llm.py) |
+-----------+ +-----+------+ +------+-------+
| |
+------------+-------+ +-------+--------+
| | | | Claude CLI |
v v v | OpenRouter |
+---------+ +---------+ +---+ | Ollama |
| Router | | Tools | | DB| | LM Studio |
|(router) | |(tools/) | |(db| +----------------+
+----+----+ +----+----+ +---+
| |
+-------+--+ +----+----+
| Identity | | Memory |
| SOUL.md | | System |
| USER.md | |(memory) |
+----------+ +---------+
- The user submits text (or voice / files) through the Gradio interface.
ui.pyhands the message toAgent.respond().- The agent stores the user message in SQLite, builds a system prompt via
router.py(loading identity files and memory context), and formats the conversation history. - The agent sends messages to
LLMAdapter.chat()which dispatches to the correct provider backend. - The LLM response streams back. If it contains tool-call requests, the agent
executes them through
ToolRegistry.execute(), appends the results, and loops back to step 4 (up to 10 iterations). - The final assistant response is stored in the database and streamed to the UI.
- After responding, the agent checks whether the conversation has exceeded the flush threshold; if so, the memory system summarizes older messages into the daily log.
Module-by-Module Breakdown
__main__.py -- Entry Point
File: cheddahbot/__main__.py
Orchestrates startup in this order:
load_config()-- loads configuration from env vars / YAML / defaults.Database(config.db_path)-- opens (or creates) the SQLite database.LLMAdapter(...)-- initializes the model-agnostic LLM client.Agent(config, db, llm)-- creates the core agent.MemorySystem(config, db)-- initializes the memory system and injects it into the agent viaagent.set_memory().ToolRegistry(config, db, agent)-- auto-discovers and loads all tool modules, then injects viaagent.set_tools().Scheduler(config, db, agent)-- starts two daemon threads (task poller and heartbeat).create_ui(agent, config, llm)-- builds the Gradio Blocks app and launches it on the configured host/port.
Each subsystem (memory, tools, scheduler) is wrapped in a try/except so the application degrades gracefully if optional dependencies are missing.
config.py -- Configuration
File: cheddahbot/config.py
Defines four dataclasses:
| Dataclass | Key Fields |
|---|---|
Config |
default_model, host, port, ollama_url, lmstudio_url, openrouter_api_key, plus derived paths (root_dir, data_dir, identity_dir, memory_dir, skills_dir, db_path) |
MemoryConfig |
max_context_messages (50), flush_threshold (40), embedding_model ("all-MiniLM-L6-v2"), search_top_k (5) |
SchedulerConfig |
heartbeat_interval_minutes (30), poll_interval_seconds (60) |
ShellConfig |
blocked_commands, require_approval (False) |
load_config() applies three layers of configuration in priority order:
- Dataclass defaults (lowest priority).
config.yamlat the project root (middle priority).- Environment variables with the
CHEDDAH_prefix, plusOPENROUTER_API_KEY(highest priority).
The function also ensures required data directories exist on disk.
db.py -- Database Layer
File: cheddahbot/db.py
A thin wrapper around SQLite using thread-local connections (one connection per thread), WAL journal mode, and foreign keys.
Key methods:
create_conversation(conv_id, title)-- insert a new conversation row.list_conversations(limit)-- return recent conversations ordered byupdated_at.add_message(conv_id, role, content, ...)-- insert a message and touch the conversation'supdated_at.get_messages(conv_id, limit)-- return messages in chronological order.count_messages(conv_id)-- count messages for flush-threshold checks.add_scheduled_task(name, prompt, schedule)-- persist a scheduled task.get_due_tasks()-- return tasks whosenext_runis in the past or NULL.update_task_next_run(task_id, next_run)-- update the next execution time.log_task_run(task_id, result, error)-- record the outcome of a task run.kv_set(key, value)/kv_get(key)-- generic key-value store.
agent.py -- Core Agent Loop
File: cheddahbot/agent.py
Contains the Agent class, the central coordinator.
Key members:
conv_id-- current conversation ID (a 12-character hex string)._memory-- optionalMemorySystemreference._tools-- optionalToolRegistryreference.
Primary method: respond(user_input, files)
This is a Python generator that yields text chunks for streaming. The detailed flow is described in the next section.
Helper: respond_to_prompt(prompt)
Non-streaming wrapper that collects all chunks and returns a single string. Used by the scheduler and heartbeat for internal prompts.
router.py -- System Prompt Builder
File: cheddahbot/router.py
Two functions:
-
build_system_prompt(identity_dir, memory_context, tools_description)-- assembles the full system prompt by concatenating these sections separated by horizontal rules:- Contents of
identity/SOUL.md - Contents of
identity/USER.md - Memory context string (from the memory system)
- Tools description listing (from the tool registry)
- A fixed "Instructions" section with core behavioral directives.
- Contents of
-
format_messages_for_llm(system_prompt, history, max_messages)-- converts raw database rows into the[{role, content}]format expected by the LLM. The system prompt becomes the first message. Tool results are converted to user messages prefixed with[Tool Result]. History is trimmed to the most recentmax_messagesentries.
llm.py -- LLM Adapter
File: cheddahbot/llm.py
Described in detail in a dedicated section below.
memory.py -- Memory System
File: cheddahbot/memory.py
Described in detail in a dedicated section below.
media.py -- Audio/Video Processing
File: cheddahbot/media.py
Three utility functions:
transcribe_audio(path)-- Speech-to-text. Tries local Whisper first, then falls back to the OpenAI Whisper API.text_to_speech(text, output_path, voice)-- Text-to-speech viaedge-tts(free, no API key). Defaults to theen-US-AriaNeuralvoice.extract_video_frames(video_path, max_frames)-- Extracts key frames from video usingffprobe(to get duration) andffmpeg(to extract JPEG frames).
scheduler.py -- Scheduler and Heartbeat
File: cheddahbot/scheduler.py
Described in detail in a dedicated section below.
ui.py -- Gradio Web Interface
File: cheddahbot/ui.py
Builds a Gradio Blocks application with:
- A model dropdown (populated from
llm.list_available_models()) with a refresh button and a "New Chat" button. - A
gr.Chatbotwidget for the conversation (500px height, copy buttons). - A
gr.MultimodalTextboxsupporting text, file upload, and microphone input. - A "Voice Chat" accordion for record-and-respond audio interaction.
- A "Conversation History" accordion showing past conversations from the database.
- A "Settings" accordion with guidance on editing identity and config files.
Event wiring:
- Model dropdown change calls
llm.switch_model(). - Refresh button re-discovers local models.
- Message submit calls
agent.respond()in streaming mode, updating the chatbot widget with each chunk. - Audio files attached to messages are transcribed via
media.transcribe_audio()before being sent to the agent. - Voice Chat records audio, transcribes it, gets a text response from the agent,
converts it to speech via
media.text_to_speech(), and plays it back.
tools/__init__.py -- Tool Registry
File: cheddahbot/tools/__init__.py
Described in detail in a dedicated section below.
skills/__init__.py -- Skill Registry
File: cheddahbot/skills/__init__.py
Defines a parallel registry for "skills" (multi-step operations). Key pieces:
SkillDef-- dataclass holdingname,description,func.@skill(name, description)-- decorator that registers a skill in the global_SKILLSdict.load_skill(path)-- dynamically loads a.pyfile as a module (triggering any@skilldecorators inside it).discover_skills(skills_dir)-- loads all.pyfiles from the skills directory.list_skills()/run_skill(name, **kwargs)-- query and execute skills.
providers/__init__.py -- Provider Extensions
File: cheddahbot/providers/__init__.py
Reserved for future custom provider implementations. Currently empty.
The Agent Loop in Detail
When Agent.respond(user_input) is called, the following sequence occurs:
1. ensure_conversation()
|-- Creates a new conversation in the DB if one doesn't exist
|
2. db.add_message(conv_id, "user", user_input)
|-- Persists the user's message
|
3. Build system prompt
|-- memory.get_context(user_input) --> memory context string
|-- tools.get_tools_schema() --> OpenAI-format JSON schemas
|-- tools.get_tools_description() --> human-readable tool list
|-- router.build_system_prompt(identity_dir, memory_context, tools_description)
|
4. Load conversation history from DB
|-- db.get_messages(conv_id, limit=max_context_messages)
|-- router.format_messages_for_llm(system_prompt, history, max_messages)
|
5. AGENT LOOP (up to MAX_TOOL_ITERATIONS = 10):
|
|-- llm.chat(messages, tools=tools_schema, stream=True)
| |-- Yields {"type":"text","content":"..."} chunks --> streamed to user
| |-- Yields {"type":"tool_use","name":"...","input":{...}} chunks
|
|-- If no tool_calls: store assistant message, BREAK
|
|-- If tool_calls present:
| |-- Store assistant message with tool_calls metadata
| |-- For each tool call:
| | |-- yield "Using tool: <name>" indicator
| | |-- tools.execute(name, input) --> result string
| | |-- yield tool result (truncated to 2000 chars)
| | |-- db.add_message(conv_id, "tool", result)
| | |-- Append result to messages as user message
| |-- Continue loop (LLM sees tool results and can respond or call more tools)
|
6. After loop: check if memory flush is needed
|-- If message count > flush_threshold:
| |-- memory.auto_flush(conv_id)
The loop allows the LLM to chain up to 10 consecutive tool calls before being cut off. Each tool result is injected back into the conversation as a user message so the LLM can reason about it in the next iteration.
LLM Adapter Design
File: cheddahbot/llm.py
Provider Routing
The LLMAdapter supports four provider paths. The active provider is determined
by examining the current model ID:
| Model ID Pattern | Provider | Backend |
|---|---|---|
claude-* |
claude |
Claude Code CLI (subprocess) |
local/ollama/<model> |
ollama |
Ollama HTTP API (OpenAI-compat) |
local/lmstudio/<model> |
lmstudio |
LM Studio HTTP API (OpenAI-compat) |
| Anything else | openrouter |
OpenRouter API (OpenAI-compat) |
The chat() Method
This is the single entry point. It accepts a list of messages, an optional tools schema, and a stream flag. It returns a generator yielding dictionaries:
{"type": "text", "content": "..."}-- a text chunk to display.{"type": "tool_use", "id": "...", "name": "...", "input": {...}}-- a tool invocation request.
Claude Code CLI Path (_chat_claude_sdk)
For Claude models, CheddahBot shells out to the claude CLI binary (the Claude
Code SDK):
- Separates system prompt, conversation history, and the latest user message from the messages list.
- Builds a full system prompt by appending conversation history under a "Conversation So Far" heading.
- Invokes
claude -p <prompt> --model <model> --output-format json --system-prompt <system>. - The
CLAUDECODEenvironment variable is stripped from the subprocess environment to avoid nested-session errors. - Parses the JSON output and yields the
resultfield as a text chunk. - On Windows,
shell=Trueis used for compatibility with npm-installed binaries.
OpenAI-Compatible Path (_chat_openai_sdk)
For OpenRouter, Ollama, and LM Studio, the adapter uses the openai Python SDK:
_resolve_endpoint(provider)returns the base URL and API key:- OpenRouter:
https://openrouter.ai/api/v1with the configured API key. - Ollama:
http://localhost:11434/v1with dummy key"ollama". - LM Studio:
http://localhost:1234/v1with dummy key"lm-studio".
- OpenRouter:
_resolve_model_id(provider)strips thelocal/ollama/orlocal/lmstudio/prefix from the model ID.- Creates an
openai.OpenAIclient with the resolved base URL and API key. - In streaming mode: iterates over
client.chat.completions.create(stream=True), accumulates tool call arguments across chunks (indexed bytc.index), yields text deltas immediately, and yields completed tool calls at the end of the stream. - In non-streaming mode: makes a single call and yields text and tool calls from the response.
Model Discovery
discover_local_models()-- probes the Ollama tags endpoint and LM Studio models endpoint (3-second timeout each) and returnsModelInfoobjects.list_available_models()-- returns a combined list of hardcoded Claude models, hardcoded OpenRouter models (if an API key is configured), and dynamically discovered local models.
Model Switching
switch_model(model_id) updates current_model. The provider property
re-evaluates on every access, so switching models also implicitly switches
providers.
Memory System
File: cheddahbot/memory.py
The 4 Layers
Layer 1: Identity -- identity/SOUL.md, identity/USER.md
(loaded by router.py into the system prompt)
Layer 2: Long-term -- memory/MEMORY.md
(persisted facts and instructions, appended over time)
Layer 3: Daily logs -- memory/YYYY-MM-DD.md
(timestamped entries per day, including auto-flush summaries)
Layer 4: Semantic -- memory/embeddings.db
(SQLite with vector embeddings for similarity search)
How Memory Context is Built
MemorySystem.get_context(query) is called once per agent turn. It assembles a
string from:
- Long-term memory -- the last 2000 characters of
MEMORY.md. - Today's log -- the last 1500 characters of today's date file.
- Semantic search results -- the top-k most similar entries to the user's query, formatted as a bulleted list.
This string is injected into the system prompt by router.py under the heading
"Relevant Memory".
Embedding and Search
- The embedding model is
all-MiniLM-L6-v2fromsentence-transformers(lazy loaded, thread-safe via a lock). _index_text(text, doc_id)-- encodes the text into a vector and stores it inmemory/embeddings.db(table:embeddingswith columnsid TEXT,text TEXT,vector BLOB).search(query, top_k)-- encodes the query, loads all vectors from the database, computes cosine similarity against each one, sorts by score, and returns the top-k results.- If
sentence-transformersis not installed,_fallback_search()performs simple case-insensitive substring matching across all.mdfiles in the memory directory.
Writing to Memory
remember(text)-- appends a timestamped entry tomemory/MEMORY.mdand indexes it for semantic search. Exposed to the LLM via theremember_thistool.log_daily(text)-- appends a timestamped entry to today's daily log file and indexes it. Exposed via thelog_notetool.
Auto-Flush
When Agent.respond() finishes, it checks db.count_messages(conv_id). If the
count exceeds config.memory.flush_threshold (default 40):
auto_flush(conv_id)loads up to 200 messages.- All but the last 10 are selected for summarization.
- A summary string is built from the selected messages (truncated to 1000 chars).
- The summary is appended to the daily log via
log_daily().
This prevents conversations from growing unbounded while preserving context in the daily log for future semantic search.
Reindexing
reindex_all() clears all embeddings and re-indexes every line (longer than 10
characters) from every .md file in the memory directory. This can be called
to rebuild the search index from scratch.
Tool System
File: cheddahbot/tools/__init__.py (registry) and cheddahbot/tools/*.py
(tool modules)
The @tool Decorator
from cheddahbot.tools import tool
@tool("my_tool_name", "Description of what this tool does", category="general")
def my_tool_name(param1: str, param2: int = 10) -> str:
return f"Result: {param1}, {param2}"
The decorator:
- Creates a
ToolDefobject containing the function, name, description, category, and auto-extracted parameter schema. - Registers it in the global
_TOOLSdictionary keyed by name. - Attaches the
ToolDefasfunc._tool_defon the original function.
Parameter Schema Generation
_extract_params(func) inspects the function signature using inspect:
- Skips parameters named
selforctx. - Maps type annotations to JSON Schema types:
str->"string",int->"integer",float->"number",bool->"boolean",list->"array". Unannotated parameters default to"string". - Parameters without defaults are marked as required.
Schema Output
ToolDef.to_openai_schema() returns the tool definition in OpenAI
function-calling format:
{
"type": "function",
"function": {
"name": "tool_name",
"description": "...",
"parameters": {
"type": "object",
"properties": { ... },
"required": [ ... ]
}
}
}
Auto-Discovery
When ToolRegistry.__init__() is called, _discover_tools() uses
pkgutil.iter_modules to find every .py file in cheddahbot/tools/ (skipping
files starting with _). Each module is imported via importlib.import_module,
which triggers the @tool decorators and populates the global registry.
Tool Execution
ToolRegistry.execute(name, args):
- Looks up the
ToolDefin the global_TOOLSdict. - Inspects the function signature for a
ctxparameter. If present, injects a context dictionary containingconfig,db,agent, andmemory. - Calls the function with the provided arguments.
- Returns the result as a string (or
"Done."if the function returnsNone). - Catches all exceptions and returns
"Tool error: ...".
Meta-Tools
Two special tools enable runtime extensibility:
build_tool (in cheddahbot/tools/build_tool.py):
- Accepts
name,description, andcode(Python source using the@tooldecorator). - Writes a new
.pyfile intocheddahbot/tools/. - Hot-imports the module via
importlib.import_module, which triggers the@tooldecorator and registers the new tool immediately. - If the import fails, the file is deleted.
build_skill (in cheddahbot/tools/build_skill.py):
- Accepts
name,description, andsteps(Python source using the@skilldecorator). - Writes a new
.pyfile into the configuredskills/directory. - Calls
skills.load_skill()to dynamically import it.
Scheduler and Heartbeat Design
File: cheddahbot/scheduler.py
The Scheduler class starts two daemon threads at application boot.
Task Poller Thread
- Runs in
_poll_loop(), sleeping forpoll_interval_seconds(default 60) between iterations. - Each iteration calls
_run_due_tasks():- Queries
db.get_due_tasks()for tasks wherenext_runis NULL or in the past. - For each due task, calls
agent.respond_to_prompt(task["prompt"])to generate a response. - Logs the result via
db.log_task_run(). - If the schedule is
"once:<datetime>", the task is disabled. - Otherwise, the schedule is treated as a cron expression:
croniteris used to calculate the next run time, which is saved viadb.update_task_next_run().
- Queries
Heartbeat Thread
- Runs in
_heartbeat_loop(), sleeping forheartbeat_interval_minutes(default 30) between iterations. - Waits 60 seconds before the first heartbeat to let the system initialize.
- Each iteration calls
_run_heartbeat():- Reads
identity/HEARTBEAT.md. - Sends the checklist to the agent as a prompt: "HEARTBEAT CHECK. Review this checklist and take action if needed."
- If the response contains
"HEARTBEAT_OK", no action is logged. - Otherwise, the response is logged to the daily log via
memory.log_daily().
- Reads
Thread Safety
Both threads are daemon threads (they die when the main process exits). The
_stop_event threading event can be set to gracefully shut down both loops. The
database layer uses thread-local connections, so concurrent access from the
scheduler threads and the Gradio request threads is safe.
Database Schema
The SQLite database (data/cheddahbot.db) contains five tables:
conversations
| Column | Type | Notes |
|---|---|---|
id |
TEXT | Primary key (hex) |
title |
TEXT | Display title |
created_at |
TEXT | ISO 8601 UTC |
updated_at |
TEXT | ISO 8601 UTC |
messages
| Column | Type | Notes |
|---|---|---|
id |
INTEGER | Autoincrement primary key |
conv_id |
TEXT | Foreign key to conversations.id |
role |
TEXT | "user", "assistant", or "tool" |
content |
TEXT | Message body |
tool_calls |
TEXT | JSON array of {name, input} (nullable) |
tool_result |
TEXT | Name of the tool that produced this result (nullable) |
model |
TEXT | Model ID used for this response (nullable) |
created_at |
TEXT | ISO 8601 UTC |
Index: idx_messages_conv on (conv_id, created_at).
scheduled_tasks
| Column | Type | Notes |
|---|---|---|
id |
INTEGER | Autoincrement primary key |
name |
TEXT | Human-readable task name |
prompt |
TEXT | The prompt to send to the agent |
schedule |
TEXT | Cron expression or "once:<datetime>" |
enabled |
INTEGER | 1 = active, 0 = disabled |
next_run |
TEXT | ISO 8601 UTC (nullable) |
created_at |
TEXT | ISO 8601 UTC |
task_run_logs
| Column | Type | Notes |
|---|---|---|
id |
INTEGER | Autoincrement primary key |
task_id |
INTEGER | Foreign key to scheduled_tasks.id |
started_at |
TEXT | ISO 8601 UTC |
finished_at |
TEXT | ISO 8601 UTC (nullable) |
result |
TEXT | Agent response (nullable) |
error |
TEXT | Error message if failed (nullable) |
kv_store
| Column | Type | Notes |
|---|---|---|
key |
TEXT | Primary key |
value |
TEXT | Arbitrary value |
Embeddings Database
A separate SQLite file at memory/embeddings.db holds one table:
embeddings
| Column | Type | Notes |
|---|---|---|
id |
TEXT | Primary key (e.g. "daily:2026-02-14:08:30") |
text |
TEXT | The original text that was embedded |
vector |
BLOB | Raw float32 bytes of the embedding vector |
Identity Files
Three Markdown files in the identity/ directory define the agent's personality,
user context, and background behavior.
identity/SOUL.md
Defines the agent's personality, communication style, boundaries, and quirks. This is loaded first into the system prompt, making it the most prominent identity influence on every response.
Contents are read by router.build_system_prompt() at the beginning of each
agent turn.
identity/USER.md
Contains a user profile template: name, technical level, primary language, current projects, and communication preferences. The user edits this file to customize how the agent addresses them and what context it assumes.
Loaded by router.build_system_prompt() immediately after SOUL.md.
identity/HEARTBEAT.md
A checklist of items to review on each heartbeat cycle. The scheduler reads this
file and sends it to the agent as a prompt every heartbeat_interval_minutes
(default 30 minutes). The agent processes the checklist and either confirms
"HEARTBEAT_OK" or takes action and logs it.
Loading Order in the System Prompt
The system prompt assembled by router.build_system_prompt() concatenates these
sections, separated by \n\n---\n\n:
- SOUL.md contents
- USER.md contents
- Memory context (long-term + daily log + semantic search results)
- Tools description (categorized list of available tools)
- Core instructions (hardcoded behavioral directives)