Configuration Guide
CleverBee uses configuration files to manage API keys, agent behavior, models, tools, and other settings. This guide explains the main configuration files and their key options.
Configuration Files
.env
: Stores sensitive API keys.config.yaml
: Controls core agent behavior, model selection, memory, tool enablement, and processing settings.mcp.json
: Defines configurations for external MCP (Model Context Protocol) servers providing additional tools.config/prompts.py
: Defines the prompts used by different models in the system.
Note
The setup.sh
script automates many configuration steps, including model selection, tool enablement, and GPU optimization settings in config.yaml
. You typically only need to edit .env
for API keys if prompted during setup.
.env File
This file stores your secret API keys. It's created from .env.example
during setup. You need to manually add your keys here.
Key Variables
GEMINI_API_KEY
Required if using Gemini models (either primary or summarizer). Obtain from Google AI Studio.
ANTHROPIC_API_KEY
Required if using Claude models. Obtain from the Anthropic Console.
config.yaml
This is the main configuration file for CleverBee's behavior. It's structured into sections:
SECTION 1: CORE MODEL CONFIGURATION
Defines the primary models used for reasoning and decision-making.
PRIMARY_MODEL_TYPE
Sets the primary model type (gemini
, claude
, or local
). Determines which model handles initial planning and final report generation.
CLAUDE_MODEL_NAME
/ GEMINI_MODEL_NAME
Specifies the exact cloud model name for the chosen provider (e.g., claude-3-7-sonnet-20250219
, gemini-2.5-pro-preview-03-25
).
LOCAL_MODEL_NAME
Filename of the local GGUF model used for primary reasoning if PRIMARY_MODEL_TYPE
is local
. Set during setup.sh
.
LOCAL_MODEL_QUANT_LEVEL
Specifies the quantization level (e.g., Q4_K_M
) for the local primary model.
NEXT_STEP_MODEL_TYPE
/ NEXT_STEP_MODEL_NAME
etc.
(New) Defines the model responsible for analyzing research progress and deciding the next action. If not explicitly set, may default to using the Primary LLM settings. Specific keys (e.g., NEXT_STEP_GEMINI_MODEL_NAME
, USE_LOCAL_NEXT_STEP_MODEL
) depend on implementation.
USE_LOCAL_SUMMARIZER_MODEL
Boolean (true
/false
). If true
, use the local SUMMARIZER_MODEL
. If false
, use the cloud summarizer (e.g., Gemini Flash).
LOCAL_MODELS_DIR
Directory where local GGUF models are stored (default: models/
).
N_GPU_LAYERS
Number of layers to offload to GPU for local models (0
for CPU, -1
for all, positive number for specific layers). Optimized during setup based on hardware.
SECTION 2: SUMMARIZER AND NEXT STEP CONFIGURATION
SUMMARIZER_MODEL
Filename of the local GGUF model or cloud model name (e.g., gemini-2.0-flash
) used specifically for summarizing web content. Set during setup.sh
.
SUMMARY_MAX_TOKENS
Maximum tokens for each generated summary.
NEXT_STEP_MODEL_TYPE
Sets the model type for determining the next steps in the research flow (gemini
is recommended).
NEXT_STEP_GEMINI_MODEL_NAME
Specifies the exact Gemini model to use for next step decisions (e.g., gemini-2.5-flash
).
SECTION 3: CONTENT PROCESSING & MEMORY
CHUNK_SIZE
Target size (in characters) for splitting large documents when using local models. Set automatically by setup.sh
based on the summarizer model's context window (0
disables chunking for cloud models).
CHUNK_OVERLAP
Number of characters overlapping between consecutive chunks.
USE_PROGRESSIVE_LOADING
Boolean. If true
, agent may use summaries first before loading full content.
ENABLE_THINKING
Boolean. If true
, attempts to give the primary model more 'thinking time' (e.g., using Claude's extended thinking feature, if supported).
THINKING_BUDGET
Additional token budget for thinking steps if ENABLE_THINKING
is true.
CONVERSATION_MEMORY_MAX_TOKENS
Maximum tokens to keep in the agent's short-term memory buffer. Set to ~90% of the primary model's context window.
SECTION 4: SEARCH & BROWSER SETTINGS
MIN_REGULAR_WEB_PAGES
/ MAX_REGULAR_WEB_PAGES
Minimum and maximum number of web pages to process from standard web searches.
MIN_POSTS_PER_SEARCH
/ MAX_POSTS_PER_SEARCH
Minimum and maximum number of Reddit posts to fetch per search.
BROWSER_NAVIGATION_TIMEOUT
Timeout (in milliseconds) for Playwright page navigation.
USE_CAPTCHA_SOLVER
Boolean. Enables integrated CAPTCHA solving attempts.
CAPTCHA_SOLVER_TIMEOUT
Timeout (in milliseconds) for the CAPTCHA solver.
SECTION 5: TOOL CONFIGURATION
This section, nested under the tools:
key in config.yaml
, enables/disables specific tools available to the agent.
tools:
web_browser:
enabled: true
reddit_search:
enabled: true
reddit_extract_post:
enabled: true
Set enabled
to false
to disable a tool. For MCP tools, you can add an mcp_tool_name
if the key differs from the actual callable tool name provided by the MCP server.
SECTION 6: USAGE TRACKING & PRICING
TRACK_TOKEN_USAGE
Boolean. Enables token counting for LLM calls.
LOG_COST_SUMMARY
Boolean. If true and tracking is enabled, prints estimated costs.
Defines cost per 1000 input/output tokens for different models (e.g., GEMINI_COST_PER_1K_INPUT_TOKENS
). Used only if tracking is enabled.
SECTION 7: LOGGING CONFIGURATION
LOG_LEVEL
Sets the minimum log severity level (DEBUG
, INFO
, WARNING
, ERROR
, CRITICAL
).
mcp.json
This file defines external MCP (Model Context Protocol) servers that provide additional tools beyond the built-in ones.
{
"mcpServers": {
"youtube-transcript": {
"command": "npx",
"args": [
"-y",
"@sinco-lab/mcp-youtube-transcript"
],
"minToolCalls": 0,
"maxToolCalls": 3,
"description": "YouTube video transcripts."
},
"pubmedmcp": {
"command": "uvx",
"args": ["pubmedmcp@latest"],
"minToolCalls": 0,
"maxToolCalls": 5,
"description": "Search PubMed papers and retrieve metadata/abstracts using the PubMed E-utilities API.",
"timeout": 20,
"env": {
"UV_PRERELEASE": "allow",
"UV_PYTHON": "3.12"
}
}
}
}
Server Definition Keys
<server_name>
(e.g., youtube-transcript
, pubmedmcp
)
The unique name identifying the MCP server. This name is used in config.yaml
to link a tool to its provider.
command
(String)
The command needed to start the MCP server process.
args
(Array of strings)
Arguments for the command.
minToolCalls
(Integer)
Minimum number of times this tool can be called.
maxToolCalls
(Integer)
Maximum number of times this tool can be called.
description
(String)
Description of the tool that overrides the default MCP tool description.
env
(Object, optional)
Environment variables to set for the server process.
timeout
(Integer, optional)
Client-side timeout (in seconds) for waiting for responses from this server's tools.
config/prompts.py
This file contains the prompts used by different models in the system. You can customize these prompts to change how the models behave.
Next Steps
With configuration understood, explore the Usage Guide to learn how to run CleverBee and start your research.