Configuration Guide

CleverBee uses configuration files to manage API keys, agent behavior, models, tools, and other settings. This guide explains the main configuration files and their key options.

Configuration Files

.env: Stores sensitive API keys.
config.yaml: Controls core agent behavior, model selection, memory, tool enablement, and processing settings.
mcp.json: Defines configurations for external MCP (Model Context Protocol) servers providing additional tools.
config/prompts.py: Defines the prompts used by different models in the system.

Note

The setup.sh script automates many configuration steps, including model selection, tool enablement, and GPU optimization settings in config.yaml. You typically only need to edit .env for API keys if prompted during setup.

.env File

This file stores your secret API keys. It's created from .env.example during setup. You need to manually add your keys here.

Key Variables

GEMINI_API_KEY

Required if using Gemini models (either primary or summarizer). Obtain from Google AI Studio.

ANTHROPIC_API_KEY

Required if using Claude models. Obtain from the Anthropic Console.

config.yaml

This is the main configuration file for CleverBee's behavior. It's structured into sections:

SECTION 1: CORE MODEL CONFIGURATION

Defines the primary models used for reasoning and decision-making.

PRIMARY_MODEL_TYPE

Sets the primary model type (gemini, claude, or local). Determines which model handles initial planning and final report generation.

CLAUDE_MODEL_NAME / GEMINI_MODEL_NAME

Specifies the exact cloud model name for the chosen provider (e.g., claude-3-7-sonnet-20250219, gemini-2.5-pro-preview-03-25).

LOCAL_MODEL_NAME

Filename of the local GGUF model used for primary reasoning if PRIMARY_MODEL_TYPE is local. Set during setup.sh.

LOCAL_MODEL_QUANT_LEVEL

Specifies the quantization level (e.g., Q4_K_M) for the local primary model.

NEXT_STEP_MODEL_TYPE / NEXT_STEP_MODEL_NAME etc.

(New) Defines the model responsible for analyzing research progress and deciding the next action. If not explicitly set, may default to using the Primary LLM settings. Specific keys (e.g., NEXT_STEP_GEMINI_MODEL_NAME, USE_LOCAL_NEXT_STEP_MODEL) depend on implementation.

USE_LOCAL_SUMMARIZER_MODEL

Boolean (true/false). If true, use the local SUMMARIZER_MODEL. If false, use the cloud summarizer (e.g., Gemini Flash).

LOCAL_MODELS_DIR

Directory where local GGUF models are stored (default: models/).

N_GPU_LAYERS

Number of layers to offload to GPU for local models (0 for CPU, -1 for all, positive number for specific layers). Optimized during setup based on hardware.

SECTION 2: SUMMARIZER AND NEXT STEP CONFIGURATION

SUMMARIZER_MODEL

Filename of the local GGUF model or cloud model name (e.g., gemini-2.0-flash) used specifically for summarizing web content. Set during setup.sh.

SUMMARY_MAX_TOKENS

Maximum tokens for each generated summary.

NEXT_STEP_MODEL_TYPE

Sets the model type for determining the next steps in the research flow (gemini is recommended).

NEXT_STEP_GEMINI_MODEL_NAME

Specifies the exact Gemini model to use for next step decisions (e.g., gemini-2.5-flash).

SECTION 3: CONTENT PROCESSING & MEMORY

CHUNK_SIZE

Target size (in characters) for splitting large documents when using local models. Set automatically by setup.sh based on the summarizer model's context window (0 disables chunking for cloud models).

CHUNK_OVERLAP

Number of characters overlapping between consecutive chunks.

USE_PROGRESSIVE_LOADING

Boolean. If true, agent may use summaries first before loading full content.

ENABLE_THINKING

Boolean. If true, attempts to give the primary model more 'thinking time' (e.g., using Claude's extended thinking feature, if supported).

THINKING_BUDGET

Additional token budget for thinking steps if ENABLE_THINKING is true.

CONVERSATION_MEMORY_MAX_TOKENS

Maximum tokens to keep in the agent's short-term memory buffer. Set to ~90% of the primary model's context window.

SECTION 4: SEARCH & BROWSER SETTINGS

MIN_REGULAR_WEB_PAGES / MAX_REGULAR_WEB_PAGES

Minimum and maximum number of web pages to process from standard web searches.

MIN_POSTS_PER_SEARCH / MAX_POSTS_PER_SEARCH

Minimum and maximum number of Reddit posts to fetch per search.

BROWSER_NAVIGATION_TIMEOUT

Timeout (in milliseconds) for Playwright page navigation.

USE_CAPTCHA_SOLVER

Boolean. Enables integrated CAPTCHA solving attempts.

CAPTCHA_SOLVER_TIMEOUT

Timeout (in milliseconds) for the CAPTCHA solver.

SECTION 5: TOOL CONFIGURATION

This section, nested under the tools: key in config.yaml, enables/disables specific tools available to the agent.

tools:
  web_browser:
    enabled: true
  reddit_search:
    enabled: true
  reddit_extract_post:
    enabled: true

Set enabled to false to disable a tool. For MCP tools, you can add an mcp_tool_name if the key differs from the actual callable tool name provided by the MCP server.

SECTION 6: USAGE TRACKING & PRICING

TRACK_TOKEN_USAGE

Boolean. Enables token counting for LLM calls.

LOG_COST_SUMMARY

Boolean. If true and tracking is enabled, prints estimated costs.

Pricing Variables

Defines cost per 1000 input/output tokens for different models (e.g., GEMINI_COST_PER_1K_INPUT_TOKENS). Used only if tracking is enabled.

SECTION 7: LOGGING CONFIGURATION

LOG_LEVEL

Sets the minimum log severity level (DEBUG, INFO, WARNING, ERROR, CRITICAL).

mcp.json

This file defines external MCP (Model Context Protocol) servers that provide additional tools beyond the built-in ones.

{
  "mcpServers": {
    "youtube-transcript": {
      "command": "npx",
      "args": [
        "-y",
        "@sinco-lab/mcp-youtube-transcript"
      ],
      "minToolCalls": 0,
      "maxToolCalls": 3,
      "description": "YouTube video transcripts."
    },
    "pubmedmcp": {
      "command": "uvx",
      "args": ["pubmedmcp@latest"],
      "minToolCalls": 0,
      "maxToolCalls": 5,
      "description": "Search PubMed papers and retrieve metadata/abstracts using the PubMed E-utilities API.",
      "timeout": 20,
      "env": {
        "UV_PRERELEASE": "allow",
        "UV_PYTHON": "3.12"
      }
    }
  }
}

Server Definition Keys

<server_name> (e.g., youtube-transcript, pubmedmcp)

The unique name identifying the MCP server. This name is used in config.yaml to link a tool to its provider.

command (String)

The command needed to start the MCP server process.

args (Array of strings)

Arguments for the command.

minToolCalls (Integer)

Minimum number of times this tool can be called.

maxToolCalls (Integer)

Maximum number of times this tool can be called.

description (String)

Description of the tool that overrides the default MCP tool description.

env (Object, optional)

Environment variables to set for the server process.

timeout (Integer, optional)

Client-side timeout (in seconds) for waiting for responses from this server's tools.

config/prompts.py

This file contains the prompts used by different models in the system. You can customize these prompts to change how the models behave.

Next Steps

With configuration understood, explore the Usage Guide to learn how to run CleverBee and start your research.