Zen MCP

Zen MCP

Zen MCP is an open-source Model Context Protocol server that empowers AI-driven development by orchestrating collaboration between multiple large language models (e.g., Claude, Gemini, O3, Ollama, and others). With Zen MCP, you can seamlessly integrate different AI models into your workflow for code analysis, problem-solving, debugging, and collaborative development, maximizing the strengths of each model—all within a single conversation thread.

Author: BeehiveInnovations


View Protocol

What is Zen MCP?

Zen MCP is a powerful development tool that acts as an "AI orchestration server" via the Model Context Protocol. It allows one main AI (typically Claude) to coordinate, debate, and leverage the unique capabilities of different LLMs—such as Gemini Pro and O3—for specific software engineering tasks. Zen MCP gives developers a way to obtain diverse AI perspectives, automate model selection, and build advanced, multi-model workflows with deep context retention, making your AI assistant feel more like a multi-senior development team.

How to Configure Zen MCP

  1. Prerequisites:

    • Install Docker Desktop and Git.
    • (For Windows users) Enable WSL2 for Claude Code CLI.
  2. Get API Keys:

    • OpenRouter: Register for a key to access multiple models via one API (OpenRouter).
    • Google Gemini: Get your API key from Google AI Studio.
    • OpenAI: Get an API key from OpenAI Platform.
    • Local models: Set up custom endpoints for Ollama, vLLM, LM Studio, etc.
  3. Clone and Set Up Zen MCP:

    git clone https://github.com/BeehiveInnovations/zen-mcp-server.git
    cd zen-mcp-server
    ./setup-docker.sh
    

    This script builds Docker images, creates .env config, and starts the Zen MCP server with Redis.

  4. Add API Keys:

    • Edit your .env file to include the required API keys or custom model endpoints.
  5. Add Zen MCP to Claude:

    • For Claude Code CLI:
      claude mcp add zen -s user -- docker exec -i zen-mcp-server python server.py
      
    • For Claude Desktop:
      • Update claude_desktop_config.json with MCP server configuration (copy instructions from setup).
  6. Restart your Claude environment as needed.

How to Use Zen MCP

  1. Interact naturally with Claude, specifying "zen" as your context provider.

    • Example: "Use zen to perform a code review on this function."
  2. Zen MCP automatically routes tasks to the best-fitting model/tool.

    • You can direct Zen to use a specific model (e.g., Gemini Pro, O3, Flash, or Ollama) or let it auto-select.
  3. Leverage collaborative multi-model conversations.

    • Tasks can be split: one model analyzes, another reviews, a third proposes fixes, all in the same conversation thread.
    • Previous context and findings carry over between steps and models.
  4. Use specific Zen MCP tools within commands:

    • Request code reviews, deep analysis, debugging, pre-commit checks, and more.
  5. Override model/tool selection if needed:

    • Add instructions like "Use o3 for logical debugging" or "Use flash for a quick check".
  6. Explore advanced usage:

    • Combine tools, use web search augmentation, or collaborate asynchronously with cross-tool continuation.

Key Features

  • Multi-model orchestration: Coordinate the strengths of Claude, Gemini, O3, and local models in unified workflows.
  • Automatic model selection: Claude intelligently picks the best model for each subtask, or you can specify.
  • Seamless context retention: Single-threaded conversations preserve context across tools and model switches.
  • Pre-built development tools: Includes collaborative chat, code review, pre-commit validation, debugging, and more.
  • AI-to-AI conversation threading: Models can debate, challenge, and request info from each other, delivering multi-perspective solutions.
  • Support for local models: Easily plug in self-hosted models like Llama via Ollama or vLLM for privacy and cost efficiency.
  • Handles large context windows: Offload analysis of large codebases to models with big token limits (e.g., Gemini 1M, O3 200K).
  • Smart file and repo management: Auto-discovers files/repositories, expands directories, and intelligently manages token limits.
  • Incremental knowledge sharing: Only changed/new information is sent per exchange, enabling effective 25K+ token workflows.
  • Web search augmentation: Some tools can suggest and incorporate web search results on demand.
  • Plug-and-play integration: One-command Docker setup and quick linking to Claude environments (CLI or Desktop).

Use Cases

  • Get both quick and deep code reviews leveraging Gemini's and Claude's different strengths
  • Brainstorm complex architecture or technical decisions—debate between models for the best solutions
  • Debug elusive logic errors—let O3 analyze logic, Gemini focus on architecture
  • Validate git commits before merging—pre-commit checks with multi-model opinions
  • Perform exploratory code analysis over large codebases that exceed Claude's native token window
  • Use a local (privacy-first) Llama model for code analysis, then escalate to an online model for deeper reasoning as needed
  • Keep a persistent, asynchronous conversation thread between models for extended problem solving
  • Rapidly switch between different analysis tools (e.g., from "analyze" to "codereview" to "debug") without resetting context

FAQ

Q: Do I need all API keys (Gemini, OpenAI, OpenRouter) to use Zen MCP?
A: No. You can get started with just one provider. However, for full multi-model orchestration, adding more keys gives you the flexibility to include more perspectives and model strengths.

Q: Does Zen MCP share my files or context with any external service?
A: Zen MCP only sends data to the APIs/models you configure. For maximum privacy, you can use local models (e.g., via Ollama) to ensure data never leaves your machine.

Q: How does conversation threading work? Will my history be saved?
A: Zen MCP uses Redis for persistent "conversation threading." Within a session, AI models retain context and can exchange updates for up to 5 messages or 1 hour. No long-term storage is retained by default.

Q: Can I use Zen MCP for non-coding tasks?
A: While optimized for code and development workflows, Zen MCP can be configured for broader analytical or reasoning tasks using supported models and tools.

Q: What happens if two API endpoints overlap (e.g., same model name)?
A: Native APIs take priority when there’s a name conflict (e.g., “gemini” via Gemini API vs OpenRouter). You can resolve this by setting unique model aliases in custom_models.json.