Skip to content

MCP Server

The liter-llm binary can run as a Model Context Protocol (MCP) server. It exposes 22 tools backed by the same ProxyConfig used by the HTTP proxy. Model-routed tools honor provider routing, virtual keys, fallbacks, and cache settings; file, batch, and response-management tools require master-key access.

Launch it with liter-llm mcp. The server supports two transports: stdio for local clients like Claude Desktop and Cursor, and http for network-attached clients using the Streamable HTTP transport.

Install the CLI

The MCP server is part of the liter-llm binary. Install it any of these ways:

brew install xberg-io/tap/liter-llm   # Homebrew (macOS/Linux)
cargo install liter-llm-cli                 # from crates.io
npx @xberg-io/liter-llm-cli --help         # npm — self-installs the binary
docker run ghcr.io/xberg-io/liter-llm  # container image

A prebuilt binary is also attached to every GitHub release.

Easiest: the liter-llm plugin

To use the MCP server inside a coding agent (Claude Code, Cursor, Codex, Gemini CLI, Factory Droid, GitHub Copilot CLI), install the liter-llm plugin from the xberg-io/plugins marketplace. It auto-registers the liter-llm MCP server and resolves the binary for you on first run — no manual config:

/plugin marketplace add xberg-io/plugins
/plugin install liter-llm@xberg

The rest of this page covers running and configuring the server directly.

Quick start

Run the server over stdio against an auto-discovered liter-llm-proxy.toml. The config must include either mcp.stdio_key_id or mcp.stdio_trust_local = true:

liter-llm mcp

Run over HTTP on the default port 3001:

liter-llm mcp --transport http --host 127.0.0.1 --port 3001

The HTTP transport exposes a single endpoint: POST /mcp. Point any MCP HTTP client at http://127.0.0.1:3001/mcp and include Authorization: Bearer <master-or-virtual-key> on every request.

Command-line flags

Flag Default Description
--config auto-discover Path to the TOML config. Same format as the proxy.
--transport stdio Transport mode. One of stdio or http.
--host 127.0.0.1 Bind address for the HTTP transport. Ignored for stdio.
--port 3001 Bind port for the HTTP transport. Ignored for stdio.

The MCP server loads the same liter-llm-proxy.toml as the HTTP proxy. See Proxy Configuration for the full schema. Any [[models]], [[aliases]], [[keys]], [cache], [files], or [health] table defined there applies to MCP requests as well.

Tools

Every tool returns a JSON payload as a single text content part. Errors are propagated as MCP error objects with the liter-llm error type embedded in the message.

Virtual keys can call model-routed tools such as chat, embeddings, media, moderation, rerank, search, and OCR. File, batch, and response-management tools require the master key because they operate on provider-side resources outside a single routed model request.

LLM operations

Tool Description Key parameters
chat Send a chat completion request to an LLM. model, messages, temperature?, max_tokens?
embed Generate text embeddings for the given input. model, input
list_models List available models from configured providers. none

Media

Tool Description Key parameters
generate_image Generate images from a text prompt. prompt, model?, n?, size?
speech Generate speech audio from text (TTS). Returns base64 audio. model, input, voice
transcribe Transcribe audio to text (STT). model, file_base64

Classification and retrieval

Tool Description Key parameters
moderate Check content against moderation policies. input, model?
rerank Rerank documents by relevance to a query. model, query, documents
search Perform a web or document search. model, query
ocr Extract text from an image or document via OCR. model, image_url?, image_base64?, media_type?

Files

Tool Description Key parameters
create_file Upload a file to the LLM provider. filename, content_base64, purpose
list_files List uploaded files. purpose?, limit?
retrieve_file Retrieve metadata for an uploaded file. file_id
delete_file Delete an uploaded file. file_id
file_content Retrieve the raw content of an uploaded file. file_id

Batches

Tool Description Key parameters
create_batch Create a new batch processing job. input_file_id, endpoint, completion_window
list_batches List batch processing jobs. limit?, after?
retrieve_batch Retrieve a batch processing job by ID. batch_id
cancel_batch Cancel an in-progress batch processing job. batch_id

Responses API

Tool Description Key parameters
create_response Create a new response (Responses API). model, input
retrieve_response Retrieve a response by ID. response_id
cancel_response Cancel an in-progress response. response_id

The full parameter schema for every tool is defined in crates/liter-llm-proxy/src/mcp/params.rs and surfaced to MCP clients as JSON Schema through rmcp.

Every tool also carries MCP tool annotations — a human title plus readOnlyHint/destructiveHint/idempotentHint/openWorldHint. Query tools are read-only; create_* mutate without being destructive; delete_*/cancel_* are destructive and idempotent; all set openWorldHint since they reach external providers. Available by v1.8

Prompts, resources, and completion

Beyond tools, the server advertises three more MCP capabilities. Available by v1.8

  • Prompts — reusable templates: summarize, translate, and extract.
  • Resources — read-only catalog endpoints: liter-llm://models (configured models) and liter-llm://providers (the built-in registry), plus the templates liter-llm://pricing/{model} and liter-llm://provider/{name}.
  • Completion — argument autocompletion for model (from the configured models) and provider name (from the registry).

Claude Desktop

Add an entry to your claude_desktop_config.json:

{
  "mcpServers": {
    "liter-llm": {
      "command": "liter-llm",
      "args": ["mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Restart Claude Desktop. The 22 tools appear under the liter-llm server. Point liter-llm at a config file with --config /absolute/path/to/liter-llm-proxy.toml if you want virtual keys or a custom model list.

Cursor

Cursor reads MCP servers from ~/.cursor/mcp.json (or the workspace equivalent). Use the same shape as Claude Desktop:

{
  "mcpServers": {
    "liter-llm": {
      "command": "liter-llm",
      "args": ["mcp", "--config", "/absolute/path/to/liter-llm-proxy.toml"]
    }
  }
}
liter-llm-proxy.toml
[mcp]
stdio_trust_local = true

Use stdio_trust_local = true only for trusted local clients. To enforce a virtual-key policy instead, set stdio_key_id to an existing [[keys]].key.

claude_desktop_config.json (stdio)
{
  "mcpServers": {
    "liter-llm": {
      "command": "liter-llm",
      "args": ["mcp", "--config", "/absolute/path/to/liter-llm-proxy.toml"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

HTTP transport

Run the server in HTTP mode when the client is on a different machine or when you want to share one MCP server across several users. Pair it with a reverse proxy for TLS.

liter-llm mcp --transport http --host 0.0.0.0 --port 3001
HTTP transport launch
liter-llm mcp \
  --transport http \
  --host 127.0.0.1 \
  --port 3001 \
  --config ./liter-llm-proxy.toml
HTTP transport smoke test
# List the 22 tools exposed by the server.
curl -s http://127.0.0.1:3001/mcp \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $LITER_LLM_MASTER_KEY" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

The HTTP endpoint is POST /mcp. Each request opens a short-lived session managed by rmcp's LocalSessionManager and passes through the same Authorization: Bearer <key> middleware as the REST proxy. The resolved master key or virtual key is attached to the MCP request context before any tool runs. Available by v1.5

HTTP transport requires Bearer auth

Do not expose the HTTP MCP transport without TLS and a real master or virtual key. Unauthenticated requests return 401 before the MCP handler runs.

Shared configuration

The MCP server and the HTTP proxy use the same ProxyConfig loader. That means:

  • Models defined in [[models]] are callable as chat, embed, generate_image, and so on.
  • Glob overrides in [[aliases]] apply to MCP requests.
  • [cache] caches non-streaming responses across both surfaces.
  • [files] persists files uploaded via the create_file tool.
  • [[keys]] virtual keys are enforced on HTTP MCP requests via Bearer auth.
  • Stdio MCP calls use the configured startup context from [mcp]: stdio_key_id binds tools to one virtual key, while stdio_trust_local = true grants master context for trusted local clients.

The master key is also loaded. HTTP clients present it as a Bearer token; stdio clients can only use master context when stdio_trust_local = true is set explicitly.

Troubleshooting

  • "tool call failed: model 'foo' not found": the model parameter passed to the tool does not match any name in [[models]]. Check liter-llm-proxy.toml and restart or use --watch.
  • stdio refuses to start: add [mcp] stdio_key_id = "vk-..." for a configured virtual key, or [mcp] stdio_trust_local = true for a trusted local-only setup.
  • HTTP returns 401: include Authorization: Bearer <master-or-virtual-key> on the POST /mcp request.
  • stdio transport hangs on startup: the client expects a JSON-RPC handshake on stdin. Make sure you are launching liter-llm mcp from an MCP client, not an interactive shell.
  • HTTP transport returns 404: the endpoint is /mcp, not /. Every request is POST /mcp.
  • Image or audio tools return empty content: the underlying provider may not support the feature. Check Providers for per-provider capability.

Edit this page on GitHub