MCP Server¶
The liter-llm binary can run as a Model Context Protocol (MCP) server. It exposes 22 tools backed by the same ProxyConfig used by the HTTP proxy. Model-routed tools honor provider routing, virtual keys, fallbacks, and cache settings; file, batch, and response-management tools require master-key access.
Launch it with liter-llm mcp. The server supports two transports: stdio for local clients like Claude Desktop and Cursor, and http for network-attached clients using the Streamable HTTP transport.
Install the CLI¶
The MCP server is part of the liter-llm binary. Install it any of these ways:
brew install xberg-io/tap/liter-llm # Homebrew (macOS/Linux)
cargo install liter-llm-cli # from crates.io
npx @xberg-io/liter-llm-cli --help # npm — self-installs the binary
docker run ghcr.io/xberg-io/liter-llm # container image
A prebuilt binary is also attached to every GitHub release.
Easiest: the liter-llm plugin¶
To use the MCP server inside a coding agent (Claude Code, Cursor, Codex, Gemini CLI, Factory Droid, GitHub Copilot CLI), install the liter-llm plugin from the xberg-io/plugins marketplace. It auto-registers the liter-llm MCP server and resolves the binary for you on first run — no manual config:
The rest of this page covers running and configuring the server directly.
Quick start¶
Run the server over stdio against an auto-discovered liter-llm-proxy.toml. The config must include either mcp.stdio_key_id or mcp.stdio_trust_local = true:
Run over HTTP on the default port 3001:
The HTTP transport exposes a single endpoint: POST /mcp. Point any MCP HTTP client at http://127.0.0.1:3001/mcp and include Authorization: Bearer <master-or-virtual-key> on every request.
Command-line flags¶
| Flag | Default | Description |
|---|---|---|
--config |
auto-discover | Path to the TOML config. Same format as the proxy. |
--transport |
stdio |
Transport mode. One of stdio or http. |
--host |
127.0.0.1 |
Bind address for the HTTP transport. Ignored for stdio. |
--port |
3001 |
Bind port for the HTTP transport. Ignored for stdio. |
The MCP server loads the same liter-llm-proxy.toml as the HTTP proxy. See Proxy Configuration for the full schema. Any [[models]], [[aliases]], [[keys]], [cache], [files], or [health] table defined there applies to MCP requests as well.
Tools¶
Every tool returns a JSON payload as a single text content part. Errors are propagated as MCP error objects with the liter-llm error type embedded in the message.
Virtual keys can call model-routed tools such as chat, embeddings, media, moderation, rerank, search, and OCR. File, batch, and response-management tools require the master key because they operate on provider-side resources outside a single routed model request.
LLM operations¶
| Tool | Description | Key parameters |
|---|---|---|
chat |
Send a chat completion request to an LLM. | model, messages, temperature?, max_tokens? |
embed |
Generate text embeddings for the given input. | model, input |
list_models |
List available models from configured providers. | none |
Media¶
| Tool | Description | Key parameters |
|---|---|---|
generate_image |
Generate images from a text prompt. | prompt, model?, n?, size? |
speech |
Generate speech audio from text (TTS). Returns base64 audio. | model, input, voice |
transcribe |
Transcribe audio to text (STT). | model, file_base64 |
Classification and retrieval¶
| Tool | Description | Key parameters |
|---|---|---|
moderate |
Check content against moderation policies. | input, model? |
rerank |
Rerank documents by relevance to a query. | model, query, documents |
search |
Perform a web or document search. | model, query |
ocr |
Extract text from an image or document via OCR. | model, image_url?, image_base64?, media_type? |
Files¶
| Tool | Description | Key parameters |
|---|---|---|
create_file |
Upload a file to the LLM provider. | filename, content_base64, purpose |
list_files |
List uploaded files. | purpose?, limit? |
retrieve_file |
Retrieve metadata for an uploaded file. | file_id |
delete_file |
Delete an uploaded file. | file_id |
file_content |
Retrieve the raw content of an uploaded file. | file_id |
Batches¶
| Tool | Description | Key parameters |
|---|---|---|
create_batch |
Create a new batch processing job. | input_file_id, endpoint, completion_window |
list_batches |
List batch processing jobs. | limit?, after? |
retrieve_batch |
Retrieve a batch processing job by ID. | batch_id |
cancel_batch |
Cancel an in-progress batch processing job. | batch_id |
Responses API¶
| Tool | Description | Key parameters |
|---|---|---|
create_response |
Create a new response (Responses API). | model, input |
retrieve_response |
Retrieve a response by ID. | response_id |
cancel_response |
Cancel an in-progress response. | response_id |
The full parameter schema for every tool is defined in crates/liter-llm-proxy/src/mcp/params.rs and surfaced to MCP clients as JSON Schema through rmcp.
Every tool also carries MCP tool annotations — a human title plus readOnlyHint/destructiveHint/idempotentHint/openWorldHint. Query tools are read-only; create_* mutate without being destructive; delete_*/cancel_* are destructive and idempotent; all set openWorldHint since they reach external providers. Available by v1.8
Prompts, resources, and completion¶
Beyond tools, the server advertises three more MCP capabilities. Available by v1.8
- Prompts — reusable templates:
summarize,translate, andextract. - Resources — read-only catalog endpoints:
liter-llm://models(configured models) andliter-llm://providers(the built-in registry), plus the templatesliter-llm://pricing/{model}andliter-llm://provider/{name}. - Completion — argument autocompletion for
model(from the configured models) and providername(from the registry).
Claude Desktop¶
Add an entry to your claude_desktop_config.json:
{
"mcpServers": {
"liter-llm": {
"command": "liter-llm",
"args": ["mcp"],
"env": {
"OPENAI_API_KEY": "sk-...",
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}
Restart Claude Desktop. The 22 tools appear under the liter-llm server. Point liter-llm at a config file with --config /absolute/path/to/liter-llm-proxy.toml if you want virtual keys or a custom model list.
Cursor¶
Cursor reads MCP servers from ~/.cursor/mcp.json (or the workspace equivalent). Use the same shape as Claude Desktop:
{
"mcpServers": {
"liter-llm": {
"command": "liter-llm",
"args": ["mcp", "--config", "/absolute/path/to/liter-llm-proxy.toml"]
}
}
}
Use stdio_trust_local = true only for trusted local clients. To enforce a virtual-key policy instead, set stdio_key_id to an existing [[keys]].key.
{
"mcpServers": {
"liter-llm": {
"command": "liter-llm",
"args": ["mcp", "--config", "/absolute/path/to/liter-llm-proxy.toml"],
"env": {
"OPENAI_API_KEY": "sk-...",
"ANTHROPIC_API_KEY": "sk-ant-..."
}
}
}
}
HTTP transport¶
Run the server in HTTP mode when the client is on a different machine or when you want to share one MCP server across several users. Pair it with a reverse proxy for TLS.
liter-llm mcp \
--transport http \
--host 127.0.0.1 \
--port 3001 \
--config ./liter-llm-proxy.toml
# List the 22 tools exposed by the server.
curl -s http://127.0.0.1:3001/mcp \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $LITER_LLM_MASTER_KEY" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'
The HTTP endpoint is POST /mcp. Each request opens a short-lived session managed by rmcp's LocalSessionManager and passes through the same Authorization: Bearer <key> middleware as the REST proxy. The resolved master key or virtual key is attached to the MCP request context before any tool runs. Available by v1.5
HTTP transport requires Bearer auth
Do not expose the HTTP MCP transport without TLS and a real master or virtual key. Unauthenticated requests return 401 before the MCP handler runs.
Shared configuration¶
The MCP server and the HTTP proxy use the same ProxyConfig loader. That means:
- Models defined in
[[models]]are callable aschat,embed,generate_image, and so on. - Glob overrides in
[[aliases]]apply to MCP requests. [cache]caches non-streaming responses across both surfaces.[files]persists files uploaded via thecreate_filetool.[[keys]]virtual keys are enforced on HTTP MCP requests via Bearer auth.- Stdio MCP calls use the configured startup context from
[mcp]:stdio_key_idbinds tools to one virtual key, whilestdio_trust_local = truegrants master context for trusted local clients.
The master key is also loaded. HTTP clients present it as a Bearer token; stdio clients can only use master context when stdio_trust_local = true is set explicitly.
Troubleshooting¶
- "tool call failed: model 'foo' not found": the
modelparameter passed to the tool does not match anynamein[[models]]. Checkliter-llm-proxy.tomland restart or use--watch. - stdio refuses to start: add
[mcp] stdio_key_id = "vk-..."for a configured virtual key, or[mcp] stdio_trust_local = truefor a trusted local-only setup. - HTTP returns 401: include
Authorization: Bearer <master-or-virtual-key>on thePOST /mcprequest. - stdio transport hangs on startup: the client expects a JSON-RPC handshake on stdin. Make sure you are launching
liter-llm mcpfrom an MCP client, not an interactive shell. - HTTP transport returns 404: the endpoint is
/mcp, not/. Every request isPOST /mcp. - Image or audio tools return empty content: the underlying provider may not support the feature. Check Providers for per-provider capability.