Skip to content

MCP Server

The liter-llm binary can run as a Model Context Protocol (MCP) server. It exposes 22 tools backed by the same ProxyConfig used by the HTTP proxy, so every provider, virtual key, fallback, and cache layer that works for the REST API works for MCP clients too.

Launch it with liter-llm mcp. The server supports two transports: stdio for local clients like Claude Desktop and Cursor, and http for network-attached clients using the Streamable HTTP transport.

Quick start

Run the server over stdio against an auto-discovered liter-llm-proxy.toml:

liter-llm mcp

Run over HTTP on the default port 3001:

liter-llm mcp --transport http --host 127.0.0.1 --port 3001

The HTTP transport exposes a single endpoint: POST /mcp. Point any MCP HTTP client at http://127.0.0.1:3001/mcp.

Command-line flags

Flag Default Description
--config auto-discover Path to the TOML config. Same format as the proxy.
--transport stdio Transport mode. One of stdio or http.
--host 127.0.0.1 Bind address for the HTTP transport. Ignored for stdio.
--port 3001 Bind port for the HTTP transport. Ignored for stdio.

The MCP server loads the same liter-llm-proxy.toml as the HTTP proxy. See Proxy Configuration for the full schema. Any [[models]], [[aliases]], [[keys]], [cache], [files], or [health] table defined there applies to MCP requests as well.

Tools

Every tool returns a JSON payload as a single text content part. Errors are propagated as MCP error objects with the liter-llm error type embedded in the message.

LLM operations

Tool Description Key parameters
chat Send a chat completion request to an LLM. model, messages, temperature?, max_tokens?
embed Generate text embeddings for the given input. model, input
list_models List available models from configured providers. none

Media

Tool Description Key parameters
generate_image Generate images from a text prompt. prompt, model?, n?, size?
speech Generate speech audio from text (TTS). Returns base64 audio. model, input, voice
transcribe Transcribe audio to text (STT). model, file_base64

Classification and retrieval

Tool Description Key parameters
moderate Check content against moderation policies. input, model?
rerank Rerank documents by relevance to a query. model, query, documents
search Perform a web or document search. model, query
ocr Extract text from an image or document via OCR. model, image_url?, image_base64?, media_type?

Files

Tool Description Key parameters
create_file Upload a file to the LLM provider. filename, content_base64, purpose
list_files List uploaded files. purpose?, limit?
retrieve_file Retrieve metadata for an uploaded file. file_id
delete_file Delete an uploaded file. file_id
file_content Retrieve the raw content of an uploaded file. file_id

Batches

Tool Description Key parameters
create_batch Create a new batch processing job. input_file_id, endpoint, completion_window
list_batches List batch processing jobs. limit?, after?
retrieve_batch Retrieve a batch processing job by ID. batch_id
cancel_batch Cancel an in-progress batch processing job. batch_id

Responses API

Tool Description Key parameters
create_response Create a new response (Responses API). model, input
retrieve_response Retrieve a response by ID. response_id
cancel_response Cancel an in-progress response. response_id

The full parameter schema for every tool is defined in crates/liter-llm-proxy/src/mcp/params.rs and surfaced to MCP clients as JSON Schema through rmcp.

Claude Desktop

Add an entry to your claude_desktop_config.json:

{
  "mcpServers": {
    "liter-llm": {
      "command": "liter-llm",
      "args": ["mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Restart Claude Desktop. The 22 tools appear under the liter-llm server. Point liter-llm at a config file with --config /absolute/path/to/liter-llm-proxy.toml if you want virtual keys or a custom model list.

Cursor

Cursor reads MCP servers from ~/.cursor/mcp.json (or the workspace equivalent). Use the same shape as Claude Desktop:

{
  "mcpServers": {
    "liter-llm": {
      "command": "liter-llm",
      "args": ["mcp", "--config", "/absolute/path/to/liter-llm-proxy.toml"]
    }
  }
}
claude_desktop_config.json (stdio)
{
  "mcpServers": {
    "liter-llm": {
      "command": "liter-llm",
      "args": ["mcp", "--config", "/absolute/path/to/liter-llm-proxy.toml"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

HTTP transport

Run the server in HTTP mode when the client is on a different machine or when you want to share one MCP server across several users. Pair it with a reverse proxy for TLS.

liter-llm mcp --transport http --host 0.0.0.0 --port 3001
HTTP transport launch
liter-llm mcp \
  --transport http \
  --host 127.0.0.1 \
  --port 3001 \
  --config ./liter-llm-proxy.toml
HTTP transport smoke test
# List the 22 tools exposed by the server.
curl -s http://127.0.0.1:3001/mcp \
  -H 'Content-Type: application/json' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

The HTTP endpoint is POST /mcp. Each request opens a short-lived session managed by rmcp's LocalSessionManager. There is no authentication on the MCP HTTP transport itself, so bind to loopback or put it behind an authenticated reverse proxy.

HTTP transport has no built-in auth

Unlike the REST proxy, liter-llm mcp --transport http does not check Bearer tokens. Do not expose it to the public internet without a reverse proxy that handles authentication.

Shared configuration

The MCP server and the HTTP proxy use the same ProxyConfig loader. That means:

  • Models defined in [[models]] are callable as chat, embed, generate_image, and so on.
  • Glob overrides in [[aliases]] apply to MCP requests.
  • [cache] caches non-streaming responses across both surfaces.
  • [files] persists files uploaded via the create_file tool.
  • [[keys]] virtual keys are loaded but not enforced on MCP calls, since the transports are assumed trusted.

The master key is also loaded, but the MCP surface does not send Bearer tokens, so virtual-key RPM, TPM, and budget caps do not apply to MCP invocations today. Use [rate_limit] and [budget] for global caps that cover both surfaces.

Troubleshooting

  • "tool call failed: model 'foo' not found": the model parameter passed to the tool does not match any name in [[models]]. Check liter-llm-proxy.toml and restart.
  • stdio transport hangs on startup: the client expects a JSON-RPC handshake on stdin. Make sure you are launching liter-llm mcp from an MCP client, not an interactive shell.
  • HTTP transport returns 404: the endpoint is /mcp, not /. Every request is POST /mcp.
  • Image or audio tools return empty content: the underlying provider may not support the feature. Check Providers for per-provider capability.