Embeddings¶

The embed method generates vector embeddings from text input. Embeddings are fixed-length numeric arrays that capture semantic meaning -- useful for search, clustering, and retrieval-augmented generation (RAG).

Basic Usage¶

PythonTypeScriptGo

import asyncio
import os
from liter_llm import LlmClient

async def main() -> None:
    client = LlmClient(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.embed(
        model="openai/text-embedding-3-small",
        input=["The quick brown fox jumps over the lazy dog"],
    )
    print(f"Dimensions: {len(response.data[0].embedding)}")
    print(f"First 5 values: {response.data[0].embedding[:5]}")

asyncio.run(main())

import { LlmClient } from "@kreuzberg/liter-llm";

const client = new LlmClient({ apiKey: process.env.OPENAI_API_KEY! });
const response = await client.embed({
  model: "openai/text-embedding-3-small",
  input: ["The quick brown fox jumps over the lazy dog"],
});
console.log(`Dimensions: ${response.data[0].embedding.length}`);
console.log(`First 5 values: ${response.data[0].embedding.slice(0, 5)}`);

package main

import (
 "context"
 "fmt"
 "os"

 llm "github.com/kreuzberg-dev/liter-llm/packages/go"
)

func main() {
 client := llm.NewClient(llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")))
 resp, err := client.Embed(context.Background(), &llm.EmbeddingRequest{
  Model: "openai/text-embedding-3-small",
  Input: llm.NewEmbeddingInputMultiple([]string{"The quick brown fox jumps over the lazy dog"}),
 })
 if err != nil {
  panic(err)
 }
 fmt.Printf("Dimensions: %d\n", len(resp.Data[0].Embedding))
 fmt.Printf("First 5 values: %v\n", resp.Data[0].Embedding[:5])
}

Supported Providers¶

Not all providers support embeddings. The major embedding providers include:

Provider	Prefix	Example model
OpenAI	`openai/`	`text-embedding-3-small`, `text-embedding-3-large`
Azure	`azure/`	`text-embedding-ada-002`
Cohere	`cohere/`	`embed-english-v3.0`
Voyage AI	`voyage/`	`voyage-3`
Mistral	`mistral/`	`mistral-embed`
Hugging Face	`huggingface/`	Various
Google Vertex AI	`vertex_ai/`	`text-embedding-004`
AWS Bedrock	`bedrock/`	`amazon.titan-embed-text-v2:0`
Ollama	`ollama/`	`nomic-embed-text`
Jina AI	`jina_ai/`	`jina-embeddings-v3`

See the Providers page for the complete capability matrix.

Batch Embeddings¶

Pass multiple strings to embed them in a single request:

response = await client.embed(
    model="openai/text-embedding-3-small",
    input=[
        "First document to embed",
        "Second document to embed",
        "Third document to embed",
    ],
)
for i, item in enumerate(response.data):
    print(f"Document {i}: {len(item.embedding)} dimensions")

Choosing a Model¶

Key considerations when selecting an embedding model:

Factor	Guidance
Dimensions	Higher dimensions capture more nuance but use more storage. OpenAI's `text-embedding-3-small` outputs 1536 dimensions.
Cost	Embedding models are significantly cheaper per token than chat models.
Latency	Local providers (Ollama) have lower latency but may produce lower-quality embeddings.
Quality	Evaluate on your specific retrieval task. MTEB leaderboard is a good starting point.