Skip to content

Embeddings

The embed method generates vector embeddings from text input. Embeddings are fixed-length numeric arrays that capture semantic meaning -- useful for search, clustering, and retrieval-augmented generation (RAG).

Basic Usage

import asyncio
import os
from liter_llm import LlmClient

async def main() -> None:
    client = LlmClient(api_key=os.environ["OPENAI_API_KEY"])
    response = await client.embed(
        model="openai/text-embedding-3-small",
        input=["The quick brown fox jumps over the lazy dog"],
    )
    print(f"Dimensions: {len(response.data[0].embedding)}")
    print(f"First 5 values: {response.data[0].embedding[:5]}")

asyncio.run(main())
import { LlmClient } from "@kreuzberg/liter-llm";

const client = new LlmClient({ apiKey: process.env.OPENAI_API_KEY! });
const response = await client.embed({
  model: "openai/text-embedding-3-small",
  input: ["The quick brown fox jumps over the lazy dog"],
});
console.log(`Dimensions: ${response.data[0].embedding.length}`);
console.log(`First 5 values: ${response.data[0].embedding.slice(0, 5)}`);
package main

import (
 "context"
 "fmt"
 "os"

 llm "github.com/kreuzberg-dev/liter-llm/packages/go"
)

func main() {
 client := llm.NewClient(llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")))
 resp, err := client.Embed(context.Background(), &llm.EmbeddingRequest{
  Model: "openai/text-embedding-3-small",
  Input: llm.NewEmbeddingInputMultiple([]string{"The quick brown fox jumps over the lazy dog"}),
 })
 if err != nil {
  panic(err)
 }
 fmt.Printf("Dimensions: %d\n", len(resp.Data[0].Embedding))
 fmt.Printf("First 5 values: %v\n", resp.Data[0].Embedding[:5])
}

Supported Providers

Not all providers support embeddings. The major embedding providers include:

Provider Prefix Example model
OpenAI openai/ text-embedding-3-small, text-embedding-3-large
Azure azure/ text-embedding-ada-002
Cohere cohere/ embed-english-v3.0
Voyage AI voyage/ voyage-3
Mistral mistral/ mistral-embed
Hugging Face huggingface/ Various
Google Vertex AI vertex_ai/ text-embedding-004
AWS Bedrock bedrock/ amazon.titan-embed-text-v2:0
Ollama ollama/ nomic-embed-text
Jina AI jina_ai/ jina-embeddings-v3

See the Providers page for the complete capability matrix.

Batch Embeddings

Pass multiple strings to embed them in a single request:

response = await client.embed(
    model="openai/text-embedding-3-small",
    input=[
        "First document to embed",
        "Second document to embed",
        "Third document to embed",
    ],
)
for i, item in enumerate(response.data):
    print(f"Document {i}: {len(item.embedding)} dimensions")

Choosing a Model

Key considerations when selecting an embedding model:

Factor Guidance
Dimensions Higher dimensions capture more nuance but use more storage. OpenAI's text-embedding-3-small outputs 1536 dimensions.
Cost Embedding models are significantly cheaper per token than chat models.
Latency Local providers (Ollama) have lower latency but may produce lower-quality embeddings.
Quality Evaluate on your specific retrieval task. MTEB leaderboard is a good starting point.