Embeddings¶
The embed method generates vector embeddings from text input. Embeddings are fixed-length numeric arrays that capture semantic meaning -- useful for search, clustering, and retrieval-augmented generation (RAG).
Basic Usage¶
import asyncio
import os
from liter_llm import LlmClient
async def main() -> None:
client = LlmClient(api_key=os.environ["OPENAI_API_KEY"])
response = await client.embed(
model="openai/text-embedding-3-small",
input=["The quick brown fox jumps over the lazy dog"],
)
print(f"Dimensions: {len(response.data[0].embedding)}")
print(f"First 5 values: {response.data[0].embedding[:5]}")
asyncio.run(main())
import { LlmClient } from "@kreuzberg/liter-llm";
const client = new LlmClient({ apiKey: process.env.OPENAI_API_KEY! });
const response = await client.embed({
model: "openai/text-embedding-3-small",
input: ["The quick brown fox jumps over the lazy dog"],
});
console.log(`Dimensions: ${response.data[0].embedding.length}`);
console.log(`First 5 values: ${response.data[0].embedding.slice(0, 5)}`);
package main
import (
"context"
"fmt"
"os"
llm "github.com/kreuzberg-dev/liter-llm/packages/go"
)
func main() {
client := llm.NewClient(llm.WithAPIKey(os.Getenv("OPENAI_API_KEY")))
resp, err := client.Embed(context.Background(), &llm.EmbeddingRequest{
Model: "openai/text-embedding-3-small",
Input: llm.NewEmbeddingInputMultiple([]string{"The quick brown fox jumps over the lazy dog"}),
})
if err != nil {
panic(err)
}
fmt.Printf("Dimensions: %d\n", len(resp.Data[0].Embedding))
fmt.Printf("First 5 values: %v\n", resp.Data[0].Embedding[:5])
}
Supported Providers¶
Not all providers support embeddings. The major embedding providers include:
| Provider | Prefix | Example model |
|---|---|---|
| OpenAI | openai/ |
text-embedding-3-small, text-embedding-3-large |
| Azure | azure/ |
text-embedding-ada-002 |
| Cohere | cohere/ |
embed-english-v3.0 |
| Voyage AI | voyage/ |
voyage-3 |
| Mistral | mistral/ |
mistral-embed |
| Hugging Face | huggingface/ |
Various |
| Google Vertex AI | vertex_ai/ |
text-embedding-004 |
| AWS Bedrock | bedrock/ |
amazon.titan-embed-text-v2:0 |
| Ollama | ollama/ |
nomic-embed-text |
| Jina AI | jina_ai/ |
jina-embeddings-v3 |
See the Providers page for the complete capability matrix.
Batch Embeddings¶
Pass multiple strings to embed them in a single request:
response = await client.embed(
model="openai/text-embedding-3-small",
input=[
"First document to embed",
"Second document to embed",
"Third document to embed",
],
)
for i, item in enumerate(response.data):
print(f"Document {i}: {len(item.embedding)} dimensions")
Choosing a Model¶
Key considerations when selecting an embedding model:
| Factor | Guidance |
|---|---|
| Dimensions | Higher dimensions capture more nuance but use more storage. OpenAI's text-embedding-3-small outputs 1536 dimensions. |
| Cost | Embedding models are significantly cheaper per token than chat models. |
| Latency | Local providers (Ollama) have lower latency but may produce lower-quality embeddings. |
| Quality | Evaluate on your specific retrieval task. MTEB leaderboard is a good starting point. |