Embeddings

UniLM.jl supports text embeddings via the OpenAI Embeddings API using the text-embedding-3-small model (1536 dimensions) by default.

Basic Usage

using UniLM

# Single text
emb = Embeddings("Julia is a high-performance programming language")
println("Model: ", emb.model)
println("Embedding dimensions: ", length(emb.embeddings))
println("Pre-allocated (all zeros): ", all(x -> x == 0.0, emb.embeddings))

Model: text-embedding-3-small
Embedding dimensions: 1536
Pre-allocated (all zeros): true

After calling the API, the embeddings are filled in-place:

emb = Embeddings("Julia is a high-performance programming language for technical computing.")
embeddingrequest!(emb)
println("First 5 dimensions:")
for v in emb.embeddings[1:5]
    println("  ", round(v, digits=6))
end
println("L2 norm: ", round(sqrt(sum(x^2 for x in emb.embeddings)), digits=4))

First 5 dimensions:
  -0.039459
  -0.009293
  0.001677
  -0.028122
  0.063354
L2 norm: 0.9997

Batch Embeddings

Embed multiple texts in a single API call:

texts = [
    "Julia is fast",
    "Python is popular",
    "Rust is safe"
]

emb = Embeddings(texts)
println("Model: ", emb.model)
println("Number of texts: ", length(emb.input))
println("Embeddings per text: ", length(emb.embeddings[1]), " dimensions")

Model: text-embedding-3-small
Number of texts: 3
Embeddings per text: 1536 dimensions

embeddingrequest!(emb)
println("Embedding dimensions per text: ", length(emb.embeddings[1]))

Embedding dimensions per text: 1536

Computing Similarity

A common use case is computing cosine similarity between embeddings:

using LinearAlgebra

emb = Embeddings(["Julia", "Python", "Rust", "Fortran"])
embeddingrequest!(emb)

sim = dot(emb.embeddings[1], emb.embeddings[4]) /
      (norm(emb.embeddings[1]) * norm(emb.embeddings[4]))
println("Cosine similarity (Julia vs Fortran): ", round(sim, digits=4))

Cosine similarity (Julia vs Fortran): 0.3111

Available Models

Model	Dimensions	Notes
`text-embedding-3-small`	1536	Default, good balance
`text-embedding-3-large`	3072	Higher quality, more dimensions

Note

The default Embeddings constructor pre-allocates for 1536 dimensions (text-embedding-3-small). To use text-embedding-3-large, you would need to adjust the embedding vector size accordingly.

The Embeddings struct pre-allocates the embedding vectors at construction time. embeddingrequest! fills them in-place — no allocation on the hot path. This is idiomatic Julia for performance-sensitive workloads.

emb = Embeddings("test")
println("Pre-allocated length: ", length(emb.embeddings))
println("All zeros before API call: ", all(x -> x == 0.0, emb.embeddings))

Pre-allocated length: 1536
All zeros before API call: true

Retry Behaviour

embeddingrequest! automatically retries on HTTP 429, 500, and 503 errors with exponential backoff and jitter (up to 30 attempts, max 60s delay). On 429 responses, the Retry-After header is respected.

API Reference

See the Embeddings API page for full type documentation.