Multi-Backend Support

UniLM.jl supports multiple LLM service backends through the ServiceEndpoint type hierarchy. Switching backends requires only changing the service parameter.

Available Backends

Backend	Type	Env Variables
OpenAI (default)	`OPENAIServiceEndpoint`	`OPENAI_API_KEY`
Azure OpenAI	`AZUREServiceEndpoint`	`AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_API_VERSION`
Google Gemini	`GEMINIServiceEndpoint`	`GEMINI_API_KEY`
DeepSeek	`DeepSeekEndpoint`	`DEEPSEEK_API_KEY`
Mistral	`MistralEndpoint`	`MISTRAL_API_KEY`
Ollama (local)	`OllamaEndpoint`	(none)
Generic	`GenericOpenAIEndpoint`	(passed to constructor)

OpenAI (Default)

using UniLM
using JSON

# OpenAI is the default — no need to specify service
chat = Chat(model="gpt-5.2")
println("Service: ", chat.service)
println("Model: ", chat.model)

Service: OPENAIServiceEndpoint
Model: gpt-5.2

Azure OpenAI

# Set environment variables
ENV["AZURE_OPENAI_BASE_URL"] = "https://your-resource.openai.azure.com"
ENV["AZURE_OPENAI_API_KEY"] = "your-key"
ENV["AZURE_OPENAI_API_VERSION"] = "2024-02-01"
ENV["AZURE_OPENAI_DEPLOY_NAME_GPT_5_2"] = "your-gpt52-deployment"

# Use Azure
chat = Chat(service=AZUREServiceEndpoint, model="gpt-5.2")
push!(chat, Message(Val(:system), "Hello from Azure!"))
push!(chat, Message(Val(:user), "Hi!"))
result = chatrequest!(chat)

Custom Deployment Names

If your Azure deployment has a custom name:

UniLM.add_azure_deploy_name!("my-custom-model", "my-deployment-name")
println("Registered deployments: ", collect(keys(UniLM._MODEL_ENDPOINTS_AZURE_OPENAI)))
delete!(UniLM._MODEL_ENDPOINTS_AZURE_OPENAI, "my-custom-model")  # cleanup

Dict{String, String} with 1 entry:
  "gpt-5.2" => "/openai/deployments/my-gpt52-deploy"

Google Gemini

ENV["GEMINI_API_KEY"] = "your-gemini-key"

chat = Chat(service=GEMINIServiceEndpoint, model="gemini-2.5-flash")
push!(chat, Message(Val(:system), "You are a helpful assistant."))
push!(chat, Message(Val(:user), "Hello!"))
result = chatrequest!(chat)

Available Gemini models:

"gemini-2.5-flash"
"gemini-2.5-pro"

Responses API Backend

The Responses API also supports the service parameter:

r = Respond(
    service=UniLM.OPENAIServiceEndpoint,
    model="gpt-5.2",
    input="Hello!",
)
println("Service: ", r.service)
println("Model: ", r.model)

Service: OPENAIServiceEndpoint
Model: gpt-5.2

OpenAI-Compatible Providers (Generic Endpoint)

Any provider that implements the OpenAI-compatible /v1/chat/completions endpoint can be used with GenericOpenAIEndpoint. This includes Ollama, vLLM, LM Studio, Mistral, and many others.

Ollama (local)

ep = OllamaEndpoint()  # defaults to http://localhost:11434
chat = Chat(service=ep, model="llama3.1")
println("URL: ", UniLM.get_url(chat))

URL: http://localhost:11434/v1/chat/completions

Mistral

chat = Chat(service=MistralEndpoint(), model="mistral-large-latest")
result = chatrequest!(chat)

DeepSeek

chat = Chat(service=DeepSeekEndpoint(), model="deepseek-chat")       # V3.2
chat = Chat(service=DeepSeekEndpoint(), model="deepseek-reasoner")   # V3.2 thinking mode

vLLM / LM Studio

# vLLM
chat = Chat(service=GenericOpenAIEndpoint("http://localhost:8000", ""), model="meta-llama/Llama-3.1-8B")

# LM Studio
chat = Chat(service=GenericOpenAIEndpoint("http://localhost:1234", ""), model="loaded-model")

Anthropic (compatibility layer)

Anthropic provides an OpenAI-compatible endpoint for evaluation purposes. Note: Anthropic considers this "not a long-term or production-ready solution" — features like response_format and strict are ignored.

chat = Chat(
    service=GenericOpenAIEndpoint("https://api.anthropic.com/v1", ENV["ANTHROPIC_API_KEY"]),
    model="claude-sonnet-4-6"
)

Custom Provider

ep = GenericOpenAIEndpoint("https://my-llm-server.example.com", "sk-my-key")
chat = Chat(service=ep, model="my-model")
println("URL: ", UniLM.get_url(chat))
println("Has auth: ", any(p -> p.first == "Authorization", UniLM.auth_header(ep)))

URL: https://my-llm-server.example.com/v1/chat/completions
Has auth: true

Embeddings with Generic Endpoint

Embeddings also support the service parameter:

emb = Embeddings("test"; service=OllamaEndpoint(), model="nomic-embed-text")
println("URL: ", UniLM.get_url(emb))

URL: http://localhost:11434/v1/embeddings

API Compatibility Tiers

API Surface	Standard Status	Supported Providers
Chat Completions	De facto standard	OpenAI, Azure, Gemini, Mistral, DeepSeek, Ollama, vLLM, LM Studio, Anthropic*
Embeddings	Widely adopted	OpenAI, Gemini, Mistral, Ollama, vLLM
Responses API	Emerging (Open Responses)	OpenAI, Ollama, vLLM, Amazon Bedrock
FIM Completion	Provider-specific	DeepSeek (beta), Ollama, vLLM
Image Generation	Limited	OpenAI, Gemini, Ollama

*Anthropic compat layer is not production-recommended by Anthropic.

Querying Provider Capabilities

Use has_capability to check what a provider supports before making requests:

for (name, svc) in [
    ("OpenAI", OPENAIServiceEndpoint),
    ("DeepSeek", DeepSeekEndpoint("k")),
    ("Ollama", OllamaEndpoint())
]
    caps = join(sort(collect(provider_capabilities(svc))), ", ")
    println("$name: $caps")
end

OpenAI: chat, embeddings, images, json_output, responses, tools
DeepSeek: chat, fim, json_output, prefix_completion, tools
Ollama: chat, embeddings, fim, responses, tools

# Check specific capabilities
println("DeepSeek FIM: ", has_capability(DeepSeekEndpoint("k"), :fim))
println("OpenAI FIM: ", has_capability(OPENAIServiceEndpoint, :fim))

DeepSeek FIM: true
OpenAI FIM: false