Multi-Backend Support
UniLM.jl supports multiple LLM service backends through the ServiceEndpoint type hierarchy. Switching backends requires only changing the service parameter.
Available Backends
| Backend | Type | Env Variables |
|---|---|---|
| OpenAI (default) | OPENAIServiceEndpoint | OPENAI_API_KEY |
| Azure OpenAI | AZUREServiceEndpoint | AZURE_OPENAI_BASE_URL, AZURE_OPENAI_API_KEY, AZURE_OPENAI_API_VERSION |
| Google Gemini | GEMINIServiceEndpoint | GEMINI_API_KEY |
| DeepSeek | DeepSeekEndpoint | DEEPSEEK_API_KEY |
| Mistral | MistralEndpoint | MISTRAL_API_KEY |
| Ollama (local) | OllamaEndpoint | (none) |
| Generic | GenericOpenAIEndpoint | (passed to constructor) |
OpenAI (Default)
using UniLM
using JSON
# OpenAI is the default — no need to specify service
chat = Chat(model="gpt-5.2")
println("Service: ", chat.service)
println("Model: ", chat.model)Service: OPENAIServiceEndpoint
Model: gpt-5.2Azure OpenAI
# Set environment variables
ENV["AZURE_OPENAI_BASE_URL"] = "https://your-resource.openai.azure.com"
ENV["AZURE_OPENAI_API_KEY"] = "your-key"
ENV["AZURE_OPENAI_API_VERSION"] = "2024-02-01"
ENV["AZURE_OPENAI_DEPLOY_NAME_GPT_5_2"] = "your-gpt52-deployment"
# Use Azure
chat = Chat(service=AZUREServiceEndpoint, model="gpt-5.2")
push!(chat, Message(Val(:system), "Hello from Azure!"))
push!(chat, Message(Val(:user), "Hi!"))
result = chatrequest!(chat)Custom Deployment Names
If your Azure deployment has a custom name:
UniLM.add_azure_deploy_name!("my-custom-model", "my-deployment-name")
println("Registered deployments: ", collect(keys(UniLM._MODEL_ENDPOINTS_AZURE_OPENAI)))
delete!(UniLM._MODEL_ENDPOINTS_AZURE_OPENAI, "my-custom-model") # cleanupDict{String, String} with 1 entry:
"gpt-5.2" => "/openai/deployments/my-gpt52-deploy"Google Gemini
ENV["GEMINI_API_KEY"] = "your-gemini-key"
chat = Chat(service=GEMINIServiceEndpoint, model="gemini-2.5-flash")
push!(chat, Message(Val(:system), "You are a helpful assistant."))
push!(chat, Message(Val(:user), "Hello!"))
result = chatrequest!(chat)Available Gemini models:
"gemini-2.5-flash""gemini-2.5-pro"
Responses API Backend
The Responses API also supports the service parameter:
r = Respond(
service=UniLM.OPENAIServiceEndpoint,
model="gpt-5.2",
input="Hello!",
)
println("Service: ", r.service)
println("Model: ", r.model)Service: OPENAIServiceEndpoint
Model: gpt-5.2OpenAI-Compatible Providers (Generic Endpoint)
Any provider that implements the OpenAI-compatible /v1/chat/completions endpoint can be used with GenericOpenAIEndpoint. This includes Ollama, vLLM, LM Studio, Mistral, and many others.
Ollama (local)
ep = OllamaEndpoint() # defaults to http://localhost:11434
chat = Chat(service=ep, model="llama3.1")
println("URL: ", UniLM.get_url(chat))URL: http://localhost:11434/v1/chat/completionsMistral
chat = Chat(service=MistralEndpoint(), model="mistral-large-latest")
result = chatrequest!(chat)DeepSeek
chat = Chat(service=DeepSeekEndpoint(), model="deepseek-chat") # V3.2
chat = Chat(service=DeepSeekEndpoint(), model="deepseek-reasoner") # V3.2 thinking modevLLM / LM Studio
# vLLM
chat = Chat(service=GenericOpenAIEndpoint("http://localhost:8000", ""), model="meta-llama/Llama-3.1-8B")
# LM Studio
chat = Chat(service=GenericOpenAIEndpoint("http://localhost:1234", ""), model="loaded-model")Anthropic (compatibility layer)
Anthropic provides an OpenAI-compatible endpoint for evaluation purposes. Note: Anthropic considers this "not a long-term or production-ready solution" — features like response_format and strict are ignored.
chat = Chat(
service=GenericOpenAIEndpoint("https://api.anthropic.com/v1", ENV["ANTHROPIC_API_KEY"]),
model="claude-sonnet-4-6"
)Custom Provider
ep = GenericOpenAIEndpoint("https://my-llm-server.example.com", "sk-my-key")
chat = Chat(service=ep, model="my-model")
println("URL: ", UniLM.get_url(chat))
println("Has auth: ", any(p -> p.first == "Authorization", UniLM.auth_header(ep)))URL: https://my-llm-server.example.com/v1/chat/completions
Has auth: trueEmbeddings with Generic Endpoint
Embeddings also support the service parameter:
emb = Embeddings("test"; service=OllamaEndpoint(), model="nomic-embed-text")
println("URL: ", UniLM.get_url(emb))URL: http://localhost:11434/v1/embeddingsAPI Compatibility Tiers
| API Surface | Standard Status | Supported Providers |
|---|---|---|
| Chat Completions | De facto standard | OpenAI, Azure, Gemini, Mistral, DeepSeek, Ollama, vLLM, LM Studio, Anthropic* |
| Embeddings | Widely adopted | OpenAI, Gemini, Mistral, Ollama, vLLM |
| Responses API | Emerging (Open Responses) | OpenAI, Ollama, vLLM, Amazon Bedrock |
| FIM Completion | Provider-specific | DeepSeek (beta), Ollama, vLLM |
| Image Generation | Limited | OpenAI, Gemini, Ollama |
*Anthropic compat layer is not production-recommended by Anthropic.
Querying Provider Capabilities
Use has_capability to check what a provider supports before making requests:
for (name, svc) in [
("OpenAI", OPENAIServiceEndpoint),
("DeepSeek", DeepSeekEndpoint("k")),
("Ollama", OllamaEndpoint())
]
caps = join(sort(collect(provider_capabilities(svc))), ", ")
println("$name: $caps")
endOpenAI: chat, embeddings, images, json_output, responses, tools
DeepSeek: chat, fim, json_output, prefix_completion, tools
Ollama: chat, embeddings, fim, responses, tools# Check specific capabilities
println("DeepSeek FIM: ", has_capability(DeepSeekEndpoint("k"), :fim))
println("OpenAI FIM: ", has_capability(OPENAIServiceEndpoint, :fim))DeepSeek FIM: true
OpenAI FIM: falseSee Also
ServiceEndpoint,GenericOpenAIEndpoint— endpoint typesOllamaEndpoint,MistralEndpoint— convenience constructorsOPENAIServiceEndpoint,AZUREServiceEndpoint,GEMINIServiceEndpoint— built-in backends