Service Endpoints

Types for configuring multi-backend service endpoints.

Abstract Type

UniLM.ServiceEndpointType
ServiceEndpoint

Abstract supertype for LLM service backends. Subtypes control URL routing and authentication.

Built-in subtypes:

  • OPENAIServiceEndpoint — OpenAI API (default)
  • AZUREServiceEndpoint — Azure OpenAI Service
  • GEMINIServiceEndpoint — Google Gemini via OpenAI-compatible endpoint
  • GenericOpenAIEndpoint — any OpenAI-compatible provider (Ollama, Mistral, vLLM, etc.)
source

Built-in Endpoints

UniLM.AZUREServiceEndpointType

Azure OpenAI Service endpoint. Requires AZURE_OPENAI_BASE_URL, AZURE_OPENAI_API_KEY, and AZURE_OPENAI_API_VERSION env variables.

source

Generic Endpoint

UniLM.GenericOpenAIEndpointType
GenericOpenAIEndpoint <: ServiceEndpoint

Configurable endpoint for any OpenAI-compatible API provider. Supports Chat Completions, Embeddings, and (where the provider implements it) the Responses API.

Fields

  • base_url::String: Base URL without trailing slash (e.g., "http://localhost:11434")
  • api_key::String: API key for Bearer auth. Use "" for local servers with no auth.

Example

# Ollama (local)
chat = Chat(service=GenericOpenAIEndpoint("http://localhost:11434", ""), model="llama3.1")

# Mistral
chat = Chat(service=GenericOpenAIEndpoint("https://api.mistral.ai", ENV["MISTRAL_API_KEY"]),
            model="mistral-large-latest")
source
UniLM.ServiceEndpointSpecType
ServiceEndpointSpec

Type alias accepting both marker types (OPENAIServiceEndpoint) and instances (GenericOpenAIEndpoint(...)). Used as the type of service fields.

source
UniLM.OllamaEndpointFunction
OllamaEndpoint(; base_url="http://localhost:11434") -> GenericOpenAIEndpoint

Pre-configured endpoint for Ollama local server.

source
UniLM.DeepSeekEndpointType
DeepSeekEndpoint <: ServiceEndpoint

Pre-configured endpoint for DeepSeek API. Supports chat completions, tool calling, FIM completion, and prefix completion.

FIM and prefix completion use the beta base URL (https://api.deepseek.com/beta).

source

Configuration

Each endpoint reads its configuration from environment variables:

OpenAI (default)

VariableDescription
OPENAI_API_KEYYour OpenAI API key

Azure OpenAI

VariableDescription
AZURE_OPENAI_BASE_URLAzure endpoint base URL
AZURE_OPENAI_API_KEYAzure API key
AZURE_OPENAI_API_VERSIONAPI version (e.g. 2024-12-01-preview)
AZURE_OPENAI_DEPLOY_NAME_GPT_5_2Deployment name for gpt-5.2

Google Gemini

VariableDescription
GEMINI_API_KEYYour Gemini API key

Azure Deployment Mapping

Azure requires model-to-deployment name mappings. Use add_azure_deploy_name! to register custom mappings:

using UniLM

# Register a custom deployment for a specific model
UniLM.add_azure_deploy_name!("gpt-5.2", "my-gpt52-deploy")
println("Registered deployment: ", UniLM._MODEL_ENDPOINTS_AZURE_OPENAI["gpt-5.2"])
Registered deployment: /openai/deployments/my-gpt52-deploy

Selecting a Backend

Pass the service keyword to any request constructor:

chat = Chat(service=UniLM.AZUREServiceEndpoint, model="gpt-5.2")
println("Service: ", chat.service)
println("Model: ", chat.model)
Service: AZUREServiceEndpoint
Model: gpt-5.2