Getting Started

Prerequisites

Julia 1.12+ (as specified in Project.toml)
An API key for your chosen provider (OpenAI, DeepSeek, Gemini, Mistral, etc.) — or none at all for local providers like Ollama

Installation

using Pkg
Pkg.add(url="https://github.com/algunion/UniLM.jl")

Configuration

UniLM.jl reads API credentials from environment variables. Set them before making any requests:

OpenAI (default)

ENV["OPENAI_API_KEY"] = "sk-..."

Or via your shell:

export OPENAI_API_KEY="sk-..."

Azure OpenAI

export AZURE_OPENAI_BASE_URL="https://your-resource.openai.azure.com"
export AZURE_OPENAI_API_KEY="your-key"
export AZURE_OPENAI_API_VERSION="2024-02-01"
export AZURE_OPENAI_DEPLOY_NAME_GPT_5_2="your-gpt52-deployment"

Google Gemini

export GEMINI_API_KEY="your-gemini-key"

DeepSeek

export DEEPSEEK_API_KEY="sk-..."

Ollama (local — no key needed)

Just have the Ollama server running on localhost:11434. No API key required.

Your First Request

Using the Responses API

The simplest way to get started — one function call:

result = respond("Explain Julia's type system in 3 bullet points")
if result isa ResponseSuccess
    println(output_text(result))
else
    println("Request failed — ", output_text(result))
end

- **Dynamic, but strongly typed with multiple dispatch:** Every value in Julia has a concrete type determined at runtime (no implicit type changes), and functions are selected via *multiple dispatch* based on the types of all arguments, enabling both generic and highly specialized code.

- **Parametric, hierarchical types (with abstract & concrete types):** Types form a lattice under `Any`, with **abstract types** for organizing behavior and **concrete types** for actual values. **Parametric types** (e.g., `Array{T,N}`, `Dict{K,V}`) let you write reusable, type-safe containers and methods.

- **Performance via type inference and specialization:** Julia’s compiler uses type inference to generate specialized machine code for concrete argument types. Keeping code *type-stable* (functions return consistent types) helps Julia reach near-C performance while still writing high-level code.

Using Chat Completions

For stateful, multi-turn conversations:

chat = Chat(model="gpt-4o-mini")
push!(chat, Message(Val(:system), "You are a concise Julia programming tutor."))
push!(chat, Message(Val(:user), "What is multiple dispatch? Answer in 2-3 sentences."))
result = chatrequest!(chat)
if result isa LLMSuccess
    println(result.message.content)
    println("\nFinish reason: ", result.message.finish_reason)
    println("Conversation length: ", length(chat))
else
    println("Request failed — see result for details")
end

Multiple dispatch is a programming paradigm primarily used in languages like Julia, where the method that gets executed is determined by the types of all arguments at runtime, rather than just the type of a single object. This allows for more flexible and efficient code, as developers can define specialized behavior for different combinations of argument types, enhancing both code clarity and performance.

Finish reason: stop
Conversation length: 3

Generating Images

result = generate_image(
    "A watercolor painting of a friendly robot reading a Julia programming book",
    size="1024x1024", quality="medium"
)
if result isa ImageSuccess
    println("Success: true")
    println("Images: ", length(image_data(result)))
else
    println("Success: false")
    println("Images: 0")
end

Success: true
Images: 1

Using Keyword Arguments

For one-shot requests without managing Chat objects:

result = chatrequest!(
    systemprompt="You are a calculator. Respond only with the number.",
    userprompt="What is 42 * 17?",
    model="gpt-4o-mini",
    temperature=0.0
)
if result isa LLMSuccess
    println(result.message.content)
else
    println("Request failed — see result for details")
end

Handling Results

All API calls return subtypes of LLMRequestResponse. Use Julia's pattern matching:

using UniLM
using InteractiveUtils

# Construct a chat to show the result type hierarchy
chat = Chat()
push!(chat, Message(Val(:system), "You are helpful."))
push!(chat, Message(Val(:user), "Hello!"))

# Show the type hierarchy:
println("LLMRequestResponse subtypes:")
for T in subtypes(UniLM.LLMRequestResponse)
    println("  ", T)
end

LLMRequestResponse subtypes:
  FIMCallError
  FIMFailure
  FIMSuccess
  ImageCallError
  ImageFailure
  ImageSuccess
  LLMCallError
  LLMFailure
  LLMSuccess
  ResponseCallError
  ResponseFailure
  ResponseSuccess

result = chatrequest!(chat)

if result isa LLMSuccess
    println("Assistant: ", result.message.content)
    println("Finish reason: ", result.message.finish_reason)
elseif result isa LLMFailure
    @warn "API returned HTTP $(result.status): $(result.response)"
elseif result isa LLMCallError
    @error "Call failed: $(result.error)"
end

Assistant: Hello! How can I help you today?
Finish reason: stop

For the Responses API:

result = respond("Hello!")

if result isa ResponseSuccess
    println(output_text(result))
    println("Status: ", result.response.status)
    println("Model: ", result.response.model)
elseif result isa ResponseFailure
    @warn "HTTP $(result.status)"
elseif result isa ResponseCallError
    @error result.error
end

Hello! How can I help you today?
Status: completed
Model: gpt-5.2-2025-12-11

What's Next?

Want to...	Read...
Build multi-turn conversations	Chat Completions Guide
Use the newer Responses API	Responses API Guide
Generate images from prompts	Image Generation Guide
Call functions from the model	Tool Calling Guide
Stream tokens in real-time	Streaming Guide
Get structured JSON output	Structured Output Guide
Use any provider	Multi-Backend Guide