Chat Completions

The Chat Completions API is the standard way to talk to LLM providers. UniLM.jl wraps it in a type-safe, stateful Chat object that tracks conversation history automatically — and works with every supported backend (OpenAI, DeepSeek, Ollama, Gemini, Mistral, and more).

Creating a Chat

chat = Chat(
    model="gpt-5.2",        # model name
    temperature=0.7,        # sampling temperature
)
println("Model: ", chat.model)
println("Messages: ", length(chat))

Model: gpt-5.2
Messages: 0

All parameters are optional with sensible defaults. See Chat for the full list.

Building Conversations

Messages are added with push!. UniLM.jl enforces conversation structure at the type level — you cannot create invalid message sequences:

# System message must come first
push!(chat, Message(Val(:system), "You are a helpful Julia programming tutor."))

# Then user messages
push!(chat, Message(Val(:user), "What are parametric types?"))

println("Conversation length: ", length(chat))
println("First message role: ", chat[1].role)
println("Last message role: ", chat[end].role)

Conversation length: 2
First message role: system
Last message role: user

The convenience Val(:system) and Val(:user) constructors keep things concise. You can also use the keyword constructor:

chat2 = Chat()
push!(chat2, Message(role="system", content="Be helpful"))
push!(chat2, Message(role="user", content="Tell me more"))
println("chat2 length: ", length(chat2))

chat2 length: 2

Conversation Rules

The first message must have role system
Messages must alternate roles (no two consecutive messages from the same role)
At least content, tool_calls, or refusal_message must be non-nothing
Attempting to violate these rules logs a warning and the message is not added

# Demonstrate validation
chat3 = Chat()
push!(chat3, Message(Val(:system), "sys"))
push!(chat3, Message(Val(:user), "hello"))
push!(chat3, Message(Val(:user), "hello again"))  # rejected — same role
println("Length after duplicate push: ", length(chat3), " (second user msg rejected)")

┌ Warning: Cannot add message Message("user", "hello again", nothing, nothing, nothing, nothing, nothing) to conversation: Chat(OPENAIServiceEndpoint, "gpt-5.2", Message[Message("system", "sys", nothing, nothing, nothing, nothing, nothing), Message("user", "hello", nothing, nothing, nothing, nothing, nothing)], true, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, Base.RefValue{Float64}(0.0))
└ @ UniLM ~/work/UniLM.jl/UniLM.jl/src/api.jl:581
Length after duplicate push: 2 (second user msg rejected)

Sending Requests

result = chatrequest!(chat)

The ! suffix is a Julia convention — chatrequest! mutates chat by appending the assistant's response to the message history (when history=true).

Result Handling

result = chatrequest!(chat)
if result isa LLMSuccess
    println(result.message.content)
    println("\nFinish reason: ", result.message.finish_reason)
    println("Conversation length: ", length(chat))
else
    println("Request failed — see result for details")
end

In Julia, **parametric types** are types that are **parameterized by one or more values (usually types)**. They let you write generic, reusable code while still keeping strong type information for performance and correctness.

## 1) Parametric composite types (structs)
You can define a type with a parameter `T`:

```julia
struct Box{T}
    value::T
end
```

- `Box{T}` is a *family* of types.
- `Box{Int}` and `Box{String}` are different concrete types.

Example:

```julia
b1 = Box(1)        # Box{Int64}
b2 = Box("hi")     # Box{String}
```

## 2) Why they matter
### Performance
Parametric types allow Julia to generate specialized, fast code:

- `Box{Int}` can be stored and handled efficiently because the compiler knows `value` is always an `Int`.
- Compare with an abstractly-typed field (slower and less precise):

```julia
struct BadBox
    value::Any
end
```

### Expressiveness / correctness
They encode relationships in types:

```julia
struct PairBox{A,B}
    a::A
    b::B
end
```

## 3) Parametric abstract types
Abstract types can also be parameterized:

```julia
abstract type AbstractVectorLike{T} end
```

This is common in Julia’s standard library (e.g., `AbstractArray{T,N}`).

## 4) Parametric methods (related concept)
Functions can also be parameterized using `where`:

```julia
same_type(x::T, y::T) where {T} = true
same_type(x, y) = false
```

Here `T` is a type parameter for the method.

## 5) Parameters can be non-types too
Type parameters can be values that are “compile-time constants”, often integers:

```julia
struct Tensor{T,N}
    data::Array{T,N}
end
```

`N` is the number of dimensions, so `Tensor{Float64,2}` differs from `Tensor{Float64,3}`.

---

If you want, I can show how parametric types relate to `UnionAll`, concrete vs abstract types, and how to choose good type parameters for performance.

Finish reason: stop
Conversation length: 3

One-Shot Requests via Keywords

Skip the Chat object entirely for simple one-off requests:

result = chatrequest!(
    systemprompt="You are a calculator. Respond only with the number.",
    userprompt="What is 42 * 17?",
    model="gpt-4o-mini",
    temperature=0.0
)
if result isa LLMSuccess
    println(result.message.content)
else
    println("Request failed — see result for details")
end

Multi-Turn Conversations

Because chatrequest! appends the response, you can keep chatting:

chat = Chat(model="gpt-4o-mini")
push!(chat, Message(Val(:system), "You are a concise Julia programming tutor."))
push!(chat, Message(Val(:user), "What is multiple dispatch? Answer in 2-3 sentences."))
result = chatrequest!(chat)
if result isa LLMSuccess
    println(result.message.content)
else
    println("Request failed — see result for details")
end

Multiple dispatch is a programming paradigm where function behavior is determined by the types of all its arguments, rather than just the type of a single object. This allows for more flexible and expressive code, enabling methods to be defined for combinations of argument types. In Julia, multiple dispatch is a core feature, allowing for greater code reusability and modularity.

push!(chat, Message(Val(:user), "Give a short Julia code example of it."))
result = chatrequest!(chat)
if result isa LLMSuccess
    println(result.message.content)
    println("\nConversation length: ", length(chat))
else
    println("Request failed — see result for details")
end

Sure! Here's a simple example of multiple dispatch in Julia:

```julia
# Define a function for different types of input
function greet(name::String)
    println("Hello, $name!")
end

function greet(age::Int)
    println("You are $age years old!")
end

# Call the function with different argument types
greet("Alice")  # Outputs: Hello, Alice!
greet(30)       # Outputs: You are 30 years old!
```

In this example, the `greet` function is defined twice with different argument types, showcasing how Julia uses multiple dispatch to choose the appropriate method based on the type of the argument passed.

Conversation length: 5

Checking Conversation Validity

println("Is chat valid? ", issendvalid(chat))  # true — system + user

empty_chat = Chat()
println("Is empty chat valid? ", issendvalid(empty_chat))  # false

Is chat valid? false
Is empty chat valid? false

This checks:

At least 2 messages
First message is system
Last message is user
No consecutive same-role messages

Models

UniLM.jl works with any model name string. Common choices:

Model	Usage
`"gpt-5.2"`	Best quality (default)
`"gpt-4o-mini"`	Fast and cheap
`"gpt-4.1-mini"`	Balanced performance
`"o3"`	Extended reasoning
`"o4-mini"`	Fast reasoning

Using Other Providers

Pass a service to target any supported backend:

# DeepSeek
chat = Chat(service=DeepSeekEndpoint(), model="deepseek-chat")

# Ollama (local)
chat = Chat(service=OllamaEndpoint(), model="llama3.1")

# Mistral
chat = Chat(service=MistralEndpoint(), model="mistral-large-latest")

See the Multi-Backend Guide for the full list of providers and configuration.

JSON Serialization

The Chat object serializes cleanly to JSON for the API:

println(JSON.json(chat))

{"messages":[{"role":"system","content":"You are a concise Julia programming tutor."},{"role":"user","content":"What is multiple dispatch? Answer in 2-3 sentences."},{"role":"assistant","content":"Multiple dispatch is a programming paradigm where function behavior is determined by the types of all its arguments, rather than just the type of a single object. This allows for more flexible and expressive code, enabling methods to be defined for combinations of argument types. In Julia, multiple dispatch is a core feature, allowing for greater code reusability and modularity.","finish_reason":"stop"},{"role":"user","content":"Give a short Julia code example of it."},{"role":"assistant","content":"Sure! Here's a simple example of multiple dispatch in Julia:\n\n```julia\n# Define a function for different types of input\nfunction greet(name::String)\n    println(\"Hello, $name!\")\nend\n\nfunction greet(age::Int)\n    println(\"You are $age years old!\")\nend\n\n# Call the function with different argument types\ngreet(\"Alice\")  # Outputs: Hello, Alice!\ngreet(30)       # Outputs: You are 30 years old!\n```\n\nIn this example, the `greet` function is defined twice with different argument types, showcasing how Julia uses multiple dispatch to choose the appropriate method based on the type of the argument passed.","finish_reason":"stop"}],"model":"gpt-4o-mini"}

Retry Behaviour

chatrequest! automatically retries on HTTP 429, 500, and 503 errors with exponential backoff and jitter (up to 30 attempts, max 60s delay). On 429 responses, the Retry-After header is respected. This is transparent and requires no configuration.

Parameter Validation

The Chat constructor validates parameter ranges at construction time:

Parameter	Valid Range
`temperature`	0.0–2.0
`top_p`	0.0–1.0
`n`	1–10
`presence_penalty`	-2.0–2.0
`frequency_penalty`	-2.0–2.0

Out-of-range values throw ArgumentError. Additionally, temperature and top_p are mutually exclusive.