Responses API

Types and functions for the Responses API — the newer, more flexible alternative to Chat Completions.

Request Type

UniLM.Respond — Type

Respond(; model="gpt-5.2", input, kwargs...)

Configuration struct for an OpenAI Responses API request.

Key Fields

model::String: Model to use (default: "gpt-5.2")
input::Any: A String or Vector{InputMessage} — the prompt input
instructions::String: System-level instructions
tools::Vector: Available tools (FunctionTool, WebSearchTool, FileSearchTool)
previous_response_id::String: Chain to a previous response for multi-turn
reasoning::Reasoning: Reasoning config for O-series models
text::TextConfig: Output format (text, json, json_schema)
temperature::Float64: Sampling temperature (0.0–2.0), mutually exclusive with top_p
top_p::Float64: Nucleus sampling (0.0–1.0), mutually exclusive with temperature
max_output_tokens::Int64: Max tokens in the response
stream::Bool: Enable streaming
truncation::String: "auto" or "disabled"
store::Bool: Whether to store the response for later retrieval
metadata::AbstractDict: Arbitrary metadata to attach
user::String: End-user identifier

Examples

# Simple text
Respond(input="Tell me a joke")

# With instructions
Respond(input="Translate to French: Hello", instructions="You are a translator")

# Multi-turn via chaining
Respond(input="Tell me more", previous_response_id="resp_abc123")

# With tools
Respond(
    input="What's the weather in NYC?",
    tools=ResponseTool[function_tool("get_weather", "Get weather", parameters=Dict("type"=>"object", "properties"=>Dict("location"=>Dict("type"=>"string"))))]
)

# Reasoning (O-series models)
Respond(input="Solve this math problem...", model="o3", reasoning=Reasoning(effort="high"))

source

Construction

using UniLM
using JSON

# Simple text request
r = Respond(input="Tell me a joke")
println("Model: ", r.model)
println("Input: ", r.input)

# With instructions and tools
r2 = Respond(
    input="What's the weather in Paris?",
    instructions="You are a helpful weather assistant",
    tools=[web_search()]
)
println("Has instructions: ", !isnothing(r2.instructions))
println("Tools: ", length(r2.tools))

Model: gpt-5.2
Input: Tell me a joke
Has instructions: true
Tools: 1

Response Object

UniLM.ResponseObject — Type

ResponseObject

Parsed response from the Responses API.

Accessors

output_text(r) — extract concatenated text output
function_calls(r) — extract function call outputs
r.id, r.status, r.model — basic metadata
r.output — full output array (raw dicts)
r.usage — token usage info
r.raw — the complete raw JSON dict

source

Result Types

UniLM.ResponseSuccess — Type

ResponseSuccess <: LLMRequestResponse

Successful response from the Responses API. Access the parsed response via .response.

source

UniLM.ResponseFailure — Type

ResponseFailure <: LLMRequestResponse

HTTP-level failure from the Responses API. Contains the response body and status code.

source

UniLM.ResponseCallError — Type

ResponseCallError <: LLMRequestResponse

Exception-level error during a Responses API call (network, parsing, etc.).

source

Accessor Functions

UniLM.output_text — Function

output_text(r::ResponseObject)::String
output_text(r::ResponseSuccess)::String

Extract the concatenated text output from a response.

Examples

result = respond("Hello!")
output_text(result)  # => "Hi there! How can I help?"

source

UniLM.function_calls — Function

function_calls(r::ResponseObject)::Vector{Dict{String,Any}}
function_calls(r::ResponseSuccess)::Vector{Dict{String,Any}}

Extract function call outputs from a response.

Each dict contains: "id", "call_id", "name", "arguments" (JSON string), "status".

Examples

result = respond("What's the weather?", tools=[function_tool("get_weather", ...)])
for call in function_calls(result)
    name = call["name"]
    args = JSON.parse(call["arguments"])
    # dispatch to your function...
end

source

Request Functions

UniLM.respond — Function

respond(r::Respond; retries=0, callback=nothing)

Send a request to the OpenAI Responses API.

Returns ResponseSuccess, ResponseFailure, or ResponseCallError.

For streaming, set stream=true and pass a callback:

callback(chunk::Union{String, ResponseObject}, close::Ref{Bool})

Examples

r = Respond(input="Tell me a joke")
result = respond(r)
if result isa ResponseSuccess
    println(output_text(result))
end

source

respond(input; kwargs...)

Convenience method: create a Respond from input + keyword arguments and send it.

Examples

# Simple text
result = respond("Tell me a joke")

# With instructions and model
result = respond("Translate: Hello", instructions="You are a translator", model="gpt-5.2")

# With tools
result = respond("Search for Julia news", tools=[web_search()])

# Multi-turn
r1 = respond("Tell me a joke")
r2 = respond("Tell me another", previous_response_id=r1.response.id)

# Streaming
respond("Tell me a story", stream=true) do chunk, close
    if chunk isa String
        print(chunk)  # partial text delta
    end
end

source

respond(callback::Function, input; kwargs...)

do-block form for streaming. Automatically sets stream=true.

Examples

respond("Tell me a story") do chunk, close
    if chunk isa String
        print(chunk)
    elseif chunk isa ResponseObject
        println("\nDone! Status: ", chunk.status)
    end
end

source

UniLM.get_response — Function

get_response(response_id::String; service=OPENAIServiceEndpoint)

Retrieve an existing response by its ID.

Examples

result = get_response("resp_abc123")
if result isa ResponseSuccess
    println(output_text(result))
end

source

UniLM.delete_response — Function

delete_response(response_id::String; service=OPENAIServiceEndpoint)

Delete a stored response by its ID. Returns a Dict with "id", "object", "deleted" keys.

Examples

result = delete_response("resp_abc123")
result["deleted"]  # => true

source

UniLM.list_input_items — Function

list_input_items(response_id::String; limit=20, order="desc", after=nothing, service=OPENAIServiceEndpoint)

List input items for a stored response. Returns a Dict with "data", "first_id", "last_id", "has_more".

Examples

items = list_input_items("resp_abc123")
for item in items["data"]
    println(item["type"], ": ", item)
end

source

UniLM.cancel_response — Function

cancel_response(response_id::String; service=OPENAIServiceEndpoint)

Cancel an in-progress response by its ID. Returns ResponseSuccess on success.

Examples

# Start a background response, then cancel it
result = respond("Write a very long essay", background=true)
cancel_result = cancel_response(result.response.id)
if cancel_result isa ResponseSuccess
    println("Cancelled: ", cancel_result.response.status)
end

source

UniLM.compact_response — Function

compact_response(; model, input, kwargs...)

Compact a conversation by running a compaction pass. Returns opaque, encrypted items that can be passed as input to subsequent requests, reducing token usage in long conversations.

Fields

model::String: Model to use for compaction
input::Any: The conversation items to compact (typically the full conversation history)

Returns a Dict with "id", "object", "output", and "usage" keys.

Examples

compacted = compact_response(model="gpt-5.2", input=[
    InputMessage(role="user", content="Hello"),
    Dict("type" => "message", "role" => "assistant", "status" => "completed",
         "content" => [Dict("type" => "output_text", "text" => "Hi there!")])
])
# Use compacted["output"] as input to the next request

source

UniLM.count_input_tokens — Function

count_input_tokens(; model, input, kwargs...)

Count the number of input tokens a request would use without actually generating a response. Useful for estimating costs or checking whether input fits within the context window.

Returns a Dict with "object" ("response.input_tokens") and "input_tokens" keys.

Examples

result = count_input_tokens(model="gpt-5.2", input="Tell me a joke")
println("Input tokens: ", result["input_tokens"])

source

Input Helpers

UniLM.InputMessage — Type

InputMessage(; role, content)

A structured input message for the Responses API.

Fields

role::String: "user", "assistant", "system", or "developer"
content::Any: String or a vector of content parts (see input_text, input_image, input_file)

Examples

InputMessage(role="user", content="What is 2+2?")
InputMessage(role="user", content=[input_text("Describe this:"), input_image("https://example.com/img.png")])

source

UniLM.input_text — Function

input_text(text::String)

Create an input_text content part for multimodal input messages.

source

UniLM.input_image — Function

input_image(url::String; detail=nothing)

Create an input_image content part. detail can be "auto", "low", or "high".

source

UniLM.input_file — Function

input_file(; url=nothing, id=nothing)

Create an input_file content part. Provide either a url or a file_id.

source

Multimodal Input

# Text-only input
msg = InputMessage(role="user", content="What is Julia?")
println("Role: ", msg.role)

# Multimodal input
parts = [
    input_text("Describe this image:"),
    input_image("https://example.com/photo.jpg", detail="high")
]
println("Parts: ", length(parts))
println("Part types: ", [p[:type] for p in parts])

Role: user
Parts: 2
Part types: ["input_text", "input_image"]

Tool Types

UniLM.ResponseTool — Type

ResponseTool

Abstract supertype for Responses API tools. Subtypes:

FunctionTool
WebSearchTool
FileSearchTool

source

UniLM.FunctionTool — Type

FunctionTool(; name, description=nothing, parameters=nothing, strict=nothing)

A function tool for the Responses API.

Examples

FunctionTool(
    name="get_weather",
    description="Get current weather for a location",
    parameters=Dict(
        "type" => "object",
        "properties" => Dict(
            "location" => Dict("type" => "string", "description" => "City name")
        ),
        "required" => ["location"]
    )
)

source

UniLM.WebSearchTool — Type

WebSearchTool(; search_context_size="medium", user_location=nothing)

A web search tool for the Responses API. Allows the model to search the web.

search_context_size: "low", "medium", or "high"
user_location: Dict with keys like "country", "city", "region", "timezone"

source

UniLM.FileSearchTool — Type

FileSearchTool(; vector_store_ids, max_num_results=nothing, ranking_options=nothing, filters=nothing)

A file search tool for the Responses API. Searches over uploaded vector stores.

source

UniLM.MCPTool — Type

MCPTool(; server_label, server_url, require_approval="never", allowed_tools=nothing, headers=nothing)

A Model Context Protocol (MCP) tool for the Responses API. Connects the model to an external MCP server for tool execution.

source

UniLM.ComputerUseTool — Type

ComputerUseTool(; display_width=1024, display_height=768, environment=nothing)

A computer use tool for the Responses API. Allows the model to interact with a virtual display via screenshots, mouse, and keyboard.

source

UniLM.ImageGenerationTool — Type

ImageGenerationTool(; background=nothing, output_format=nothing, output_compression=nothing, quality=nothing, size=nothing)

An image generation tool for the Responses API. Allows the model to generate images inline during a response.

source

UniLM.CodeInterpreterTool — Type

CodeInterpreterTool(; container=nothing, file_ids=nothing)

A code interpreter tool for the Responses API. Allows the model to execute code in a sandboxed environment.

source

Tool Constructors

UniLM.function_tool — Function

function_tool(name, description=nothing; parameters=nothing, strict=nothing)

Shorthand constructor for FunctionTool.

source

function_tool(d::AbstractDict)

Construct a FunctionTool from a dict. Accepts both the bare format {"name": ...} and the wrapped format {"type": "function", "function": {"name": ...}}.

source

UniLM.web_search — Function

web_search(; context_size="medium", location=nothing)

Shorthand constructor for WebSearchTool.

source

UniLM.file_search — Function

file_search(store_ids::Vector{String}; max_results=nothing, ranking=nothing, filters=nothing)

Shorthand constructor for FileSearchTool.

source

UniLM.mcp_tool — Function

mcp_tool(label, url; require_approval="never", allowed_tools=nothing, headers=nothing)

Shorthand constructor for MCPTool.

source

UniLM.computer_use — Function

computer_use(; display_width=1024, display_height=768, environment=nothing)

Shorthand constructor for ComputerUseTool.

source

UniLM.image_generation_tool — Function

image_generation_tool(; kwargs...)

Shorthand constructor for ImageGenerationTool.

source

UniLM.code_interpreter — Function

code_interpreter(; container=nothing, file_ids=nothing)

Shorthand constructor for CodeInterpreterTool.

source

# Function tool
ft = function_tool("calculate", "Evaluate a math expression",
    parameters=Dict("type" => "object", "properties" => Dict(
        "expr" => Dict("type" => "string")
    ))
)
println("Function tool: ", ft.name)

# Web search
ws = web_search(context_size="high")
println("Web search context: ", ws.search_context_size)

Function tool: calculate
Web search context: high

Text Format

UniLM.TextConfig — Type

TextConfig(; format=TextFormatSpec())

Wrapper for the text field in the Responses API request body.

source

UniLM.TextFormatSpec — Type

TextFormatSpec(; type="text", name=nothing, description=nothing, schema=nothing, strict=nothing)

Output text format specification.

type: "text" (default), "json_object", or "json_schema"
For "json_schema": provide name, description, schema, and optionally strict

source

UniLM.text_format — Function

text_format(; kwargs...)

Create a TextConfig with the given format options.

source

UniLM.json_schema_format — Function

json_schema_format(name, description, schema; strict=nothing)

Create a JSON Schema output format for structured output.

source

json_schema_format(d::AbstractDict)

Construct a JSON Schema TextConfig from a dict with keys "name", "description", and "schema".

source

UniLM.json_object_format — Function

json_object_format()

Create a JSON object output format (unstructured).

source

tf = text_format()
println("Default format: ", tf.format.type)

jf = json_object_format()
println("JSON format: ", jf.format.type)

Default format: text
JSON format: json_object

Reasoning

UniLM.Reasoning — Type

Reasoning(; effort=nothing, summary=nothing)

Reasoning configuration for O-series models (o3, o4-mini, etc.).

effort: "none", "low", "medium", or "high"
generate_summary: "auto", "concise", or "detailed" — configures summary generation
summary: "auto", "concise", or "detailed" (deprecated alias for generate_summary)

source

reasoning = UniLM.Reasoning(effort="high")
println("Effort: ", reasoning.effort)

Effort: high