Responses API

Types and functions for the Responses API — the newer, more flexible alternative to Chat Completions.

Request Type

UniLM.RespondType
Respond(; model="gpt-5.2", input, kwargs...)

Configuration struct for an OpenAI Responses API request.

Key Fields

  • model::String: Model to use (default: "gpt-5.2")
  • input::Any: A String or Vector{InputMessage} — the prompt input
  • instructions::String: System-level instructions
  • tools::Vector: Available tools (FunctionTool, WebSearchTool, FileSearchTool)
  • previous_response_id::String: Chain to a previous response for multi-turn
  • reasoning::Reasoning: Reasoning config for O-series models
  • text::TextConfig: Output format (text, json, json_schema)
  • temperature::Float64: Sampling temperature (0.0–2.0), mutually exclusive with top_p
  • top_p::Float64: Nucleus sampling (0.0–1.0), mutually exclusive with temperature
  • max_output_tokens::Int64: Max tokens in the response
  • stream::Bool: Enable streaming
  • truncation::String: "auto" or "disabled"
  • store::Bool: Whether to store the response for later retrieval
  • metadata::AbstractDict: Arbitrary metadata to attach
  • user::String: End-user identifier

Examples

# Simple text
Respond(input="Tell me a joke")

# With instructions
Respond(input="Translate to French: Hello", instructions="You are a translator")

# Multi-turn via chaining
Respond(input="Tell me more", previous_response_id="resp_abc123")

# With tools
Respond(
    input="What's the weather in NYC?",
    tools=ResponseTool[function_tool("get_weather", "Get weather", parameters=Dict("type"=>"object", "properties"=>Dict("location"=>Dict("type"=>"string"))))]
)

# Reasoning (O-series models)
Respond(input="Solve this math problem...", model="o3", reasoning=Reasoning(effort="high"))
source

Construction

using UniLM
using JSON

# Simple text request
r = Respond(input="Tell me a joke")
println("Model: ", r.model)
println("Input: ", r.input)

# With instructions and tools
r2 = Respond(
    input="What's the weather in Paris?",
    instructions="You are a helpful weather assistant",
    tools=[web_search()]
)
println("Has instructions: ", !isnothing(r2.instructions))
println("Tools: ", length(r2.tools))
Model: gpt-5.2
Input: Tell me a joke
Has instructions: true
Tools: 1

Response Object

UniLM.ResponseObjectType
ResponseObject

Parsed response from the Responses API.

Accessors

  • output_text(r) — extract concatenated text output
  • function_calls(r) — extract function call outputs
  • r.id, r.status, r.model — basic metadata
  • r.output — full output array (raw dicts)
  • r.usage — token usage info
  • r.raw — the complete raw JSON dict
source

Result Types

UniLM.ResponseSuccessType
ResponseSuccess <: LLMRequestResponse

Successful response from the Responses API. Access the parsed response via .response.

source
UniLM.ResponseFailureType
ResponseFailure <: LLMRequestResponse

HTTP-level failure from the Responses API. Contains the response body and status code.

source
UniLM.ResponseCallErrorType
ResponseCallError <: LLMRequestResponse

Exception-level error during a Responses API call (network, parsing, etc.).

source

Accessor Functions

UniLM.output_textFunction
output_text(r::ResponseObject)::String
output_text(r::ResponseSuccess)::String

Extract the concatenated text output from a response.

Examples

result = respond("Hello!")
output_text(result)  # => "Hi there! How can I help?"
source
UniLM.function_callsFunction
function_calls(r::ResponseObject)::Vector{Dict{String,Any}}
function_calls(r::ResponseSuccess)::Vector{Dict{String,Any}}

Extract function call outputs from a response.

Each dict contains: "id", "call_id", "name", "arguments" (JSON string), "status".

Examples

result = respond("What's the weather?", tools=[function_tool("get_weather", ...)])
for call in function_calls(result)
    name = call["name"]
    args = JSON.parse(call["arguments"])
    # dispatch to your function...
end
source

Request Functions

UniLM.respondFunction
respond(r::Respond; retries=0, callback=nothing)

Send a request to the OpenAI Responses API.

Returns ResponseSuccess, ResponseFailure, or ResponseCallError.

For streaming, set stream=true and pass a callback:

callback(chunk::Union{String, ResponseObject}, close::Ref{Bool})

Examples

r = Respond(input="Tell me a joke")
result = respond(r)
if result isa ResponseSuccess
    println(output_text(result))
end
source
respond(input; kwargs...)

Convenience method: create a Respond from input + keyword arguments and send it.

Examples

# Simple text
result = respond("Tell me a joke")

# With instructions and model
result = respond("Translate: Hello", instructions="You are a translator", model="gpt-5.2")

# With tools
result = respond("Search for Julia news", tools=[web_search()])

# Multi-turn
r1 = respond("Tell me a joke")
r2 = respond("Tell me another", previous_response_id=r1.response.id)

# Streaming
respond("Tell me a story", stream=true) do chunk, close
    if chunk isa String
        print(chunk)  # partial text delta
    end
end
source
respond(callback::Function, input; kwargs...)

do-block form for streaming. Automatically sets stream=true.

Examples

respond("Tell me a story") do chunk, close
    if chunk isa String
        print(chunk)
    elseif chunk isa ResponseObject
        println("\nDone! Status: ", chunk.status)
    end
end
source
UniLM.get_responseFunction
get_response(response_id::String; service=OPENAIServiceEndpoint)

Retrieve an existing response by its ID.

Examples

result = get_response("resp_abc123")
if result isa ResponseSuccess
    println(output_text(result))
end
source
UniLM.delete_responseFunction
delete_response(response_id::String; service=OPENAIServiceEndpoint)

Delete a stored response by its ID. Returns a Dict with "id", "object", "deleted" keys.

Examples

result = delete_response("resp_abc123")
result["deleted"]  # => true
source
UniLM.list_input_itemsFunction
list_input_items(response_id::String; limit=20, order="desc", after=nothing, service=OPENAIServiceEndpoint)

List input items for a stored response. Returns a Dict with "data", "first_id", "last_id", "has_more".

Examples

items = list_input_items("resp_abc123")
for item in items["data"]
    println(item["type"], ": ", item)
end
source
UniLM.cancel_responseFunction
cancel_response(response_id::String; service=OPENAIServiceEndpoint)

Cancel an in-progress response by its ID. Returns ResponseSuccess on success.

Examples

# Start a background response, then cancel it
result = respond("Write a very long essay", background=true)
cancel_result = cancel_response(result.response.id)
if cancel_result isa ResponseSuccess
    println("Cancelled: ", cancel_result.response.status)
end
source
UniLM.compact_responseFunction
compact_response(; model, input, kwargs...)

Compact a conversation by running a compaction pass. Returns opaque, encrypted items that can be passed as input to subsequent requests, reducing token usage in long conversations.

Fields

  • model::String: Model to use for compaction
  • input::Any: The conversation items to compact (typically the full conversation history)

Returns a Dict with "id", "object", "output", and "usage" keys.

Examples

compacted = compact_response(model="gpt-5.2", input=[
    InputMessage(role="user", content="Hello"),
    Dict("type" => "message", "role" => "assistant", "status" => "completed",
         "content" => [Dict("type" => "output_text", "text" => "Hi there!")])
])
# Use compacted["output"] as input to the next request
source
UniLM.count_input_tokensFunction
count_input_tokens(; model, input, kwargs...)

Count the number of input tokens a request would use without actually generating a response. Useful for estimating costs or checking whether input fits within the context window.

Returns a Dict with "object" ("response.input_tokens") and "input_tokens" keys.

Examples

result = count_input_tokens(model="gpt-5.2", input="Tell me a joke")
println("Input tokens: ", result["input_tokens"])
source

Input Helpers

UniLM.InputMessageType
InputMessage(; role, content)

A structured input message for the Responses API.

Fields

  • role::String: "user", "assistant", "system", or "developer"
  • content::Any: String or a vector of content parts (see input_text, input_image, input_file)

Examples

InputMessage(role="user", content="What is 2+2?")
InputMessage(role="user", content=[input_text("Describe this:"), input_image("https://example.com/img.png")])
source
UniLM.input_textFunction
input_text(text::String)

Create an input_text content part for multimodal input messages.

source
UniLM.input_imageFunction
input_image(url::String; detail=nothing)

Create an input_image content part. detail can be "auto", "low", or "high".

source
UniLM.input_fileFunction
input_file(; url=nothing, id=nothing)

Create an input_file content part. Provide either a url or a file_id.

source

Multimodal Input

# Text-only input
msg = InputMessage(role="user", content="What is Julia?")
println("Role: ", msg.role)

# Multimodal input
parts = [
    input_text("Describe this image:"),
    input_image("https://example.com/photo.jpg", detail="high")
]
println("Parts: ", length(parts))
println("Part types: ", [p[:type] for p in parts])
Role: user
Parts: 2
Part types: ["input_text", "input_image"]

Tool Types

UniLM.FunctionToolType
FunctionTool(; name, description=nothing, parameters=nothing, strict=nothing)

A function tool for the Responses API.

Examples

FunctionTool(
    name="get_weather",
    description="Get current weather for a location",
    parameters=Dict(
        "type" => "object",
        "properties" => Dict(
            "location" => Dict("type" => "string", "description" => "City name")
        ),
        "required" => ["location"]
    )
)
source
UniLM.WebSearchToolType
WebSearchTool(; search_context_size="medium", user_location=nothing)

A web search tool for the Responses API. Allows the model to search the web.

  • search_context_size: "low", "medium", or "high"
  • user_location: Dict with keys like "country", "city", "region", "timezone"
source
UniLM.FileSearchToolType
FileSearchTool(; vector_store_ids, max_num_results=nothing, ranking_options=nothing, filters=nothing)

A file search tool for the Responses API. Searches over uploaded vector stores.

source
UniLM.MCPToolType
MCPTool(; server_label, server_url, require_approval="never", allowed_tools=nothing, headers=nothing)

A Model Context Protocol (MCP) tool for the Responses API. Connects the model to an external MCP server for tool execution.

source
UniLM.ComputerUseToolType
ComputerUseTool(; display_width=1024, display_height=768, environment=nothing)

A computer use tool for the Responses API. Allows the model to interact with a virtual display via screenshots, mouse, and keyboard.

source
UniLM.ImageGenerationToolType
ImageGenerationTool(; background=nothing, output_format=nothing, output_compression=nothing, quality=nothing, size=nothing)

An image generation tool for the Responses API. Allows the model to generate images inline during a response.

source
UniLM.CodeInterpreterToolType
CodeInterpreterTool(; container=nothing, file_ids=nothing)

A code interpreter tool for the Responses API. Allows the model to execute code in a sandboxed environment.

source

Tool Constructors

UniLM.function_toolFunction
function_tool(name, description=nothing; parameters=nothing, strict=nothing)

Shorthand constructor for FunctionTool.

source
function_tool(d::AbstractDict)

Construct a FunctionTool from a dict. Accepts both the bare format {"name": ...} and the wrapped format {"type": "function", "function": {"name": ...}}.

source
UniLM.mcp_toolFunction
mcp_tool(label, url; require_approval="never", allowed_tools=nothing, headers=nothing)

Shorthand constructor for MCPTool.

source
# Function tool
ft = function_tool("calculate", "Evaluate a math expression",
    parameters=Dict("type" => "object", "properties" => Dict(
        "expr" => Dict("type" => "string")
    ))
)
println("Function tool: ", ft.name)

# Web search
ws = web_search(context_size="high")
println("Web search context: ", ws.search_context_size)
Function tool: calculate
Web search context: high

Text Format

UniLM.TextConfigType
TextConfig(; format=TextFormatSpec())

Wrapper for the text field in the Responses API request body.

source
UniLM.TextFormatSpecType
TextFormatSpec(; type="text", name=nothing, description=nothing, schema=nothing, strict=nothing)

Output text format specification.

  • type: "text" (default), "json_object", or "json_schema"
  • For "json_schema": provide name, description, schema, and optionally strict
source
UniLM.json_schema_formatFunction
json_schema_format(name, description, schema; strict=nothing)

Create a JSON Schema output format for structured output.

source
json_schema_format(d::AbstractDict)

Construct a JSON Schema TextConfig from a dict with keys "name", "description", and "schema".

source
tf = text_format()
println("Default format: ", tf.format.type)

jf = json_object_format()
println("JSON format: ", jf.format.type)
Default format: text
JSON format: json_object

Reasoning

UniLM.ReasoningType
Reasoning(; effort=nothing, summary=nothing)

Reasoning configuration for O-series models (o3, o4-mini, etc.).

  • effort: "none", "low", "medium", or "high"
  • generate_summary: "auto", "concise", or "detailed" — configures summary generation
  • summary: "auto", "concise", or "detailed" (deprecated alias for generate_summary)
source
reasoning = UniLM.Reasoning(effort="high")
println("Effort: ", reasoning.effort)
Effort: high