FIM Types

Types and functions for FIM (Fill-in-the-Middle) Completion and Chat Prefix Completion.

Request Type

UniLM.FIMCompletionType
FIMCompletion(; service, model, prompt, suffix=nothing, max_tokens=128, ...)

A Fill-in-the-Middle completion request. The model generates text between prompt (prefix) and suffix.

Supported by DeepSeekEndpoint (beta), Ollama, vLLM.

Example

fim = FIMCompletion(service=DeepSeekEndpoint(), prompt="def fib(a):",
    suffix="    return fib(a-1) + fib(a-2)", max_tokens=128)
result = fim_complete(fim)
println(fim_text(result))
source

Response Types

UniLM.FIMChoiceType
FIMChoice

A single completion choice from a FIM response.

Fields

  • text::String: The generated text
  • index::Int: Choice index (default 0)
  • finish_reason::Union{String,Nothing}: Why generation stopped (e.g. "stop", "length")
source
UniLM.FIMResponseType
FIMResponse

Parsed FIM completion response containing choices, usage, and raw data.

Fields

  • choices::Vector{FIMChoice}: Generated completions
  • usage::Union{TokenUsage,Nothing}: Token usage statistics
  • model::String: Model that generated the response
  • raw::Dict{String,Any}: Complete raw JSON response
source

Result Types

Request Functions

UniLM.fim_textFunction
fim_text(result) -> String

Extract the generated text from a FIM completion result.

source
UniLM.prefix_completeFunction
prefix_complete(chat::Chat; retries=0) -> LLMRequestResponse

Chat prefix completion: the model continues from a partial assistant message. The last message in chat must be role=assistant containing the text prefix to continue from.

Supported by DeepSeekEndpoint (beta).

Example

chat = Chat(service=DeepSeekEndpoint(), model="deepseek-chat")
push!(chat, Message(Val(:user), "Write a quicksort in Python"))
push!(chat, Message(role=RoleAssistant, content="```python\n"))
result = prefix_complete(chat)
source

Example

using UniLM
using JSON

fim = FIMCompletion(
    service=DeepSeekEndpoint("demo"),
    prompt="def add(a, b):",
    suffix="    return result",
    max_tokens=64
)
println("JSON body:")
println(JSON.json(fim, 2))
JSON body:
{
  "prompt": "def add(a, b):",
  "suffix": "    return result",
  "max_tokens": 64,
  "model": "deepseek-chat"
}