Local LLM

Run text generation locally.

Create a local LLM from PrimordialClient, download the model, load it, then generate or stream text.

This page was auto-generated by OpenAI's GPT5.5. We are refining it to make it more helpful.

Create An Engine

The local LLM object is an actor. Read actor-isolated state with await.

import Primordial

let primordial = PrimordialClient()
let llm = primordial.localLLM()

let downloaded = await llm.isDownloaded
let loaded = await llm.isLoaded

Download Then Load

Use separate calls when your app controls the model lifecycle.

try await llm.download { progress in
    print(progress.fractionCompleted)
    print(progress.currentFilename ?? "")
    print(progress.downloadedBytes, progress.totalBytes ?? 0)
}

try await llm.load()

Download And Load

Use the convenience path when the app wants both steps in one call.

try await llm.downloadAndLoad { progress in
    print(progress.status)
}

Generate Text

let result = try await llm.generate(
    prompt: "Summarize this note in one sentence."
)

print(result.text)
print(result.tokenCount)
print(result.tokensPerSecond ?? 0)

Chat Messages

let result = try await llm.generate(messages: [
    PrimordialLLMMessage(role: .system, content: "You summarize notes."),
    PrimordialLLMMessage(role: .user, content: "Today I met Sarah about launch planning.")
])

Stream Text

for try await chunk in llm.stream(prompt: "Write three bullet points about this note.") {
    print(chunk.text, terminator: "")
}

Model Selection And Cleanup

let llm = primordial.localLLM(configuration: .bonsai8B1Bit)

await llm.unload()

let bytes = try await llm.downloadedModelSize()
try await llm.removeDownloadedModel()