Linchpin works directly with Ollama. No cloud API keys, no per-token billing, no data leaving your machine. The agent runtime is open source, the models are local, the loop is closed.
Ollama runs as a local HTTP server (default http://localhost:11434). Linchpin's ollama provider points at it. The agent's session state, event log, and sandbox containers all live in Linchpin; the model weights and inference live in Ollama. Both run on the same machine.
flowchart LR
you[You] -->|chat / HTTP| linchpin[Linchpin
runtime]
linchpin -->|/api/chat| ollama[Ollama
http://localhost:11434]
ollama --> weights[(local model
weights)]
linchpin --> pg[(Postgres
event log)]
linchpin --> sbx[per-session
sandbox]
style linchpin fill:#B83A1A,color:#F4F2EC
style ollama fill:#1A1A1A,color:#F4F2EC
Point Linchpin at your local Ollama in the .env:
# .env LINCHPIN_API_KEY=dev-key VAULT_ENCRYPTION_KEY=$(openssl rand -base64 32) # model provider MODEL_PROVIDER=ollama OLLAMA_HOST=http://host.docker.internal:11434
If Linchpin is on the same host as Ollama (typical), host.docker.internal resolves from inside the Linchpin container. If Ollama is on a different machine on the LAN, point at that host's IP. From there, define an agent whose model is whatever you have pulled — llama3.2, qwen2.5, mistral, deepseek-coder, anything Ollama can run.
Not every chat model holds together inside an agent loop. The honest signal is "does the model reliably emit tool calls when asked, and follow multi-step instructions across many turns." A working set today:
Smaller models (3B and below) will agent — but unreliably. Expect more "model forgot the tool format" and "model went in circles" failures. For a real workload, plan on 7B or larger.
# 1. pull a model ollama pull llama3.2:8b # 2. clone linchpin git clone https://github.com/linchpinhq/linchpin cd linchpin cp .env.example .env # set MODEL_PROVIDER=ollama and OLLAMA_HOST # 3. up docker compose up --build
Define an agent with model: "llama3.2:8b", open a session, send an event. Streaming output comes back over SSE. See the docs for the full curl flow.