Three things get called "agent" today and they sit at very different layers of the stack. Mixing them up is the source of a lot of "X is just Y" arguments online. Let me draw the layers.
The three layers
From bottom to top:
- Agent library. Code you import into your own service. Examples: LangGraph, LangChain agents, Mastra, Pydantic AI, the OpenAI Agents SDK. You write the FastAPI service (or whatever) that wraps the library. You decide how to persist state, how to expose the API, how to isolate tool calls, how to stream output. The library gives you a graph or a loop; everything else is yours.
- Agent runtime. A platform that runs agents. You define agents through an API; the runtime owns the orchestrator loop, the session state, the sandbox, the event log, the tool calls, and the streaming. Examples: Linchpin, and OpenHands in some configurations. The runtime is something you operate. You ship agents on top of it.
- Hosted agent service. A vendor runs the runtime for you. You configure agents through their API and consume them as a service. Examples: Anthropic Managed Agents, Devin, Manus, OpenAI's agent tools. You do not operate anything; you bring your prompts and pay per use.
flowchart TB
L3["Hosted agent service
(Anthropic Managed Agents, Devin, Manus)"]
L2["Agent runtime
(Linchpin, OpenHands)"]
L1["Agent library
(LangGraph, LangChain, Mastra)"]
L3 -->|"sits on"| L2
L2 -->|"is built around"| L1
style L2 fill:#B83A1A,color:#F4F2EC
The lines are not perfectly clean. Many agent libraries ship some runtime affordances (state persistence, checkpointing). Many hosted services expose enough surface that they feel like a runtime you happen to not operate. But the layers are real, and noticing which one a project is at will save you a lot of "are these competitors?" confusion.
What a runtime owns
A real agent runtime takes responsibility for a specific list of things. The list is what distinguishes a runtime from a library that ships with a server.
Sessions
An agent session is long-lived. The user sends a message, the agent reasons, the agent calls tools, the agent replies. Then later the user sends another message — and the agent picks up where it left off. The runtime is responsible for persisting that session, surfacing its state, resuming it, and tearing it down. Sessions outlive HTTP requests.
Event log
Everything that happens in a session is recorded as events: user.message, agent.message_delta, agent.tool_call, agent.tool_result, session.status_running, session.status_idle. Append-only. Cursor-paginated. The event log is the source of truth, not the agent's in-memory state. You can replay it. You can subscribe to it. Other services can consume it.
Sandbox
When the agent runs bash, you don't want it running on the runtime's host. Each session gets a per-session sandbox — typically a Docker container on a network the operator controls. File I/O, shell commands, MCP servers all run inside the sandbox. The runtime mediates between the agent's tool calls and the sandbox's filesystem and processes.
Tool execution
Tools come in a few flavors. Built-in tools (read, write, bash, edit, grep, etc.) are shipped by the runtime. MCP servers are registered as additional tool sources. Custom HTTP tools point at services the operator defines. The runtime is responsible for dispatching tool calls, policy-gating them (this agent in this environment can or cannot use this tool), and recording the results back into the event log.
Model providers
The runtime calls models. It is responsible for the wire-format, retries, streaming, and provider selection. A library typically supports many providers via adapters; a runtime usually picks a small number and supports them well. Linchpin uses two: OpenRouter for any cloud model and Ollama for any local model.
Streaming
Agents are slow. Users want to see the answer arriving. The runtime exposes the event log as a stream — typically Server-Sent Events — so clients can subscribe and render deltas as they happen, with replay from a cursor if the connection drops.
Why this layer exists
You could, in principle, build all of this yourself on top of an agent library. People do. The reason runtimes exist as a distinct layer is that the work is not negligible, and it is the same work across every agent product:
- Spawning and tearing down sandbox containers.
- Storing and paginating an event log.
- Streaming events to many clients with cursor-replay semantics.
- Policy-gating tool calls.
- Resuming sessions after a server restart.
- Encrypting credentials per session and injecting them at tool time.
That list is two months of work, and you have not yet written your actual product. Runtimes commoditize it — you import a runtime, you ship the product.
Runtime vs hosted service
The split between a runtime and a hosted service is operational, not architectural. A hosted service is a runtime — just operated by someone else, on their infrastructure, with their pricing. Anthropic Managed Agents is roughly what you would get if you packaged a runtime and ran it as a SaaS.
The reasons to pick a self-hosted runtime over a hosted service are the usual self-hosting reasons: data residency, cost predictability, model choice, source-availability, embedding inside another product. The reasons to pick a hosted service are the usual hosted-service reasons: no operations, vendor SLA, compliance certifications, day-one feature breadth.
Linchpin as an example
Linchpin is one example of a runtime. Three services and Postgres on a single VM. The API is HTTP. The streaming is SSE. Sandboxes are Docker. Models are OpenRouter or Ollama. Credentials are Fernet-encrypted. Apache-2.0. There is nothing exotic about any of these choices — they are the obvious defaults — but they are made, and they are pre-wired, which is what makes Linchpin a runtime and not a library.
If you are looking for the layer this post defines, see the homepage and docs. If you are looking for the comparison to one of the libraries or services above, the comparison pages have a side-by-side: vs OpenHands, vs LangGraph, vs Anthropic Managed Agents.
Further reading
- Self-hosting AI agents on a single VM — practical deployment
- Glossary — definitions for every term used here
- Self-hosted agents — who picks this layer and why