Scaling Managed Agents:
Decoupling the Brain from the Model
April 9, 2026 · 15 min read · Lattice Runtime Team
Managed agent infrastructure should not assume which model is the brain. Lattice Runtime is built around interfaces that stay stable as models, harnesses, and providers change.
A running theme in agent infrastructure is that harnesses encode assumptions — about what models cannot do, about where code runs, and about which model is doing the thinking. Those assumptions go stale fast.
Anthropic described this well in their recent engineering blog post on Claude Managed Agents. They found that workarounds built for Sonnet were dead weight on Opus. Context resets, retry logic, premature-completion guards — all tuned for one model's limitations, all wrong for the next.
We hit the same wall, but from a different angle. We were not just swapping model versions. We were swapping model providers. A planning agent running Opus. A code agent running Sonnet. A review agent running GPT for a second opinion. A local agent running Ollama for sensitive data that cannot leave the machine.
We built Lattice Runtime to solve both problems: a managed agent infrastructure that is opinionated about the interfaces — brain, session, hands — but not about which model is the brain or where the hands live.
The same old problem, one layer up
Anthropic framed their challenge as an old problem in computing: how to design a system for “programs as yet unthought of.” Operating systems solved this by virtualizing hardware into abstractions general enough for programs that did not exist yet.
We agree with the pattern. We disagree with where they drew the boundary.
Anthropic virtualized everything except the model. Their brain interface assumes Claude. Their session lives on Anthropic's cloud. Their sandbox runs in Anthropic's containers. They decoupled the components — brain, session, hands — but coupled them all to a single provider.
Lattice Runtime virtualizes one layer further: the model itself becomes a pluggable interface.
// The brain doesn't know what model is behind this.
response := brain.Generate(ctx, GenerateRequest{
Messages: session.GetEvents(lastCheckpoint),
Tools: runtime.AvailableTools(),
})
// Could be Claude, GPT, Gemini, or Ollama.
// Config change, not code change.The brain calls generate(messages, tools) → response — it does not know if Claude, GPT, Gemini, or a local Ollama instance is behind it. The session lives wherever you run it. The abstraction outlasts the provider.
Do not adopt a vendor
Anthropic described the “pets vs cattle” problem: when everything lives in one container, that container becomes a pet you cannot afford to lose.
We had the same problem, but the pet was not a container — it was a vendor.
When your managed agent infrastructure is Claude-only, your entire agent system becomes a pet tethered to one provider's API, pricing, uptime, and context window.
Vendor-locked architecture
Brain
Claude only
Session
Their servers
Sandbox
Their container
Provider fails → everything fails
The fix is the same one Anthropic arrived at for containers: decouple. But we decouple at the provider level, not just the component level.
Decouple the brain from the model
In Lattice Runtime, the brain is a harness that calls a model-agnostic interface. It knows how to:
- Build context from the durable session log
- Call
generate()with messages and tools — against any model - Route tool calls to the appropriate runtime backend
- Write events back to the session for durability
Switching from Claude to GPT to Ollama is a config change, not a code change. Each component is an interface. Each can fail independently. Each can be swapped without disturbing the others.
Lattice Runtime: decoupled architecture
The Brain
Model-agnostic harness
generate(messages, tools) → responseThe Session
Durable append-only log
getEvents() → event streamLives wherever you run it
Survives crashes + model swaps
SHA-256 audit chain
The Hands
6 runtime backends
execute(name, input) → stringEach component is an interface. Each can fail independently. Each can be swapped.
The session is not the context window
Long-horizon tasks exceed the context window. The standard fix — compaction, trimming, memory tools — involves irreversible decisions about what to keep. It is difficult to know which tokens the future turns will need.
In Lattice Runtime, we separated two concerns:
The session is durable. Every event is written to an append-only log per workspace. Malformed entries are filtered at load time, never fatal. If streaming crashes mid-token, the partial message lifecycle ensures the session is repairable on next load.
The context window is constructed. A stream context builder reads from the session log and builds what the model sees on each turn. Different models get different context strategies. Compaction marks a boundary but does not delete — the originals are still in the log, queryable.
// Context is constructed, not accumulated.
events := session.GetEvents(
session.FromCheckpoint(lastCompaction),
session.WithLimit(contextBudget),
)
// Different models get different strategies.
context := contextBuilder.Build(events, model.ContextConfig())Session vs context window
Session Log
DURABLEevent_001User messageevent_002Tool call: git diffevent_003Tool resultevent_004Model responseevent_005Compaction boundary...Append-only, queryable, portableContext Window
CONSTRUCTEDBuild
Stream context builder reads from session log
Transform
Different models get different context strategies
Compact
Summarize + mark boundary. Originals stay in log.
Recover
getEvents() can rewind, slice, or replay
The harness became cattle
Because the brain is model-agnostic, a crashed session can resume with a different model. If your Opus session hits a provider outage, the retry manager can fall back to Sonnet or GPT. The session does not care which brain is driving.
This is the structural advantage of decoupling at the provider level. In a vendor-locked system, a provider outage is a total system outage. In Lattice Runtime, it is a config change.
Crash → auto-recovery
Kill, OOM, or provider outage
Deterministic replay from last checkpoint
Resume with a different model if needed
KILLED → RESTARTING → RUNNING. SHA-256 chained.
The security boundary
Credential forwarding, not credential sharing. In the coupled design, untrusted code runs next to credentials. A prompt injection only needs to convince the model to read its environment. The structural fix is to ensure credentials are never reachable from the sandbox.
In Lattice Runtime, authentication is wired into runtimes without the agent session ever seeing raw credentials. Git tokens are baked into the clone during sandbox init. MCP OAuth tokens live in a secure vault, accessed through a proxy that fetches them per-session.
Five governance gates. Every agent action passes through: Identity → Authorization → Constraints → Execute → Audit. Policy violations are structurally impossible — the infrastructure will not execute actions that fail any gate.
Five governance gates
Identity
OAuth2 / SAML / mTLS / API Key
Authorization
RBAC + ABAC / Rego → SQL
Constraints
Budget / PII / Model lock / Tool gate
Execute
Temporal durable workflow
Audit
SHA-256 chain / Diff capture
Every agent action passes through all five gates. Policy violations are structurally impossible.
Many brains, many hands
Many brains. Each workspace runs its own agent session with its own model configuration. An orchestrator running Opus can delegate to workers running Sonnet. A review agent can use GPT for a second opinion. A sensitive-data agent can use Ollama so nothing leaves the machine.
Many hands. Each brain connects to hands through execute(name, input) → string. The harness does not know whether the sandbox is a container, a remote server, or a local shell. Because no hand is coupled to any brain, brains can pass hands to one another.
Lazy provisioning. Runtimes are provisioned on the first tool call that needs them, not at session start. A session that never touches the sandbox does not wait for one. This dropped our p50 time-to-first-token by roughly 60%.
Many brains, many hands
Many brains
Orchestrator
Opus — planning, reasoning
Code Worker
Sonnet — execution, edits
Review Agent
GPT — second opinion
Sensitive Data Agent
Ollama — never leaves the machine
Many hands
Local
Direct filesystem
Worktree
Git isolation
SSH
Remote servers
Docker
Container sandbox
Devcontainer
VS Code compat
Lattice SSH
Managed tunnel
execute(name, input) → stringSame interface. Any hand. Any brain can pass hands to another.
What is different
A side-by-side comparison of vendor-locked managed agent infrastructure vs Lattice Runtime.
Vendor-locked
Lattice Runtime
Model
One provider only
Model
Any model, any provider
Hosting
Provider cloud
Hosting
Your cloud, your machine, or ours
Data
Provider servers
Data
Wherever you run it
Multi-agent
Single-model
Multi-agent
Mix models per agent role
Governance
Not built-in
Governance
5 gates, budgets, crypto audit
Runtimes
Cloud container
Runtimes
6 backends (local → K8s)
Crash recovery
Provider-dependent
Crash recovery
Resume with any model
Session
Provider-managed
Session
Durable, portable, yours
Context
Model-coupled
Context
Constructed from durable log
Credentials
Shared environment
Credentials
Forwarded, never exposed
Audit trail
Basic logging
Audit trail
SHA-256 hash chain, tamper-evident
Provisioning
Upfront container
Provisioning
Lazy, on first tool call
Building in public
We are two people. We ship every day. We are rolling out access in phases.
The bet is simple: today's best model will not be tomorrow's best model. The team that bets on one provider will rewrite their agent infrastructure every time the leaderboard shifts. The team that bets on interfaces will swap a config line and keep shipping.
Lattice Runtime is that interface layer. One Go binary. Any model behind it. Your data stays on your machines. Every action passes through five governance gates before it executes. Every event is written to a tamper-evident audit log.
The abstraction outlasts the provider. That is the whole point.
Written by the Lattice Runtime team. View the full architecture →
Run agents on your terms.
Star the repo to get on our radar. We reach out to stargazers first.
Star on GitHublattice-runtimeStar · Open an issue with your use case · Watch for the invite
View full architecture