Overview
When working with AI coding agents, two components work together:LLM (Language Model)
The underlying AI model—Claude, GPT, etc.—that reasons about code and generates responses.
Harness
The coding agent CLI: tools, commands, agentic loop, permissions, and all the features that let the model interact with your codebase.
The Harness
The harness is the coding agent CLI that wraps the model. It’s what transforms a language model into a functional coding assistant. Key components include:| Component | Examples |
|---|---|
| Tools | Read/write files, run bash commands, search code, web fetch |
| Agentic Loop | Multi-turn execution that lets the model plan, act, observe, and iterate |
| Permission System | Controls what the agent can modify, execute, or access |
| Sub-agents | Specialized agents for planning, exploration, verification |
| MCP Integrations | External tool servers (GitHub, Linear, databases) |
| Context Management | What files, history, and information the model sees |
Different coding agents have different harnesses. Claude Code, Cursor, Cline,
Aider—each has its own tool set, permission model, and execution flow.
Why Model Providers Train on Specific Harnesses
Model providers typically use reinforcement learning (RL) to improve their models for coding tasks. This training happens on a specific harness:- Anthropic trains Claude models using the Claude Code harness
- OpenAI trains models using their own coding agent infrastructure
- The model learns tool-calling patterns, when to ask for permission, how to handle errors
What This Means
A model optimized for one harness may behave differently in another:| Scenario | Impact |
|---|---|
| Claude in Claude Code | Well-optimized—trained on this harness |
| Claude in a different agent | Generally works well, but may have subtle differences in tool use |
| GPT in Claude Code’s harness | Works but tool-calling patterns may differ from native environment |
Delegate Tasks to Native Harnesses
This is where Twill comes in. Instead of forcing models to work in unfamiliar harnesses, Twill delegates coding tasks to agents running in their native environments. Each model operates with the tools, patterns, and execution flows it was trained onBy matching models to their native harnesses, you get optimal tool-calling
behavior, better error handling, and the full capabilities each provider
intended.