Skip to main content

Built on CLI Agents

Twill doesn’t reinvent the wheel—it orchestrates existing CLI coding agents like Claude Code and Open Code. This is intentional:
  1. Tools developers already use: CLI agents are the same tools you use daily in your terminal. There’s no new interface to learn, and the code they produce follows the same patterns you’d write yourself.
  2. Optimized by AI labs: Model providers like Anthropic, OpenAI, and Google specifically train their models to excel at these CLI tools through reinforcement learning. By leveraging these agents, Twill benefits from billions of dollars of RL training rather than competing with it.

Sub-Agent Architecture

Twill uses a multi-agent system where specialized agents handle different parts of the workflow:
1

Planning Agent

Analyzes your request, explores the codebase, and creates a detailed implementation plan with a confidence score. You review and approve before any code changes.
2

Code Agent

Implements the approved plan step-by-step. Follows existing patterns in your codebase and adapts when reality doesn’t match the plan.
3

Verifyer Agent

Runs mechanical checks—tests, linting, type checking, and builds. Inspects infrastructure logs to confirm services are healthy.
4

Critique Agent

Reviews logical correctness and alignment with the original goal. Analyzes whether the changes actually solve the stated problem. If issues are found by either verification agent, the code agent addresses them and re-runs verification until everything passes.

Self-Verification in Sandboxes

What makes this verification loop actually work is that Twill runs your entire project inside an isolated sandbox environment. The agent doesn’t just write code it:
  • Starts your dev server using your entrypoint script
  • Runs your test suite to catch regressions
  • Manually tests UI changes using browser automation
  • Calls API endpoints to verify backend changes
  • Inspects logs to confirm services are healthy
This means the agent catches its own mistakes before you ever see them. No more “it compiles but doesn’t work” pull requests.