Skip to content

Overview

A multi-model agent framework for running long, high-stakes projects with real quality control.

Modulatio is a TUI-and-CLI framework for orchestrating teams of LLM agents on projects that take more than one prompt. You define an objective, set the standards, pick the models, and the team — your Mod Squad — plans, executes, reviews, and iterates, with you in the loop at the levels that matter and out of the loop where automation is safe.

It’s designed for work like:

  • Drafting and revising long-form content (essays, reports, books) against an editorial standard
  • Running a small business loop (research, content production, social engagement, weekly review)
  • Multi-step research where the team has to track decisions across many sub-tasks
  • Any artifact class where quality matters and a single LLM call won’t cut it

What makes Modulatio different from “spin up a chat” or single-agent tools:

  • Real multi-model routing per agent. Drafter on GLM 5.1, Quality Control on Kimi-K2.5, Leader on Sonnet — each role on its own model and its own provider. Native to the architecture, not an afterthought.
  • Quality Control as a first-class subsystem. Modulatio’s quality gate is built on Total Quality Management (TQM) — the established quality-engineering discipline behind the ISO 9000 family of standards — applied in three layers: universal axes × per-artifact-kind standards × per-team overrides. The Quality Control (QC) agent reviews every artifact before it ships; rejects route back to the producer, and when the producer can’t clear the bar QC patches the artifact itself. The economics are the point: cheap producers generate the bulk, the smarter QC patches only the errors — the cost of a cheap model with the quality of a strong one.
  • An honest covering note on every product. When the lead has reservations it can’t resolve inside the team — citations it couldn’t independently verify, a claim worth double-checking — it ships them to you as an advisory Product Quality Report beside the work. Honest caveats, never a gate: they don’t block the deliverable or stall the run.
  • A producer is a model endpoint. No fixed roles, and no skills to assign — you give a producer an LLM and tag what it’s good at, and the team composes the skills each task needs from a shared library at run-time. Routing reasons over each model’s capabilities and never blocks on a gap.
  • Plan-mode end-to-end. Leader is a conversational partner. Project is the unit being led. Plan is the unit of execution. Long plans run as daemons in the background, with reflection between sub-objectives and Telegram approvals where needed.
  • One cooperative team, one sandbox. Modulatio is two things at once: a coding harness for hands-on work, and a long-horizon, self-looping digital factory staffed by a cast of models you design. Not many copies of one model, not sandboxed agents passing notes over a wall — your models, harnessed into a single functional team inside one sandbox.
  • Diversity as a quality edge. Models from different providers and training lineages bring different instincts to a problem; where one is blind, another sees. Mixing them turns variety of approach into better work instead of one model’s blind spots, repeated.
  • Local, cloud, or offline. No preference for either — lean on local models to cut cost further, mix in cloud where it pays, and keep working when the connection drops.
  • Open architecture. Your data, your vault, your providers, your models. No SaaS, no per-instance subscription, no vendor lock-in.

v0.9.6 release notes — the current release: reliabilitythe team always finishes the job. A producer’s attempts are budgeted per task (across every retry, hand-off, and re-run, never reset), and when that budget is spent QC finishes the work itself (patches the existing draft, or writes the artifact from the task’s brief) — so a run lands a real deliverable instead of wedging. Plus a leaner team (the Leader is the only required role), lifecycle tooling (modulatio uninstall / modulatio repair), clearer signals (each QC review shows against its task; a plain message when the Leader’s model is unavailable), and fail-closed confinement for a Clay producer/QC seat. The prior v0.9.5 release notessubscription seats — bring your own Claude and GPT-5.5 to the team: Clay runs any seat through your Claude Code subscription (claude -p, the official harness — never a metered key, confined like any other seat, additive), GPT-5.5 runs through your OpenAI Codex subscription, and each seat can carry fallback models (the engine warns and restarts the whole task on a backup when a model is unavailable). The prior v0.9.4 release notesthe two-lane Leader — the Leader can now work on its own as a standalone coding agent (read / edit / run files in a folder you grant with /work; confined by default to its own workspace; widened only by a scoped approval — once / session / always / deny, /rp to revoke; sandbox-required, fail-closed for anything it runs), with three autonomy modes/yolo (auto-grant capabilities), /goal (delegate judgment), /yolo-goal (both) — and one fence through all of them: running free outside your own yard always needs permission (no mode opens the folder gate). The prior v0.9.3 release notesFeng-Tui, the harmonious terminal interface — a full phosphor-terminal reskin of the TUI (pure black, thin frames, three live-cycling monochrome variants — amber / green / cyan, switched with F2 and remembered across launches; state read as glyph + WORD; a low-res boot splash; a shared master-detail layout across the list tabs; a read-only skills preview; app-wide copy/paste; uniform delete guards — layout-only, no backend change). The prior v0.9.1 release notesagent role refinement: producers, the Leader, and QC work to a per-operation standard (the right definition of “done” per kind of work — a fix judged on the reported problem being gone, research on real synthesized sources, an assessment on evidence; no behavior change for work that declares no operation). The prior v0.9.0 release notesstability + reporting — two full-codebase debug passes (hundreds of fixes, no behavior change for a normal run) plus a crash / error / doctor log system (a LOGS tab + modulatio logs; capture-always, submit-on-consent, auto-redacted). The prior v0.8.9 release notessecurity hardening: a full-codebase audit + two independent mirror-audits closed nine findings (keystone: a tool-call authorization bypass found by the independent pass), no behavior change for a normal run. The prior v0.8.8 release notes — deterministic assembly validation (QC cheap-passes a provably-correct code or media assembly — containment, not shape; SaaS imports expected; lossy media falls back honestly) and codify-the-win (learn techniques from QC recoveries, not just repeated fails — project-local, flagged non-independent). The prior v0.8.6 release notes cover Leader self-remediation (fix fixable concerns in place, under a typed gate + engine-owned fix window) and JT generativity (refuse a job template a job can’t fill; derive a fitting one; cron skip-the-slot). See the Assembly + the review-ledger and Skill system deep-dives.

v0.8.1 release notes — product-aware familial assemblers (document / code / data) and a content-addressed review-ledger so QC verifies the marks instead of re-reading the finished work.

v0.8.0 release notes — an Agent Client Protocol (ACP) server: drive the same conversational Leader from a Zed-class editor over stdio, with client-approved tool calls.

v0.7.2 release notes — conversation-first: approvals via the Leader (he asks before he touches anything), the Job-Template Library, attachments, and the constitution.

v0.7.0 release notes — the API key pool (pool by default, pin to isolate) and the Configuration tab.

v0.6.0 release notes — the role-language migration: routing reality wired on every headless path, the specialist/researcher roles collapsed into producers, and an operator-presence-aware Leader that judges when it runs alone and defers when you’re present.

v0.4.0 release notes — autonomous skill self-codification: the team learns from its own repeated failures, codifying recurring corrections into durable, git-versioned skill guidance.

v0.3.0 release notes — the skill-library keystone: producers as model endpoints, capability + availability routing that never blocks, and self-contained goal decomposition.

v0.2.2 release notes — web search (the first skill-library brick), source-credibility flagging, and a provably-terminating redo loop.

v0.2.1 release notes — in-place editing (--attach), surgical patch mode, the code read-toolkit, and the delivery / verify-goal fixes.

v0.2.0 release notes — the QC-thesis arc, the Product Quality Report, default standards, and the context-budget hardening.

Beta calibration — what the engine does well, what it does NOT do yet, the QC-as-fixer / self-heal scope, and how to report bugs. Read this before serious work.

Getting started — install, run the setup wizard, get to a working first plan.

Concepts — the mental model: project, plan, agent, skill, standard, vault. Read this once and the rest of the system gets a lot less mysterious.

Agents — the Leader + QC structural roles, producers (skill-holders), choosing models by role, and how the team composes around a project.

Plan lifecycle — what actually happens from “Leader, can you help me with X” to “plan complete, here’s your output.” The state machine, the approval gates, the reflection step, the audit trail.

Providers & models — every supported provider (Anthropic, xAI, OpenRouter, Ollama, LM Studio) with the auth flow, the gotchas, and how to point each agent at the model you want.

CLI reference — every command, every flag, with examples.

Roadmap — what’s shipping next, what’s under design, and the long-horizon pillars.

Troubleshooting — common errors mapped to fixes. Auth failures, model-not-found, paused plans, divergence flags, what to do when QC keeps rejecting.


Modulatio is standalone — it runs entirely on your machine, talks to whichever model providers you configure, and stores everything in your Obsidian vault (or any directory of your choosing).

It’s intentionally a different shape from chat UIs (single-agent, no quality gate, no plan persistence) and from cloud agent platforms (multi-tenant, SaaS, your data on their servers). If you want a tool you fully own that runs real teams of agents on real work, Modulatio is built for that.


If you hit something the docs don’t cover, or you find a mistake:

  • Errors at install or first run — start with Troubleshooting.
  • Provider auth issues — see the Providers & models page for that specific provider.
  • Conceptual confusion (“why is this paused?”, “why did QC reject?”) — Plan lifecycle covers state transitions; Agents covers role responsibilities.
  • Open an issue at the project repo with the relevant modulatio doctor output.