Context Engineering with ExoChat: Parsimony in Action
I wrote about the principle of parsimony in context engineering a week ago. That article defined parsimony for developer workflows: specs, cursor rules, agent instructions. The core idea: a context is parsimonious when nothing in it can be removed without introducing ambiguity or degrading the result.
That definition holds. But it was scoped to one audience: developers building with AI tools.
The same problem (what enters the context window and at what detail level) exists in every LLM-based product that talks to users. Customer support bots. Financial advisors. Mental health assistants. Pre-sales qualification flows. Any system where an LLM conducts a multi-turn conversation with a person.
Context engineering in dialogue systems is the same discipline as in development: you manage what the model sees at each step. The principle of parsimony tells you how: only what's needed, nothing else. The missing piece is a tool that lets you practice context engineering at scale, without writing code for every change.
ExoChat is that tool.
The Problem: Context Bloat in Dialogue Products
The typical approach to building an LLM dialogue product looks like this:
- Write a system prompt. Put everything in it: persona, rules, examples, edge cases, compliance disclosures, escalation instructions.
- Append conversation history.
- Send to the model. Hope it follows the rules.
This works for demos. It fails in production.
As the conversation grows, instructions compete with history for the token budget. The model's attention is finite. Instructions that were at the top of the prompt get pushed further from the model's effective focus. The result:
- Rule drift. The model starts ignoring instructions it followed perfectly three turns ago.
- Hallucinations. Lost context triggers confabulation to fill the gaps.
- Inconsistent behavior. Same question, different answers depending on conversation length.
- Escalating costs. Longer contexts burn more tokens, and the output quality doesn't improve.
The longer the conversation, the worse it gets. This isn't a model quality issue. It's a failure to engineer the context: to control what the model sees at each turn of the dialogue.
How ExoChat Does Context Engineering
ExoChat is a context engineering tool for dialogue systems. Its core design principle follows parsimony directly: at each state of the conversation, assemble only the context relevant to that state.
Not a monolithic system prompt. Not "everything we might need." The minimum viable context for the current conversational step. Assembled automatically, controlled visually.
Six mechanisms make this work:
State Graph (FSM)
The conversation is modeled as a finite state machine. Each state has an explicit goal, its own rules, and defined transitions to other states. The state graph is the primary context engineering structure. It determines what the model needs to know right now, not what it might need later.
An ExoChat FSM for a financial advisor might have states like: greeting → risk_profiling → product_recommendation → disclosure → confirmation. At the risk_profiling state, the model doesn't need disclosure text. At disclosure, it doesn't need the profiling questionnaire. Each state scopes its own context.
Managed Prompts per State
Each state assembles its prompt independently:
- State-specific instructions (what to do, what to avoid)
- Relevant facts collected so far (not raw conversation history)
- Minimal history (summarized or filtered, not the full transcript)
The prompt controller builds context from these components. The model never sees the full system prompt. Only what applies to the current state.
Fact Storage
When the user provides validated information (name, consent, risk tolerance, symptoms), ExoChat extracts and stores these as structured facts. Facts are not kept in raw conversation history where they'd consume tokens every turn. They're stored separately and injected only when a state needs them.
This is parsimony at the data level: the model gets risk_tolerance: moderate instead of re-reading the five-turn exchange where the user explained their preferences.
Model Routing
Not every step needs the same model. Classification ("did the user agree?") is a cheap operation; a small, fast model handles it. Complex generation ("explain this investment product in the user's terms") needs a capable model with richer context.
ExoChat routes requests by task type: classification, generation, verification, summarization. Each route gets a context budget appropriate to the task. A yes/no classifier doesn't need the full conversation history.
Context Assembly from Policies
Compliance rules, disclosure requirements, escalation triggers. These are domain policies. ExoChat injects them only in states where they apply. A disclosure policy enters the context at the disclosure state, not at greeting. An escalation trigger for high-risk topics is active in relevant states, not everywhere.
Transition Validators
Before the conversation moves from one state to another, validators check exit conditions. Did the user provide the required information? Did they confirm consent? Validators prevent premature transitions that would lose context or skip required steps.
Comparison: Context Engineering Approaches for Dialogue Systems
Every LLM dialogue system deals with context engineering, consciously or not. The approaches differ in how much control they offer over what enters the context, who controls it, and how well they follow the principle of parsimony.
| Approach | Context Engineering Method | Who Controls | Parsimony | Iteration Speed |
|---|---|---|---|---|
| Raw Prompt Engineering | Monolithic system prompt | Developer | Low: everything upfront | Slow: code deploy per change |
| RAG | Chunk retrieval by similarity | Developer + embeddings | Medium: relevant chunks, but no conversation state awareness | Medium: update knowledge base |
| Agent Frameworks (LangChain, CrewAI) | Code-defined tool chains | Developer | Medium: tools scope context, but flow is code | Slow: code changes for flow |
| Visual Bot Builders (Voiceflow, Botpress) | Decision trees, intents | Designer | Medium: structured, but intent-based | Fast: visual editor |
| ExoChat (FSM + M2P) | State graph with per-state context assembly | Operator (no-code) | High: minimum viable context per state | Fast: visual editor, no deploy |
The key differences:
RAG answers "what knowledge to include" but not "what instructions and rules apply right now." Retrieval is stateless. It doesn't know where in the conversation you are. Context engineering without conversation state is incomplete.
Agent frameworks are developer tools. Every flow change (a new state, a different transition condition, a modified prompt) requires code. Context engineering is possible, but gated by developer availability. Parsimony degrades because nobody tunes it.
Visual bot builders (Voiceflow, Botpress) share the no-code philosophy. Operators can edit flows visually. They also allow different prompts per node. But the context engineering model is different in three ways:
- Routing: bot builders route by user intent (what did the user say?). ExoChat routes by state graph with validated transitions (what should the system do next?). Intent-based routing is reactive: it responds to user input. State-driven routing is proactive: the system leads the conversation toward a goal.
- Context assembly: bot builders typically pass the full conversation history plus the node's prompt to the model. ExoChat assembles context per state from structured facts, domain policies, and filtered history, not the raw transcript. This is where parsimony is enforced: each state gets only what it needs.
- Data model: bot builders work with conversation memory (full or summarized). ExoChat extracts validated facts (
risk_tolerance: moderate,consent: true) and injects them selectively. Facts don't consume tokens sitting in raw history. They're available when a state requests them.
ExoChat is a context engineering tool that combines these three mechanisms with no-code editing and M2P conversation control. State-driven routing, per-state context assembly, structured fact storage. The operator manages what context enters each state. Parsimony is the default, not an afterthought.
Operator-First: Context Engineering Without Developers
Context engineering following the principle of parsimony requires constant tuning. A context that's parsimonious today becomes bloated tomorrow when you add a new product line, a compliance rule, or an edge case handler.
If every tuning cycle requires a developer to change code, review, test, and deploy, context quality degrades. The cost of maintaining parsimony exceeds the perceived benefit. The system prompt grows. Context bloat returns.
ExoChat is designed so that the operator (product manager, analyst, domain expert) does context engineering directly:
- Visual state graph editor. See the full conversation flow, edit per-state prompts, adjust transition conditions.
- Version control. Ship scenario versions with A/B testing and feature flags.
- ExoChat Quality Lab. Run synthetic users through the scenario, evaluate quality per persona, catch regressions before production.
- No deploy cycle. Changes go live when the operator publishes the version.
The operator does context engineering directly. They add context to a state where the model hallucinates. They remove context from a state where it's not needed. They test with the Quality Lab. They ship. No developer in the loop. Parsimony is maintained because the person closest to the domain manages the context.
ExoChat Quality Lab deserves its own article, and it will get one. In short: LLM dialogues operate in fuzzy logic territory where you can't write a unit test that says "response must equal X." Quality Lab solves this by simulating dozens of conversations in parallel — each with a different user persona and task, then automatically scoring every dialogue across a set of criteria. The result is a quality map of your ExoChat scenario: which personas get great service, which hit dead ends, and where context engineering needs attention. More on this soon.
This is the difference between context engineering as an abstract discipline and context engineering as daily practice.
Context Engineering Across the Stack
The prompt engineering vs context engineering article described three levels of AI system complexity. Context engineering with parsimony applies at every level, but the tools and the people change.
| Layer | Context Engineering Tool | Who Does It |
|---|---|---|
| Developer workflow | CLAUDE.md, cursor rules, specs | Developer |
| Agent orchestration | Inter-agent context references, tool-first approach | Developer / Architect |
| User-facing dialogue | ExoChat FSM, per-state context assembly | Operator |
The principle of parsimony is the same everywhere: minimum viable context per task. The difference is who practices context engineering and with what tools.
At the developer layer, context engineering means writing compressed specs and directive rules. At the agent layer, it means passing references instead of full context between agents. At the dialogue layer, it means assembling per-state prompts from facts, policies, and scoped instructions. ExoChat is the tool that makes this possible without code.
From Principle to Product
Context engineering is the discipline. Parsimony is the principle that guides it: remove everything from context that doesn't contribute to the result. This applies to developer workflows, agent orchestration, and user-facing dialogue systems alike.
ExoChat is what happens when you build a context engineering tool around the principle of parsimony. State graphs scope context per conversation step. Managed prompts assemble minimum viable context. Fact storage replaces raw history. Model routing matches context budgets to task complexity. The operator, not the developer, practices context engineering daily. The system stays parsimonious because the person tuning it understands the domain.
The context window is finite. What you put in it determines what comes out. ExoChat is the tool that manages what gets in, following the principle of parsimony, in the hands of the people who know the domain best.
