# Agent Orchestration Patterns & Interactive Workflow Frameworks ## Deep Research Report — February 2026 > **Purpose:** Comprehensive comparison of how modern AI agent frameworks handle multi-agent orchestration with human-in-the-loop (HITL), state persistence, interactive UIs, and notification patterns — with architectural recommendations for building an interactive AI factory. --- ## Table of Contents 1. [Agent Orchestration Frameworks — Deep Comparison](#1-agent-orchestration-frameworks) 2. [State Machine Patterns for Agent Pipelines](#2-state-machine-patterns) 3. [Notification & Alerting Patterns](#3-notification--alerting-patterns) 4. [Chat-Embedded Interactive Modules](#4-chat-embedded-interactive-modules) 5. [MCP Apps / Interactive MCP Patterns](#5-mcp-apps--interactive-mcp-patterns) 6. [Production Examples](#6-production-examples) 7. [Architectural Recommendations](#7-architectural-recommendations) --- ## 1. Agent Orchestration Frameworks ### LangGraph — The Gold Standard for HITL LangGraph is currently the most mature framework for human-in-the-loop agent workflows. It provides three core primitives: **Three Pillars: Checkpointing → Interrupts → Commands** 1. **Checkpointing** — Persistent state that survives crashes, restarts, and even server migrations. Analogous to BizTalk's dehydration/rehydration pattern. The full agent state (variables, context, progress) is serialized to a backend (MemorySaver for dev, PostgresSaver/SQLiteSaver for production). 2. **Interrupts** — Two flavors: - *Static:* Always pause at a specific node (`interrupt_before=["sensitive_action"]`) - *Dynamic:* Conditionally pause based on runtime state using `interrupt()` from `langgraph.types` 3. **Commands** — Resume a paused workflow with `Command(resume={...})`, correlated by `thread_id` ```python # LangGraph HITL pattern — dynamic interrupt from langgraph.types import interrupt, Command from langgraph.checkpoint.memory import MemorySaver def process_transaction(state): if state["transaction_amount"] > 10000: human_decision = interrupt({ "question": f"Approve transaction of ${state['transaction_amount']}?", "details": state["details"] }) if not human_decision.get("approved"): return {"status": "rejected", "reason": human_decision.get("reason")} return {"status": "approved", "processed": True} graph = workflow.compile(checkpointer=MemorySaver()) config = {"configurable": {"thread_id": "txn-123"}} result = graph.invoke(initial_state, config) # Later (hours/days), resume with human input: result = graph.invoke( Command(resume={"approved": True, "notes": "Verified identity"}), config ) ``` **Key Insight:** The graph doesn't replay from the start — it resumes from the exact checkpoint. The `thread_id` acts as a correlation key (similar to BizTalk correlation sets). **Production template:** There's an open-source [LangGraph interrupt workflow template](https://github.com/KirtiJha/langgraph-interrupt-workflow-template) with FastAPI + Next.js frontend that demonstrates the full pattern. --- ### CrewAI — Simpler but Less Flexible CrewAI supports HITL through two mechanisms: 1. **`human_input=True` on Tasks** — When a task has this flag, the agent pauses after completing its work and asks the human to review/approve before finalizing. This is a task-level checkpoint, not node-level. 2. **`allow_delegation=True` on Agents** — Enables agents to delegate work to other agents, with optional human input at delegation points. 3. **Hierarchical Process** — Automatically assigns a manager agent that coordinates planning, delegation, and validation. The manager can route to humans. ```python # CrewAI HITL pattern task1 = Task( description="Conduct analysis of AI trends in 2024...", expected_output="Detailed report", human_input=True, # Pauses for human review agent=researcher_agent ) ``` **Limitations:** - No persistent checkpoint/resume like LangGraph — if the process crashes while waiting for human input, state is lost - `human_input` is essentially a blocking `input()` call under the hood - No dynamic interrupt capability (always pauses or never pauses, per task config) - CrewAI Flows (newer) add `start/listen/router` steps with state persistence and resume, but HITL is still less mature than LangGraph --- ### AutoGen — Conversation-Centric HITL AutoGen (now at v0.4+ / "stable") models HITL through the `UserProxyAgent`: 1. **UserProxyAgent** — A special agent that acts as a proxy for human input. It blocks the team's execution until the user responds. 2. **Group Chat Orchestration** — In `RoundRobinGroupChat`, the UserProxyAgent is called in order. In `SelectorGroupChat`, a selector prompt/function dynamically decides when to route to the human. 3. **`human_input_mode`** (v0.2) — Three modes: `ALWAYS` (always ask), `TERMINATE` (ask only at termination), `NEVER` (fully autonomous). ```python # AutoGen HITL pattern from autogen_agentchat.agents import AssistantAgent, UserProxyAgent from autogen_agentchat.teams import RoundRobinGroupChat from autogen_agentchat.conditions import TextMentionTermination user_proxy = UserProxyAgent("user_proxy", input_func=input) assistant = AssistantAgent("assistant", model_client=model_client) termination = TextMentionTermination("APPROVE") team = RoundRobinGroupChat([assistant, user_proxy], termination_condition=termination) await team.run_stream(task="Write a 4-line poem about the ocean.") ``` **Key Limitation:** `UserProxyAgent` blocks the entire team execution. AutoGen docs explicitly recommend it only for "short interactions requiring immediate feedback" like button clicks. It puts the team in an **unstable state that cannot be saved or resumed**. For long-running approvals, you need external patterns. **Integration examples:** AutoGen provides sample integrations with FastAPI, ChainLit, and Streamlit for web-based HITL. --- ### Microsoft Semantic Kernel — Process Framework with External Pub/Sub Semantic Kernel's Process Framework (experimental, as of Feb 2026) takes an event-driven approach to HITL: 1. **Parameter Gating** — A step's `KernelFunction` only executes when ALL required parameters are provided. By adding a `userApproval` parameter, the step naturally waits for both the document AND the approval. 2. **ProxyStep + External Pub/Sub** — A `ProxyStep` bridges internal process events to external messaging systems. When a document is approved by the AI proofreader, an event is emitted externally via `IExternalKernelProcessMessageChannel`. 3. **External Event Injection** — Human decisions come back as `OnInputEvent("UserApprovedDocument")` that route to the waiting step's parameter. ```csharp // Semantic Kernel Process: parameter gating for HITL public class PublishDocumentationStep : KernelProcessStep { [KernelFunction] public DocumentInfo PublishDocumentation( DocumentInfo document, // From AI proofreader bool userApproval // From human via external pub/sub ) { if (userApproval) { /* publish */ } return document; } } // Process wiring processBuilder .OnInputEvent("UserApprovedDocument") .SendEventTo(new(docsPublishStep, parameterName: "userApproval")); ``` **Architectural Insight:** This is the most "enterprise-ready" pattern — clean separation between the process engine and the notification/approval system. The `IExternalKernelProcessMessageChannel` interface can be implemented for any pub/sub backend (Azure Service Bus, Redis, Kafka, etc.). --- ### Temporal.io — The Durability Champion Temporal provides the most robust HITL primitive through **Signals**: 1. **Signals** — Asynchronous messages sent to a running Workflow to change its state. A `@workflow.signal` handler mutates workflow state, and the main workflow loop uses `workflow.wait_condition()` to react. 2. **Queries** — Read-only inspection of workflow state (e.g., "what's the current LLM output?") without affecting execution. 3. **Updates** — Synchronous tracked write requests where the sender can wait for a response. ```python # Temporal HITL pattern class UserDecision(StrEnum): KEEP = "KEEP" EDIT = "EDIT" WAIT = "WAIT" @workflow.defn class ResearchWorkflow: def __init__(self): self._user_decision = UserDecisionSignal(decision=UserDecision.WAIT) @workflow.signal def user_decision(self, input: UserDecisionSignal): self._user_decision = input @workflow.run async def run(self, input): continue_loop = True while continue_loop: research = await workflow.execute_activity(llm_call, ...) # Wait for human signal await workflow.wait_condition( lambda: self._user_decision.decision != UserDecision.WAIT ) if self._user_decision.decision == UserDecision.KEEP: continue_loop = False elif self._user_decision.decision == UserDecision.EDIT: # Incorporate feedback, reset, loop self._user_decision.decision = UserDecision.WAIT ``` **Why Temporal is unmatched for durability:** - Workflow state survives crashes, restarts, and server migrations automatically - No separate checkpointing step needed — it's intrinsic to the execution model - Signals are durably stored — if a user approves and the server crashes, the approval is NOT lost - Workflows can run for months/years without consuming resources while waiting - Built-in retry, timeout, and heartbeat mechanisms --- ### Inngest — Serverless Step Functions with Event Matching Inngest's `step.waitForEvent()` is elegant for serverless HITL: ```typescript // Inngest HITL pattern const processInvoice = inngest.createFunction( { id: "process-invoice" }, { event: "app/invoice.created" }, async ({ event, step }) => { const analysis = await step.run("analyze", () => analyzeInvoice(event.data)); // Wait up to 7 days for human approval, match by invoiceId const approval = await step.waitForEvent("wait-for-approval", { event: "app/invoice.approved", timeout: "7d", match: "data.invoiceId", // Correlation! }); if (!approval) { await step.run("escalate", () => notifyManager(event.data)); return; } await step.run("process", () => processPayment(approval.data)); } ); ``` **Key Features:** - **Event correlation** via `match` — automatically matches approval events to the correct waiting function by field value - **Timeouts** with fallback logic — if no approval in 7 days, escalate - **Serverless** — no persistent server needed; the function is dehydrated and rehydrated on events - **Realtime streaming** — Combine with Inngest Realtime to stream status updates to the UI while waiting **Limitation:** The `waitForEvent` only listens for events from when the step executes — events sent before the wait starts are missed (lookback feature planned). --- ### Prefect / Dagster — Data Pipeline Focus, Limited HITL Neither Prefect nor Dagster has first-class HITL primitives. They're optimized for data pipeline orchestration, not human-interactive workflows. - **Dagster:** Sensors can trigger on external events, but there's no built-in "wait for human approval" gate. You'd need to implement it externally (e.g., poll a database for approval status). - **Prefect:** Similar story — you can use pause/resume via the API, but it's not a core workflow primitive. Tasks are primarily designed for data transformations. - **Both:** The Reddit consensus is "if you need human-interactive workflows, use Temporal.io instead." --- ## 2. State Machine Patterns for Agent Pipelines ### XState + Stately Agent — State Machines for LLM Agents [Stately Agent](https://github.com/statelyai/agent) (formerly `@statelyai/agent`) combines XState v5 state machines with LLM decision-making: - **State machines guide agent behavior** — Valid transitions are defined explicitly, preventing the agent from entering invalid states - **Observations, feedback, and insights** feed into decision-making - **First-class Vercel AI SDK integration** — supports OpenAI, Anthropic, Google, Mistral, Groq, etc. - **Episodes** — complete sequences from initial state to goal (similar to RL episodes) **Modeling "Waiting for Human" as a State:** ```typescript // XState pattern: waiting-for-human as first-class state const agentMachine = createMachine({ id: 'pipeline', initial: 'gathering', states: { gathering: { invoke: { src: 'gatherData', onDone: 'analyzing' } }, analyzing: { invoke: { src: 'runAnalysis', onDone: 'awaitingHumanReview' } }, awaitingHumanReview: { // This is a "parking" state — no automatic transitions // Only human events can move forward on: { APPROVE: { target: 'publishing', actions: 'recordApproval' }, REJECT: { target: 'revising', actions: 'recordRejection' }, EDIT: { target: 'analyzing', actions: 'incorporateFeedback' } }, after: { // Auto-escalate after 24 hours 86400000: { target: 'escalating' } } }, escalating: { invoke: { src: 'notifyManager', onDone: 'awaitingHumanReview' } }, revising: { invoke: { src: 'applyRevisions', onDone: 'awaitingHumanReview' } }, publishing: { invoke: { src: 'publishResult', onDone: 'complete' } }, complete: { type: 'final' } } }); ``` **Persistent State Machines with Restate:** [Restate.dev](https://restate.dev) offers persistent serverless state machines that combine XState's modeling with durable execution — state machines survive crashes and can be distributed across serverless functions. **Key Patterns:** - **Parallel states** for concurrent agent work with sync points (XState parallel states) - **Guard conditions** for branching based on human decisions - **Delayed transitions** (`after`) for SLA enforcement and auto-escalation - **Snapshot/restore** — XState v5 supports persisting machine state to any backend --- ## 3. Notification & Alerting Patterns ### Making Human Input OBVIOUS The n8n community and PagerDuty have established robust patterns: **Escalation Chain Pattern (PagerDuty-style):** ``` Level 1 (0 min): Slack notification + email Level 2 (15 min): Direct message + phone push notification Level 3 (30 min): SMS to secondary reviewer Level 4 (60 min): Phone call to manager Level 5 (4 hrs): Auto-default to safest outcome + incident report ``` **SLA Tracking:** - Record `waiting_since` timestamp when entering HITL state - Display elapsed time in all notifications ("⏰ Waiting 2h 15m") - Color-code: 🟢 < 1hr, 🟡 1-4hr, 🔴 > 4hr, 🚨 > SLA threshold - Dashboard showing all pending approvals with age **Smart Batching (Context-Rich Notifications):** Instead of 10 separate "approve this" notifications: ``` 📋 5 MCP server builds need review: ┌────────────────────────────────────┐ │ 1. ✅ ghl-mcp (tests pass, low risk) [Approve] [Review] │ 2. ⚠️ stripe-mcp (1 warning, med risk) [Approve] [Review] │ 3. ❌ shopify-mcp (2 errors, high risk) [Review Required] │ 4. ✅ notion-mcp (tests pass, low risk) [Approve] [Review] │ 5. ✅ cal-mcp (tests pass, low risk) [Approve] [Review] └────────────────────────────────────┘ ⏰ Oldest: 45 min ago | 🎯 SLA: 2 hours [Approve All Low-Risk] [Review All] ``` **n8n Escalation Pattern:** ``` Wait Node (timeout: 2h) → IF approved → continue → IF timed out → notify backup owner → Wait Node (timeout: 1h) → IF approved → continue → IF timed out → auto-reject + incident log ``` --- ## 4. Chat-Embedded Interactive Modules ### Slack Block Kit — The Reference Implementation Slack Block Kit provides the canonical pattern for interactive chat elements: ```json { "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": "🏭 MCP Build Review Required" } }, { "type": "section", "fields": [ { "type": "mrkdwn", "text": "*Server:*\nghl-mcp" }, { "type": "mrkdwn", "text": "*Status:*\n✅ Tests Passing" } ] }, { "type": "section", "fields": [ { "type": "mrkdwn", "text": "*Build Time:*\n2m 34s" }, { "type": "mrkdwn", "text": "*Risk Level:*\n🟢 Low" } ] }, { "type": "actions", "elements": [ { "type": "button", "text": { "type": "plain_text", "text": "✅ Approve" }, "style": "primary", "value": "approve" }, { "type": "button", "text": { "type": "plain_text", "text": "❌ Reject" }, "style": "danger", "value": "reject" }, { "type": "button", "text": { "type": "plain_text", "text": "👀 Review Details" }, "value": "review" } ] } ] } ``` ### Discord Components — Buttons + Embeds Discord supports interactive components via discord.js: ```javascript const row = new ActionRowBuilder().addComponents( new ButtonBuilder() .setCustomId('approve_build_123') .setLabel('✅ Approve') .setStyle(ButtonStyle.Success), new ButtonBuilder() .setCustomId('reject_build_123') .setLabel('❌ Reject') .setStyle(ButtonStyle.Danger), new ButtonBuilder() .setCustomId('details_build_123') .setLabel('📋 Details') .setStyle(ButtonStyle.Secondary) ); const embed = new EmbedBuilder() .setTitle('🏭 Build Review: ghl-mcp') .addFields( { name: 'Status', value: '✅ Tests Passing', inline: true }, { name: 'Risk', value: '🟢 Low', inline: true }, { name: 'Waiting', value: '⏰ 15 minutes', inline: true } ) .setColor(0x00ff00); await channel.send({ embeds: [embed], components: [row] }); ``` **Interactive Patterns for Chat:** - **Card-based reviews** — Embed with context + action buttons - **Select menus** — Choose from options (e.g., select which variant to deploy) - **Modal forms** — Full form inputs triggered by button click (Discord modals, Slack dialogs) - **Progress indicators** — Update embed fields in-place as pipeline progresses - **Threaded detail** — "Review Details" button creates a thread with full diff/logs --- ## 5. MCP Apps / Interactive MCP Patterns ### MCP Apps Extension — Interactive UIs in Chat (January 2026) The [MCP Apps specification](https://modelcontextprotocol.io/docs/extensions/apps) (released Nov 2025, refined Jan 2026) is a game-changer for HITL: **Core Pattern:** 1. A tool declares `_meta.ui.resourceUri` pointing to a `ui://` resource 2. The host fetches the HTML resource and renders it in a sandboxed iframe 3. **Bidirectional communication** via JSON-RPC between the app and host ```typescript // MCP Apps: Interactive approval UI served by MCP server import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'; import { registerAppTool, registerAppResource } from '@modelcontextprotocol/ext-apps/server'; import { createUIResource } from '@mcp-ui/server'; const approvalUI = createUIResource({ uri: 'ui://factory/build-approval', content: { type: 'rawHtml', htmlString: `