Jake Shore ddfa0956fe Daily backup: 2026-02-03 — 4 new MCP servers, multi-panel threads, LocalBosses bug fixes

2026-02-03 23:01:52 -05:00

9.0 KiB

Raw Blame History

Research: How MCP Servers Handle Rich Interactive UIs

Date: 2026-02-03 Source: github.com/modelcontextprotocol/ext-apps (official repo) + MCP Apps blog/docs

Executive Summary

There is ONE canonical pattern for interactivity in MCP Apps: callServerTool. Every production example that has interactive UI uses app.callServerTool() from the client side to communicate with the server. There are NO alternative server-communication patterns. However, there are 3 complementary APIs for communicating with the HOST (not server): updateModelContext, sendMessage, and openLink.

The key insight: Interactivity in MCP Apps is entirely client-side JavaScript. The server sends data once via tool results, and the UI is a fully interactive web app running in a sandboxed iframe. Buttons, sliders, drag-drop, forms — all work natively because it's just HTML/JS in an iframe. The only time you need callServerTool is when you need FRESH DATA from the server.

The 5 Proven Interactivity Patterns

Pattern 1: Client-Side State Only (Most Common)

Used by: budget-allocator, scenario-modeler, customer-segmentation, cohort-heatmap

The server sends ALL data upfront via structuredContent in the tool result. The UI is fully interactive using only local JavaScript state — no server calls needed after initial load.

Server → ontoolresult(structuredContent) → UI renders with local state
User interacts → local JS handles everything (sliders, charts, forms)

Example: Budget Allocator — Server sends categories, history, benchmarks. Sliders, charts, percentile badges all update locally. Zero callServerTool calls during interaction.

Example: Scenario Modeler — Server sends templates + default inputs. User adjusts sliders → React recalculates projections locally using useMemo. No server round-trips.

Key takeaway: If you can send all needed data upfront, do it. This is the simplest and most responsive pattern.

Pattern 2: callServerTool for On-Demand Data

Used by: wiki-explorer, system-monitor, pdf-server

The UI calls app.callServerTool() when it needs data the server didn't send initially. This is the ONLY way to get fresh data from the server.

User clicks "Expand" → app.callServerTool({ name: "get-first-degree-links", arguments: { url } })
Server returns structuredContent → UI updates graph

Example: Wiki Explorer — Initial tool result gives first page's links. When user clicks "Expand" on a node, it calls callServerTool to fetch that page's links. The graph visualization (force-directed d3) is entirely client-side.

Example: System Monitor — Two tools:

get-system-info (model-visible): Returns static config once
poll-system-stats (app-only, visibility: ["app"]): Returns live metrics

The UI polls poll-system-stats via setInterval + callServerTool every 2 seconds.

Pattern 3: App-Only Tools with `visibility: ["app"]`

Used by: system-monitor, pdf-server

Tools marked visibility: ["app"] are hidden from the LLM — only the UI can call them. This is critical for polling, pagination, and UI-driven actions that shouldn't clutter model context.

// Server: register app-only tool
registerAppTool(server, "poll-system-stats", {
  _meta: { ui: { visibility: ["app"] } },  // Hidden from model
  // ...
});

// Client: call it on interval
setInterval(async () => {
  const result = await app.callServerTool({ name: "poll-system-stats", arguments: {} });
  updateUI(result.structuredContent);
}, 2000);

Pattern 4: updateModelContext (Inform the Model)

Used by: map-server, transcript-server

app.updateModelContext() pushes context TO the model without triggering a response. It tells the model what the user is currently seeing/doing.

Example: Map Server — When user pans the map, debounced handler sends location + screenshot:

app.updateModelContext({
  content: [
    { type: "text", text: `Map centered on [${lat}, ${lon}], ${widthKm}km wide` },
    { type: "image", data: screenshotBase64, mimeType: "image/png" }
  ]
});

Example: Transcript Server — Updates model context with unsent transcript text using YAML frontmatter:

app.updateModelContext({
  content: [{ type: "text", text: `---\nstatus: listening\nunsent-entries: 3\n---\n\n${text}` }]
});

Pattern 5: sendMessage (Trigger Model Response)

Used by: transcript-server

app.sendMessage() sends a message AS the user, triggering a model response. Used when the UI wants the model to act on accumulated data.

await app.sendMessage({
  role: "user",
  content: [{ type: "text", text: transcriptText }]
});

How Specific Examples Handle Interactivity

Example	Interactive Elements	Pattern Used	callServerTool?
Budget Allocator	Sliders, dropdowns, charts, sparklines	Client-side state only	❌ No
Scenario Modeler	Sliders, template selector, React charts	Client-side state only	❌ No
Cohort Heatmap	Hover tooltips, data viz	Client-side state only	❌ No
Customer Segmentation	Scatter chart, tooltips	Client-side state only	❌ No
Wiki Explorer	Click nodes, expand graph, zoom, reset	callServerTool on user action	✅ Yes
System Monitor	Start/stop polling, live charts	callServerTool polling	✅ Yes (every 2s)
Map Server	Pan, zoom, fullscreen 3D globe	updateModelContext on move	❌ No (data from tool input)
Transcript	Start/stop recording, send transcript	updateModelContext + sendMessage	❌ No
PDF Server	Page navigation, zoom	callServerTool for chunks	✅ Yes (chunked loading)
ShaderToy	WebGL shader rendering	ontoolinputpartial for streaming	❌ No

What Does NOT Exist (Common Misconceptions)

No WebSocket/SSE from server to UI — Communication is request/response via callServerTool only
No server-push to UI — The server cannot push data to the UI after initial tool result. The UI must poll.
No shared state between tool calls — Each HTTP request creates a new server instance (stateless)
No DOM access from server — The server never touches the UI. It only returns data.
No "re-render" from server — Once the UI is in the iframe, it's autonomous. The server doesn't control it.

The Minimal Working Interactive MCP App

1. Server registers tool with `_meta.ui.resourceUri`
2. Server registers resource that returns bundled HTML
3. HTML includes `<script>` that creates `new App()` and calls `app.connect()`
4. `app.ontoolresult` receives initial data from the tool call
5. UI renders with that data
6. User interactions handled by local JS event listeners
7. If fresh server data needed: `app.callServerTool()`
8. If model needs to know what user did: `app.updateModelContext()`

Recommendations for Our MCP App

Problem: Interactive components lose functionality after rendering

This is likely because we're treating the UI as a static render rather than a live web app. In MCP Apps, the UI is a full JavaScript application running in an iframe. Event listeners, React state, DOM manipulation — all work normally.

What We Should Adopt

Send all data upfront via structuredContent — Don't drip-feed data. Send everything the UI needs in the initial tool result. This is what budget-allocator and scenario-modeler do.
Build the UI as a standalone web app — Use Vite to bundle HTML+JS+CSS into a single file. The UI should work independently with hardcoded test data before integrating with MCP.
Use callServerTool only for fresh data — If the user needs to load more data (like wiki-explorer expanding nodes), create an app-only tool with visibility: ["app"].
Use updateModelContext to keep the model informed — When users interact (drag items, fill forms), send a summary to the model so it knows the current state.
Use app.ontoolresult + app.ontoolinput — These are the entry points. ontoolinput fires with the tool arguments (can show loading state). ontoolresult fires with the actual data.
Bundle everything with Vite + vite-plugin-singlefile — Every example uses this pattern. The entire UI (HTML, JS, CSS) becomes a single HTML file served as a resource.

Architecture Template

Server (server.ts):
  - registerAppTool("my-tool", { _meta: { ui: { resourceUri } } }, handler)
  - registerAppTool("my-app-action", { _meta: { ui: { visibility: ["app"] } } }, handler)  // optional
  - registerAppResource(resourceUri, ..., returns bundled HTML)

Client (src/mcp-app.ts or .tsx):
  - const app = new App({ name, version })
  - app.ontoolinput = (params) => { /* show loading/preview */ }
  - app.ontoolresult = (result) => { /* render UI with result.structuredContent */ }
  - app.connect()
  - Event listeners for user interactions
  - app.callServerTool() when fresh server data needed
  - app.updateModelContext() when model needs to know about user actions

9.0 KiB Raw Blame History