clawdbot-workspace/mcp-factory-reviews/boss-alexei-proposals.md
2026-02-04 23:01:37 -05:00

40 KiB

Boss Alexei — Final Review & Improvement Proposals

Reviewer: Alexei, MCP Protocol & Ecosystem Authority
Date: 2026-02-04
Scope: MCP-FACTORY.md + all 5 skill files
Verdict: Strong foundation, needs targeted updates for 2025-11-25 spec compliance and several cross-skill gaps


Pass 1 Notes (per skill)

1. MCP-FACTORY.md

Good:

  • Clean pipeline visualization (P1→P7)
  • Clear inputs/outputs/quality gates per phase
  • Agent role mapping with model recommendations (Opus vs Sonnet)
  • Parallel execution noted (Agents 2 & 3)
  • Current inventory tracking with priority guidance

Issues Found:

  • Phase count mismatch: Lists 7 phases (P1-P7) in the pipeline diagram but skills say "Phase X of 5" — the factory doc says 6 phases with P7 = Ship, yet the skills individually say "Phase X of 5." Needs alignment.
  • No mention of new 2025-11-25 spec features: Tasks (async operations), URL mode elicitation, server icons, OAuth Client ID Metadata — these are all in the current spec but absent from the pipeline.
  • No MCP Registry awareness: The MCP Registry launched preview Sep 2025 and is heading to GA. The pipeline should include server registration as a step.
  • Missing post-ship lifecycle: No guidance on monitoring deployed servers, handling API changes, or re-running QA when APIs evolve.
  • Missing version control strategy: No git branching or versioning strategy for the pipeline artifacts themselves.
  • 30 "untested" servers: No prioritization criteria beyond "test against live APIs." Should rank by: business value, credential availability, API stability.

2. mcp-api-analyzer/SKILL.md

Good:

  • Extremely thorough API reading methodology (priority-ordered reading list)
  • Excellent pagination pattern catalog (8 types — best I've seen)
  • API style detection table (REST, GraphQL, SOAP, gRPC, WebSocket)
  • 6-part description formula is excellent
  • Token budget awareness with concrete targets
  • Tool count optimization table
  • Disambiguation tables per group
  • Content annotations planning (audience + priority)
  • Elicitation candidates section
  • Semantic clustering verb prefixes

Issues Found:

  • Pipeline position says "Phase 1 of 5" but MCP-FACTORY.md shows 7 phases
  • Missing: Tasks/async analysis — The 2025-11-25 spec adds experimental Tasks (async operations with polling). The analyzer should identify which tools are candidates for async execution (long-running reports, bulk exports, data migrations).
  • Missing: Icon planning — The 2025-11-25 spec allows icons on tools, resources, prompts. Analysis should note icon candidates.
  • Missing: Server identity / registry metadata — Should note if the service has official branding, logos, and metadata for MCP Registry listing.
  • Section numbering jumps — Goes 1→2→3→3b→4→5→6→6b→7→7b→8→9→10. The template (Section 4) uses sequential numbers but then sections 5-10 follow outside. Confusing.
  • Content annotations placement is ambiguous — Content annotations (audience, priority) go on content blocks in tool results, not on tool definitions. The way they're listed alongside tool definitions in the inventory could confuse builders.
  • The Calendly example uses collection as the data key and next_page_token for pagination, which differs from the standard data/meta envelope documented in the template.
  • No guidance on beta/preview endpoints or incomplete documentation handling.

3. mcp-server-builder/SKILL.md

Good:

  • Comprehensive template variable reference with verification step
  • All 4 auth patterns (API key, OAuth2 client credentials, Basic, multi-tenant)
  • Circuit breaker implementation with proper state machine
  • Pluggable pagination (5 strategies)
  • Health check tool always included — excellent practice
  • Structured JSON logging on stderr
  • Both transports (stdio + Streamable HTTP)
  • One-file pattern for ≤15 tools
  • Error classification (protocol vs tool execution) — matches spec exactly
  • Token budget targets are realistic
  • outputSchema with JSON Schema 2020-12 guidance
  • structuredContent dual-return pattern
  • resource_link in GET single-entity results

Issues Found:

  • SDK version should be ^1.26.0: v1.26.0 was released Feb 4, 2026 and fixes a security vulnerability (GHSA-345p-7cg4-v4c7: sharing server/transport instances can leak cross-client response data). The skills pin ^1.25.0 which would receive this as a compatible update, but explicitly recommending ^1.26.0 is safer.
  • SDK v2 migration warning needed: The TypeScript SDK v2 is in pre-alpha with stable release expected Q1 2026. Skills should note this and recommend pinning v1.x for now.
  • Zod version compatibility: Known issues between Zod v4.x and MCP SDK v1.x (issue #1429). The skill pins ^3.25.0 — this is correct for v1.x but needs a warning about not upgrading to Zod v4 until SDK v2.
  • Missing: Tasks capability — The 2025-11-25 spec adds experimental tasks support (SEP-1686). For long-running tool calls, servers can declare tasks.requests.tools.call and tools can set execution.taskSupport. This is absent from the builder.
  • Missing: Server icons — 2025-11-25 adds icons to tools, resources, prompts, resource templates. The skill mentions icons in section 7 but only as "optional." Should provide concrete guidance on when/how to include them.
  • Missing: URL mode elicitation — 2025-11-25 adds URL mode for elicitation, allowing servers to direct users to external URLs. Useful for OAuth flows and external confirmations.
  • Missing: OAuth Client ID Metadata — New recommended client registration mechanism (SEP-991). Relevant for the OAuth2 auth patterns.
  • ToolDefinition type in types.ts doesn't list title as a required field — but the skill says it's required per spec. The type should enforce this.
  • HTTP transport session management is simplistic — no cleanup of stale sessions, no TTL. Should add session expiry logic.
  • crypto.randomUUID() in HTTP transport — the crypto module isn't imported (global crypto works in Node 18+ but should be explicit).
  • Capabilities declaration includes resources: {} and prompts: {} but no resources or prompts are implemented. Should either implement or remove to avoid misleading clients.
  • Env var placeholder {SERVICE}_API_KEY in the one-file pattern won't work as-is in TypeScript — needs process.env['{SERVICE}_API_KEY'] syntax.
  • Pagination: cursor strategy page parameter — The cursor pagination falls back to a page parameter which doesn't make sense for cursor-based pagination.

4. mcp-app-designer/SKILL.md

Good:

  • Comprehensive design system with WCAG AA compliance and verified contrast ratios
  • 9 app type templates including Interactive Data Grid
  • Data visualization primitives (SVG line/area, donut, sparklines, progress bars, horizontal bars) — all pure CSS/SVG
  • Bidirectional communication patterns (refresh, navigate, tool_call)
  • Error boundary with window.onerror
  • Three required states (loading/empty/data) with type-specific empty states
  • Data validation utility (validateData())
  • Exponential backoff polling with visibility change handling
  • prefers-reduced-motion support
  • Accessibility (sr-only, focus management, ARIA roles/labels)
  • Micro-interactions (staggered rows, count animation, cross-fade)

Issues Found:

  • postMessage origin not validated — The template accepts messages from any origin ('*'). This is flagged in QA but should be fixed at the source in the template itself.
  • escapeHtml() creates a DOM element every time — Inefficient for large datasets. Should use a regex-based approach for performance.
  • APP_ID placeholder '{app-id}' has no reminder in the execution workflow to replace it.
  • Interactive Data Grid search has a logic bug: handleSearch calls handleSort then immediately toggles the direction back — this is fragile and will break if sort logic changes.
  • No file size budget in the designer skill — The 50KB limit is in the QA skill but not mentioned in the designer skill. Builders won't know until QA.
  • No virtualization for large datasets — At 100+ rows, rendering becomes slow. Should recommend virtual scrolling or pagination for grid apps.
  • Form/wizard template has no submit handler — It renders the form but doesn't actually submit data back to the host. Needs sendToHost('tool_call', { tool: 'create_*', args: formData }).
  • Missing: Print styles — No @media print rules.
  • Missing: i18n/localization guidance — Date/number formatting is hardcoded to en-US.
  • Missing: How apps handle structuredContent directly — The data flow section explains the APP_DATA bridge but doesn't address future direct structuredContent consumption.
  • The donut chart helper has a bug: offset -= seg.percent should be offset += seg.percent (offset moves clockwise).

5. mcp-localbosses-integrator/SKILL.md

Good:

  • Extremely detailed file-by-file integration guide
  • Complete Calendly example walkthrough
  • APP_DATA failure modes with robust parser pattern
  • System prompt engineering guidelines with token budgets
  • Thread lifecycle documentation
  • Thread state management with localStorage concerns and cleanup pattern
  • Three rollback strategies (git, feature-flag, manifest)
  • Integration validation script (cross-reference all 4 files)
  • Few-shot examples in system prompts
  • Notes on MCP Elicitation, Prompts, Roots futures
  • Intake question quality criteria with good/bad examples

Issues Found:

  • APP_DATA is fragile — The entire data flow depends on the LLM correctly generating JSON within HTML comment markers. The failure modes section acknowledges this but the architecture is inherently lossy.
  • structuredContent → APP_DATA bridge section is truncated — The file was cut off at the end. The roadmap section is incomplete.
  • Validation script assumes ts-node — Not always installed. Should provide a compiled JS alternative.
  • Editing 4 shared files doesn't scale — Each new service touches channels.ts, appNames.ts, app-intakes.ts, route.ts. With 30+ services, merge conflicts are inevitable. The manifest-based approach (Strategy 3) should be prioritized.
  • No mention of MCP server lifecycle — What happens when the MCP server crashes mid-conversation? How does the chat route handle tool call failures?
  • Missing: Multiple MCP servers per channel — Some channels might need tools from 2+ servers. No guidance on this.
  • Feature-flag rollback uses enabled property but this isn't in the channel interface definition. Would cause a TypeScript error.
  • System prompt token budgets are reasonable but not verified — no script to actually count tokens.
  • Missing: How to test locally before deploying to production.

6. mcp-qa-tester/SKILL.md

Good:

  • Comprehensive 6-layer architecture (actually 9 sub-layers: 0, 1, 2, 2.5, 3, 3.5, 4, 4.5, 5)
  • Quantitative metrics with specific, measurable targets
  • MCP Inspector integration (Layer 0)
  • Protocol compliance test script with initialize → tools/list → tools/call lifecycle
  • structuredContent validation against outputSchema using Ajv
  • Playwright visual tests with all 3 states
  • BackstopJS regression testing
  • axe-core accessibility auditing with scoring
  • Color contrast audit script
  • VoiceOver testing procedure
  • MSW for API mocking in unit tests
  • Tool routing smoke tests with fixture files
  • APP_DATA schema validator
  • Performance benchmarks (cold start, latency, memory, file size)
  • Security testing (XSS payloads, CSP, key exposure, postMessage origin)
  • Chaos testing (API 500s, wrong data format, huge datasets, rapid-fire)
  • Credential management strategy with categories
  • Fixture library with edge cases, adversarial data, and scale generator
  • Automated QA shell script
  • Report template with trend tracking

Issues Found:

  • Protocol test spawns subprocess but doesn't handle MCP protocol correctly — It sends raw JSON lines but stdio MCP uses newline-delimited JSON-RPC. The readline approach works but only if the server outputs one JSON-RPC message per line (which is standard, so this is actually okay — I was wrong initially).
  • Layer 3.1 tests fetch directly rather than tool handlers — The MSW tests call the mock API endpoints, not the actual tool handler code. Should import and test the real handlers.
  • Cold start benchmark sends an initialize message on stdin but then head -1 reads the first line — this should work but timing via date commands is imprecise. Should use performance.now() inside Node.
  • Missing: Tasks protocol testing — No tests for the new tasks capability (async operations).
  • Missing: Elicitation testing — No tests for elicitation/create flows.
  • Missing: CI/CD integration guidance — The test suite is designed to run manually. No GitHub Actions / CI pipeline template.
  • Missing: Load testing for HTTP transport (concurrent connections, session management).
  • Missing: Test coverage requirements — No minimum coverage thresholds.
  • BackstopJS requires global install (npm install -g backstopjs) which isn't in the setup section.
  • The Ajv import in the structuredContent test is listed but the ajv package isn't mentioned in the dependency installation in Section "Adding Tests." Wait, it IS there: npm install -D ... ajv .... Okay, that's fine.

Pass 2 Notes (what I missed first time, contradictions found)

Cross-Skill Contradictions

  1. Phase numbering inconsistency:

    • MCP-FACTORY.md: "Phase 1-7" (7 phases)
    • mcp-api-analyzer: "Phase 1 of 5"
    • mcp-server-builder: "Phase 2 of 5"
    • mcp-app-designer: "Phase 3 of 5"
    • mcp-localbosses-integrator: "Phase 4 of 5"
    • mcp-qa-tester: Doesn't state a phase number
    • Fix: Standardize to "Phase X of 6" (Analysis, Build, Design, Integrate, Test, Ship) or explicitly document that Phases 6 & 7 in the factory doc are embedded.
  2. SDK version pinning:

    • Server builder: "@modelcontextprotocol/sdk": "^1.25.0"
    • QA tester: References ^1.25.0 in quality gates
    • Reality: v1.26.0 is latest (released same day as this review) with a security fix. And SDK v2 is coming Q1 2026.
    • Fix: Update to ^1.26.0, add migration warning for v2.
  3. Zod version:

    • Server builder: "zod": "^3.25.0"
    • QA tester: Validates Zod at ^3.25.0
    • Reality: Known Zod v4 incompatibility with MCP SDK v1.x (issue #1429). The ^3.25.0 pin is correct but Zod v4 was released and ^3.25.0 won't pull it in. Need explicit warning.
    • Fix: Add note: "Do NOT use Zod v4.x with MCP SDK v1.x — known incompatibility."
  4. Tool definition title field:

    • Analyzer: Includes title in tool inventory template (Section 6)
    • Builder: Says title is REQUIRED (Section 7), but the ToolDefinition type in types.ts doesn't mark it required
    • Fix: Update ToolDefinition type to make title non-optional.
  5. Content annotations location:

    • Analyzer (Section 6b): Plans audience and priority per tool type
    • Builder: Never implements content annotations on tool results
    • Gap: The analyzer plans them but the builder never uses them. Content annotations go on content blocks inside tool results, e.g., { type: "text", text: "...", annotations: { audience: ["user"], priority: 0.7 } }. The builder's tool handlers don't include these.
    • Fix: Add content annotations to the builder's tool handler template.
  6. App data shape expectations:

    • Analyzer (Section 7): Defines app candidates with data source tools
    • Designer: Each app type expects a specific data shape (documented per template)
    • Builder: Tool handlers return structuredContent with whatever shape the API returns
    • Integrator: System prompts tell the AI to generate APP_DATA matching the app's expected shape
    • Gap: There's no formal contract between the builder's outputSchema and the designer's expected data shape. The bridge is the LLM in the integrator's system prompt, which is lossy.
    • Fix: Add a "Data Contract" section where the analyzer explicitly maps tool output schemas to app input schemas. The integrator's system prompt should reference these contracts.
  7. App file location:

    • Factory: Says {service}-mcp/app-ui/ or {service}-mcp/ui/
    • Builder: Creates app-ui/ directory
    • Designer: Says output goes to {service}-mcp/app-ui/
    • Integrator: Route.ts checks {dir}/filename.html in APP_DIRS
    • Minor inconsistency: Factory mentions ui/ as alternative but designer only uses app-ui/.
    • Fix: Standardize on app-ui/ everywhere.
  8. Capabilities declaration:

    • Builder: Declares capabilities: { tools, resources, prompts, logging }
    • Reality: No resources or prompts are implemented. Declaring empty capabilities is technically valid per spec (it says "the server supports this feature") but misleading if nothing is there.
    • Fix: Only declare tools and logging unless resources/prompts are actually implemented.

Handoff Gaps

  1. Analyzer → Builder handoff:

    • Analyzer outputs: {service}-api-analysis.md
    • Builder expects: Same file
    • Gap: The analyzer's elicitation candidates section has no corresponding implementation in the builder. The builder doesn't implement elicitation/create.
    • Gap: The analyzer's content annotations planning has no corresponding implementation in the builder's handlers.
    • Gap: The analyzer's outputSchema format in the tool inventory template uses a simplified format, but the builder needs full JSON Schema 2020-12.
  2. Builder → Designer handoff:

    • Builder outputs: Compiled server + tool definitions
    • Designer expects: Analysis doc (app candidates) + tool definitions
    • Gap: The designer uses the analysis doc's app candidates section, not the actual built server's tool definitions. If the builder modified tool names or schemas during implementation, the designer wouldn't know.
    • Fix: The designer should also read the built server's tool definitions as input validation.
  3. Designer → Integrator handoff:

    • Designer outputs: HTML files in app-ui/
    • Integrator expects: HTML files + analysis doc + server
    • Gap: The integrator's APP_DATA format tables (Section 7, "Required Fields Per App Type") define data shapes that must match what the designer's render() functions expect. But these are defined in two different places — the designer has expected data shapes per template, and the integrator has required APP_DATA fields per type. They're not cross-referenced.
    • Fix: Create a single "Data Shape Contract" document that both reference.
  4. Integrator → QA handoff:

    • Integrator outputs: Wired LocalBosses channel
    • QA expects: Integrated channel for testing
    • Gap: The QA skill has a tool routing smoke test that needs test-fixtures/tool-routing.json, but the integrator doesn't generate this file. Who creates it?
    • Fix: The integrator should generate a baseline tool-routing.json from the system prompt's tool routing rules.

Technical Accuracy of Code Examples

  1. Builder: process.env.{SERVICE}_API_KEY — This is not valid TypeScript. Needs bracket notation: process.env['{SERVICE}_API_KEY'] or the template variable should be replaced before build.

  2. Builder: HTTP transport crypto.randomUUID() — Works in Node 18+ via the global crypto, but for explicitness and to support older Node versions, should import: import { randomUUID } from 'crypto';

  3. Builder: StreamableHTTPServerTransport constructor — Uses sessionIdGenerator parameter. Verified this is correct per SDK v1.25.x API.

  4. Designer: escapeHtml function — Creates a temporary DOM element per call. For a grid with 1000 cells, that's 6000+ DOM element creations. Should use a string-replacement approach:

    function escapeHtml(text) {
      if (!text) return '';
      return String(text)
        .replace(/&/g, '&')
        .replace(/</g, '&lt;')
        .replace(/>/g, '&gt;')
        .replace(/"/g, '&quot;')
        .replace(/'/g, '&#39;');
    }
    
  5. Designer: Interactive Data Grid handleSearch — Calls handleSort twice to re-apply current sort after filtering. This toggles direction twice, which works but is fragile. Better approach: extract sort logic into a separate applySort() function.

  6. Designer: Donut chart helperoffset -= seg.percent moves counter-clockwise. For standard clockwise rendering starting from 12 o'clock, should be offset -= seg.percent (dash-offset decreases = clockwise in SVG). Actually, reviewing SVG stroke-dashoffset semantics: decreasing offset moves the dash start forward (clockwise). So offset -= seg.percent is actually correct. I retract this note.

  7. QA: Protocol test readline interface — Uses this.proc.stdout! with readline. MCP stdio transport uses newline-delimited JSON-RPC, so readline by line is correct.

  8. QA: Cold start benchmarkecho '...' | timeout 10 node dist/index.js | head -1 — This sends initialize without waiting for the response, then immediately pipes. The server might not respond before stdin closes. A more robust approach would use a Node script with proper bidirectional communication.


Research Findings (latest updates we need to incorporate)

1. SDK Version: v1.26.0 (Released Feb 4, 2026)

What changed:

  • Security fix: GHSA-345p-7cg4-v4c7 — "Sharing server/transport instances can leak cross-client response data"
  • Client Credentials OAuth scopes support fix
  • Dependency vulnerability fixes

Action: Update all SDK version references from ^1.25.0 to ^1.26.0.

2. SDK v2 (Pre-Alpha, Stable Q1 2026)

The TypeScript SDK main branch is v2 (pre-alpha). Stable v2 expected Q1 2026. Key implications:

  • v1.x will receive bug fixes and security updates for 6+ months after v2 ships
  • Servers built now on v1.x will need a migration path
  • v2 likely has breaking API changes

Action: Add a "Future-Proofing" section to the builder skill warning about v2 and recommending pinning v1.x.

3. 2025-11-25 Spec — Features Missing from Skills

Feature Spec Section Impact Priority
Tasks (experimental) SEP-1686 Long-running ops can return immediately with task ID, client polls for result HIGH — our skills don't mention async at all
URL Mode Elicitation SEP-1036 Servers direct users to external URLs (OAuth, payment confirmations) MEDIUM — useful for OAuth flows
Server/Tool Icons SEP-973 icons array on tools, resources, prompts, resource templates LOW — cosmetic but improves UX
Tool Names Guidance SEP-986 Official spec guidance on tool naming conventions LOW — our naming is already good
Tool Calling in Sampling SEP-1577 tools and toolChoice params in sampling/createMessage LOW — not relevant for our server-side
OAuth Client ID Metadata SEP-991 Recommended client registration without DCR MEDIUM — simplifies OAuth
OpenID Connect Discovery PR #797 Enhanced auth server discovery MEDIUM — OAuth flows
Incremental Scope Consent SEP-835 WWW-Authenticate for incremental OAuth scopes LOW — edge case
Elicitation Enhancements SEP-1034, 1330 Default values, titled enums, multi-select MEDIUM — makes elicitation more powerful
JSON Schema 2020-12 Default SEP-1613 Official dialect for MCP schemas Already covered
Input Validation = Tool Error SEP-1303 Clarified in spec Already covered

4. MCP Registry (Preview, Sep 2025)

The MCP Registry is an open catalog and API for server discovery. Launched preview Sep 2025.

  • Public and private sub-registries
  • Native API for clients to discover servers
  • Server identity via .well-known URLs planned for future

Action: Add a Phase 6.5 or post-ship step: "Register server in MCP Registry."

5. Zod v4 Incompatibility

MCP SDK v1.x is incompatible with Zod v4.x (issue #1429). The error is w._parse is not a function.

  • Our skills correctly pin ^3.25.0 which stays on Zod v3.x
  • But if someone manually installs Zod v4, it breaks

Action: Add explicit warning in builder skill.


Proposed Improvements (specific, actionable)

P0 — Critical (do before next build)

1. Update SDK Version Pin

File: mcp-server-builder/SKILL.md (Section 3, package.json template)

// BEFORE
"@modelcontextprotocol/sdk": "^1.25.0",

// AFTER
"@modelcontextprotocol/sdk": "^1.26.0",

Add note after package.json:

> **Security Note (Feb 2026):** v1.26.0 fixes GHSA-345p-7cg4-v4c7 (cross-client data leak 
> in shared transport instances). Always use ≥1.26.0.
>
> **SDK v2 Warning:** The TypeScript SDK v2 is in pre-alpha (stable expected Q1 2026). 
> Pin to v1.x for production. v1.x will receive bug fixes for 6+ months after v2 ships.
> Do NOT use Zod v4.x with SDK v1.x — known incompatibility (issue #1429).

Also update QA tester references.

2. Fix ToolDefinition Type to Require title

File: mcp-server-builder/SKILL.md (Section 4.1, types.ts)

// BEFORE
export interface ToolDefinition {
  name: string;
  title: string;  // exists but not enforced differently from other fields

// AFTER — add JSDoc to clarify requirement
export interface ToolDefinition {
  /** Machine-readable name (snake_case). REQUIRED. */
  name: string;
  /** Human-readable display name. REQUIRED per 2025-11-25 spec. */
  title: string;

The type already has title: string (non-optional), so it IS required at the type level. But the outputSchema is optional in the type (outputSchema?: ...). Per the skill's own Section 7, outputSchema is "REQUIRED (2025-06-18+)". Fix:

// Make outputSchema required in the type:
outputSchema: Record<string, unknown>;  // Remove the ?

3. Add Content Annotations to Builder Tool Handlers

File: mcp-server-builder/SKILL.md (Section 4.6, tool group template)

The analyzer plans content annotations per tool type, but the builder never implements them. Add to the handler return pattern:

// In list handler:
return {
  content: [
    {
      type: "text",
      text: JSON.stringify(result, null, 2),
      annotations: { audience: ["user", "assistant"], priority: 0.7 },
    },
  ],
  structuredContent: result,
};

// In get handler:
return {
  content: [
    {
      type: "text",
      text: JSON.stringify(result, null, 2),
      annotations: { audience: ["user"], priority: 0.8 },
    },
    {
      type: "resource_link",
      uri: `{service}://contacts/${contact_id}`,
      name: `Contact ${contact_id}`,
      mimeType: "application/json",
    },
  ],
  structuredContent: result,
};

// In delete handler:
return {
  content: [
    {
      type: "text",
      text: JSON.stringify(result, null, 2),
      annotations: { audience: ["user"], priority: 1.0 },
    },
  ],
  structuredContent: result,
};

4. Fix escapeHtml in App Designer

File: mcp-app-designer/SKILL.md (Section 5, template script)

// BEFORE (DOM-based, slow for large datasets)
function escapeHtml(text) {
  if (!text) return '';
  const div = document.createElement('div');
  div.textContent = String(text);
  return div.innerHTML;
}

// AFTER (string-based, 10x faster)
function escapeHtml(text) {
  if (!text) return '';
  return String(text)
    .replace(/&/g, '&amp;')
    .replace(/</g, '&lt;')
    .replace(/>/g, '&gt;')
    .replace(/"/g, '&quot;')
    .replace(/'/g, '&#39;');
}

5. Fix Capabilities Declaration

File: mcp-server-builder/SKILL.md (Section 4.7, index.ts)

// BEFORE — declares resources and prompts but doesn't implement them
capabilities: {
  tools: { listChanged: false },
  resources: {},
  prompts: {},
  logging: {},
},

// AFTER — only declare what's implemented
capabilities: {
  tools: { listChanged: false },
  logging: {},
  // Add resources/prompts ONLY when the server actually implements them:
  // resources: { subscribe: false, listChanged: false },
  // prompts: { listChanged: false },
},

P1 — Important (do in next cycle)

6. Add Tasks (Async Operations) Support

File: mcp-api-analyzer/SKILL.md — add Section 7c: "Task Candidates" File: mcp-server-builder/SKILL.md — add Section X: "Async Tasks"

In the analyzer, add:

## 7c. Task Candidates (Async Operations)

Identify tools where the operation may take >10 seconds and should be executed 
asynchronously using MCP Tasks (spec 2025-11-25, experimental).

### When to flag a tool for async/task support:
- **Report generation** — compiling analytics, PDFs, exports
- **Bulk operations** — updating 100+ records, mass imports
- **External processing** — waiting on third-party webhooks, payment processing
- **Data migration** — moving large datasets between systems

### Task Candidate Template:

| Tool | Typical Duration | Task Support | Polling Interval |
|------|-----------------|-------------|-----------------|
| `export_report` | 30-120s | required | 5000ms |
| `bulk_update` | 10-60s | optional | 3000ms |
| `generate_invoice_pdf` | 5-15s | optional | 2000ms |

In the builder, add task-enabled tool pattern:

// Tool definition with task support
{
  name: "export_report",
  title: "Export Report",
  description: "...",
  inputSchema: { ... },
  outputSchema: { ... },
  annotations: { readOnlyHint: true, ... },
  execution: {
    taskSupport: "optional",  // "required" | "optional" | "forbidden"
  },
}

// In capabilities:
capabilities: {
  tools: { listChanged: false },
  tasks: {
    list: {},
    cancel: {},
    requests: { tools: { call: {} } },
  },
}

7. Add Form Submit Handler to App Designer

File: mcp-app-designer/SKILL.md (Section 6.4, Form/Wizard template)

The form template renders fields but has no submit action. Add:

// Add submit button to form HTML:
`<button class="btn-primary" onclick="submitForm()" style="width:100%;margin-top:16px">
  Create ${escapeHtml(title)}
</button>`

// Add submit handler:
function submitForm() {
  const form = document.getElementById('appForm');
  const formData = {};
  const fields = form.querySelectorAll('input, select, textarea');
  fields.forEach(field => {
    if (field.name) formData[field.name] = field.value;
  });
  
  // Validate required fields
  const missing = [...fields].filter(f => f.required && !f.value);
  if (missing.length > 0) {
    missing[0].focus();
    missing[0].style.borderColor = '#f04747';
    return;
  }
  
  // Send to host for tool execution
  sendToHost('tool_call', {
    tool: data.submitTool || 'create_' + APP_ID.split('-').pop(),
    args: formData
  });
  
  // Show confirmation
  showState('empty');
  document.querySelector('#empty .empty-state-icon').textContent = '✅';
  document.querySelector('#empty .empty-state-title').textContent = 'Submitted!';
  document.querySelector('#empty .empty-state-text').textContent = 'Your request has been sent.';
}

8. Add File Size Budget to App Designer

File: mcp-app-designer/SKILL.md (Section 10, Rules & Constraints)

Add to MUST list:

- [x] File size under 50KB per app (ideally under 30KB)

Add to Section 12 (Execution Workflow), step 2k:

   k. Check file size: `wc -c < app.html` should be under 51200 bytes

9. Standardize Phase Numbering

All files: Update phase references to be consistent.

Options:

  • A) 6 phases: Analyze (1), Build (2), Design (3), Integrate (4), Test (5), Ship (6)
  • B) 5 phases: Analyze (1), Build (2), Design (3), Integrate (4), Test (5) — ship is implicit

Recommendation: Option A. Update MCP-FACTORY.md pipeline to 6 phases and update each skill header.

10. Add postMessage Origin Validation

File: mcp-app-designer/SKILL.md (Section 5, template)

// BEFORE
window.addEventListener('message', (event) => {
  try {
    const msg = event.data;
    // ... process message

// AFTER
const TRUSTED_ORIGINS = [window.location.origin, 'http://localhost:3000', 'http://192.168.0.25:3000'];

window.addEventListener('message', (event) => {
  // Validate origin (skip if same-origin or trusted)
  if (event.origin && !TRUSTED_ORIGINS.includes(event.origin) && event.origin !== window.location.origin) {
    // Accept messages from parent frame regardless (typical iframe pattern)
    // but log unexpected origins for debugging
    console.warn('[App] Message from unexpected origin:', event.origin);
  }
  try {
    const msg = event.data;
    // ... process message

Note: In the iframe context, messages from the parent are the primary use case. Full origin validation is tricky because the iframe may not know the parent's origin. A pragmatic approach is to validate message structure rather than origin.

P2 — Nice to Have (future improvements)

11. Add MCP Registry Registration Step

File: MCP-FACTORY.md — add after Phase 6:

## Phase 6.5: Registry Registration (Optional)

Register the server in the MCP Registry for discoverability.
- Server metadata (name, description, icon, capabilities)
- Authentication requirements
- Tool catalog summary
- Registry API: https://registry.modelcontextprotocol.io

12. Add Data Shape Contract Section

Create a new concept: a shared contract between builder (outputSchema) and designer (expected data shape). Add to the analyzer skill as a new section after App Candidates:

## 7d. Data Shape Contracts

For each app, define the exact mapping from tool outputSchema to app render input:

| App | Source Tool | Tool OutputSchema Key Fields | App Expected Fields | Transform Notes |
|-----|------------|-----|-----|------|
| `svc-contact-grid` | `list_contacts` | `data[].{name,email,status}`, `meta.{total,page}` | `data[].{name,email,status}`, `meta.{total,page}` | Direct pass-through |
| `svc-dashboard` | `get_analytics` | `{revenue,contacts,deals}` | `metrics.{revenue,contacts,deals}`, `recent[]` | LLM restructures into metrics + recent |

13. Add Virtual Scrolling Guidance for Large Grids

File: mcp-app-designer/SKILL.md — add note in Section 6.9 (Interactive Data Grid):

> **Performance Note:** For datasets over 100 rows, consider implementing virtual 
> scrolling. Render only visible rows + a buffer zone. Alternative: paginate client-side 
> (show 50 rows with prev/next controls, all data already loaded).

14. Improve QA Tool Routing Tests to Use Real Handlers

File: mcp-qa-tester/SKILL.md (Layer 3.1)

The current MSW tests call fetch directly. Better:

// Import actual tool handlers
import { getTools } from '../src/tools/contacts.js';
import { APIClient } from '../src/client.js';

// Create client with mock API (MSW intercepts fetch)
const client = new APIClient('test-key');
const { handlers } = getTools(client);

test('list_contacts handler returns correct shape', async () => {
  const result = await handlers.list_contacts({ page: 1, pageSize: 25 });
  expect(result.content).toBeDefined();
  expect(result.structuredContent).toBeDefined();
  expect(result.structuredContent.data).toBeInstanceOf(Array);
});

15. Add CI Pipeline Template

File: mcp-qa-tester/SKILL.md — add new section:

# .github/workflows/mcp-qa.yml
name: MCP QA Pipeline
on: [push, pull_request]
jobs:
  qa:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22' }
      - run: npm ci
      - run: npm run build
      - run: npx tsc --noEmit
      - run: npx jest --ci --coverage
      - run: npx playwright install --with-deps
      - run: npx playwright test
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: test-results
          path: test-results/

Cross-Skill Issues (contradictions, handoff gaps, inconsistencies)

Issue Matrix

# Issue Skills Affected Severity Fix
1 Phase count mismatch (5 vs 7) All Low Standardize numbering
2 SDK version ^1.25.0 vs ^1.26.0 (security) Builder, QA High Update to ^1.26.0
3 Content annotations planned but not built Analyzer → Builder Medium Add to builder handlers
4 Data shape contract gap (tool output ≠ app input) Analyzer → Designer → Integrator High Add data shape contracts
5 Capabilities declare resources/prompts but none exist Builder Medium Only declare implemented
6 App file location inconsistency (app-ui/ vs ui/) Factory, Builder, Designer Low Standardize app-ui/
7 Tool routing fixtures not generated by integrator Integrator → QA Medium Auto-generate from prompts
8 escapeHtml DOM-based (slow) in designer Designer, QA Medium Switch to regex-based
9 No Tasks (async) support across pipeline All Medium Add to analyzer + builder
10 No MCP Registry awareness All Low Add registry step
11 Form template has no submit handler Designer Medium Add submitForm()
12 postMessage origin not validated Designer, QA Medium Add validation or structured checks
13 Env var {SERVICE}_API_KEY syntax invalid in TS Builder High Use bracket notation in one-file pattern
14 structuredContent → APP_DATA bridge section truncated Integrator Low Complete the section
15 Feature-flag rollback uses undeclared enabled property Integrator Low Add to interface or use env var
16 No file size budget in designer skill Designer Medium Add 50KB limit to rules
17 handleSearch sort workaround is fragile Designer Low Extract applySort() function
18 Missing Zod v4 incompatibility warning Builder Medium Add explicit warning

Handoff Chain Integrity

Analyzer → Builder: 85% aligned
  ✅ Tool names, descriptions, schemas transfer well
  ❌ Elicitation candidates not implemented
  ❌ Content annotations planned but not built
  ❌ Task candidates not planned/implemented
  
Builder → Designer: 70% aligned
  ✅ HTML apps can render tool output
  ❌ No formal data shape contract
  ❌ Designer doesn't validate against built tool schemas
  ❌ structuredContent → APP_DATA bridge is lossy

Designer → Integrator: 90% aligned
  ✅ HTML files, APP_IDs, routing all documented
  ❌ Data shape expectations documented in two places
  ❌ Form submit handler missing

Integrator → QA: 80% aligned
  ✅ QA knows what to test
  ❌ Tool routing fixtures not auto-generated
  ❌ No Tasks/elicitation test coverage
  ❌ Protocol test could be more robust

Final Assessment

Overall Quality: 8.5/10 — This is genuinely impressive work. The skills are more comprehensive than most production MCP documentation I've seen. The pipeline concept is solid, the templates are battle-tested, and the attention to detail (WCAG compliance, error boundaries, circuit breakers, structured logging) is professional-grade.

Biggest Wins:

  1. The 6-part tool description formula with "when NOT to use" disambiguation
  2. The pluggable pagination strategies (5 types)
  3. The QA framework with quantitative metrics and 9 testing layers
  4. The circuit breaker + structured logging in every server
  5. The app designer's 9 template types with full accessibility

Biggest Gaps:

  1. No Tasks (async operations) support — this is in the current spec
  2. Content annotations planned but never implemented
  3. Data shape contracts between tools and apps don't exist
  4. SDK version needs security update
  5. The APP_DATA bridge architecture is inherently fragile (LLM as data serializer)

My recommendation: Fix P0 items immediately (SDK version, capabilities, escapeHtml, env var syntax). Schedule P1 items for the next iteration (Tasks, form submit, phase numbering, origin validation). P2 items can be done opportunistically.

These skills are 90% of the way to being the #1 MCP development process in the world. The remaining 10% is spec currency, cross-skill contracts, and the async operations story.

—Alexei