clawdbot-workspace/design-hitl-modal-collection.md
2026-02-06 23:01:30 -05:00

99 KiB
Raw Permalink Blame History

🏭 GooseFactory HITL Modal Collection

Interactive MCP App Modals for Human-in-the-Loop Input Collection

Design Document v1.0 — 25 Bold, Experiential Modal Types Built for MCP Apps: sandboxed HTML iframes rendered inside AI chat interfaces. Operator: Jake | System: GooseFactory | Pipelines: 64+ MCP servers


Design Philosophy

The Three Laws of GooseFactory Modals:

  1. 3-Second Rule — The operator must know exactly what to do within 3 seconds of seeing the modal. No reading instructions. No figuring it out. Giant affordances, obvious actions.
  2. Dopamine Design — Every interaction should feel satisfying. Animations, sounds (optional), color transitions, micro-celebrations. The operator should want to review the next item.
  3. Data Density — Every tap, drag, swipe, and click captures nuanced signal that a boring 1-10 scale never could. We're training AI systems — the richer the signal, the smarter they get.

Technical Constraints (MCP App iframe):

  • Self-contained HTML/CSS/JS — no external dependencies (inline everything)
  • Communicates decisions back via window.parent.postMessage() or form submission
  • Responsive: works in chat panels ~400-600px wide
  • Dark mode first (chat interfaces are typically dark)
  • All animations via CSS transitions/keyframes (no heavy JS libs)
  • Touch-friendly targets: minimum 48px tap zones

The Collection


1. The Traffic Light

When: Binary/ternary quality gates — "Does this MCP server's output pass basic requirements?" Perfect for initial screening of generated code, API responses, or config files.

Visual: Three massive circular lights stacked vertically against a matte black background, mimicking a real traffic light housing. Each circle is 120px diameter. Default state: all dim gray (#333). On hover, the light glows with its color bleeding outward like real LED diffusion.

  • 🟢 GREEN = "Ship it" — glows #00FF66 with a soft pulse animation
  • 🟡 YELLOW = "Needs polish" — glows #FFD700 with a slow blink
  • 🔴 RED = "Stop, rework" — glows #FF3333 with a hard static glow

Below the lights: a compact summary card showing what's being reviewed (pipeline name, deliverable type, 2-line preview). The traffic light housing has a subtle metallic gradient border (#444 to #222).

Flow:

  1. Modal appears with all three lights dim. Summary card shows the item under review.
  2. Operator clicks one light — it BLAZES on with a glow animation, the other two fade to near-invisible.
  3. If GREEN → a small "+feedback" text input slides up from the bottom (optional, 60-char max). Auto-submits after 5 seconds if no text entered.
  4. If YELLOW → a row of quick-tag pills appears: "Naming", "Logic", "Style", "Docs", "Tests", "Performance" — operator taps 1-3 to indicate what needs polish.
  5. If RED → same tag pills but colored red, plus a mandatory 1-line "What's wrong?" input.
  6. Submission: the chosen light does a final bright flash, then the modal smoothly collapses.

Data Collected:

  • decision: "pass" | "marginal" | "fail"
  • tags: string[] (what specifically needs work)
  • feedback: string (optional text)
  • response_time_ms: number (how quickly operator decided — fast = high confidence)
  • timestamp: ISO string

Why It's Special: The metaphor is universal — everyone knows traffic lights. No cognitive load. The glow effect makes clicking feel powerful, like you're actually controlling a signal. Response time is a hidden confidence metric: fast clicks = operator was certain, slow clicks = operator was unsure (valuable training signal).

HTML/CSS Sketch:

<div class="traffic-light-housing">
  <div class="light green" data-decision="pass" onclick="select(this)">
    <span class="label">SHIP IT</span>
  </div>
  <div class="light yellow" data-decision="marginal" onclick="select(this)">
    <span class="label">POLISH</span>
  </div>
  <div class="light red" data-decision="fail" onclick="select(this)">
    <span class="label">REWORK</span>
  </div>
</div>
<div class="summary-card">
  <h3>{{pipeline_name}}</h3>
  <p class="preview">{{deliverable_preview}}</p>
</div>
<div class="tag-tray hidden">
  <button class="tag-pill">Naming</button>
  <button class="tag-pill">Logic</button>
  <button class="tag-pill">Style</button>
  <!-- ... -->
</div>
<style>
  .light { width: 120px; height: 120px; border-radius: 50%; background: #333;
           transition: all 0.3s ease; cursor: pointer; position: relative; }
  .light:hover { box-shadow: 0 0 40px var(--glow-color); }
  .light.selected { box-shadow: 0 0 60px 20px var(--glow-color);
                    animation: pulse 1.5s ease-in-out infinite; }
  .green { --glow-color: #00FF66; }
  .yellow { --glow-color: #FFD700; }
  .red { --glow-color: #FF3333; }
  @keyframes pulse { 0%,100% { box-shadow: 0 0 40px var(--glow-color); }
                     50% { box-shadow: 0 0 80px 30px var(--glow-color); } }
</style>

2. The Tinder Swipe

When: Batch review of multiple outputs — "Here are 8 generated API endpoints, swipe through them." Best for high-volume, fast-paced review sessions where you need gut reactions.

Visual: A card stack in the center of the modal. The top card is a ~350px wide, ~200px tall rounded card with a subtle shadow stack behind it showing 2-3 more cards. The card displays the item name, a 3-line preview, and a small metadata badge (pipeline, timestamp).

The background is dark (#1a1a2e). As you drag left, the card rotates and the background tints red. As you drag right, it tints green. Drag up = card tints gold with a watermark.

Below the card: three hint icons — ← Reject | ↑ Love It | → Approve — in muted text.

Flow:

  1. Card stack appears. Counter in top-right: "1 of 8".
  2. Operator can:
    • Swipe right (or tap ) → Card flies off right with green trail. Approved.
    • Swipe left (or tap ) → Card flies off left with red trail. Rejected.
    • Swipe up (or tap ) → Card flies up with golden sparkle effect. "Love it" — exemplary output.
    • Tap the card → Card flips over (3D CSS transform) to show full details/diff.
  3. After each swipe, next card slides in from the bottom with a satisfying spring animation.
  4. After all cards: summary screen shows approved (green stack), rejected (red stack), loved (gold stack) with counts. "Confirm & Submit" button.

Data Collected:

  • decisions: Array of { item_id, decision: "approve"|"reject"|"love", swipe_speed, hesitation_ms }
  • cards_flipped: which items operator examined in detail (signals complexity/uncertainty)
  • total_session_time: aggregate review speed
  • order_effects: did operator get more lenient or strict over time?

Why It's Special: The swipe mechanic is muscle memory for anyone under 40. It's FAST — you can review 8 items in under 30 seconds. The swipe speed and hesitation are hidden confidence signals. Card flips reward curiosity without forcing detail views. The gold "love it" swipe captures exemplary outputs that become training examples.

HTML/CSS Sketch:

<div class="swipe-container">
  <div class="card-stack">
    <div class="card" draggable style="--rotation: 0deg; --x: 0px;">
      <div class="card-front">
        <span class="badge">{{pipeline}}</span>
        <h3>{{item_name}}</h3>
        <p class="preview">{{preview_text}}</p>
      </div>
      <div class="card-back">
        <pre>{{full_content}}</pre>
      </div>
    </div>
  </div>
  <div class="swipe-hints">
    <span class="hint-left">❌ Reject</span>
    <span class="hint-up">⭐ Love</span>
    <span class="hint-right">✅ Approve</span>
  </div>
</div>
<style>
  .card { width: 350px; height: 200px; border-radius: 16px; background: #2a2a3e;
          position: absolute; transition: transform 0.1s; transform-style: preserve-3d;
          touch-action: none; cursor: grab; }
  .card.swiped-right { transform: translateX(500px) rotate(30deg); opacity: 0;
                       transition: all 0.4s ease-out; }
  .card.swiped-left { transform: translateX(-500px) rotate(-30deg); opacity: 0; }
  .card.swiped-up { transform: translateY(-500px) scale(0.8); opacity: 0; }
  .card.flipped { transform: rotateY(180deg); }
  .card-back { transform: rotateY(180deg); backface-visibility: hidden; }
</style>

3. The Report Card

When: Comprehensive quality assessment — "Rate this MCP server across all dimensions." Best for milestone reviews, final approvals, or when training the AI to understand multi-dimensional quality.

Visual: A literal report card design on cream/ivory paper (#FFF8E7) with a subtle paper texture. School-style header: "GooseFactory Academy — Progress Report". Student name = the MCP server/pipeline name. Each row is a subject (dimension) with letter grade buttons A through F.

Grade buttons are styled like old-school bubble-fill circles. When clicked, they fill in with a satisfying "pencil fill" animation (radial fill from center). The selected grade letter appears bold.

Color coding: A = deep blue, B = green, C = yellow, D = orange, F = red.

At the bottom: "Teacher's Comments" — a lined textarea that looks like ruled notebook paper.

Flow:

  1. Report card slides in from the top like a paper being handed to you.
  2. Dimensions listed (customizable per pipeline type):
    • Code Quality — Clean, readable, follows conventions
    • Functionality — Does it actually work correctly?
    • Documentation — Is it well-documented?
    • Error Handling — Graceful failures?
    • Performance — Efficient?
    • Creativity — Novel approach? (for applicable items)
  3. Operator fills in bubble grades for each. Hover shows tooltip: "A = Exceptional, B = Good, C = Acceptable, D = Below Standard, F = Failing"
  4. Optional "Teacher's Comments" at bottom.
  5. GPA auto-calculates and displays in a circled number (4.0 scale) at the top right.
  6. "Sign & Submit" button styled as a signature line.

Data Collected:

  • grades: { dimension: letter_grade } map
  • gpa: calculated float
  • comments: string
  • grade_distribution: enables analysis of which dimensions the AI is strong/weak on across all reviews
  • grade_changes: if operator changed any grade before submitting (indecision tracking)

Why It's Special: Multi-dimensional feedback in a format everyone instantly understands. The bubble-fill animation is deeply satisfying (nostalgia + tactile feedback). The GPA gives a single summary number while preserving per-dimension detail. Over time, the AI learns which dimensions it's consistently graded poorly on.

HTML/CSS Sketch:

<div class="report-card">
  <div class="header">
    <h2>GooseFactory Academy</h2>
    <div class="gpa-circle">{{gpa}}</div>
  </div>
  <div class="subject-row" data-dim="code_quality">
    <span class="subject-name">Code Quality</span>
    <div class="grade-bubbles">
      <button class="bubble" data-grade="A">A</button>
      <button class="bubble" data-grade="B">B</button>
      <button class="bubble" data-grade="C">C</button>
      <button class="bubble" data-grade="D">D</button>
      <button class="bubble" data-grade="F">F</button>
    </div>
  </div>
  <!-- repeat for each dimension -->
  <textarea class="teachers-comments" placeholder="Teacher's comments..."></textarea>
  <button class="submit-btn">Sign & Submit ✍️</button>
</div>
<style>
  .report-card { background: #FFF8E7; border-radius: 4px; padding: 24px;
                 font-family: 'Georgia', serif; box-shadow: 2px 4px 12px rgba(0,0,0,0.3); }
  .bubble { width: 40px; height: 40px; border-radius: 50%; border: 2px solid #666;
            background: transparent; font-weight: bold; cursor: pointer; transition: all 0.3s; }
  .bubble.filled { background: radial-gradient(circle, var(--grade-color) 0%, var(--grade-color) 70%, transparent 100%);
                   color: white; animation: fill-in 0.3s ease; }
  @keyframes fill-in { from { transform: scale(0.8); } to { transform: scale(1); } }
  .teachers-comments { width: 100%; min-height: 80px; background: repeating-linear-gradient(
    transparent, transparent 27px, #ccc 27px, #ccc 28px); border: none; font-family: inherit;
    line-height: 28px; resize: vertical; }
</style>

4. The Thermometer

When: Subjective quality feel — "How hot is this output?" Best for creative work, marketing copy, UX designs, or anything where quality is more "vibe" than objective criteria.

Visual: A giant mercury thermometer on the left side of the modal, taking up the full height (~400px). The mercury column is a gradient from icy blue at the bottom (#00BFFF) through green → yellow → orange to blazing red at the top (#FF4444).

The bulb at the bottom pulses gently. As the operator drags the temperature up, the background transitions through matching color temperatures — blue frost effects at the bottom, warm orange glow at the top, with particle effects (snowflakes at cold end, flame particles at hot end).

Temperature labels on the right side of the thermometer:

  • 🔥 100° — "Absolutely fire"
  • 😎 80° — "Solid work"
  • 😐 60° — "Meh, it's okay"
  • 🥶 40° — "Pretty cold"
  • 💀 20° — "Dead on arrival"

Flow:

  1. Thermometer appears at a neutral 50° (yellow/green zone).
  2. Operator drags the mercury level up or down. The entire modal background shifts color temperature. Haptic feedback on mobile.
  3. At extreme temperatures (>90° or <20°), celebratory/alarm effects trigger (confetti for hot, frost crystals for cold).
  4. Release the drag → temperature locks in with a "ding" marker line appearing.
  5. Below the thermometer: "What's driving this temperature?" with 3 quick-select chips that change based on temperature range:
    • Hot: "Clever solution", "Clean code", "Great UX", "Innovative"
    • Cold: "Buggy", "Confusing", "Over-engineered", "Missing requirements"
  6. Submit button reads the temperature: "Submit 78°" (dynamic).

Data Collected:

  • temperature: 0-100 integer
  • temperature_zone: "freezing"|"cold"|"lukewarm"|"warm"|"hot"|"blazing"
  • drivers: string[] (selected chips explaining the rating)
  • drag_journey: array of temperature values during drag (shows hesitation — did they go up then back down?)
  • final_hold_time_ms: how long they held before releasing (confidence proxy)

Why It's Special: Continuous scale with physical metaphor beats discrete 1-10. The color/particle effects make it visceral — you FEEL the quality level you're setting. The drag journey captures the operator's internal deliberation process. It's also just plain fun to play with.

HTML/CSS Sketch:

<div class="thermo-modal" style="--temp: 50; --temp-color: #88CC00;">
  <div class="thermometer">
    <div class="mercury" style="height: calc(var(--temp) * 1%);">
      <div class="mercury-top" draggable></div>
    </div>
    <div class="bulb"></div>
    <div class="scale-labels">
      <span class="label" style="bottom: 100%">🔥 100°</span>
      <span class="label" style="bottom: 80%">😎 80°</span>
      <span class="label" style="bottom: 60%">😐 60°</span>
      <span class="label" style="bottom: 40%">🥶 40°</span>
      <span class="label" style="bottom: 20%">💀 20°</span>
    </div>
  </div>
  <div class="driver-chips">
    <!-- dynamically populated based on temperature range -->
  </div>
  <button class="submit">Submit <span class="temp-display">50</span>°</button>
</div>
<style>
  .thermo-modal { background: var(--temp-color); transition: background 0.2s; min-height: 400px; }
  .thermometer { width: 60px; height: 350px; background: #eee; border-radius: 30px;
                 position: relative; margin: 20px auto; }
  .mercury { width: 40px; background: linear-gradient(to top, #00BFFF, #00FF66, #FFD700, #FF8C00, #FF4444);
             border-radius: 20px; position: absolute; bottom: 0; left: 10px;
             transition: height 0.05s; }
  .bulb { width: 70px; height: 70px; border-radius: 50%; background: #FF4444;
          position: absolute; bottom: -20px; left: -5px; animation: bulb-pulse 2s infinite; }
  @keyframes bulb-pulse { 0%,100% { transform: scale(1); } 50% { transform: scale(1.05); } }
</style>

5. The Spotlight

When: Code review, design review, or any content where specific parts need annotation — "Show me exactly what's good and what's bad in this output."

Visual: The deliverable content (code, text, design mockup) displayed in a dark panel with a "spotlight" effect — most of the content is dimmed to 30% opacity, and a bright circular spotlight (200px diameter, feathered edge) follows the cursor. Only the area under the spotlight is fully visible.

Two mode buttons at the top: 🟢 "Highlight Good" and 🔴 "Highlight Bad". When in green mode, clicking on content area places a green highlight annotation. Red mode = red highlight. Each annotation creates a small numbered marker.

A sidebar shows all annotations as a numbered list with auto-generated line/section references.

Flow:

  1. Content loads dimmed. Spotlight follows cursor — operator naturally scans the content.
  2. Operator toggles between 🟢 and 🔴 mode using buttons or keyboard shortcut (G/B).
  3. Click on any area → spotlight freezes, highlight appears, and a small input popup asks for optional 1-line comment ("Why?").
  4. Enter/ESC → spotlight resumes following cursor.
  5. Operator can place unlimited annotations. Sidebar tracks them all.
  6. "Done Reviewing" button at bottom shows summary: X good highlights, Y bad highlights.
  7. Optional: "Overall verdict" quick-select before final submit.

Data Collected:

  • annotations: Array of { line_number, char_range, type: "good"|"bad", comment: string }
  • scan_pattern: heatmap of where the spotlight lingered (attention tracking)
  • time_per_section: how long operator spent in each area
  • annotation_density: ratio of good to bad highlights
  • overall_verdict: string

Why It's Special: Instead of "the code is bad", we get "lines 14-18 are bad because X." The spotlight mechanic forces focused attention instead of skimming. The scan heatmap reveals what the operator even looked at — if they never looked at the error handling section, that's data too. It transforms passive reading into active investigation.

HTML/CSS Sketch:

<div class="spotlight-review">
  <div class="mode-bar">
    <button class="mode-btn active" data-mode="good">🟢 Highlight Good (G)</button>
    <button class="mode-btn" data-mode="bad">🔴 Highlight Bad (B)</button>
  </div>
  <div class="content-area" onmousemove="moveSpotlight(event)">
    <pre class="code-content" style="mask-image: radial-gradient(circle 100px at var(--mx) var(--my), black 50%, transparent 100%);">
      {{content_here}}
    </pre>
    <div class="annotations-layer">
      <!-- dynamically placed annotation markers -->
    </div>
  </div>
  <div class="annotation-sidebar">
    <h4>Annotations (0)</h4>
    <ol class="annotation-list"></ol>
  </div>
  <button class="done-btn">Done Reviewing ✓</button>
</div>
<style>
  .content-area { position: relative; background: #1e1e1e; cursor: crosshair; overflow: auto; }
  .code-content { color: #d4d4d4; font-size: 13px; line-height: 1.6;
                  mask-image: radial-gradient(circle 120px at var(--mx, 50%) var(--my, 50%),
                  rgba(0,0,0,1) 40%, rgba(0,0,0,0.15) 100%); }
  .annotation-marker { position: absolute; width: 24px; height: 24px; border-radius: 50%;
                       font-size: 12px; display: flex; align-items: center; justify-content: center;
                       color: white; font-weight: bold; cursor: pointer; z-index: 10; }
  .annotation-marker.good { background: #00CC55; }
  .annotation-marker.bad { background: #FF3344; }
</style>

6. The Ranking Arena

When: Comparing multiple alternatives — "We generated 5 possible implementations, rank them best to worst." Perfect for A/B/C/D testing, choosing between approaches, or prioritizing a backlog.

Visual: Items displayed as draggable cards in a single column, numbered 1 through N. Each card has a grip handle on the left (⠿ dots), item name, and a compact preview. The current rank number is displayed in a large bold circle on the left.

As cards are dragged, other cards smoothly animate out of the way (like iOS list reordering). The #1 position has a golden glow border. The last position has a dim red border. Middle positions are neutral.

A "confidence bar" at the bottom shows how much you've reordered: "You made 4 changes" — more changes = more signal.

Flow:

  1. Items appear in their default order (e.g., alphabetical or order generated).
  2. Operator drags cards to reorder. Each drag triggers satisfying slide animations.
  3. Tap a card to expand its preview (accordion-style) without losing position.
  4. "Lock" button on each card — operator can lock confident placements (card grays out, no longer draggable). This helps with large lists.
  5. Optional: "Add notes" icon on each card for per-item comments.
  6. "Submit Rankings" button shows final order with rank numbers.

Data Collected:

  • final_ranking: ordered array of item_ids
  • ranking_changes: array of every swap made (shows deliberation process)
  • locked_items: which items operator was most confident about
  • expanded_items: which items needed closer inspection
  • per_item_notes: optional comments
  • time_to_first_move: how long before operator started reordering (comprehension time)

Why It's Special: Forced ranking eliminates the "everything is a 7" problem. The drag interaction is physical and intuitive. Locked items reveal confidence. The change history shows which comparisons were hardest. Over time, the AI learns not just what's best, but the operator's priority hierarchy.

HTML/CSS Sketch:

<div class="ranking-arena">
  <h3>Rank Best → Worst</h3>
  <div class="rank-list" id="sortable">
    <div class="rank-card" draggable="true" data-id="item1">
      <div class="rank-number">1</div>
      <div class="grip"></div>
      <div class="card-content">
        <h4>{{item_name}}</h4>
        <p class="preview">{{preview}}</p>
      </div>
      <button class="lock-btn">🔒</button>
    </div>
    <!-- more cards -->
  </div>
  <div class="confidence-bar">
    <span>Reordering moves: <strong id="move-count">0</strong></span>
  </div>
  <button class="submit-btn">Submit Rankings</button>
</div>
<style>
  .rank-card { display: flex; align-items: center; gap: 12px; padding: 12px 16px;
               background: #2a2a3e; border-radius: 12px; margin: 8px 0; cursor: grab;
               transition: transform 0.2s, box-shadow 0.2s; user-select: none; }
  .rank-card:first-child { border: 2px solid #FFD700; box-shadow: 0 0 20px rgba(255,215,0,0.2); }
  .rank-card.dragging { transform: scale(1.03); box-shadow: 0 8px 24px rgba(0,0,0,0.4);
                        z-index: 100; opacity: 0.9; }
  .rank-card.locked { opacity: 0.5; cursor: not-allowed; }
  .rank-number { width: 36px; height: 36px; border-radius: 50%; background: #444;
                 display: flex; align-items: center; justify-content: center;
                 font-weight: bold; font-size: 18px; }
  .rank-card:first-child .rank-number { background: #FFD700; color: #000; }
</style>

7. The Speed Round

When: Large batch of small decisions — "Quick pass/fail on 15 generated test cases." Best when individual items are low-stakes and the goal is coverage, not depth.

Visual: Fullscreen dark mode. A giant countdown timer bar across the top (starts at 60 seconds, shrinks like a progress bar). Center: current item displayed BIG. Two enormous buttons spanning the bottom half: left = red "FAIL", right = green "PASS".

Score counter in top-right: "7/15 reviewed". Speed stat in top-left: "Avg: 3.2s/item".

When you hit a button, the item FLIES off screen (left for fail, right for pass) and the next one SNAPS in. Screen briefly flashes the color of your choice (100ms red or green tint).

If the timer runs out: remaining items are marked "SKIPPED" — creating urgency.

Flow:

  1. "SPEED ROUND — 15 items — 60 seconds — GO!" splash screen with 3-2-1 countdown.
  2. First item appears. Timer starts.
  3. Operator hammers or for each item. Animations are FAST (200ms transitions).
  4. Streak counter appears after 3 consecutive same-direction choices: "3x PASS STREAK 🔥"
  5. If operator pauses >5s on an item, a "SKIP →" button gently fades in.
  6. Timer hits 0 → remaining items auto-skipped. Final score screen:
    • "12/15 reviewed in 47 seconds"
    • "8 passed, 4 failed, 3 skipped"
    • "Fastest decision: 1.2s | Slowest: 8.4s"
  7. One-tap "Submit" to confirm.

Data Collected:

  • decisions: Array of { item_id, decision: "pass"|"fail"|"skip", time_ms }
  • streaks: consecutive same-decision runs (reveals batch quality patterns)
  • total_time: seconds
  • completion_rate: items reviewed / total
  • slowest_item: which item caused the most hesitation (flag for detailed review)
  • speed_curve: are decisions getting faster or slower over the batch?

Why It's Special: Gamification creates FLOW STATE. The timer creates healthy urgency that prevents over-analysis of low-stakes items. The speed data is incredibly valuable — slow items are the ones the AI should focus on improving. Streak data reveals batch quality patterns. It's legitimately fun.

HTML/CSS Sketch:

<div class="speed-round">
  <div class="timer-bar"><div class="timer-fill" id="timer"></div></div>
  <div class="stats-row">
    <span class="speed">⚡ Avg: <strong>--</strong>s</span>
    <span class="counter"><strong id="count">0</strong>/15</span>
  </div>
  <div class="item-display" id="current-item">
    <h2>{{item_name}}</h2>
    <p>{{item_preview}}</p>
  </div>
  <div class="button-row">
    <button class="mega-btn fail" onclick="decide('fail')">❌ FAIL</button>
    <button class="mega-btn pass" onclick="decide('pass')">✅ PASS</button>
  </div>
  <div class="streak-indicator hidden" id="streak">🔥 3x STREAK</div>
</div>
<style>
  .speed-round { height: 100%; display: flex; flex-direction: column; background: #111; }
  .timer-bar { height: 6px; background: #333; width: 100%; }
  .timer-fill { height: 100%; background: linear-gradient(90deg, #FF4444, #FFD700, #00FF66);
                width: 100%; transition: width 1s linear; }
  .mega-btn { flex: 1; height: 120px; font-size: 32px; font-weight: bold; border: none;
              cursor: pointer; transition: transform 0.1s; }
  .mega-btn:active { transform: scale(0.95); }
  .mega-btn.fail { background: #FF3333; color: white; }
  .mega-btn.pass { background: #00CC55; color: white; }
  .item-display { flex: 1; display: flex; flex-direction: column; justify-content: center;
                  align-items: center; padding: 24px; text-align: center; }
  @keyframes flash-green { 0% { background: #00FF66; } 100% { background: #111; } }
  @keyframes flash-red { 0% { background: #FF4444; } 100% { background: #111; } }
</style>

8. The Emoji Scale

When: Quick emotional/gut reaction — "How do you feel about this output?" Perfect for subjective quality checks, creative work, or when you want sentiment data beyond numbers.

Visual: A horizontal row of 7 emojis, each ~64px, evenly spaced across the modal width. From left to right: 🤮 😡 😕 😐 🙂 😊 😍. Below each emoji is a subtle label: "Terrible" → "Okay" → "Love it".

Default: no selection. On hover, emoji scales up to 80px with a bounce animation. Selected emoji goes to 96px with a gentle floating animation, and unselected emojis shrink to 48px and desaturate.

Background color smoothly transitions to match the emotion: 🤮 = sickly green tint, 😡 = red, 😐 = gray, 😍 = pink/rosy.

Flow:

  1. Question at top: "How does this {{deliverable_type}} make you feel?"
  2. Brief summary of what's being reviewed.
  3. Operator hovers emojis — each one animates invitingly.
  4. Click an emoji → it grows and floats, others recede. Background shifts color.
  5. A "Why this feeling?" text input slides up (optional, max 140 chars — tweet-length).
  6. "Submit" button styled to match the chosen emotion's color.

Data Collected:

  • sentiment: -3 to +3 (mapped from emoji position)
  • emoji_chosen: specific emoji string
  • hover_journey: which emojis were hovered before selecting (shows deliberation)
  • reason: optional string
  • response_time: snap judgment vs. deliberation

Why It's Special: Emojis bypass rational analysis and capture GUT FEELINGS. The hover journey reveals the operator's internal debate. A 1-10 scale requires cognitive effort; an emoji selection is instant and emotional. The background color shift creates an immersive micro-experience.

HTML/CSS Sketch:

<div class="emoji-scale" style="--bg-tint: transparent;">
  <h3>How does this make you feel?</h3>
  <div class="emoji-row">
    <button class="emoji-btn" data-value="-3" data-color="#88AA00">🤮</button>
    <button class="emoji-btn" data-value="-2" data-color="#FF4444">😡</button>
    <button class="emoji-btn" data-value="-1" data-color="#CC8844">😕</button>
    <button class="emoji-btn" data-value="0" data-color="#888888">😐</button>
    <button class="emoji-btn" data-value="1" data-color="#66AA44">🙂</button>
    <button class="emoji-btn" data-value="2" data-color="#44AA88">😊</button>
    <button class="emoji-btn" data-value="3" data-color="#FF69B4">😍</button>
  </div>
  <input type="text" class="reason-input hidden" placeholder="Why this feeling? (optional)" maxlength="140">
  <button class="submit-btn hidden">Submit</button>
</div>
<style>
  .emoji-scale { text-align: center; padding: 24px; transition: background 0.5s; }
  .emoji-btn { font-size: 48px; background: none; border: none; cursor: pointer;
               transition: all 0.3s cubic-bezier(0.175, 0.885, 0.32, 1.275); padding: 8px; }
  .emoji-btn:hover { font-size: 64px; transform: translateY(-8px); }
  .emoji-btn.selected { font-size: 80px; animation: float 2s ease-in-out infinite; }
  .emoji-btn.unselected { font-size: 36px; filter: grayscale(60%); opacity: 0.5; }
  @keyframes float { 0%,100% { transform: translateY(0); } 50% { transform: translateY(-10px); } }
</style>

9. The Before/After

When: Comparing revisions — "Here's the original vs. the improved version." Best for code refactors, copy editing, design iterations, or any A→B comparison.

Visual: Two panels side by side (or overlaid). A vertical slider handle in the center — drag left to see more "after", drag right to see more "before". The handle is a bold white line with a circular grip (like photo retouching comparisons).

Labels: Left panel header = "BEFORE" (red tint, #FF444420), Right panel = "AFTER" (green tint, #00FF6620). Both panels contain the content at the same scroll position (synced scrolling).

Differences are highlighted: removed text in red, added text in green (classic diff coloring).

Flow:

  1. Slider starts at 50/50 (both panels equally visible).
  2. Operator drags the slider to compare. Synced scrolling keeps alignment.
  3. Below the comparison: "Which is better?" with three buttons:
    • ⬅️ "Before was better" (red)
    • ➡️ "After is better" (green)
    • ↔️ "About the same" (gray)
  4. Then: "How much better?" — a small 1-5 star rating appears.
  5. "What changed for better/worse?" — optional chip selectors: "Clarity", "Performance", "Readability", "Correctness", "Style".
  6. Submit.

Data Collected:

  • preference: "before"|"after"|"same"
  • improvement_magnitude: 1-5 stars
  • improvement_dimensions: string[] (which aspects improved)
  • slider_time_distribution: how much time spent looking at before vs after (from slider position tracking)
  • scroll_depth: how far the operator scrolled (did they review the whole thing?)

Why It's Special: Seeing the actual diff side-by-side is 10x more informative than reading a description. The slider mechanic makes comparison feel like using professional tools. Slider position tracking reveals which version the operator spent more time examining. Scroll depth catches lazy reviews.

HTML/CSS Sketch:

<div class="before-after">
  <div class="comparison-container">
    <div class="panel before" style="clip-path: inset(0 var(--after-pct) 0 0);">
      <div class="panel-header before-header">BEFORE</div>
      <pre class="content">{{before_content}}</pre>
    </div>
    <div class="panel after" style="clip-path: inset(0 0 0 var(--before-pct));">
      <div class="panel-header after-header">AFTER</div>
      <pre class="content">{{after_content}}</pre>
    </div>
    <div class="slider-handle" draggable style="left: var(--slider-pos, 50%);">
      <div class="handle-grip"></div>
    </div>
  </div>
  <div class="verdict-row">
    <button class="verdict before-better">⬅️ Before</button>
    <button class="verdict same">↔️ Same</button>
    <button class="verdict after-better">➡️ After</button>
  </div>
</div>
<style>
  .comparison-container { position: relative; height: 300px; overflow: hidden; border-radius: 12px; }
  .panel { position: absolute; top: 0; left: 0; right: 0; bottom: 0; overflow-y: auto; }
  .before { background: rgba(255,68,68,0.05); }
  .after { background: rgba(0,255,102,0.05); }
  .slider-handle { position: absolute; top: 0; bottom: 0; width: 4px; background: white;
                   cursor: ew-resize; z-index: 10; transform: translateX(-50%); }
  .handle-grip { position: absolute; top: 50%; left: 50%; transform: translate(-50%,-50%);
                 width: 40px; height: 40px; border-radius: 50%; background: white;
                 display: flex; align-items: center; justify-content: center; box-shadow: 0 2px 8px rgba(0,0,0,0.3); }
</style>

10. The Priority Poker

When: Estimating effort, complexity, or risk — "How much work would it take to fix this?" or "How risky is shipping this?" Borrowed from agile planning poker.

Visual: A fan of playing cards at the bottom of the screen, face-up, showing Fibonacci-ish numbers: 1, 2, 3, 5, 8, 13, 21, ?, . The cards are styled as actual playing cards with a dark blue back pattern and white face. They're fanned out with a slight arc, like holding a hand of cards.

The question being estimated appears above in a card-table-green area (#1a6b3c). Felt texture background.

Flow:

  1. Question displayed on the "table": "Estimate the effort to refactor {{pipeline_name}}'s error handling"
  2. Card fan displayed at bottom. Each card has its number and a hint label:
    • 1 = "Trivial", 3 = "Easy", 5 = "Medium", 8 = "Hard", 13 = "Very Hard", 21 = "Enormous", ? = "No idea", = "Need a break"
  3. Operator clicks a card → it animates up to the "table" area, flipping as it goes (like dealing a card). Other cards recede.
  4. The played card lands face-up on the table with a satisfying thwack animation.
  5. "Any context?" — small optional input.
  6. "Deal" button to submit (poker terminology).

Data Collected:

  • estimate: number | "unknown" | "break"
  • context: optional string
  • selection_time: how long to choose (longer = more uncertain)
  • hoverer_cards: which cards were hovered (shows the range being considered)

Why It's Special: Planning poker is a proven estimation technique from agile. The card mechanic is playful and satisfying. The "?" card legitimizes "I don't know" as valid input. The "" card is a self-care signal that gets tracked (burnout detection). Card hover tracking reveals the operator's estimation range, not just their point estimate.

HTML/CSS Sketch:

<div class="poker-table">
  <div class="felt-area">
    <div class="question-card">
      <h3>Estimate the effort:</h3>
      <p>{{estimation_question}}</p>
    </div>
    <div class="played-card-zone" id="played"></div>
  </div>
  <div class="card-fan">
    <button class="poker-card" data-value="1" style="--i:0">
      <span class="card-number">1</span>
      <span class="card-label">Trivial</span>
    </button>
    <button class="poker-card" data-value="2" style="--i:1">
      <span class="card-number">2</span>
    </button>
    <!-- ... more cards ... -->
    <button class="poker-card" data-value="?" style="--i:7">
      <span class="card-number">?</span>
      <span class="card-label">No idea</span>
    </button>
    <button class="poker-card" data-value="break" style="--i:8">
      <span class="card-number"></span>
    </button>
  </div>
</div>
<style>
  .poker-table { background: #0d3d1f; min-height: 400px; }
  .felt-area { background: #1a6b3c; border-radius: 100px / 60px; padding: 40px; margin: 20px;
               background-image: url("data:image/svg+xml,..."); /* felt texture */ }
  .card-fan { display: flex; justify-content: center; padding: 20px; gap: -10px; }
  .poker-card { width: 70px; height: 100px; background: white; border-radius: 8px;
                border: 2px solid #ddd; cursor: pointer; position: relative;
                transform: rotate(calc((var(--i) - 4) * 5deg)) translateY(calc(abs(var(--i) - 4) * 5px));
                transition: all 0.3s; box-shadow: 2px 4px 8px rgba(0,0,0,0.3); }
  .poker-card:hover { transform: translateY(-20px) scale(1.1); z-index: 10; }
  .poker-card.played { animation: deal 0.5s ease forwards; }
  @keyframes deal { 0% { transform: scale(1); } 50% { transform: translateY(-200px) rotateY(180deg) scale(1.2); }
                    100% { transform: translateY(-250px) rotateY(360deg) scale(1); } }
</style>

11. The Mission Briefing

When: High-stakes deployment decisions — "Ready to ship this MCP server to production?" Best for go/no-go gates on critical releases, security-sensitive changes, or customer-facing launches.

Visual: Military/NASA mission control aesthetic. Dark background (#0a0a14) with scan-line overlay effect. Green monospace text (#00FF41, font-family: 'Courier New'). A "CLASSIFIED" watermark at 10% opacity.

The briefing is structured as a dossier:

  • MISSION: {{pipeline_name}} deployment
  • STATUS: Awaiting authorization
  • INTEL: Key metrics displayed in a grid (tests passed, coverage %, build time, etc.)
  • RISK ASSESSMENT: Color-coded risk indicators

At the bottom: two giant buttons with key-turn animations:

  • 🔴 ABORT MISSION — red, requires 1-second hold to activate (preventing accidental aborts)
  • 🟢 GO FOR LAUNCH — green, same hold-to-confirm mechanic

A toggle switch row: "Flight checks" — a series of ON/OFF toggle switches the operator must flip (like pre-flight checklist), each confirming a specific requirement.

Flow:

  1. Modal appears with a "INCOMING TRANSMISSION" typewriter animation.
  2. Briefing content types out line by line (fast, ~20ms per char).
  3. Flight check toggles are all OFF. Operator flips each one they can confirm:
    • ☐ Tests passing → ☑
    • ☐ No security vulnerabilities → ☑
    • ☐ Documentation updated → ☑
    • ☐ Backward compatible → ☑
  4. GO button is disabled until at least N checks are flipped ON.
  5. Operator holds GO or ABORT for 1 second → ring fill animation around the button.
  6. On release: "MISSION {{APPROVED/ABORTED}} — TRANSMISSION SENT" with appropriate animation (green pulse for go, red flash for abort).

Data Collected:

  • decision: "go" | "abort"
  • checks_confirmed: string[] (which flight checks were toggled)
  • checks_skipped: string[] (which were left off — reveals risk acceptance)
  • hold_duration_ms: how long the button was held (shorter hold = more confident)
  • briefing_read_time: time before first interaction (did they actually read it?)
  • decision_after_reading: time between finishing reading and clicking (deliberation time)

Why It's Special: The military aesthetic elevates the decision's importance — it FEELS like it matters. The hold-to-confirm prevents accidents. The flight checks force the operator to consciously acknowledge each requirement. The typewriter effect forces reading. This modal is for decisions that SHOULD feel heavy.

HTML/CSS Sketch:

<div class="mission-briefing">
  <div class="scanlines"></div>
  <div class="watermark">CLASSIFIED</div>
  <div class="terminal-text" id="briefing">
    <p class="typewriter">MISSION: {{pipeline}} DEPLOYMENT</p>
    <p class="typewriter">STATUS: AWAITING AUTHORIZATION</p>
    <div class="intel-grid">
      <div class="intel-item"><span class="label">TESTS</span><span class="value pass">47/47</span></div>
      <div class="intel-item"><span class="label">COVERAGE</span><span class="value warn">78%</span></div>
    </div>
  </div>
  <div class="flight-checks">
    <label class="toggle-switch"><input type="checkbox"><span class="slider"></span> Tests passing</label>
    <label class="toggle-switch"><input type="checkbox"><span class="slider"></span> Security clear</label>
    <label class="toggle-switch"><input type="checkbox"><span class="slider"></span> Docs updated</label>
  </div>
  <div class="launch-controls">
    <button class="hold-btn abort" onmousedown="startHold('abort')" onmouseup="endHold()">
      <svg class="ring"><circle/></svg> 🔴 ABORT MISSION
    </button>
    <button class="hold-btn go" onmousedown="startHold('go')" onmouseup="endHold()" disabled>
      <svg class="ring"><circle/></svg> 🟢 GO FOR LAUNCH
    </button>
  </div>
</div>
<style>
  .mission-briefing { background: #0a0a14; color: #00FF41; font-family: 'Courier New', monospace;
                      padding: 24px; position: relative; overflow: hidden; }
  .scanlines { position: absolute; top: 0; left: 0; right: 0; bottom: 0; pointer-events: none;
               background: repeating-linear-gradient(transparent, transparent 2px, rgba(0,0,0,0.1) 2px, rgba(0,0,0,0.1) 4px); }
  .watermark { position: absolute; top: 50%; left: 50%; transform: translate(-50%,-50%) rotate(-30deg);
               font-size: 80px; opacity: 0.05; font-weight: bold; letter-spacing: 20px; }
  .hold-btn { padding: 20px 40px; font-size: 18px; font-weight: bold; border: 2px solid;
              background: transparent; cursor: pointer; position: relative; font-family: inherit; }
  .hold-btn.go { border-color: #00FF41; color: #00FF41; }
  .hold-btn.abort { border-color: #FF3333; color: #FF3333; }
  .hold-btn .ring circle { stroke-dasharray: 283; stroke-dashoffset: 283;
                           transition: stroke-dashoffset 1s linear; }
  .hold-btn:active .ring circle { stroke-dashoffset: 0; }
</style>

12. The Judge's Scorecard

When: Multi-criteria scoring with different weights — "Rate this MCP server like an Olympic judge." Good for comprehensive evaluations where some criteria matter more than others.

Visual: Olympic/competition scoring aesthetic. Dark navy background with gold accents. A row of score displays across the top — each one looks like a digital scoreboard (LED-style numbers, #FF6600 on black). Each score slot represents a different judge "dimension."

Below each score: the dimension name and its weight (shown as ×1, ×2, ×3 importance). A large "FINAL SCORE" display at the bottom calculates the weighted average in real-time.

Score input: each dimension has a clickable 0.0 to 10.0 display. Click → number dial appears (like a slot machine) where you scroll to set the score.

Flow:

  1. Scoreboard appears with all scores at "—.—" (not yet rated).
  2. Dimensions displayed with weights:
    • Functionality (×3) — does it work?
    • Code Quality (×2) — is it clean?
    • Innovation (×1) — is it creative?
    • Documentation (×2) — is it explained?
    • Performance (×2) — is it fast?
  3. Operator clicks each score display → slot-machine-style number picker scrolls into view.
  4. Spin/drag to set score (0.0 to 10.0, 0.5 increments).
  5. Each score locks in with a "ding" and LED display updates.
  6. FINAL SCORE updates in real-time as each dimension is scored.
  7. After all scores: a medal is awarded based on final score (🥇 >9.0, 🥈 >7.5, 🥉 >6.0, no medal <6.0).
  8. Submit with optional one-liner.

Data Collected:

  • scores: { dimension: float } map
  • weights: { dimension: multiplier } (sent for reference)
  • weighted_final: calculated float
  • medal: "gold"|"silver"|"bronze"|"none"
  • scoring_order: which dimensions rated first (reveals priority)
  • score_adjustments: any scores changed after initial set

Why It's Special: Weighted scoring captures that some things matter more than others. The LED/scoreboard aesthetic makes it feel official and consequential. The medal system creates a target the AI can aim for. Score adjustment tracking catches second-guessing. Real-time weighted average gives the operator immediate feedback on overall quality.

HTML/CSS Sketch:

<div class="judges-scorecard">
  <div class="score-displays">
    <div class="score-slot" data-dim="functionality" data-weight="3">
      <div class="led-display">—.—</div>
      <div class="dim-label">Functionality</div>
      <div class="weight-badge">×3</div>
    </div>
    <!-- more slots -->
  </div>
  <div class="final-score-area">
    <div class="medal-icon" id="medal"></div>
    <div class="final-display">
      <span class="final-label">FINAL SCORE</span>
      <span class="final-number" id="final">0.0</span>
    </div>
  </div>
</div>
<style>
  .judges-scorecard { background: #0a0a2e; color: white; padding: 24px; }
  .led-display { font-family: 'Courier New', monospace; font-size: 48px; color: #FF6600;
                 background: #111; padding: 8px 16px; border-radius: 4px; text-align: center;
                 text-shadow: 0 0 10px #FF660088; cursor: pointer; min-width: 80px; }
  .weight-badge { background: #FFD700; color: #000; font-size: 12px; font-weight: bold;
                  padding: 2px 8px; border-radius: 10px; display: inline-block; }
  .final-display { text-align: center; margin-top: 24px; }
  .final-number { font-size: 72px; color: #FFD700; font-weight: bold;
                  text-shadow: 0 0 30px #FFD70066; }
  .medal-icon { font-size: 64px; text-align: center; transition: all 0.5s; }
</style>

13. The Quick Pulse

When: Ultra-fast check-in — "One tap, how are you feeling about the factory's output today?" Best for periodic sentiment collection, ambient mood tracking, or when you need data with zero friction.

Visual: A horizontal gradient bar spanning the full width of the modal. Left end = deep red (#FF2222), middle = neutral gray (#888), right end = vibrant green (#22FF66). The bar is 80px tall with rounded ends.

Above the bar: the question in large text. Below: labels at each end ("Terrible" ← → "Excellent") and center ("Meh").

The bar is clickable ANYWHERE. On click, a white pin/marker drops at that position with a bounce animation. The exact position maps to a -100 to +100 sentiment score.

No buttons needed — clicking the bar IS the submission (with a 2-second "undo" toast if misclicked).

Flow:

  1. Question appears: "Quick pulse: How's {{pipeline_name}} doing?"
  2. Gradient bar below. Operator clicks anywhere on it.
  3. Pin drops with satisfying bounce. Score appears next to pin: "+67".
  4. Small "Undo" link appears for 2 seconds, then auto-submits.
  5. Modal fades with a "Thanks! 🫡" micro-animation.

Data Collected:

  • sentiment_score: -100 to +100
  • click_position_pct: 0-100% (raw position)
  • undo_used: boolean
  • response_time_ms: time from modal appearance to click
  • click_precision: distance from center of gradient (extremity of opinion)

Why It's Special: ZERO friction. One tap and done. Takes literally 1 second. No buttons, no choices to parse, no cognitive load. The continuous gradient captures nuance that a 5-star rating never could. Perfect for high-frequency, low-burden check-ins. Extreme positions (far left/right) signal strong opinions.

HTML/CSS Sketch:

<div class="quick-pulse">
  <h3>Quick pulse: How's {{pipeline}} doing?</h3>
  <div class="pulse-bar" onclick="setPulse(event)">
    <div class="pin hidden" id="pin">📍<span class="score-label">+0</span></div>
  </div>
  <div class="bar-labels">
    <span>Terrible</span><span>Meh</span><span>Excellent</span>
  </div>
  <div class="undo-toast hidden">Misclick? <a onclick="undo()">Undo</a> (2s)</div>
</div>
<style>
  .pulse-bar { height: 80px; border-radius: 40px; cursor: pointer; position: relative;
               background: linear-gradient(90deg, #FF2222, #FF8844, #888888, #44CC66, #22FF66);
               transition: box-shadow 0.2s; }
  .pulse-bar:hover { box-shadow: 0 0 20px rgba(255,255,255,0.2); }
  .pin { position: absolute; top: -30px; transform: translateX(-50%); font-size: 24px;
         animation: pin-drop 0.4s cubic-bezier(0.175, 0.885, 0.32, 1.275); }
  @keyframes pin-drop { 0% { top: -80px; opacity: 0; } 60% { top: -20px; }
                        80% { top: -35px; } 100% { top: -30px; opacity: 1; } }
</style>

14. The Decision Tree

When: Complex decisions with branching logic — "What should we do about this failing pipeline?" Best when the right action depends on multiple factors and you want to narrow down systematically.

Visual: A flowchart-style tree that reveals itself step by step. Dark background with neon blue connecting lines (#00AAFF). Each node is a rounded card with a question and 2-3 choice buttons. Selected path illuminates; unselected branches dim and blur.

The tree starts with one root node and branches down. Lines animate (dotted → solid) as you progress. Current node has a pulsing border. Previous selections show as small breadcrumbs at the top.

Flow:

  1. Root question appears: "What's the main issue with {{pipeline}}?"
  2. Options: "Output Quality" | "Performance" | "Reliability" | "Not sure"
  3. Operator picks one → selected branch animates down, revealing the next question.
    • If "Output Quality" → "Is the output wrong or just mediocre?"
      • "Wrong" → "Is it a logic error or data error?"
      • "Mediocre" → "What would make it better?" (text input)
  4. Tree continues 2-4 levels deep, narrowing the diagnosis.
  5. Each path terminates with a recommended action and confirm button.
  6. Breadcrumb trail at top shows full decision path.

Data Collected:

  • decision_path: ordered array of choices made at each branch
  • terminal_action: the final recommended action
  • path_depth: how many levels before resolution
  • backtrack_count: times operator went back to change a choice
  • time_per_node: deliberation time at each decision point

Why It's Special: Instead of asking one vague question, the tree narrows systematically. This captures not just the decision but the REASONING PROCESS. The AI learns which diagnostic paths lead to which conclusions. Backtracking reveals where the decision wasn't obvious. It makes the operator feel like a detective solving a problem, not filling a form.

HTML/CSS Sketch:

<div class="decision-tree">
  <div class="breadcrumbs" id="breadcrumbs"></div>
  <div class="tree-canvas">
    <div class="tree-node active" id="root">
      <p class="node-question">What's the main issue?</p>
      <div class="node-options">
        <button class="tree-btn" onclick="selectBranch('quality')">Output Quality</button>
        <button class="tree-btn" onclick="selectBranch('performance')">Performance</button>
        <button class="tree-btn" onclick="selectBranch('reliability')">Reliability</button>
      </div>
    </div>
    <svg class="tree-lines">
      <line class="branch-line" x1="50%" y1="0" x2="30%" y2="100"/>
    </svg>
    <!-- child nodes render dynamically -->
  </div>
</div>
<style>
  .decision-tree { background: #0d0d1a; padding: 24px; min-height: 400px; }
  .tree-node { background: #1a1a2e; border: 2px solid #333; border-radius: 12px; padding: 16px;
               max-width: 300px; margin: 0 auto 20px; transition: all 0.4s; }
  .tree-node.active { border-color: #00AAFF; box-shadow: 0 0 20px rgba(0,170,255,0.3);
                      animation: node-pulse 2s infinite; }
  .tree-node.selected { border-color: #00FF66; opacity: 0.7; }
  .tree-node.dimmed { opacity: 0.2; filter: blur(2px); }
  .tree-btn { padding: 10px 20px; border-radius: 8px; background: #00AAFF22;
              border: 1px solid #00AAFF; color: #00AAFF; cursor: pointer; }
  .tree-btn:hover { background: #00AAFF44; }
  .branch-line { stroke: #00AAFF; stroke-width: 2; stroke-dasharray: 8 4;
                 animation: dash-flow 1s linear infinite; }
  @keyframes node-pulse { 0%,100% { box-shadow: 0 0 20px rgba(0,170,255,0.3); }
                         50% { box-shadow: 0 0 40px rgba(0,170,255,0.5); } }
</style>

15. The Hot Take

When: Time-sensitive decisions — "This PR has been waiting 4 hours, decide NOW." Best for creating urgency on items that have been pending too long, or for preventing analysis paralysis.

Visual: A massive circular countdown timer dominating the center of the modal (250px diameter). The ring counts down from the allotted time (e.g., 30 seconds) with the ring depleting clockwise. Color transitions: green → yellow → orange → red as time runs out.

Below the timer: the decision question and a brief summary. At the bottom: decision buttons that get BIGGER as time runs out (subtle scale animation, starting at 100% and ending at 120% at time-zero).

If the timer expires: "AUTO-SKIPPED ⏭️" — the item goes back to the queue with a "timed_out" flag.

Flow:

  1. Modal appears with a 3-second "Incoming!" warning.
  2. Timer starts (configurable: 15s, 30s, 60s). Ring depletes.
  3. Decision presented with buttons: "APPROVE" | "REJECT" | "NEEDS WORK"
  4. As timer enters final 10 seconds, a subtle heartbeat pulse overlays the modal.
  5. Clicking a button BEFORE timeout → timer stops, confetti burst, "Decided in Xs!"
  6. Timer expiring → items auto-skipped, "Too slow! Item re-queued."
  7. Either way, a stats bar shows: "Your avg decision time: 12s | Items pending: 47"

Data Collected:

  • decision: "approve"|"reject"|"needs_work"|"timed_out"
  • time_remaining: seconds left when decided
  • timer_duration: how long they had
  • in_panic_zone: boolean (decided in last 25% of time)
  • auto_skipped: boolean

Why It's Special: Urgency WORKS. Parkinson's law says work expands to fill time — constraining time produces faster, more instinctive decisions. Items that consistently time out reveal where the operator is uncertain (valuable signal). The growing buttons are a subtle pressure mechanic. The heartbeat effect creates genuine tension. This modal says "your attention is needed NOW."

HTML/CSS Sketch:

<div class="hot-take">
  <div class="timer-ring">
    <svg viewBox="0 0 250 250">
      <circle class="ring-bg" cx="125" cy="125" r="110" fill="none" stroke="#333" stroke-width="12"/>
      <circle class="ring-progress" cx="125" cy="125" r="110" fill="none" stroke="#00FF66"
              stroke-width="12" stroke-dasharray="691" stroke-dashoffset="0"
              stroke-linecap="round" transform="rotate(-90 125 125)"/>
    </svg>
    <div class="timer-text" id="timer-display">30</div>
  </div>
  <div class="question">
    <h3>{{decision_question}}</h3>
    <p class="summary">{{brief_summary}}</p>
  </div>
  <div class="hot-buttons" style="--scale: 1;">
    <button class="hot-btn approve">✅ APPROVE</button>
    <button class="hot-btn reject">❌ REJECT</button>
    <button class="hot-btn rework">🔧 NEEDS WORK</button>
  </div>
</div>
<style>
  .hot-take { text-align: center; padding: 24px; background: #111; }
  .timer-ring { width: 200px; height: 200px; margin: 0 auto; position: relative; }
  .ring-progress { transition: stroke-dashoffset 1s linear, stroke 1s; }
  .ring-progress.warning { stroke: #FFD700; }
  .ring-progress.danger { stroke: #FF3333; }
  .timer-text { position: absolute; top: 50%; left: 50%; transform: translate(-50%,-50%);
                font-size: 64px; font-weight: bold; color: white; }
  .hot-buttons { display: flex; gap: 12px; justify-content: center; margin-top: 24px; }
  .hot-btn { padding: 16px 32px; font-size: 18px; font-weight: bold; border: none;
             border-radius: 12px; cursor: pointer; transform: scale(var(--scale));
             transition: transform 0.3s; }
  .hot-take.heartbeat { animation: heartbeat 0.8s ease-in-out infinite; }
  @keyframes heartbeat { 0%,100% { transform: scale(1); } 50% { transform: scale(1.01); } }
</style>

16. The Side-by-Side Arena

When: A/B comparison — "Two approaches to the same problem, which is better?" Best when the AI generated multiple solutions and needs preference data.

Visual: Two contender panels side by side, styled like a fighting game character select screen. Left panel = "CONTENDER A" with a blue border and blue glow. Right panel = "CONTENDER B" with an orange border and orange glow. A giant "VS" in the center with animated electricity effects.

Each panel shows: approach name, strategy summary, key metrics, code/content preview. Panels are scrollable independently.

Below: "AND THE WINNER IS..." with two massive buttons: 🔵 A and 🟠 B, plus a smaller "🤝 TIE" button in the center.

Flow:

  1. "VERSUS" splash animation (0.5s).
  2. Both contenders revealed simultaneously.
  3. Operator examines both (scrollable panels). Can click to expand either.
  4. Clicks winner → winning panel does a celebration animation (scales up, sparkles), loser dims and shrinks.
  5. "What made the winner better?" — quick chips: "Cleaner", "Faster", "Simpler", "More Complete", "Better DX".
  6. "Margin of victory?" — Small/Clear/Decisive (close match vs. blowout).
  7. Submit.

Data Collected:

  • winner: "A"|"B"|"tie"
  • winning_factors: string[]
  • victory_margin: "narrow"|"clear"|"decisive"
  • time_on_A: ms spent scrolling/viewing panel A
  • time_on_B: ms spent viewing panel B
  • panels_expanded: which panels were expanded for detail view

Why It's Special: Direct comparison eliminates rating-scale ambiguity. The competitive framing (VS, arena, contenders) makes it engaging. Time-on-panel tracking reveals which approach was easier to understand. Victory margin captures nuance: a narrow A win is very different from a decisive A win. This data directly trains preference models.

HTML/CSS Sketch:

<div class="arena">
  <div class="contender contender-a">
    <div class="contender-header">🔵 CONTENDER A</div>
    <div class="contender-content">{{content_a}}</div>
  </div>
  <div class="vs-badge">
    <span>VS</span>
    <div class="electricity"></div>
  </div>
  <div class="contender contender-b">
    <div class="contender-header">🟠 CONTENDER B</div>
    <div class="contender-content">{{content_b}}</div>
  </div>
  <div class="verdict-section">
    <h3>AND THE WINNER IS...</h3>
    <div class="winner-buttons">
      <button class="win-btn a-btn">🔵 A</button>
      <button class="win-btn tie-btn">🤝 TIE</button>
      <button class="win-btn b-btn">🟠 B</button>
    </div>
  </div>
</div>
<style>
  .arena { display: grid; grid-template-columns: 1fr auto 1fr; gap: 12px; padding: 16px;
           background: #111; }
  .contender { border-radius: 12px; padding: 16px; max-height: 300px; overflow-y: auto; }
  .contender-a { border: 2px solid #4488FF; background: rgba(68,136,255,0.1);
                 box-shadow: 0 0 30px rgba(68,136,255,0.2); }
  .contender-b { border: 2px solid #FF8844; background: rgba(255,136,68,0.1);
                 box-shadow: 0 0 30px rgba(255,136,68,0.2); }
  .vs-badge { display: flex; align-items: center; justify-content: center; font-size: 36px;
              font-weight: 900; color: white; text-shadow: 0 0 20px white; }
  .contender.winner { animation: celebrate 0.5s ease; transform: scale(1.05);
                      box-shadow: 0 0 60px rgba(255,215,0,0.5); }
  .contender.loser { opacity: 0.3; transform: scale(0.95); filter: grayscale(80%); }
</style>

17. The Checklist Ceremony

When: Verification that all requirements are met — "Does this MCP server check all the boxes?" Best for acceptance criteria, quality gates, or compliance checks.

Visual: A clean, minimal design on dark background. Each checklist item is a full-width row (~60px tall) with a large custom checkbox on the left and the criterion text on the right.

The magic is in the animation: checking a box triggers a cascade — the checkbox morphs from ☐ to with a satisfying "pop" + a green ripple effect that washes across the row. A progress bar at the top fills incrementally with each check.

Unchecked items have a subtle red-orange glow to draw attention. When ALL items are checked, a massive golden "ALL CLEAR " celebration animates across the screen.

Flow:

  1. Checklist appears with all items unchecked. Progress bar at 0%.
  2. Each item shows the criterion + a brief auto-generated note about whether the AI thinks it passes.
  3. Operator checks items they confirm. Each check = satisfying animation + progress bar fill.
  4. If an item DOESN'T pass, operator leaves it unchecked and can tap a small "📝" icon to add a note about what's missing.
  5. When all items are checked → "ALL CLEAR" celebration → "Ship It" button.
  6. If ANY items are unchecked → "Submit with {{N}} issues noted" button (amber colored).

Data Collected:

  • checks: { criterion: boolean } map
  • notes: { criterion: string } (for unchecked items)
  • check_order: which items checked first/last
  • total_checked: count
  • completion_rate: pct
  • all_clear: boolean
  • time_per_check: how long between each check (rushing vs careful)

Why It's Special: EVERY CHECK FEELS LIKE AN ACHIEVEMENT. The progressive animations create momentum — you WANT to check the next box. The celebration at 100% creates a clear reward. The AI's pre-assessment for each item educates the operator. Unchecked items with notes become specific action items, not vague "needs work" feedback.

HTML/CSS Sketch:

<div class="checklist-ceremony">
  <div class="progress-bar">
    <div class="progress-fill" style="width: 0%;"></div>
    <span class="progress-text">0/8 verified</span>
  </div>
  <div class="checklist">
    <div class="check-row">
      <div class="checkbox" onclick="toggleCheck(this)">
        <div class="check-icon"></div>
      </div>
      <div class="criterion">
        <span class="criterion-text">All API endpoints return proper error codes</span>
        <span class="ai-note">AI: Tests show 200/400/500 responses ✓</span>
      </div>
      <button class="note-btn">📝</button>
    </div>
    <!-- more rows -->
  </div>
  <div class="all-clear hidden">✨ ALL CLEAR ✨</div>
  <button class="submit-btn">Ship It 🚀</button>
</div>
<style>
  .check-row { display: flex; align-items: center; gap: 16px; padding: 12px 16px;
               border-radius: 8px; margin: 4px 0; transition: all 0.3s; }
  .check-row.unchecked { box-shadow: inset 0 0 0 1px rgba(255,140,0,0.3); }
  .checkbox { width: 48px; height: 48px; border-radius: 12px; border: 3px solid #555;
              display: flex; align-items: center; justify-content: center; cursor: pointer;
              transition: all 0.3s; }
  .checkbox.checked { background: #00CC55; border-color: #00CC55;
                      animation: check-pop 0.4s cubic-bezier(0.175, 0.885, 0.32, 1.275); }
  .check-row.checked { background: rgba(0,204,85,0.1); }
  @keyframes check-pop { 0% { transform: scale(0.5); } 50% { transform: scale(1.2); }
                         100% { transform: scale(1); } }
  .progress-fill { height: 100%; background: linear-gradient(90deg, #00CC55, #00FF66);
                   border-radius: 4px; transition: width 0.5s ease; }
  .all-clear { font-size: 48px; text-align: center; animation: celebrate 1s ease;
               color: #FFD700; text-shadow: 0 0 30px #FFD700; }
</style>

18. The Confidence Meter

When: Any approval where certainty level matters — "How sure are you about this approval?" Pairs with any other modal as a follow-up step.

Visual: A semicircular gauge (like a speedometer/tachometer) with the needle pointing upward at the center (default: uncertain). Left = "Wild Guess" (red zone), Center = "Educated Guess" (yellow zone), Right = "Rock Solid" (green zone).

The gauge has markings like an actual car dashboard. The needle is draggable — grab it and sweep it to your confidence level. As you move the needle, the zone illumination changes.

Below the gauge: a text adaptation of the selected zone. E.g., "You're 72% confident — this is an 'Educated Guess' level approval."

Flow:

  1. Question: "How confident are you in your decision on {{item}}?"
  2. Gauge appears with needle at center (50%).
  3. Operator drags the needle (or clicks on the arc to snap to a position).
  4. Below: dynamic text explains the confidence level.
  5. If confidence is LOW (<30%): "What would make you more confident?" text input appears.
  6. If confidence is HIGH (>80%): "Want to mark this as a training example?" toggle appears.
  7. Submit.

Data Collected:

  • confidence_pct: 0-100
  • confidence_zone: "guess"|"educated_guess"|"confident"|"certain"
  • low_confidence_reason: string (if applicable)
  • is_training_example: boolean (if high confidence)
  • needle_journey: array of values during drag (showing the deliberation)

Why It's Special: Most review systems treat all approvals as equal. But "I'm 90% sure" and "I'm 40% sure" are VASTLY different signals. Low-confidence approvals can trigger secondary review. High-confidence approvals become training examples. The speedometer metaphor is instantly intuitive and satisfying to interact with.

HTML/CSS Sketch:

<div class="confidence-meter">
  <h3>How confident are you?</h3>
  <div class="gauge-container">
    <svg class="gauge" viewBox="0 0 300 180">
      <path class="zone red" d="M 30 150 A 120 120 0 0 1 90 40" fill="none" stroke="#FF3333" stroke-width="20"/>
      <path class="zone yellow" d="M 90 40 A 120 120 0 0 1 210 40" fill="none" stroke="#FFD700" stroke-width="20"/>
      <path class="zone green" d="M 210 40 A 120 120 0 0 1 270 150" fill="none" stroke="#00CC55" stroke-width="20"/>
      <line class="needle" x1="150" y1="150" x2="150" y2="30"
            transform="rotate(var(--angle, 0) 150 150)" stroke="white" stroke-width="3"/>
      <circle cx="150" cy="150" r="8" fill="white"/>
    </svg>
    <div class="confidence-text">
      <span class="pct" id="conf-pct">50%</span>
      <span class="zone-label" id="conf-zone">Educated Guess</span>
    </div>
  </div>
  <div class="follow-up" id="low-conf" class="hidden">
    <p>What would make you more confident?</p>
    <input type="text" placeholder="e.g., Need to see test results...">
  </div>
</div>
<style>
  .gauge-container { position: relative; width: 300px; margin: 0 auto; }
  .needle { transform-origin: 150px 150px; transition: transform 0.1s;
            filter: drop-shadow(0 0 4px white); }
  .zone { stroke-linecap: round; opacity: 0.3; }
  .zone.active { opacity: 1; filter: drop-shadow(0 0 10px currentColor); }
  .confidence-text { text-align: center; margin-top: -20px; }
  .pct { font-size: 48px; font-weight: bold; }
</style>

19. The Voice of the Customer

When: Empathy-driven evaluation — "Put yourself in the end user's shoes. Would they be happy with this?" Best for user-facing features, API design, documentation, or UX copy.

Visual: A persona card at the top: avatar silhouette, persona name (e.g., "Alex, Solo Developer"), brief context ("Building a weekend project, just wants it to work"). This is the imaginary customer.

Below: the deliverable being reviewed, framed as "What Alex would see/experience."

Then: a customer satisfaction interface styled like a product review:

  • Star rating (1-5 large stars)
  • "Would you recommend this?" (thumbs up/down)
  • "How does this compare to alternatives?" (worse/same/better)
  • "Write a review as Alex" (character-limited text area styled like an app store review)

Flow:

  1. Persona card introduces the "customer" with context about their needs.
  2. Deliverable shown from the customer's perspective (e.g., API docs shown as a developer would encounter them).
  3. Operator role-plays as the customer:
    • Stars (1-5) for overall satisfaction
    • Recommend? (👍/👎)
    • vs. Alternatives? (worse/same/better)
  4. "Write a brief review as {{persona_name}}" — textarea styled as app store review.
  5. Submit "Post Review".

Data Collected:

  • persona_used: string
  • star_rating: 1-5
  • would_recommend: boolean
  • vs_alternatives: "worse"|"same"|"better"
  • review_text: string
  • empathy_shift: did the persona-framing change the operator's typical assessment pattern?

Why It's Special: Forces EMPATHY. Most reviews are from the creator's perspective ("is the code clean?") not the consumer's ("can I figure this out?"). The persona card creates psychological distance that produces more honest evaluations. The app-store-review format is familiar and encourages specific, actionable feedback.

HTML/CSS Sketch:

<div class="voice-of-customer">
  <div class="persona-card">
    <div class="avatar">👤</div>
    <div class="persona-info">
      <h3>Alex, Solo Developer</h3>
      <p>Building a weekend project. Just wants clear docs and working examples.</p>
    </div>
  </div>
  <div class="deliverable-preview">
    <h4>What Alex sees:</h4>
    <div class="preview-frame">{{deliverable_content}}</div>
  </div>
  <div class="review-form">
    <div class="star-rating">
      <span class="star" data-value="1"></span>
      <span class="star" data-value="2"></span>
      <span class="star" data-value="3"></span>
      <span class="star" data-value="4"></span>
      <span class="star" data-value="5"></span>
    </div>
    <div class="recommend-row">
      Would Alex recommend? <button class="thumb up">👍</button> <button class="thumb down">👎</button>
    </div>
    <textarea class="review-text" placeholder="Write a brief review as Alex..." maxlength="280"></textarea>
    <button class="submit-btn">Post Review ⭐</button>
  </div>
</div>
<style>
  .persona-card { display: flex; gap: 16px; align-items: center; background: #2a2a3e;
                  padding: 16px; border-radius: 12px; border-left: 4px solid #00AAFF; }
  .avatar { font-size: 48px; }
  .star { font-size: 40px; color: #555; cursor: pointer; transition: all 0.2s; }
  .star.active { color: #FFD700; transform: scale(1.1); text-shadow: 0 0 10px #FFD700; }
  .star:hover { transform: scale(1.2); }
  .review-text { width: 100%; min-height: 80px; background: #1a1a2e; border: 1px solid #444;
                 border-radius: 8px; color: white; padding: 12px; resize: vertical;
                 font-family: -apple-system, sans-serif; }
</style>

20. The Retrospective Board

When: Periodic holistic review — "Let's reflect on this pipeline's performance this week." Best for weekly reviews, sprint retros, or when you want structured qualitative feedback.

Visual: Three columns in a horizontal layout, styled like sticky notes on a whiteboard:

  • 🟢 START (green column) — "What should we start doing?"
  • 🔴 STOP (red column) — "What should we stop doing?"
  • 🔵 CONTINUE (blue column) — "What's working well? Keep it up."

Each column has a "+" button at the top. Clicking it adds a new "sticky note" (small card with a text input). Notes can be dragged between columns if the operator changes their mind.

The whiteboard has a subtle grid background. Sticky notes have slight random rotation (±3°) and a paper shadow for realism.

Flow:

  1. Retro board appears with 3 empty columns. Pipeline name and date range at the top.
  2. Operator clicks "+" in any column to add a note. A blank sticky appears with auto-focused text input.
  3. Type a brief note (max 100 chars per note), hit Enter → note "sticks" with a subtle thwap animation.
  4. Add as many notes per column as desired.
  5. Drag notes between columns to reclassify.
  6. Optional: click a note to add a priority tag (🔥 urgent, 📌 important, 💡 idea).
  7. "Wrap Up Retro" button submits all notes.

Data Collected:

  • start: Array of { text, priority }
  • stop: Array of { text, priority }
  • continue: Array of { text, priority }
  • column_balance: ratio of notes per column (all "stop" = bad sign)
  • note_count: total notes (engagement level)
  • reclassified_notes: notes that were dragged between columns (shows initial misassessment)

Why It's Special: Start/Stop/Continue is a proven retrospective framework. The sticky note mechanic feels tangible and collaborative. Random rotation adds organic personality. The freedom to add unlimited notes prevents forced choice. Column balance is a meta-metric: if the operator only fills "STOP", the AI knows there's serious dissatisfaction. Reclassification tracking reveals the operator's thought process.

HTML/CSS Sketch:

<div class="retro-board">
  <h3>Retrospective: {{pipeline}} — Week of {{date}}</h3>
  <div class="columns">
    <div class="retro-column start">
      <h4>🟢 START</h4>
      <button class="add-note" onclick="addNote('start')">+ Add note</button>
      <div class="notes-area" data-column="start">
        <!-- sticky notes appear here -->
      </div>
    </div>
    <div class="retro-column stop">
      <h4>🔴 STOP</h4>
      <button class="add-note" onclick="addNote('stop')">+ Add note</button>
      <div class="notes-area" data-column="stop"></div>
    </div>
    <div class="retro-column continue">
      <h4>🔵 CONTINUE</h4>
      <button class="add-note" onclick="addNote('continue')">+ Add note</button>
      <div class="notes-area" data-column="continue"></div>
    </div>
  </div>
  <button class="submit-btn">Wrap Up Retro 📋</button>
</div>
<style>
  .retro-board { background: #f5f5f0; color: #333; padding: 24px;
                 background-image: linear-gradient(#ddd 1px, transparent 1px),
                 linear-gradient(90deg, #ddd 1px, transparent 1px);
                 background-size: 40px 40px; }
  .columns { display: grid; grid-template-columns: 1fr 1fr 1fr; gap: 16px; }
  .retro-column { min-height: 300px; padding: 12px; border-radius: 8px; }
  .retro-column.start { background: rgba(0,204,85,0.1); }
  .retro-column.stop { background: rgba(255,51,51,0.1); }
  .retro-column.continue { background: rgba(68,136,255,0.1); }
  .sticky-note { background: #FFFFA5; padding: 12px; border-radius: 2px; margin: 8px 0;
                 box-shadow: 2px 3px 6px rgba(0,0,0,0.2); cursor: grab;
                 transform: rotate(calc((random - 0.5) * 6deg)); font-family: 'Comic Sans MS', cursive;
                 min-height: 60px; }
  .sticky-note.stop { background: #FFAAAA; }
  .sticky-note.continue { background: #AACCFF; }
</style>

🎁 BONUS MODALS (5 Additional Creative Designs)


21. The Slot Machine

When: Adding an element of serendipity to quality reviews — "Pull the lever and see if the output hits the jackpot." Best for breaking monotony during long review sessions, or when you want to capture snap quality judgments with a fun wrapper.

Visual: A classic 3-reel slot machine in bold retro casino style. Gold trim, cherry red body, dark background. The three reels don't auto-spin — instead, they display the operator's ASSESSMENT. Each reel represents a dimension:

  • Reel 1: Quality (💎 Diamond = Excellent → 🍋 Lemon = Poor)
  • Reel 2: Completeness (7 = Full → 🍒 = Partial)
  • Reel 3: Would Ship? (🎰 BAR = Yes → = No)

The operator clicks each reel to cycle through its symbols (3-4 options each). Then pulls a big lever handle on the right side to submit.

If all three reels land on top symbols (💎 7 🎰) → JACKPOT animation: flashing lights, coins falling, "TRIPLE CROWN OUTPUT!" text.

Flow:

  1. Slot machine appears with reels showing "?" symbols.
  2. Click Reel 1 → cycles through quality symbols with a spinning animation. Click to stop on chosen symbol.
  3. Repeat for Reels 2 and 3.
  4. Pull the lever (click and drag down) → reels do a final fast spin and land on chosen values.
  5. If jackpot → celebration. If all bottom symbols → "BUST" with sympathetic animation.
  6. Coin counter at top shows "Jackpots this session: 3/12 reviews"
  7. Auto-submit after lever pull with a satisfying "ka-ching."

Data Collected:

  • reel_1_quality: symbol/score
  • reel_2_completeness: symbol/score
  • reel_3_shippable: symbol/score
  • is_jackpot: boolean (all top marks)
  • is_bust: boolean (all bottom marks)
  • jackpot_rate: running ratio across session
  • time_per_reel: deliberation per dimension

Why It's Special: It's a SLOT MACHINE. In a code review. The absurdity alone makes it memorable and fun. The three-reel structure forces multi-dimensional assessment. The jackpot mechanic creates a target for AI to aim for. The session-long jackpot counter gamifies the entire review batch.

HTML/CSS Sketch:

<div class="slot-machine">
  <div class="machine-body">
    <div class="display-header">🎰 QUALITY SLOTS 🎰</div>
    <div class="jackpot-counter">Jackpots: <strong>0</strong>/0</div>
    <div class="reels">
      <div class="reel" onclick="cycleReel(0)">
        <div class="reel-window"><span class="symbol">?</span></div>
        <div class="reel-label">Quality</div>
      </div>
      <div class="reel" onclick="cycleReel(1)">
        <div class="reel-window"><span class="symbol">?</span></div>
        <div class="reel-label">Complete</div>
      </div>
      <div class="reel" onclick="cycleReel(2)">
        <div class="reel-window"><span class="symbol">?</span></div>
        <div class="reel-label">Ship?</div>
      </div>
    </div>
    <div class="lever-container">
      <div class="lever" draggable>
        <div class="lever-knob"></div>
        <div class="lever-arm"></div>
      </div>
    </div>
  </div>
</div>
<style>
  .slot-machine { background: #1a0a2e; padding: 24px; text-align: center; }
  .machine-body { background: linear-gradient(135deg, #8B0000, #CC0000); border-radius: 20px;
                  padding: 24px; border: 4px solid #FFD700; max-width: 400px; margin: 0 auto;
                  box-shadow: 0 0 40px rgba(255,215,0,0.3); }
  .reels { display: flex; justify-content: center; gap: 8px; margin: 20px 0; }
  .reel-window { width: 80px; height: 80px; background: white; border-radius: 8px;
                 display: flex; align-items: center; justify-content: center;
                 font-size: 48px; cursor: pointer; border: 3px solid #333; }
  .reel-window:active { animation: reel-spin 0.3s ease; }
  @keyframes reel-spin { 0% { transform: translateY(0); } 25% { transform: translateY(-20px); }
                         75% { transform: translateY(20px); } 100% { transform: translateY(0); } }
  .lever { cursor: pointer; display: inline-block; }
  .lever-knob { width: 40px; height: 40px; background: #FFD700; border-radius: 50%;
                border: 3px solid #B8860B; margin: 0 auto; }
  .lever-arm { width: 8px; height: 60px; background: #888; margin: 0 auto; border-radius: 4px; }
</style>

22. The Mood Ring

When: Ambient quality sensing — "What color is your vibe for this output?" Best for creative and subjective work where traditional metrics don't apply.

Visual: A large circular ring (200px diameter) that shifts through a full color spectrum as the operator moves their cursor (or finger) around the ring's circumference. Think of a color picker wheel, but each color maps to a mood/quality:

  • Deep purple = "Mystified / Don't understand"
  • Blue = "Cool, calm, solid"
  • Green = "Healthy, good to go"
  • Yellow = "Cautious, something's off"
  • Orange = "Worried, needs attention"
  • Red = "Alarmed, major issues"
  • Pink = "Delighted, love it"

The ring pulses gently with the currently hovered color. The center of the ring displays the mood label for the current position. Background shifts to a gradient matching the selected hue.

Flow:

  1. "What color does this {{deliverable}} give you?" instruction text.
  2. Ring appears. Operator moves cursor around it — the ring GLOWS at the current position, center label updates in real-time.
  3. Click to lock the color/mood.
  4. The entire ring settles into the chosen color with a ripple effect radiating outward.
  5. "Tell me more about this {{mood}} feeling?" — optional text.
  6. Submit.

Data Collected:

  • color_hex: exact color chosen
  • mood_label: mapped string
  • hue_angle: 0-360 degrees (precise position)
  • orbit_pattern: how many times / how far the operator circled before choosing
  • optional_explanation: string

Why It's Special: It's synesthetic — mapping quality to COLOR bypasses analytical thinking entirely. The orbit pattern (how much the operator circled) is a unique indecision metric. Colors map to emotional states that are hard to express in words. It's also visually stunning and unlike anything else in a review tool.

HTML/CSS Sketch:

<div class="mood-ring-modal" style="--hue: 120;">
  <h3>What color does this give you?</h3>
  <div class="ring-container">
    <div class="mood-ring" onmousemove="updateHue(event)" onclick="lockColor()">
      <div class="ring-glow" style="background: conic-gradient(from 0deg, 
        #8B00FF, #0000FF, #00FF00, #FFFF00, #FF8800, #FF0000, #FF69B4, #8B00FF);"></div>
      <div class="ring-center">
        <span class="mood-label" id="mood">Healthy</span>
      </div>
    </div>
  </div>
</div>
<style>
  .mood-ring-modal { background: hsl(var(--hue), 30%, 10%); transition: background 0.3s;
                     padding: 40px; text-align: center; }
  .mood-ring { width: 220px; height: 220px; border-radius: 50%; position: relative;
               cursor: crosshair; margin: 0 auto; }
  .ring-glow { width: 100%; height: 100%; border-radius: 50%; padding: 30px;
               box-sizing: border-box; }
  .ring-center { position: absolute; top: 30px; left: 30px; right: 30px; bottom: 30px;
                 border-radius: 50%; background: #1a1a2e; display: flex;
                 align-items: center; justify-content: center; }
  .mood-label { font-size: 18px; font-weight: bold; transition: color 0.3s; }
</style>

23. The War Room Dashboard

When: Aggregate review of multiple pipeline statuses — "Here's the state of all 12 active pipelines, what needs attention?" Best for daily stand-up reviews or when the operator needs to triage across many pipelines simultaneously.

Visual: A grid of 12+ mini-cards (3 or 4 columns), each representing a pipeline. Each card shows: pipeline name, last output status (color-coded dot), a micro-sparkline of recent quality scores, and a small action button.

Think of it like a server monitoring dashboard (Datadog/Grafana style) but for AI pipeline quality. Cards that need attention have a pulsing amber or red border. Cards that are healthy have a subtle green glow.

The operator can bulk-select cards and apply actions, or tap individual cards for quick review.

Flow:

  1. Dashboard grid loads with all pipelines. Auto-sorted: most urgent first (red > amber > green).
  2. Each card shows:
    • Pipeline name + icon
    • Status dot (red/amber/green)
    • Last 7 quality scores as a tiny sparkline
    • "Since last review" timestamp
  3. Operator can:
    • Tap a card → quick popup with last output preview + approve/reject/skip
    • Long-press to select → multi-select → bulk action ("Approve all selected", "Flag all for deep review")
  4. Top bar shows aggregate stats: "5 need attention, 4 healthy, 3 pending"
  5. "All reviewed" button when every card has been actioned.

Data Collected:

  • per_pipeline_action: { pipeline_id: action } map
  • triage_order: which pipelines reviewed first (priority signal)
  • bulk_actions: what was bulk-approved vs individually reviewed
  • ignored_pipelines: any pipelines not touched (oversight tracking)
  • time_per_pipeline: engagement level per pipeline
  • total_triage_time: overall review duration

Why It's Special: Instead of serial one-at-a-time review, the operator sees the FULL PICTURE and can triage. Bulk actions save massive time for healthy pipelines. The sparklines give trend context without requiring deep dives. The sorting puts problems first. This is how you efficiently manage 64 pipelines — not one modal at a time.

HTML/CSS Sketch:

<div class="war-room">
  <div class="war-room-header">
    <h3>🏭 Pipeline War Room</h3>
    <div class="stats">
      <span class="stat urgent">🔴 5 need attention</span>
      <span class="stat ok">🟢 4 healthy</span>
      <span class="stat pending">🟡 3 pending</span>
    </div>
  </div>
  <div class="pipeline-grid">
    <div class="pipeline-card urgent" data-id="pipeline-1">
      <div class="card-header">
        <span class="status-dot red"></span>
        <span class="pipeline-name">GHL MCP</span>
      </div>
      <div class="sparkline">
        <svg viewBox="0 0 100 30"><polyline points="0,25 15,20 30,22 45,15 60,28 75,30 90,27"/></svg>
      </div>
      <span class="last-review">2h ago</span>
      <div class="quick-actions">
        <button class="qa-btn approve"></button>
        <button class="qa-btn reject"></button>
        <button class="qa-btn skip"></button>
      </div>
    </div>
    <!-- more pipeline cards -->
  </div>
  <button class="bulk-action-btn hidden">Bulk Action (0 selected)</button>
</div>
<style>
  .war-room { background: #0d1117; padding: 16px; color: #e6e6e6; }
  .pipeline-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(160px, 1fr));
                   gap: 12px; margin-top: 16px; }
  .pipeline-card { background: #161b22; border-radius: 12px; padding: 12px;
                   border: 2px solid transparent; transition: all 0.3s; cursor: pointer; }
  .pipeline-card.urgent { border-color: #FF4444; animation: urgent-pulse 2s infinite; }
  .pipeline-card.healthy { border-color: rgba(0,204,85,0.3); }
  .pipeline-card.selected { border-color: #00AAFF; background: #00AAFF11; }
  .status-dot { width: 10px; height: 10px; border-radius: 50%; display: inline-block; }
  .status-dot.red { background: #FF4444; box-shadow: 0 0 8px #FF4444; }
  .status-dot.green { background: #00CC55; }
  .sparkline svg { width: 100%; height: 30px; }
  .sparkline polyline { fill: none; stroke: #666; stroke-width: 2; }
  @keyframes urgent-pulse { 0%,100% { border-color: #FF4444; }
                           50% { border-color: #FF444466; } }
</style>

24. The Applause Meter

When: Celebrating and rating exceptional outputs — "How much applause does this deserve?" Best for positive reinforcement: when you want the AI to know what GREAT looks like, not just what's wrong.

Visual: A horizontal meter across the bottom half of the modal, styled like a live audience applause meter (think TV game shows). The bar fills from left to right based on how fast/many times the operator clicks/taps the "CLAP" button.

A giant 👏 button in the center (100px+). Every tap: the button does a scale-bounce animation and the meter fills a tiny bit. The faster you tap, the more it fills. A 5-second window.

Visual feedback as the meter fills:

  • 0-25%: "Polite golf clap" — quiet, small reactions
  • 25-50%: "Nice applause" — normal clapping animation
  • 50-75%: "Standing ovation!" — crowd goes wild, screen shakes slightly
  • 75-100%: "THUNDEROUS APPLAUSE 🌩️" — confetti, screen flash, maximum celebration

The deliverable name and a "why are you applauding?" quick-select follow.

Flow:

  1. "APPLAUSE METER" title with the deliverable shown.
  2. "Tap the clap as much as you think this deserves! You have 5 seconds. GO!"
  3. 5-second countdown starts. Giant 👏 button active.
  4. Operator taps rapidly. Each tap = meter fills, animations intensify.
  5. Timer ends → final level locks in.
  6. "What deserves the applause?" quick chips: "Clever", "Clean", "Fast", "Thorough", "Surprising".
  7. Submit.

Data Collected:

  • tap_count: total claps
  • taps_per_second: speed curve (burst at start? steady? accelerating?)
  • final_meter_pct: 0-100
  • applause_level: "polite"|"nice"|"standing_ovation"|"thunderous"
  • applause_reasons: string[]

Why It's Special: It's PHYSICAL. Tapping repeatedly creates emotional investment — you're literally putting effort into your rating. More taps = more positive. The speed curve reveals enthusiasm pattern: initial burst = genuine excitement, tapering = polite politeness. It's also hilarious and memorable. Positive reinforcement signals are just as valuable as criticism for AI training.

HTML/CSS Sketch:

<div class="applause-meter">
  <h3>👏 APPLAUSE METER 👏</h3>
  <p class="deliverable">For: {{deliverable_name}}</p>
  <div class="meter-container">
    <div class="meter-bar"><div class="meter-fill" style="width: 0%"></div></div>
    <div class="meter-labels">
      <span>Golf clap</span><span>Nice</span><span>Standing O!</span><span>THUNDER</span>
    </div>
  </div>
  <div class="clap-zone">
    <button class="clap-btn" onclick="clap()">👏</button>
    <div class="timer">5.0s</div>
    <div class="tap-count">0 claps</div>
  </div>
  <div class="reason-chips hidden">
    <button class="chip">Clever</button>
    <button class="chip">Clean</button>
    <button class="chip">Fast</button>
    <button class="chip">Thorough</button>
    <button class="chip">Surprising</button>
  </div>
</div>
<style>
  .applause-meter { background: #1a0a2e; padding: 24px; text-align: center; color: white; }
  .meter-bar { height: 40px; background: #333; border-radius: 20px; overflow: hidden;
               margin: 20px 0; }
  .meter-fill { height: 100%; border-radius: 20px; transition: width 0.1s;
                background: linear-gradient(90deg, #4488FF, #44FF88, #FFD700, #FF4444); }
  .clap-btn { font-size: 80px; background: none; border: none; cursor: pointer;
              transition: transform 0.1s; user-select: none; -webkit-user-select: none; }
  .clap-btn:active { transform: scale(1.3); }
  @keyframes clap-bounce { 0% { transform: scale(1); } 50% { transform: scale(1.2); }
                           100% { transform: scale(1); } }
  .applause-meter.shaking { animation: shake 0.1s infinite; }
  @keyframes shake { 0%,100% { transform: translateX(0); } 50% { transform: translateX(3px); } }
</style>

25. The Crystal Ball

When: Predictive assessment — "Will this MCP server succeed in production?" Best for forward-looking decisions: risk assessment, market viability, longevity predictions.

Visual: A dark, mystical aesthetic. A large glowing crystal ball (CSS radial gradient) in the center, sitting on an ornate gold base. Mist/fog effects swirl inside the ball (animated gradient or SVG filter).

The operator "consults the oracle" by choosing a prediction. The crystal ball's color shifts based on the prediction:

  • Deep blue/clear = "Bright future"
  • Murky amber = "Uncertain"
  • Dark red/black = "Doomed"

Below the ball: prediction buttons styled as tarot/oracle cards laid face-down. Flip them to reveal predictions.

Flow:

  1. "Consult the Oracle: What future do you see for {{pipeline}}?" with mystical font.
  2. Crystal ball swirls with neutral mist.
  3. Three oracle cards below, face down:
    • Card 1: "DESTINY" — Will this succeed? (Yes/Maybe/No)
    • Card 2: "TIMELINE" — How long until issues arise? (Never/Months/Weeks/Days)
    • Card 3: "WEAKNESS" — What's the biggest risk? (select from options)
  4. Operator clicks each card to flip it, then selects an answer.
  5. After all three cards are answered, the crystal ball colors shift to reflect the overall prediction.
  6. "Seal the prophecy" button submits.

Data Collected:

  • destiny: "success"|"maybe"|"failure"
  • timeline_to_issues: "never"|"months"|"weeks"|"days"
  • biggest_risk: string (from predefined options)
  • overall_outlook: "bright"|"uncertain"|"dark" (derived from ball color)
  • card_flip_order: which dimension the operator assessed first
  • prediction_confidence: derived from flip speed and hesitation

Why It's Special: Reframing "risk assessment" as "fortune telling" sounds silly but actually WORKS psychologically. It gives the operator permission to use INTUITION instead of pure analysis. Experienced operators have gut feelings about what will succeed — this captures that predictive signal. The mystical aesthetic makes a boring risk form feel magical. Plus, tracking prediction accuracy over time reveals which operators have the best instincts.

HTML/CSS Sketch:

<div class="crystal-ball-modal">
  <h3 class="mystical-title">🔮 Consult the Oracle 🔮</h3>
  <p>What future do you see for <strong>{{pipeline}}</strong>?</p>
  <div class="ball-container">
    <div class="crystal-ball">
      <div class="mist"></div>
      <div class="inner-glow"></div>
    </div>
    <div class="ball-base"></div>
  </div>
  <div class="oracle-cards">
    <div class="oracle-card" onclick="flipCard(this)" data-dimension="destiny">
      <div class="card-back">🌟 DESTINY</div>
      <div class="card-front">
        <button onclick="predict('success')">✨ Bright</button>
        <button onclick="predict('maybe')">🌫️ Unclear</button>
        <button onclick="predict('failure')">💀 Doomed</button>
      </div>
    </div>
    <div class="oracle-card" onclick="flipCard(this)" data-dimension="timeline">
      <div class="card-back">⏳ TIMELINE</div>
      <div class="card-front">
        <button>Never</button><button>Months</button><button>Weeks</button><button>Days</button>
      </div>
    </div>
    <div class="oracle-card" onclick="flipCard(this)" data-dimension="risk">
      <div class="card-back">⚡ WEAKNESS</div>
      <div class="card-front">
        <button>Reliability</button><button>Scale</button><button>Security</button><button>Complexity</button>
      </div>
    </div>
  </div>
  <button class="seal-btn hidden">Seal the Prophecy 🔮</button>
</div>
<style>
  .crystal-ball-modal { background: #0a0014; color: #d4c4ff; padding: 24px; text-align: center;
                        font-family: 'Georgia', serif; }
  .mystical-title { font-size: 24px; letter-spacing: 4px; color: #bb88ff;
                    text-shadow: 0 0 20px #bb88ff44; }
  .crystal-ball { width: 180px; height: 180px; border-radius: 50%; margin: 0 auto;
                  background: radial-gradient(circle at 35% 35%, rgba(255,255,255,0.3),
                  rgba(100,100,200,0.2), rgba(50,0,100,0.6));
                  box-shadow: 0 0 60px rgba(150,100,255,0.4), inset 0 0 40px rgba(150,100,255,0.2);
                  position: relative; overflow: hidden; }
  .mist { position: absolute; top: 0; left: 0; right: 0; bottom: 0;
          background: radial-gradient(ellipse at 40% 60%, rgba(200,180,255,0.3), transparent);
          animation: mist-swirl 4s ease-in-out infinite alternate; }
  @keyframes mist-swirl { 0% { transform: rotate(0deg) scale(1); }
                          100% { transform: rotate(30deg) scale(1.1); } }
  .ball-base { width: 120px; height: 30px; background: linear-gradient(#B8860B, #FFD700, #B8860B);
               border-radius: 0 0 60px 60px; margin: -10px auto 0; }
  .oracle-card { width: 120px; height: 160px; perspective: 600px; cursor: pointer;
                 display: inline-block; margin: 10px; }
  .oracle-card .card-back { background: linear-gradient(135deg, #2a1a4e, #4a2a8e);
                            border: 2px solid #bb88ff; border-radius: 8px; height: 100%;
                            display: flex; align-items: center; justify-content: center;
                            font-weight: bold; backface-visibility: hidden; }
  .oracle-card.flipped .card-back { transform: rotateY(180deg); }
  .oracle-card.flipped .card-front { transform: rotateY(0deg); }
</style>

📐 Implementation Notes

MCP App Architecture

Each modal is a self-contained HTML page:

/servers/{pipeline}/src/apps/hitl-modals/
├── traffic-light.html
├── tinder-swipe.html
├── report-card.html
├── thermometer.html
├── spotlight.html
├── ranking-arena.html
├── speed-round.html
├── emoji-scale.html
├── before-after.html
├── priority-poker.html
├── mission-briefing.html
├── judges-scorecard.html
├── quick-pulse.html
├── decision-tree.html
├── hot-take.html
├── side-by-side-arena.html
├── checklist-ceremony.html
├── confidence-meter.html
├── voice-of-customer.html
├── retrospective-board.html
├── slot-machine.html
├── mood-ring.html
├── war-room-dashboard.html
├── applause-meter.html
├── crystal-ball.html
└── shared/
    ├── base-styles.css        # Common dark mode, typography, animations
    ├── submit-handler.js      # postMessage/form submission logic
    ├── timing-tracker.js      # Hidden metrics (response time, hover, etc.)
    └── celebration-effects.js # Confetti, sparkles, shakes, etc.

Data Submission Contract

Every modal submits via window.parent.postMessage():

window.parent.postMessage({
  type: 'hitl_response',
  modal_type: 'traffic_light',  // which modal was used
  pipeline_id: '{{pipeline_id}}',
  item_id: '{{item_id}}',
  timestamp: new Date().toISOString(),
  response_time_ms: Date.now() - modalOpenedAt,
  
  // Modal-specific data
  data: {
    decision: 'pass',
    tags: ['clean_code', 'good_naming'],
    feedback: 'Solid implementation',
    // ... varies per modal type
  },
  
  // Hidden metrics (always collected)
  meta: {
    hover_journey: [...],
    hesitation_points: [...],
    total_interactions: 5,
    viewport_size: { w: 400, h: 600 }
  }
}, '*');

Modal Selection Logic

The system should auto-select which modal to show based on context:

Context Recommended Modal
Single item pass/fail Traffic Light, Quick Pulse
Batch review (5+ items) Tinder Swipe, Speed Round
Multi-dimensional quality Report Card, Judge's Scorecard
A/B comparison Side-by-Side Arena, Before/After
Subjective/creative work Thermometer, Emoji Scale, Mood Ring
High-stakes deployment Mission Briefing, Confidence Meter
Code/content annotation Spotlight
Prioritization Ranking Arena, Priority Poker
Effort estimation Priority Poker
Time-sensitive Hot Take
Positive reinforcement Applause Meter, Slot Machine
Forward-looking risk Crystal Ball
Verification checklist Checklist Ceremony
Complex diagnostic Decision Tree
Empathy/user perspective Voice of the Customer
Periodic reflection Retrospective Board
Multi-pipeline triage War Room Dashboard
Breaking monotony Slot Machine, Crystal Ball

Rotation Strategy

To prevent modal fatigue:

  1. Never show the same modal type twice in a row
  2. Weight toward faster modals during high-volume periods
  3. Save deep modals (Report Card, Retrospective) for milestone reviews
  4. Introduce new modals gradually — start with 5, add new ones weekly
  5. Track per-modal engagement — if completion rates drop for a modal type, rotate it out temporarily
  6. Let the operator request favorites — "I want the Tinder Swipe for this batch"

Summary

This collection provides 25 distinct modal types covering every HITL scenario in the GooseFactory pipeline:

  • 4 Quick Decision modals (Traffic Light, Quick Pulse, Emoji Scale, Tinder Swipe)
  • 4 Deep Assessment modals (Report Card, Judge's Scorecard, Checklist Ceremony, Spotlight)
  • 3 Comparison modals (Side-by-Side Arena, Before/After, Ranking Arena)
  • 3 Gamified modals (Speed Round, Slot Machine, Applause Meter)
  • 3 Estimation/Prediction modals (Priority Poker, Confidence Meter, Crystal Ball)
  • 3 Subjective/Emotional modals (Thermometer, Mood Ring, Voice of Customer)
  • 3 Structured Process modals (Mission Briefing, Decision Tree, Retrospective Board)
  • 2 Urgency/Time modals (Hot Take, Speed Round)
  • 1 Aggregate Overview modal (War Room Dashboard)

Every modal captures hidden behavioral metrics (response time, hover patterns, hesitation, order effects) that enrich the training signal far beyond what explicit ratings provide.

Total estimated implementation time: ~2-3 days for the full collection using the shared component library, with the simpler modals (Traffic Light, Quick Pulse, Emoji Scale) buildable in <1 hour each.


Document created for GooseFactory HITL system. All modals designed for MCP App iframe rendering.