clawdbot-workspace/proposals/attachments/research-vault-architecture.md

# Research Vault Architecture — Prompt-Chaining System
## Proposed Design for Ingredient-Mechanism Research Automation

---

## Overview

A file-based, multi-step agentic workflow in Cursor + Claude Code that turns a single ingredient name into a complete, formatted research document — zero manual copy-paste.

**User flow:** Copywriter types ingredient name → system runs 4-step prompt chain → formatted markdown document appears in `/outputs/`

---

## Vault Structure

```
research-vault/
│
├── .claude/
│   └── claude.md                        # ROOT ORCHESTRATOR
│       # Defines: agent identity, execution order, error handling
│       # Inherits into all subdirectories
│
├── prompts/
│   ├── .claude/
│   │   └── claude.md                    # PROMPT-RUNNER SUB-AGENT
│   │       # Scoped rules: output format enforcement,
│   │       # citation requirements, no hallucination policy
│   │
│   ├── 01-mechanism-scoring.md          # Step 1: Score ingredient vs mechanisms
│   ├── 02-study-retrieval.md            # Step 2: MCP search for real studies
│   ├── 03-evidence-synthesis.md         # Step 3: Synthesize findings
│   └── 04-output-formatting.md          # Step 4: Format to template
│
├── inputs/
│   ├── .claude/
│   │   └── claude.md                    # INPUT WATCHER rules
│   │       # Validates ingredient names, triggers orchestrator
│   │
│   ├── queue.md                         # Batch mode: list of ingredients
│   └── current.md                       # Single-run mode: one ingredient
│
├── outputs/
│   ├── ashwagandha-2026-02-16.md        # Example completed output
│   ├── berberine-2026-02-16.md
│   └── _index.md                        # Auto-generated output log
│
├── templates/
│   ├── research-document-template.md    # Master output template
│   └── scoring-rubric.md               # Mechanism scoring criteria
│
├── reference/
│   ├── mechanisms-of-action.md          # Known mechanisms database
│   └── scoring-guidelines.md           # How to score ingredient-mechanism fit
│
├── config/
│   └── mcp-settings.json               # MCP server config for search
│
└── docs/
    ├── handoff-guide.md                 # 2-3 page written guide
    └── troubleshooting.md              # Common issues + fixes
```

---

## Claude.md Inheritance Chain

### Root `.claude/claude.md` (Orchestrator)

```markdown
# Research Vault Orchestrator

## Identity
You are a research automation orchestrator. You manage the full pipeline
from ingredient input to formatted output.

## Execution Order
When triggered with an ingredient name:
1. Read the ingredient from `inputs/current.md`
2. Load `templates/scoring-rubric.md` and `reference/mechanisms-of-action.md`
3. Execute prompts in order: 01 → 02 → 03 → 04
4. Each prompt reads the previous step's intermediate output
5. Final output written to `outputs/{ingredient}-{date}.md`
6. Update `outputs/_index.md` with entry

## Error Handling
- If MCP search returns no results: note "No studies found" and continue
- If any step fails: save partial output with [INCOMPLETE] tag
- Never fabricate citations — use only MCP search results

## Rules
- Always follow the template in `templates/research-document-template.md`
- Include real DOIs and PubMed IDs when available
- Score each mechanism 1-10 with justification
```

### Prompts `.claude/claude.md` (Sub-Agent)

```markdown
# Prompt Runner Sub-Agent

## Scope
You execute individual research prompts. You do NOT orchestrate.

## Rules
- Output in markdown only
- Every claim must cite a source (study, review, or meta-analysis)
- If no evidence exists, state "Insufficient evidence" — never hallucinate
- Follow the scoring rubric exactly
- Write intermediate results to a temp file for the next step
```

---

## Prompt Chain Detail

### Step 1: Mechanism Scoring (`01-mechanism-scoring.md`)
**Input:** Ingredient name + mechanisms list
**Process:** Score each mechanism of action (1-10) for relevance to ingredient
**Output:** Ranked list of mechanisms with preliminary scores and reasoning

### Step 2: Study Retrieval (`02-study-retrieval.md`)
**Input:** Top-scored mechanisms from Step 1
**Process:** MCP search for clinical studies, systematic reviews, meta-analyses
**Output:** Evidence table with citations, sample sizes, outcomes, DOIs

### Step 3: Evidence Synthesis (`03-evidence-synthesis.md`)
**Input:** Mechanism scores + evidence table
**Process:** Adjust scores based on evidence quality, synthesize narrative
**Output:** Updated scores + synthesis paragraphs per mechanism

### Step 4: Output Formatting (`04-output-formatting.md`)
**Input:** All previous outputs + template
**Process:** Format into final document structure
**Output:** Publication-ready research document in `outputs/`

---

## MCP Integration

```json
{
  "mcpServers": {
    "web-search": {
      "command": "npx",
      "args": ["-y", "@anthropic/mcp-server-web-search"],
      "env": {
        "BRAVE_API_KEY": "${BRAVE_API_KEY}"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "./"]
    }
  }
}
```

**Search strategy:** For each mechanism, the agent runs 2-3 targeted queries:
1. `"{ingredient}" "{mechanism}" clinical trial site:pubmed.ncbi.nlm.nih.gov`
2. `"{ingredient}" "{mechanism}" systematic review OR meta-analysis`
3. `"{ingredient}" "{mechanism}" randomized controlled trial 2020..2026`

---

## Handoff Design

### For the Copy Chief (non-technical operator):

**To run a single ingredient:**
1. Open `inputs/current.md` in Obsidian
2. Type the ingredient name (e.g., "Ashwagandha")
3. Save the file
4. In Cursor terminal: `claude "Run research pipeline for the ingredient in inputs/current.md"`
5. Wait 2-5 minutes
6. Find your formatted document in `outputs/`

**To run a batch:**
1. Add ingredient names to `inputs/queue.md` (one per line)
2. In Cursor terminal: `claude "Process all ingredients in inputs/queue.md"`

**To modify prompts:**
- Edit any file in `prompts/` — they're plain markdown
- The template in `templates/research-document-template.md` controls final format
- Scoring criteria in `templates/scoring-rubric.md` controls how mechanisms are rated

---

## Timeline

| Day | Deliverable |
|-----|------------|
| 1-2 | Vault structure + claude.md rules + MCP config |
| 3-4 | Prompt chain implementation + testing with 2-3 ingredients |
| 5   | Working V1 + handoff guide draft |
| 6-7 | Copy Chief walkthrough + iteration on feedback |
| 8-10 | Polish, edge cases, batch mode, final documentation |

---

*Prepared by Jake Shore | https://portfolio.mcpengage.com*