clawdbot-workspace/proposals/attachments/research-vault-architecture.md
2026-02-16 23:01:00 -05:00

202 lines
6.7 KiB
Markdown

# Research Vault Architecture — Prompt-Chaining System
## Proposed Design for Ingredient-Mechanism Research Automation
---
## Overview
A file-based, multi-step agentic workflow in Cursor + Claude Code that turns a single ingredient name into a complete, formatted research document — zero manual copy-paste.
**User flow:** Copywriter types ingredient name → system runs 4-step prompt chain → formatted markdown document appears in `/outputs/`
---
## Vault Structure
```
research-vault/
├── .claude/
│ └── claude.md # ROOT ORCHESTRATOR
│ # Defines: agent identity, execution order, error handling
│ # Inherits into all subdirectories
├── prompts/
│ ├── .claude/
│ │ └── claude.md # PROMPT-RUNNER SUB-AGENT
│ │ # Scoped rules: output format enforcement,
│ │ # citation requirements, no hallucination policy
│ │
│ ├── 01-mechanism-scoring.md # Step 1: Score ingredient vs mechanisms
│ ├── 02-study-retrieval.md # Step 2: MCP search for real studies
│ ├── 03-evidence-synthesis.md # Step 3: Synthesize findings
│ └── 04-output-formatting.md # Step 4: Format to template
├── inputs/
│ ├── .claude/
│ │ └── claude.md # INPUT WATCHER rules
│ │ # Validates ingredient names, triggers orchestrator
│ │
│ ├── queue.md # Batch mode: list of ingredients
│ └── current.md # Single-run mode: one ingredient
├── outputs/
│ ├── ashwagandha-2026-02-16.md # Example completed output
│ ├── berberine-2026-02-16.md
│ └── _index.md # Auto-generated output log
├── templates/
│ ├── research-document-template.md # Master output template
│ └── scoring-rubric.md # Mechanism scoring criteria
├── reference/
│ ├── mechanisms-of-action.md # Known mechanisms database
│ └── scoring-guidelines.md # How to score ingredient-mechanism fit
├── config/
│ └── mcp-settings.json # MCP server config for search
└── docs/
├── handoff-guide.md # 2-3 page written guide
└── troubleshooting.md # Common issues + fixes
```
---
## Claude.md Inheritance Chain
### Root `.claude/claude.md` (Orchestrator)
```markdown
# Research Vault Orchestrator
## Identity
You are a research automation orchestrator. You manage the full pipeline
from ingredient input to formatted output.
## Execution Order
When triggered with an ingredient name:
1. Read the ingredient from `inputs/current.md`
2. Load `templates/scoring-rubric.md` and `reference/mechanisms-of-action.md`
3. Execute prompts in order: 01 → 02 → 03 → 04
4. Each prompt reads the previous step's intermediate output
5. Final output written to `outputs/{ingredient}-{date}.md`
6. Update `outputs/_index.md` with entry
## Error Handling
- If MCP search returns no results: note "No studies found" and continue
- If any step fails: save partial output with [INCOMPLETE] tag
- Never fabricate citations — use only MCP search results
## Rules
- Always follow the template in `templates/research-document-template.md`
- Include real DOIs and PubMed IDs when available
- Score each mechanism 1-10 with justification
```
### Prompts `.claude/claude.md` (Sub-Agent)
```markdown
# Prompt Runner Sub-Agent
## Scope
You execute individual research prompts. You do NOT orchestrate.
## Rules
- Output in markdown only
- Every claim must cite a source (study, review, or meta-analysis)
- If no evidence exists, state "Insufficient evidence" — never hallucinate
- Follow the scoring rubric exactly
- Write intermediate results to a temp file for the next step
```
---
## Prompt Chain Detail
### Step 1: Mechanism Scoring (`01-mechanism-scoring.md`)
**Input:** Ingredient name + mechanisms list
**Process:** Score each mechanism of action (1-10) for relevance to ingredient
**Output:** Ranked list of mechanisms with preliminary scores and reasoning
### Step 2: Study Retrieval (`02-study-retrieval.md`)
**Input:** Top-scored mechanisms from Step 1
**Process:** MCP search for clinical studies, systematic reviews, meta-analyses
**Output:** Evidence table with citations, sample sizes, outcomes, DOIs
### Step 3: Evidence Synthesis (`03-evidence-synthesis.md`)
**Input:** Mechanism scores + evidence table
**Process:** Adjust scores based on evidence quality, synthesize narrative
**Output:** Updated scores + synthesis paragraphs per mechanism
### Step 4: Output Formatting (`04-output-formatting.md`)
**Input:** All previous outputs + template
**Process:** Format into final document structure
**Output:** Publication-ready research document in `outputs/`
---
## MCP Integration
```json
{
"mcpServers": {
"web-search": {
"command": "npx",
"args": ["-y", "@anthropic/mcp-server-web-search"],
"env": {
"BRAVE_API_KEY": "${BRAVE_API_KEY}"
}
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "./"]
}
}
}
```
**Search strategy:** For each mechanism, the agent runs 2-3 targeted queries:
1. `"{ingredient}" "{mechanism}" clinical trial site:pubmed.ncbi.nlm.nih.gov`
2. `"{ingredient}" "{mechanism}" systematic review OR meta-analysis`
3. `"{ingredient}" "{mechanism}" randomized controlled trial 2020..2026`
---
## Handoff Design
### For the Copy Chief (non-technical operator):
**To run a single ingredient:**
1. Open `inputs/current.md` in Obsidian
2. Type the ingredient name (e.g., "Ashwagandha")
3. Save the file
4. In Cursor terminal: `claude "Run research pipeline for the ingredient in inputs/current.md"`
5. Wait 2-5 minutes
6. Find your formatted document in `outputs/`
**To run a batch:**
1. Add ingredient names to `inputs/queue.md` (one per line)
2. In Cursor terminal: `claude "Process all ingredients in inputs/queue.md"`
**To modify prompts:**
- Edit any file in `prompts/` — they're plain markdown
- The template in `templates/research-document-template.md` controls final format
- Scoring criteria in `templates/scoring-rubric.md` controls how mechanisms are rated
---
## Timeline
| Day | Deliverable |
|-----|------------|
| 1-2 | Vault structure + claude.md rules + MCP config |
| 3-4 | Prompt chain implementation + testing with 2-3 ingredients |
| 5 | Working V1 + handoff guide draft |
| 6-7 | Copy Chief walkthrough + iteration on feedback |
| 8-10 | Polish, edge cases, batch mode, final documentation |
---
*Prepared by Jake Shore | https://portfolio.mcpengage.com*