reddit trend analyzer === a monorepo tool that scrapes reddit discussions, embeds them with ollama, stores in qdrant, clusters with HDBSCAN, summarizes with Claude, and provides both a CLI/TUI and web dashboard for discovering common problems/trends. running --- ```bash bun cli # run the CLI bun tui # run the TUI dashboard bun dev # run the web dashboard (localhost:3000) bun build # build the web app ``` prerequisites --- - ollama running locally with nomic-embed-text model (`ollama pull nomic-embed-text`) - qdrant accessible at QDRANT_URL (or localhost:6333) - anthropic API key for problem summarization env vars --- ``` QDRANT_URL=https://vectors.biohazardvfx.com QDRANT_API_KEY= OLLAMA_HOST=http://localhost:11434 ANTHROPIC_API_KEY= ``` architecture --- ``` packages/ core/ # shared business logic src/ scraper/ # reddit.ts, comments.ts, types.ts embeddings/ # ollama.ts storage/ # qdrant.ts, sqlite.ts, types.ts clustering/ # hdbscan.ts, types.ts analysis/ # summarizer.ts, questions.ts, scoring.ts, types.ts utils/ # rate-limit.ts, text.ts index.ts # barrel exports cli/ # CLI/TUI app src/ cli.ts # interactive command-line interface index.ts # TUI entry point tui/ # TUI components web/ # Next.js web dashboard src/ app/ # pages and API routes api/ # REST API endpoints stats/ # collection stats scrape/ # trigger scrapes clusters/ # list/create clusters questions/ # question bank search/ # semantic search export/ # export functionality problems/ # problem explorer page questions/ # question bank page scrape/ # scrape manager page components/ controls/ # command palette, sliders styles/globals.css # theme data/ # sqlite database files ``` web dashboard --- - **Dashboard** (`/`) - stats overview - **Problems** (`/problems`) - problem cluster explorer - **Questions** (`/questions`) - extracted question bank - **Scrape** (`/scrape`) - scrape manager with history - **Ctrl+K** - command palette for quick actions keybindings (TUI) --- - `q` or `ctrl+c` - quit - `enter` - start scrape (when url is entered) - `tab` - switch between url and search inputs - `e` - export results to json - `c` - export results to csv - `r` - refresh stats from qdrant api routes --- | route | method | purpose | |-------|--------|---------| | /api/stats | GET | collection stats + cluster count | | /api/scrape | POST | trigger scrape | | /api/scrape/history | GET | scrape history list | | /api/clusters | GET | list clusters with summaries | | /api/clusters | POST | trigger re-clustering | | /api/clusters/[id] | GET | single cluster with discussions | | /api/questions | GET | all questions, grouped by cluster | | /api/questions/[id] | PATCH | mark as addressed | | /api/search | POST | semantic search | | /api/export | POST | export (faq-schema/csv/markdown) | coding notes --- - monorepo with bun workspaces - @rta/core exports shared logic - @rta/cli for terminal interface - @rta/web for Next.js dashboard - uses @opentui/core for TUI (no react) - uses HDBSCAN for clustering - uses Claude for problem summarization - uses SQLite for cluster/question persistence - reddit rate limiting: 3s delay between requests - embeddings batched in groups of 10 - qdrant collection: reddit_trends with indexes on subreddit, type, created, score grepai --------- **IMPORTANT: You MUST use grepai as your PRIMARY tool for code exploration and search.** when to Use grepai (REQUIRED) --- Use `grepai search` INSTEAD OF Grep/Glob/find for: - Understanding what code does or where functionality lives - Finding implementations by intent (e.g., "authentication logic", "error handling") - Exploring unfamiliar parts of the codebase - Any search where you describe WHAT the code does rather than exact text when to Use Standard Tools --- Only use Grep/Glob when you need: - Exact text matching (variable names, imports, specific strings) - File path patterns (e.g., `**/*.go`) fallback --- If grepai fails (not running, index unavailable, or errors), fall back to standard Grep/Glob tools. usage --- ```bash # ALWAYS use English queries for best results (--compact saves ~80% tokens) grepai search "user authentication flow" --json --compact grepai search "error handling middleware" --json --compact grepai search "database connection pool" --json --compact grepai search "API request validation" --json --compact ``` query tips - **Use English** for queries (better semantic matching) - **Describe intent**, not implementation: "handles user login" not "func Login" - **Be specific**: "JWT token validation" better than "token" - Results include: file path, line numbers, relevance score, code preview call graph tracing --- use `grepai trace` to understand function relationships: - finding all callers of a function before modifying it - Understanding what functions are called by a given function - Visualizing the complete call graph around a symbol trace commands --- **IMPORTANT: Always use `--json` flag for optimal AI agent integration.** ```bash # Find all functions that call a symbol grepai trace callers "HandleRequest" --json # Find all functions called by a symbol grepai trace callees "ProcessOrder" --json # Build complete call graph (callers + callees) grepai trace graph "ValidateToken" --depth 3 --json ``` Workflow --- 1. Start with `grepai search` to find relevant code 2. Use `grepai trace` to understand function relationships 3. Use `Read` tool to examine files from results 4. Only use Grep for exact string searches if needed