221 lines
7.1 KiB
Markdown

# Reonomy Scraper Platform — Architecture
## Overview
Full-stack Reonomy property data extraction platform with:
- REST API server (Express.js)
- MCP Server (TypeScript, stdio transport)
- MCP App(s) with React UI
- Integration with LocalBosses web app
## Components
### 1. API Server (`/api`)
Express.js REST API that orchestrates scraping.
**Endpoints:**
- `POST /api/scrape` — Start a scrape job with search + output config
- `GET /api/scrape/:jobId` — Check job status
- `GET /api/scrape/:jobId/results` — Get results (JSON)
- `GET /api/filters` — List all available Reonomy filter options
- `GET /api/exports/:jobId?format=csv|json` — Export results
**Auth:** API key header (`X-API-Key`)
### 2. Scraper Engine (`/engine`)
Core scraping logic (evolved from v13). Modular extraction.
**Modules:**
- `auth.js` — Login, session management, state save/load
- `search-builder.js` — Translates filter config → Reonomy UI actions
- `extractor.js` — Modular tab extraction (only grabs what user requested)
- `anti-detection.js` — Random delays, humanization, daily limits
- `queue.js` — Job queue with rate limiting
**Extraction modules (per tab):**
- `extract-building.js` — Building & Lot data
- `extract-owner.js` — Owner names, phones, emails (CURRENT v13 logic)
- `extract-sales.js` — Sale history
- `extract-debt.js` — Mortgage/lender info
- `extract-tax.js` — Tax assessed values
### 3. MCP Server (`/mcp-server`)
TypeScript MCP server exposing Reonomy tools.
**Tools:**
- `reonomy_search` — Configure search filters, returns search ID + count
- `reonomy_scrape` — Start extraction from a search (with output config)
- `reonomy_get_results` — Fetch results for a job
- `reonomy_get_filters` — List all available filter options
- `reonomy_export` — Export results as CSV/JSON
**Resources:**
- `reonomy://app/search` — Search configuration UI
- `reonomy://app/results` — Results viewer UI
- `reonomy://app/dashboard` — Dashboard with stats
### 4. MCP App(s) (`/mcp-app`)
React + Vite, bundled to single HTML files per app.
**Apps:**
- **Search Builder App** — Visual filter configuration
- Location picker, property type checkboxes
- Owner filters (phone/email toggles)
- Building & Lot ranges
- Output field selection
- "Start Scrape" button
- **Results Viewer App** — Table/card view of scraped leads
- Sortable/filterable data table
- Expandable owner cards with contact info
- Export to CSV button
- Job status indicator
- **Dashboard App** — Scrape stats overview
- Total leads, daily usage, job history
- Properties by type chart
### 5. LocalBosses Integration
- Add Reonomy channel to toolbar
- Wire MCP apps into iframe system
- API endpoints accessible from app
## Tech Stack
- **Runtime:** Node.js 22
- **API:** Express.js
- **MCP Server:** @modelcontextprotocol/sdk (TypeScript)
- **MCP App:** React 18 + Vite + Tailwind
- **Browser Automation:** agent-browser CLI
- **Queue:** In-memory (Bull optional for production)
- **Storage:** SQLite (better-sqlite3) for results + job tracking
- **Rate Limiting:** Built-in daily caps + per-request delays
## File Structure
```
reonomy-api/
├── ARCHITECTURE.md
├── package.json
├── tsconfig.json
├── src/
│ ├── server.ts # Express API entry
│ ├── routes/
│ │ ├── scrape.ts # /api/scrape endpoints
│ │ ├── filters.ts # /api/filters
│ │ └── exports.ts # /api/exports
│ ├── engine/
│ │ ├── auth.ts # Reonomy auth
│ │ ├── search-builder.ts
│ │ ├── extractor.ts # Orchestrates tab extraction
│ │ ├── anti-detection.ts
│ │ ├── queue.ts
│ │ └── extractors/
│ │ ├── building.ts
│ │ ├── owner.ts
│ │ ├── sales.ts
│ │ ├── debt.ts
│ │ └── tax.ts
│ ├── mcp/
│ │ ├── server.ts # MCP server entry
│ │ ├── tools.ts # Tool definitions
│ │ └── resources.ts # Resource definitions
│ ├── db/
│ │ ├── schema.ts # SQLite schema
│ │ └── queries.ts # DB operations
│ └── types.ts # Shared types
├── mcp-app/
│ ├── package.json
│ ├── vite.config.ts
│ ├── src/
│ │ ├── apps/
│ │ │ ├── SearchBuilder.tsx
│ │ │ ├── ResultsViewer.tsx
│ │ │ └── Dashboard.tsx
│ │ ├── components/
│ │ │ ├── FilterPanel.tsx
│ │ │ ├── PropertyCard.tsx
│ │ │ ├── DataTable.tsx
│ │ │ └── ExportButton.tsx
│ │ └── lib/
│ │ └── mcp-app-sdk.ts
│ └── dist/ # Built HTML files
└── data/
└── reonomy.db # SQLite database
```
## Search Config Schema
```typescript
interface SearchConfig {
location: string; // "Miami-Dade, FL"
propertyTypes?: string[]; // ["Multifamily", "Multi Family (General)"]
building?: {
yearBuiltFrom?: number;
yearBuiltUntil?: number;
yearRenovatedFrom?: number;
yearRenovatedUntil?: number;
zoning?: string;
lotSizeSfMin?: number;
lotSizeSfMax?: number;
lotSizeAcresMin?: number;
lotSizeAcresMax?: number;
opportunityZone?: boolean;
totalUnitsMin?: number;
totalUnitsMax?: number;
buildingAreaMin?: number;
buildingAreaMax?: number;
};
owner?: {
nameOrCompany?: string;
ownerType?: "Company" | "Person";
includesPhone?: boolean;
includesEmail?: boolean;
includesMailingAddress?: boolean;
portfolioMin?: number;
portfolioMax?: number;
ownerOccupied?: boolean;
inStateOwner?: boolean;
portfolioValueMin?: number;
portfolioValueMax?: number;
reportedOwner?: string;
mailingAddress?: string;
};
occupants?: {
name?: string;
naicsSic?: string;
website?: string;
};
sales?: {
dateRange?: string;
multiParcel?: boolean;
priceMin?: number;
priceMax?: number;
pricePerSfMin?: number;
pricePerSfMax?: number;
likelyToSell?: boolean;
};
debt?: {
amountMin?: number;
amountMax?: number;
originationFrom?: string;
originationUntil?: string;
maturityFrom?: string;
maturityUntil?: string;
lenderName?: string;
cmbsLoan?: boolean;
};
distressed?: {
auctionDateFrom?: string;
auctionDateUntil?: string;
preForeclosureCategory?: string;
cmbsWatchlist?: boolean;
};
}
interface OutputConfig {
propertyInfo?: ("address" | "type" | "units" | "sqft" | "yearBuilt" | "lotSize" | "zoning" | "opportunityZone" | "apn" | "legal")[];
ownerInfo?: ("name" | "company" | "portfolioSize" | "portfolioValue" | "ownerType")[];
contactInfo?: ("phones" | "emails")[];
salesInfo?: ("lastSaleDate" | "lastSalePrice" | "buyer" | "seller" | "deedType")[];
debtInfo?: ("lender" | "loanType" | "mortgageAmount" | "maturityDate" | "interestType")[];
taxInfo?: ("assessedValue" | "taxAmount")[];
}
```