# Reonomy Scraper - Complete Analysis & Memory **Last Updated:** 2026-01-13 19:43Z --- ## 🎯 Critical URL Pattern Discovery ### ✅ Working URL Patterns ``` # Search Page (property list) https://app.reonomy.com/#!/search/{search-id} # Property Page (with tabs) https://app.reonomy.com/#!/property/{property-id} # Ownership Page (WITH CONTACT INFO) ← KEY! https://app.reonomy.com/#!/search/{search-id}/property/{property-id}/ownership ``` **Key Insight:** Must use `/ownership` suffix to get emails/phones. Direct property pages don't show contact info. --- ## 📊 DOM Structure & Contact Selectors ### Page Layout - **Left Panel**: Map view - **Right Panel**: Property cards (scrollable list) - **Property Details Page**: 3 tabs 1. **Owner** (RIGHT side, default tab) ← Contains contact info 2. **Building and Lot** (property details) 3. **Occupants** (tenant info) ### Contact Info Extraction (PROVEN WORKING) ```javascript // Emails (from manually tested property) document.querySelectorAll('a[href^="mailto:"]').forEach(a => { const email = a.href.replace('mailto:', ''); if (email && email.length > 5) { // Found email! } }); // Phones (from manually tested property) document.querySelectorAll('a[href^="tel:"]').forEach(a => { const phone = a.href.replace('tel:', ''); if (phone && phone.length > 7) { // Found phone! } }); ``` ### Property Address Extraction ```javascript // From h1-h6 heading const heading = document.querySelector('h1, h2, h3, h4, h5, h6'); const address = heading.textContent.trim(); // Format: "123 main st, city, ST 12345" ``` ### Owner Name Extraction ```javascript // From page text const ownerPattern = /Owner:\s*(\d+)\s+properties?\s*in\s*([A-Za-z\s,]+(?:\s*,\s+[A-Z]{2})?)/i; const ownerMatch = document.body.innerText.match(ownerPattern); const ownerName = ownerMatch[2]?.trim(); // e.g., "Helen Christian" ``` --- ## 🐛 Issues Encountered ### Issue 1: Account Tier / Access Levels - **Problem:** When scraper navigates to `/ownership` URLs, it finds 0 emails/phones - **Root Cause:** Different properties may have different access levels based on: - Premium/Free account tier - Property type (commercial vs residential) - Geographic location - Whether you've previously viewed the property - **Evidence:** Manually inspected property showed 4 emails + 4 phones, but scraper found 0 ### Issue 2: Page Loading Timing - **Problem:** Contact info loads dynamically via JavaScript/AJAX after initial page load - **Evidence:** Reonomy uses SPA (Single Page Application) framework - **Solution Needed:** Increased wait times (10-15 seconds) + checking for specific selectors ### Issue 3: Dynamic Property IDs - **Problem:** Property IDs extracted from search results may not be the most recent/current ones - **Evidence:** Different searches produce different property lists - **Solution Needed:** Check URL to confirm we're on correct search --- ## 📂 Scraper Versions ### v1-v3.js - Basic (from earlier attempts) - ❌ Wrong URL pattern (missing `/search/{id}`) - ❌ Wrong selectors (complex CSS) - ❌ No contact info extraction ### v2-v4-final.js - Direct Navigation (failed) - ✅ Correct URL pattern: `/search/{search-id}/property/{id}/ownership` - ❌ Navigates directly to /ownership without clicking through property - ❌ Finds 0 emails/phones on all properties ### v3-v4-v5-v6-v7-v8-v9 (various click-through attempts) - ✅ All attempted to click property buttons first - ❌ All found 0 emails/phones on properties - ⚠️ Possible cause: Account access limitations, dynamic loading, wrong page state ### v9 (LATEST) - Owner Tab Extraction (current best approach) - ✅ Extracts data from **Owner tab** (right side, default view) - ✅ No tab clicking needed - contact info is visible by default - ✅ Extracts: address, city, state, zip, square footage, property type, owner names, emails, phones - ✅ Correct URL pattern with `/ownership` suffix - ✅ 8 second wait for content to load - ✅ Click-through approach: property button → property page → extract Owner tab → go back → next property **File:** `reonomy-scraper-v9-owner-tab.js` --- ## 🎯 Recommended Approach ### Workflow (Based on manual inspection) 1. **Login** to Reonomy 2. **Navigate** to search 3. **Apply advanced filters** (optional but helpful): - "Has Phone" checkbox - "Has Email" checkbox 4. **Search** for location (e.g., "Eatontown, NJ") 5. **Extract property IDs** from search results 6. **For each property**: - Click property button (navigate into property page) - Wait 5-8 seconds for page to load - Navigate to `/ownership` tab (CRITICAL - this is where contact info is!) - Wait 8-10 seconds for ownership tab content to load - Extract contact info: - Emails: `a[href^="mailto:"]` - Phones: `a[href^="tel:"]` - Owner name: From page text regex - Property address: From h1-h6 heading - Go back to search results 7. **Repeat** for next property ### Key Differences from Previous Attempts | Aspect | Old Approach | New Approach (v9) | |---------|-------------|----------------| | **URL** | `/property/{id}` | `/search/{id}/property/{id}/ownership` | | **Navigation** | Direct to page | Click property → Go to ownership | | **View** | Dashboard/Search | Owner tab (default right side) | | **Wait Time** | 2-3 seconds | 8-10 seconds (longer) | | **Data Source** | Not found | Owner tab content | --- ## 🚀 How to Use v9 Scraper ```bash # Run with default settings (Eatontown, NJ) cd /Users/jakeshore/.clawdbot/workspace node reonomy-scraper-v9-owner-tab.js # Run with custom location REONOMY_LOCATION="Your City, ST" node reonomy-scraper-v9-owner-tab.js # Run in visible mode (watch it work) HEADLESS=false node reonomy-scraper-v9-owner-tab.js ``` ### Configuration Options ```bash # Change email/password REONOMY_EMAIL="your-email@example.com" REONOMY_PASSWORD="yourpassword" node reonomy-scraper-v9-owner-tab.js # Change max properties (default: 20) MAX_PROPERTIES=50 node reonomy-scraper-v9-owner-tab.js ``` ### Output - **File:** `reonomy-leads-v9-owner-tab.json` - **Format:** JSON with scrapeDate, location, searchId, leadCount, leads[] - **Each lead contains:** - scrapeDate - propertyId - propertyUrl - ownershipUrl (with `/ownership` suffix) - address - city, state, zip - squareFootage - propertyType - ownerNames (array) - emails (array) - phones (array) --- ## 🎯 What Makes v9 Different 1. **Correct URL Pattern** - Uses `/search/{search-id}/property/{id}/ownership` (not just `/property/{id}`) 2. **Owner Tab Extraction** - Extracts from Owner tab content directly (no need to click "View Contact" button) 3. **Click-Through Workflow** - Property button → Navigate → Extract → Go back → Next property 4. **Longer Wait Times** - 10 second wait after navigation, 10 second wait after going to ownership tab 5. **Full Data Extraction** - Not just emails/phones, but also: address, city, state, zip, square footage, property type, owner names --- ## 🔧 If v9 Still Fails ### Manual Debugging Steps 1. Run in visible mode to watch the browser 2. Check if the Owner tab is the default view (it should be) 3. Verify we're on the correct search results page 4. Check if property IDs are being extracted correctly 5. Look for any "Upgrade to view contact" or "Premium only" messages ### Alternative: Try Specific Properties From your manually tested property that had contact info: - Search for: "Center Hill, FL" or specific address from that property - Navigate directly to that property's ownership tab ### Alternative: Check "Recently Viewed Properties" Your account shows "Recently Viewed Properties" on the home page - these may have guaranteed access to contact info --- ## 📝 Summary **We've learned:** - ✅ Correct URL pattern for contact info: `/search/{id}/property/{id}/ownership` - ✅ Contact info is in **Owner tab** (right side, default) - ✅ Emails: `a[href^="mailto:"]` - ✅ Phones: `a[href^="tel:"]` - ✅ Can extract: address, owner names, property details - ⚠️ Contact info may be limited by account tier or property type **Current Best Approach:** v9 Owner Tab Extractor **Next Step:** Test v9 and see if it successfully finds contact info on properties that have it available.