clawdbot-workspace/README.md

# Reonomy Lead Scraper

A browser automation tool that scrapes property and owner leads from [Reonomy](https://www.reonomy.com/) and exports them to Google Sheets.

## Features

- ✅ Automated login to Reonomy
- 🔍 Search for properties by location
- 📊 Extract lead data:
  - Owner Name
  - Property Address
  - City, State, ZIP
  - Property Type
  - Square Footage
  - Owner Location
  - Property Count
  - Property/Owner URLs
- 📈 Export to Google Sheets via `gog` CLI
- 🔐 Secure credential handling (environment variables or 1Password)
- 🖥️ Headless or visible browser mode

## Prerequisites

### Required Tools

1. **Node.js** (v14 or higher)
   ```bash
   # Check if installed
   node --version
   ```

2. **gog CLI** - Google Workspace command-line tool
   ```bash
   # Install via Homebrew
   brew install gog

   # Or from GitHub
   # https://github.com/stripe/gog

   # Authenticate
   gog auth login
   ```

3. **Puppeteer** (installed via npm with this script)

### Optional Tools

- **1Password CLI** (`op`) - For secure credential storage
  ```bash
  brew install --cask 1password-cli
  ```

## Installation

1. Clone or navigate to the workspace directory:
   ```bash
   cd /Users/jakeshore/.clawdbot/workspace
   ```

2. Install Node.js dependencies:
   ```bash
   npm install
   ```

3. Make the script executable (should already be done):
   ```bash
   chmod +x scrape-reonomy.sh
   ```

## Setup

### Option 1: Environment Variables (Recommended for Development)

Set your Reonomy credentials as environment variables:

```bash
export REONOMY_EMAIL="henry@realestateenhanced.com"
export REONOMY_PASSWORD="your_password_here"
```

Or add to your shell profile (e.g., `~/.zshrc` or `~/.bash_profile`):

```bash
echo 'export REONOMY_EMAIL="henry@realestateenhanced.com"' >> ~/.zshrc
echo 'export REONOMY_PASSWORD="9082166532"' >> ~/.zshrc
source ~/.zshrc
```

### Option 2: 1Password (Recommended for Production)

1. Create a 1Password item named "Reonomy"
2. Add fields:
   - `email`: Your Reonomy email
   - `password`: Your Reonomy password
3. Use the `--1password` flag when running the scraper:

   ```bash
   ./scrape-reonomy.sh --1password
   ```

### Option 3: Interactive Prompt

If you don't set credentials, the script will prompt you for them:

```bash
./scrape-reonomy.sh
```

## Usage

### Basic Usage

Run the scraper with default settings (searches "New York, NY"):

```bash
./scrape-reonomy.sh
```

### Search a Different Location

```bash
./scrape-reonomy.sh --location "Los Angeles, CA"
```

### Use Existing Google Sheet

```bash
./scrape-reonomy.sh --sheet "1ABC123XYZ..."
```

### Run in Headless Mode (No Browser Window)

```bash
./scrape-reonomy.sh --headless
```

### Combined Options

```bash
# Search Chicago, use headless mode, save to existing sheet
./scrape-reonomy.sh \
  --location "Chicago, IL" \
  --headless \
  --sheet "1ABC123XYZ..."
```

### Using 1Password

```bash
./scrape-reonomy.sh --1password --headless
```

### Direct Node.js Usage

You can also run the scraper directly with Node.js:

```bash
REONOMY_EMAIL="..." \
REONOMY_PASSWORD="..." \
REONOMY_LOCATION="Miami, FL" \
HEADLESS=true \
node reonomy-scraper.js
```

## Output

### Google Sheet

The scraper creates or appends to a Google Sheet with the following columns:

| Column | Description |
|--------|-------------|
| Scrape Date | Date the lead was scraped |
| Owner Name | Property owner's name |
| Property Address | Street address of the property |
| City | Property city |
| State | Property state |
| ZIP | Property ZIP code |
| Property Type | Type of property (e.g., "General Industrial") |
| Square Footage | Property size |
| Owner Location | Owner's location |
| Property Count | Number of properties owned |
| Property URL | Direct link to property page |
| Owner URL | Direct link to owner profile |
| Email | Owner email (if available) |
| Phone | Owner phone (if available) |

### Log File

Detailed logs are saved to:
```
/Users/jakeshore/.clawdbot/workspace/reonomy-scraper.log
```

## Command-Line Options

| Option | Description |
|--------|-------------|
| `-h, --help` | Show help message |
| `-l, --location LOC` | Search location (default: "New York, NY") |
| `-s, --sheet ID` | Google Sheet ID (creates new sheet if not provided) |
| `-H, --headless` | Run in headless mode (no browser window) |
| `--no-headless` | Run with visible browser |
| `--1password` | Fetch credentials from 1Password |

## Environment Variables

| Variable | Required | Description |
|----------|----------|-------------|
| `REONOMY_EMAIL` | Yes | Your Reonomy email address |
| `REONOMY_PASSWORD` | Yes | Your Reonomy password |
| `REONOMY_LOCATION` | No | Search location (default: "New York, NY") |
| `REONOMY_SHEET_ID` | No | Google Sheet ID (creates new sheet if not set) |
| `REONOMY_SHEET_TITLE` | No | Title for new sheet (default: "Reonomy Leads") |
| `HEADLESS` | No | Run in headless mode ("true" or "false") |

## Troubleshooting

### "Login failed" Error

- Verify your credentials are correct
- Check if Reonomy has changed their login process
- Try running without headless mode to see what's happening:
  ```bash
  ./scrape-reonomy.sh --no-headless
  ```

### "gog command failed" Error

- Ensure `gog` is installed and authenticated:
  ```bash
  gog auth login
  ```
- Check your Google account has Google Sheets access

### "No leads extracted" Warning

- The page structure may have changed
- The search location might not have results
- Check the screenshot saved to `/tmp/reonomy-no-leads.png` or `/tmp/reonomy-error.png`

### Puppeteer Issues

If you encounter browser-related errors, try:
```bash
npm install puppeteer --force
```

## Security Notes

### Credential Security

⚠️ **Important**: Never commit your credentials to version control!

**Best Practices:**
1. Use environment variables (set in your shell profile)
2. Use 1Password for production environments
3. Add `.env` files to `.gitignore`
4. Never hardcode credentials in scripts

### Recommended `.gitignore`

```gitignore
# Credentials
.env
.reonomy-credentials.*

# Logs
*.log
reonomy-scraper.log

# Screenshots
*.png
/tmp/reonomy-*.png

# Node
node_modules/
package-lock.json
```

## Advanced Usage

### Scheduled Scraping

You can set up a cron job to scrape automatically:

```bash
# Edit crontab
crontab -e

# Add line to scrape every morning at 9 AM
0 9 * * * /Users/jakeshore/.clawdbot/workspace/scrape-reonomy.sh --headless --1password >> /tmp/reonomy-cron.log 2>&1
```

### Custom Search Parameters

The scraper currently searches by location. To customize:

1. Edit `reonomy-scraper.js`
2. Modify the `extractLeadsFromPage` function
3. Add filters for:
   - Property type
   - Price range
   - Building size
   - Owner type

### Integrating with Other Tools

The Google Sheet can be connected to:
- Google Data Studio for dashboards
- Zapier for automations
- Custom scripts for further processing

## Development

### File Structure

```
workspace/
├── reonomy-scraper.js      # Main scraper script
├── scrape-reonomy.sh       # Shell wrapper
├── package.json            # Node.js dependencies
├── README.md               # This file
├── reonomy-scraper.log     # Run logs
└── node_modules/           # Dependencies
```

### Testing

Test the scraper in visible mode first:

```bash
./scrape-reonomy.sh --no-headless --location "Brooklyn, NY"
```

### Extending the Scraper

To add new data fields:
1. Update the `headers` array in `initializeSheet()`
2. Update the `extractLeadsFromPage()` function
3. Add new parsing functions as needed

## Support

### Getting Help

- Check the log file: `reonomy-scraper.log`
- Run with visible browser to see issues: `--no-headless`
- Check screenshots in `/tmp/` directory

### Common Issues

| Issue | Solution |
|-------|----------|
| Login fails | Verify credentials, try manual login |
| No leads found | Try a different location, check search results |
| Google Sheets error | Run `gog auth login` to re-authenticate |
| Browser timeout | Increase timeout in the script |

## License

This tool is for educational and personal use. Respect Reonomy's Terms of Service when scraping.

## Changelog

### v1.0.0 (Current)
- Initial release
- Automated login
- Location-based search
- Google Sheets export
- 1Password integration
- Headless mode support