2026-02-14 23:01:35 -05:00

5.3 KiB

Common Pitfalls — Real Mistakes & How to Avoid Them

Curated from 30+ entries in Buba's lessons-learned.md. These are real mistakes that cost real time.


Gateway & Infrastructure

Background processes die when exec closes

  • Mistake: Ran cloudflared tunnel in an exec session. When the session closed, the process died.
  • Rule: Always use nohup your-command & for anything that needs to outlive the session.

Gateway logs moved to /tmp/

  • Mistake: Checked ~/.clawdbot/logs/ and said "nothing since Feb 5" — wrong directory.
  • Rule: Current gateway logs live at /tmp/clawdbot/clawdbot-YYYY-MM-DD.log, not the old logs dir.

tmux death kills auto-restart loops

  • Mistake: Assumed compaction caused silence. Actually the entire tmux session died.
  • Rule: When diagnosing downtime, check tmux list-sessions first. If the session is newer than expected, tmux died and restarted.

Verify before announcing

  • Mistake: Told the user "it's live!" three times before it actually was.
  • Rule: Always curl the URL and confirm a 200 response before announcing anything is deployed.

Use python3 -u for background scripts

  • Mistake: Background Python script produced no output — stdout was buffered.
  • Rule: Always use python3 -u (unbuffered) for scripts running in background/exec sessions.

http2 > quic for cloudflared tunnels

  • Mistake: Default QUIC protocol was unreliable.
  • Rule: Use --protocol http2 for cloudflared tunnels — more reliable than default QUIC.

Discord API

Guild ID ≠ Channel ID

  • Mistake: Passed a channel ID to channel-list, got "Unknown Guild."
  • Rule: These are different things. Know your guild IDs separately from channel IDs.

allowBots is off by default

  • Mistake: Spent 20 minutes wondering why another bot couldn't see my messages.
  • Rule: Set channels.discord.allowBots: true in gateway config for bot-to-bot communication.

Don't spam debug messages

  • Mistake: Sent 45 messages of debug output to a Discord channel.
  • Rule: Do work silently, announce clean results. The user doesn't need to see your stderr.

Category creation requires guild permissions

  • Mistake: Tried to create a category without proper bot permissions.
  • Rule: Ensure your bot has Manage Channels permission in the guild.

Context & Memory

Saying "noted" without writing to disk

  • Mistake: Acknowledged information, said "I'll remember that," never wrote it anywhere.
  • Rule: If it matters, write it to a file immediately. "Noted" means nothing if it's only in context.

Waiting until end-of-day to write memory

  • Mistake: Session compacted mid-day, lost all context, hadn't written anything yet.
  • Rule: Write to daily log mid-session. Don't wait. If the session dies, the work should be captured.

Relying on compaction hooks for memory saves

  • Mistake: Expected pre-compaction flush to save everything. Context was already truncated.
  • Rule: Don't rely on compaction hooks. Write eagerly throughout the session.

Using "should" instead of "MANDATORY" in AGENTS.md

  • Mistake: Rules framed as "should" or "consider" were consistently ignored.
  • Rule: Use "MANDATORY", "EVERY time", "FIRST" — strong framing changes model behavior.

Sub-agents

Brief task descriptions

  • Mistake: Gave sub-agent a vague task, got garbage output.
  • Rule: Sub-agents have NO context from your conversation. Write the task like a spec doc for a contractor you've never met.

Agents clobbering each other's files

  • Mistake: Two parallel agents wrote to the same directory, overwrote each other's work.
  • Rule: Give each agent its own output directory (e.g., servers/{name}/ per agent).

Not logging spawned agents

  • Mistake: Spawned 10 agents, session compacted, forgot what I spawned.
  • Rule: Log every spawn to working-state.md immediately with labels.

Zombie agents running forever

  • Mistake: Agent hung on a bad API call with no timeout.
  • Rule: Always set runTimeoutSeconds (600 = 10 min is a good default).

Deployments & DNS

Cloudflare Workers need DNS records

  • Mistake: Worker with custom route returned nothing — no DNS record existed.
  • Rule: Workers with routes need a proxied A record. Use 192.0.2.1 (RFC 5737 dummy IP).

Quick tunnels are unreliable for production

  • Mistake: Used Cloudflare quick tunnels for "production" — they expire and break.
  • Rule: Use GitHub Pages, Cloudflare Workers, or proper tunnel configs for anything that needs to stay up.

CF Registrar is dashboard-only

  • Mistake: Tried to register a domain via API.
  • Rule: Cloudflare domain registration is dashboard-only. No API. Only management of existing domains.

Cron Jobs

Cron text is what fires as the prompt

  • Mistake: Wrote cron text as a description instead of an actionable prompt.
  • Rule: Write cron text as something that reads like an instruction when it fires.

Test crons with cron run before waiting

  • Mistake: Set a cron, waited hours, found out it was broken.
  • Rule: Use cron(action: "run", jobId: "...") to test immediately.

Add your own lessons as you discover them. The goal: never make the same mistake twice.