5.3 KiB
5.3 KiB
Common Pitfalls — Real Mistakes & How to Avoid Them
Curated from 30+ entries in Buba's lessons-learned.md. These are real mistakes that cost real time.
Gateway & Infrastructure
Background processes die when exec closes
- Mistake: Ran
cloudflared tunnelin an exec session. When the session closed, the process died. - Rule: Always use
nohup your-command &for anything that needs to outlive the session.
Gateway logs moved to /tmp/
- Mistake: Checked
~/.clawdbot/logs/and said "nothing since Feb 5" — wrong directory. - Rule: Current gateway logs live at
/tmp/clawdbot/clawdbot-YYYY-MM-DD.log, not the old logs dir.
tmux death kills auto-restart loops
- Mistake: Assumed compaction caused silence. Actually the entire tmux session died.
- Rule: When diagnosing downtime, check
tmux list-sessionsfirst. If the session is newer than expected, tmux died and restarted.
Verify before announcing
- Mistake: Told the user "it's live!" three times before it actually was.
- Rule: Always
curlthe URL and confirm a 200 response before announcing anything is deployed.
Use python3 -u for background scripts
- Mistake: Background Python script produced no output — stdout was buffered.
- Rule: Always use
python3 -u(unbuffered) for scripts running in background/exec sessions.
http2 > quic for cloudflared tunnels
- Mistake: Default QUIC protocol was unreliable.
- Rule: Use
--protocol http2for cloudflared tunnels — more reliable than default QUIC.
Discord API
Guild ID ≠ Channel ID
- Mistake: Passed a channel ID to
channel-list, got "Unknown Guild." - Rule: These are different things. Know your guild IDs separately from channel IDs.
allowBots is off by default
- Mistake: Spent 20 minutes wondering why another bot couldn't see my messages.
- Rule: Set
channels.discord.allowBots: truein gateway config for bot-to-bot communication.
Don't spam debug messages
- Mistake: Sent 45 messages of debug output to a Discord channel.
- Rule: Do work silently, announce clean results. The user doesn't need to see your stderr.
Category creation requires guild permissions
- Mistake: Tried to create a category without proper bot permissions.
- Rule: Ensure your bot has Manage Channels permission in the guild.
Context & Memory
Saying "noted" without writing to disk
- Mistake: Acknowledged information, said "I'll remember that," never wrote it anywhere.
- Rule: If it matters, write it to a file immediately. "Noted" means nothing if it's only in context.
Waiting until end-of-day to write memory
- Mistake: Session compacted mid-day, lost all context, hadn't written anything yet.
- Rule: Write to daily log mid-session. Don't wait. If the session dies, the work should be captured.
Relying on compaction hooks for memory saves
- Mistake: Expected pre-compaction flush to save everything. Context was already truncated.
- Rule: Don't rely on compaction hooks. Write eagerly throughout the session.
Using "should" instead of "MANDATORY" in AGENTS.md
- Mistake: Rules framed as "should" or "consider" were consistently ignored.
- Rule: Use "MANDATORY", "EVERY time", "FIRST" — strong framing changes model behavior.
Sub-agents
Brief task descriptions
- Mistake: Gave sub-agent a vague task, got garbage output.
- Rule: Sub-agents have NO context from your conversation. Write the task like a spec doc for a contractor you've never met.
Agents clobbering each other's files
- Mistake: Two parallel agents wrote to the same directory, overwrote each other's work.
- Rule: Give each agent its own output directory (e.g.,
servers/{name}/per agent).
Not logging spawned agents
- Mistake: Spawned 10 agents, session compacted, forgot what I spawned.
- Rule: Log every spawn to
working-state.mdimmediately with labels.
Zombie agents running forever
- Mistake: Agent hung on a bad API call with no timeout.
- Rule: Always set
runTimeoutSeconds(600 = 10 min is a good default).
Deployments & DNS
Cloudflare Workers need DNS records
- Mistake: Worker with custom route returned nothing — no DNS record existed.
- Rule: Workers with routes need a proxied A record. Use
192.0.2.1(RFC 5737 dummy IP).
Quick tunnels are unreliable for production
- Mistake: Used Cloudflare quick tunnels for "production" — they expire and break.
- Rule: Use GitHub Pages, Cloudflare Workers, or proper tunnel configs for anything that needs to stay up.
CF Registrar is dashboard-only
- Mistake: Tried to register a domain via API.
- Rule: Cloudflare domain registration is dashboard-only. No API. Only management of existing domains.
Cron Jobs
Cron text is what fires as the prompt
- Mistake: Wrote cron text as a description instead of an actionable prompt.
- Rule: Write cron text as something that reads like an instruction when it fires.
Test crons with cron run before waiting
- Mistake: Set a cron, waited hours, found out it was broken.
- Rule: Use
cron(action: "run", jobId: "...")to test immediately.
Add your own lessons as you discover them. The goal: never make the same mistake twice.