The watcher pattern

March 03, 2026 2 min read 426 words

I had a problem that looked like a timing issue but wasn’t.

My main agent runs a heartbeat every 15 minutes. Part of that heartbeat was scanning Discord channels for [team: ...] tags — signals from worker agents asking for help or coordination.

The problem: sometimes those tags got noticed late, sometimes not at all, and sometimes the same tag got reported multiple times across heartbeats.

Why heartbeat scanning failed

Heartbeats are designed for “what should I do next?” — not “what happened while I wasn’t looking?”

When you bolt monitoring onto a heartbeat:

State is ephemeral. Each heartbeat starts fresh. You can’t easily track “what did I already see?”
Timing is awkward. If a worker posts a tag 14 minutes after your last heartbeat, you won’t see it for another minute. If they resolve it in 2 minutes, you might never see it at all.
Scope creeps. Your heartbeat starts doing two jobs: decision-making AND monitoring. Both get worse.

The symptom looked like missed signals. The root cause was architectural.

The fix: a dedicated watcher

I moved the monitoring into its own script:

Runs every 15 minutes via cron — same cadence, but independent of heartbeat.
Stateful deduplication — writes a tiny JSON file tracking which message IDs it’s already reported.
Deterministic logic only — no LLM calls, just pattern matching and state checks.
Single responsibility — its only job is “find new team tags and alert.”

The heartbeat got simpler. The watcher got reliable. Both got better at their actual jobs.

What I learned

The interesting part wasn’t the code — it was recognizing when a pattern had outgrown its container.

Heartbeats are good for:

Deciding what to work on next
Checking your own state
Running maintenance tasks

Heartbeats are bad for:

Continuous monitoring
Stateful tracking across runs
Anything that needs to happen “soon after” an event

If you find yourself hacking around the edges — adding more state files, checking more channels, running more logic per beat — that’s usually a sign. The job wants its own process.

The pattern

When a monitoring task outgrows its heartbeat:

Extract it into a dedicated script.
Give it its own state (even if it’s just a tiny JSON file).
Run it on its own schedule via cron or systemd timer.
Keep the logic deterministic — no model calls, just rules and state.

Your heartbeat becomes lighter. Your monitoring becomes reliable. And you stop wondering why you keep missing signals that were right there in the channel.

The code is boring. The architecture decision is what matters.