Gateway Module — `notifications/`
1. Purpose
The notifications/ module is a background Telegram service with zero REST surface. It sends three kinds of operator alert to the home chat: dispatch completion, dispatch stall, and a degraded-tier system alert (e.g. sustained Voyage embedding auth failures). It provides idempotency (per-dispatch DB columns), in-process dedup (5-minute TTL), rate limiting (token bucket), boot-time recovery of missed terminal-dispatch notifications, and per-component state-flag suppression for the degraded tier. It gracefully degrades to a no-op layer when the Telegram credentials are absent, so the gateway always boots.
2. File Inventory
| File | Lines | Responsibility |
|---|---|---|
index.ts | 87 | createNotificationLayer(db) factory; no-op fallback; start/stop lifecycle |
notification-service.ts | 353 | Core dispatcher — idempotency, dedup, rate limit, recovery, degraded suppression |
message-format.ts | 155 | Message formatters (completion / stall / degraded) + StallEntry/DegradedEntry types |
rate-limiter.ts | 44 | Token-bucket rate limiter (createRateLimiter, refills to max every 60s) |
telegram.ts | 106 | sendTelegramMessage — Telegram Bot API transport |
| Total | 745 |
3. Public API Surface
REST Endpoints
No HTTP surface. This is a background service invoked in-process. It exposes no router and is not mounted in the Express app.
MCP Tools
None.
Consumers (in-process, not HTTP)
- Constructed at
index.tsline 388 (createNotificationLayer(db));.start()at line 940,.stop()at line 951. notifications.serviceis injected into the dispatch layer (createDispatchLayer(db, psychicCache.repo, notifications.service), index.ts line 430). Dispatch GC firesnotifyCompletion(dispatch/dispatch-gc.ts:192).- The embedding provider's
onAuthFailurehook wires tosendDegradedAlert({ component: 'voyage-embeddings', … })and clears the flag viaclearDegradedFlagon the next successful embedding (index.ts lines 395–406).
4. Internal API
NotificationService (returned from createNotificationService):
notifyCompletion(dispatch: DispatchRecord): Promise<void>— idempotent (skips ifcompletion_notified_atset; marks column on send).notifyStall(dispatch: StallEntry, elapsedStr): Promise<void>— idempotent viahang_notified_at.recoverMissedNotifications(): Promise<void>— boot scan for terminal dispatches (completed/failed, un-notified, last 24h).sendDegradedAlert(entry: DegradedEntry): Promise<void>— per-component state-flag suppression (primary) + 15-min rate-limiter floor (secondary, gray-fox C4).clearDegradedFlag(component)/isDegradedFlagSet(component)(test helper).
NotificationLayer: { service, start(), stop() }. createRateLimiter(max): RateLimiter (tryConsume, stop).
5. Background Services
- Boot recovery —
start()firesrecoverMissedNotifications()asynchronously viasetImmediate(never blocks boot). - Rate-limiter timers — two token buckets: the main bucket (
NOTIFICATION_RATE_LIMIT_PER_MIN, default 10/min) and an independentdegradedRateLimiter(max 1 → 1/15-min effective, AK-321/gray-fox C4). Both refill on a 60s interval;stop()clears them. - In-process dedup set — 5-minute-TTL keyed by
completion:<id>/stall:<id>, purged on access.
6. Data Contracts
No Zod schemas (no inbound HTTP). Internal types: NotificationConfig (botToken, chatId, rateLimitPerMin), DispatchRecord (imported from dispatch), StallEntry + DegradedEntry (from message-format.ts). Idempotency is enforced by two nullable columns on agent_dispatch (completion_notified_at, hang_notified_at) updated with WHERE … IS NULL guards.
7. Dependencies
- Gateway modules consumed:
dispatch/(DispatchRecordtype; injected into dispatch layer + GC). - External libraries:
better-sqlite3; Telegram Bot API over HTTP (intelegram.ts). - Environment variables:
TELEGRAM_BOT_TOKEN,TELEGRAM_HOME_CHAT_ID(both required — absence → no-op layer with a WARN log),NOTIFICATION_RATE_LIMIT_PER_MIN(optional, default 10).
8. Test Coverage
| Layer | File | Cases |
|---|---|---|
| Layer + service | test/notifications-layer.test.ts (484 L) | 24 it blocks |
| Degraded tier | test/notifications-degraded.test.ts (472 L) | (state-flag suppression + 15-min floor) |
Covers idempotency, dedup, rate limiting, boot recovery, no-op fallback, and degraded suppression/clear semantics.
9. Known Limitations
- In-process dedup only — the 5-minute dedup set is per-process; a gateway restart within the window could re-send a notification that the DB idempotency columns don't yet cover (the DB guard is the durable backstop; dedup is the race guard).
- Degraded ack is flag-based, not command-based — the alert text retains a
/degraded-ackinstruction as a documented future override, but no Telegram bot command-handler infrastructure exists yet; suppression auto-clears only on component recovery. - Degraded 15-min cooldown is encoded as
max=1on a 60s-refill bucket + per-component flag; the flag is the real guard, the limiter is only the floor for flag-clear races.
10. Change History
| Date | Dispatch | Summary |
|---|---|---|
| 2026-07-04 | 2297 | Initial module spec (smoke-clone-3, LisaOS audit campaign Phase 4) |
| — | (AK-321) | Degraded-tier alerting + independent 15-min cooldown bucket (gray-fox C4) |
| — | 371 | Module created — completion/stall notifications, idempotency, recovery |