Auth — the authorisation chokepoint
How the gateway authorises every request: the Bearer matrix philosophy, the two-token trust model, and the fail-closed posture. A security-architecture overview — parameters withheld by design.
Curated page
This is a security-architecture overview. It publishes the concepts — the authorisation model, the trust tiers, the fail-closed posture. It deliberately withholds the parameters that would help an attacker: the exact exemption map, rate-limiter thresholds, refund semantics, and any network coordinate. The wall is described; its blueprints are not.
What it is
The auth module is the gateway's single authorisation chokepoint. It is
middleware, not routes — it exposes no REST surface of its own. It is mounted
globally, ahead of every route, so every request to every other module passes
through it first. There is no second door.
It is deliberately the extracted, behaviour-preserving source of truth for the gateway's authorisation posture, and it is shared verbatim between the live application and its integration test. That sharing is a design choice with teeth: a future change to the authorisation logic MUST break the test rather than silently drift. The wall and the test of the wall are the same object.
The authorisation model
Every request is evaluated by an ordered set of rules that resolve it into one of three published outcomes:
| Outcome | Meaning |
|---|---|
| public (health only) | A tiny, fixed set of endpoints answer without a token — liveness/health and the static dashboard shell. They expose no data beyond "the service is up". |
| internal | The dashboard's same-origin proxy surface. The browser dashboard reads these without carrying a token; the privileged token stays server-side and is never shipped to the client. |
| Bearer | Everything else. A valid token is required — and write methods require the write token specifically. |
The exact membership of the public and internal sets — which paths, in which order, under which predicate — is withheld. What matters for a reader is the posture: the default is Bearer-required, and the exempt set is small, fixed, and data-free by construction.
Why the default flipped to Bearer-required
The gateway can be made world-routable (e.g. behind a tunnel). "It's only reachable on the private network" is therefore no longer a safe basis for any exemption. A large block of formerly-exempt read routes was deleted and now falls through to the Bearer gate. The safe default is authenticate, and the exemptions are the rare, justified exceptions — not the rule.
Two-token trust model
Authorisation runs on two tokens, not one:
- A write token mints reads and writes.
- A read-only token mints GETs (and the streaming handshake) only.
The separation is load-bearing: a leaked read-only token cannot mint writes, because write methods bypass the read-authorisation path entirely and demand the write token at the final gate. The blast radius of the lower-privilege credential is bounded to reads by construction.
Some read endpoints that would otherwise be broadly readable are pulled back behind a mandatory Bearer even though they sit among read surfaces — because they expose session metadata, transcripts, or full conversation turns. Operator-state never rides the low bar.
Fail-closed posture
The module is built to fail closed — when anything is ambiguous, the answer is "no":
- Constant-time token comparison. Tokens are compared with a timing-safe primitive behind a length pre-check that is itself constant-time. A wrong length and a wrong value are indistinguishable to a caller — no timing oracle.
- Byte-identical rejection. An auth failure returns the same response body every time, leaking nothing about why it failed.
- Traversal sentinel first. Path normalisation rejects dot-segment trickery before any token comparison, and the ordering of the normalisation steps is itself load-bearing (a reordering was a real historical finding).
- Mandatory at boot. If the write token cannot be resolved, the gateway throws at boot. There is no "auth-disabled" mode to fall into.
Rate-limiting the wall
Failed authorisation attempts are rate-limited to blunt brute force. The design carries three ideas worth publishing (without their numbers):
- Trust tiers. Two independent buckets run in sequence — a strict public bucket that is the world-facing wall, and a looser bucket that grants a bounded (never infinite) allowance to a single trusted egress. The two are mutually exclusive, so a request is counted by exactly one bucket, or by neither if it is exempt.
- The refund invariant. A lockout drains on schedule — throttle responses are refunded to the bucket so a legitimate caller recovers — while genuine auth failures stay counted, so the brute-force wall actually holds. Draining the wall and holding the wall are reconciled by which outcomes are refunded.
- Never trust loopback. The trusted set must never contain loopback. Under a public tunnel, loopback becomes the ingress proxy's address; trusting it would hand the loose bucket to all public traffic. This is a hard boot-time guard, not a convention.
The exact thresholds, windows, trusted coordinate, and refund status codes are withheld — they are the wall's blueprints.
What lives here
One middleware factory builds the request gate and the failed-auth limiter, plus a handful of pure helpers (constant-time compare, path normalisation, the exemption predicate shared between the gate and the limiter's skip logic, and a minimal cookie reader for the streaming handshake). It reads no environment directly: token resolution happens upstream and is passed in. It consumes no other gateway module — it is upstream of all of them.
Test coverage
The authorisation matrix, the dual-credential read path, the constant-time compare, the traversal sentinel, the streaming handshake, and both rate-limit buckets are exercised across three suites — a unit suite over the matrix internals, an integration suite that mounts the same factory as production, and a middleware-behaviour suite. Because production and test share the object, coverage of the wall is coverage of the real wall.