fix(acl): hard-reject a lapsed token binding instead of prompting (#36) #38

Merged
padreug merged 1 commit from issue-36-expired-binding-hard-reject into dev 2026-06-21 12:57:47 +00:00
Owner

Closes #36.

Problem

checkIfPubkeyAllowed returned the same undefined for two very different situations:

  • the binding never existed for this (method, kind), and
  • the binding existed and lapsed (token expired or token-revoked).

undefined routes the caller into the admin-prompt path. For an interactive operator that's fine; for an unattended client (a bitspire ATM) there is no admin to prompt, so the request hangs until a BunkerTimeoutError.

This diverges from a KeyUser-level revoke, which exits at step 2 with false → a clean BunkerRejectedError. The Sintra smoke proved it on hardware:

operator action ACL exit spire sees
revoke (KeyUser.revokedAt) false BunkerRejectedError → "Pairing Required"
TTL expiry (Token.expiresAt) undefined BunkerTimeoutError → "Signer Unreachable"

Same operator intent ("this pairing is over"), two outcomes, one broken.

Fix

Before the final undefined, classify the no-live-grant case. If a token bound to this KeyUser would have granted the request (its policy carries a matching rule; for connect, any bound token) but is now expired or token-revoked, return false so the client re-pairs immediately. A genuinely never-granted (method, kind) still returns undefined, preserving the admin-approval path for new permission.

The (method, kind) matcher (policyMatch) is hoisted so the live-grant query and the lapsed-binding query share one predicate.

Scope note — usage-cap exhaustion stays undefined

A windowed cap (#28) is a temporary rate-limit that refills as the window rolls — not a permanent lapse — so it must not be reclassed as the re-pair signal. A dedicated rate-limit reply is a separate follow-up; this PR is strictly the expiry/revoke lifecycle lapse.

Tests

tests/acl.integration.test.ts (the #29/#33 DB harness):

  • expired token -> sign_event and token-revoked -> sign_event now assert false (were undefined).
  • expired token -> connect asserts false.
  • two distinction guards: a live token queried for a method outside its policy, and an expired token queried for a method it never covered, both stay undefined — proving the re-pair signal doesn't swallow genuinely-new permission.
test:all → 7 lifecycle + 24 integration, all pass

tsc --noEmit introduces no new errors (the 3 pre-existing authorize.ts errors are unchanged); npm run build (tsup) green.

🤖 Generated with Claude Code

Closes #36. ## Problem `checkIfPubkeyAllowed` returned the same `undefined` for two very different situations: - the binding **never existed** for this `(method, kind)`, and - the binding **existed and lapsed** (token expired or token-revoked). `undefined` routes the caller into the admin-prompt path. For an interactive operator that's fine; for an **unattended client** (a bitspire ATM) there is no admin to prompt, so the request hangs until a `BunkerTimeoutError`. This diverges from a `KeyUser`-level revoke, which exits at step 2 with `false` → a clean `BunkerRejectedError`. The Sintra smoke proved it on hardware: | operator action | ACL exit | spire sees | |---|---|---| | revoke (KeyUser.revokedAt) | `false` | `BunkerRejectedError` → "Pairing Required" ✅ | | TTL expiry (Token.expiresAt) | `undefined` | `BunkerTimeoutError` → "Signer Unreachable" ❌ | Same operator intent ("this pairing is over"), two outcomes, one broken. ## Fix Before the final `undefined`, classify the no-live-grant case. If a token bound to this `KeyUser` **would** have granted the request (its policy carries a matching rule; for `connect`, any bound token) but is now **expired or token-revoked**, return `false` so the client re-pairs immediately. A genuinely never-granted `(method, kind)` still returns `undefined`, preserving the admin-approval path for new permission. The `(method, kind)` matcher (`policyMatch`) is hoisted so the live-grant query and the lapsed-binding query share one predicate. ### Scope note — usage-cap exhaustion stays `undefined` A windowed cap (#28) is a *temporary* rate-limit that refills as the window rolls — not a permanent lapse — so it must **not** be reclassed as the re-pair signal. A dedicated rate-limit reply is a separate follow-up; this PR is strictly the expiry/revoke lifecycle lapse. ## Tests `tests/acl.integration.test.ts` (the #29/#33 DB harness): - `expired token -> sign_event` and `token-revoked -> sign_event` now assert **`false`** (were `undefined`). - `expired token -> connect` asserts **`false`**. - two distinction guards: a live token queried for a method outside its policy, and an expired token queried for a method it never covered, both stay **`undefined`** — proving the re-pair signal doesn't swallow genuinely-new permission. ``` test:all → 7 lifecycle + 24 integration, all pass ``` `tsc --noEmit` introduces no new errors (the 3 pre-existing `authorize.ts` errors are unchanged); `npm run build` (tsup) green. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
fix(acl): hard-reject a lapsed token binding instead of prompting (#36)
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
14d48ca0f9
A request that finds no live token grant exited `checkIfPubkeyAllowed`
at `undefined` regardless of *why* — whether the binding never existed
or had simply lapsed (expired / token-revoked). `undefined` routes the
caller into the admin-prompt path, which for an unattended client (an
ATM spire) means the request hangs until a BunkerTimeoutError.

The Sintra smoke proved the divergence directly: a KeyUser-level revoke
exits at step 2 with `false` and the spire sees a clean BunkerRejected
("Pairing Required"), but a TTL expiry fell through to `undefined` and
the spire saw a BunkerTimeout ("Signer Unreachable") — same operator
intent ("this pairing is over"), two different, one-broken outcomes.

Classify the no-live-grant case before returning: if a token bound to
this KeyUser *would* have granted the request (its policy carries a
matching rule; for `connect`, any bound token) but is now expired or
token-revoked, return `false` so the client re-pairs immediately. Only
a genuinely never-granted (method/kind) request stays `undefined` so an
admin can still approve new permission out-of-band.

Usage-cap exhaustion is left at `undefined` deliberately: a windowed
cap is a temporary rate-limit that refills as the window rolls, not a
permanent lapse, so it must not be reclassed as the re-pair signal. A
dedicated rate-limit reply is a separate follow-up.

Tests: the #24 expired-token and token-revoke guards now assert `false`;
added connect-lapse, and two distinction cases proving a never-granted
method (live token, or a method the lapsed token never covered) stays
`undefined`.
Author
Owner

Reviewed from the bitspire/consumer side — LGTM, approve.

The lapsed-vs-never-granted distinction is exactly right, and the two guard tests (live token, method outside policy → undefined and expired token, method it never covered → undefined) lock the crucial property that the re-pair signal doesn't swallow genuinely-new permission. The connect case (any lapsed token → false) is correct, and re-pair with a fresh client key is unaffected since it's a different KeyUser — which is what leg 9 showed live on the Sintra.

No consumer-side change needed: my existing BunkerRejectedError → "Pairing Required" path absorbs the now-false expiry. Once this is on dev and the stack rebuilds, the leg-8 case flips from "Signer Unreachable" → "Pairing Required" with zero bitspire change. Happy to re-run leg 8 against the rebuilt bunker to confirm the flip on hardware.

One non-blocking follow-up (you already scoped it out, just noting the consumer tie-in): usage-cap exhaustion staying undefined means an unattended client hitting a cap hangs the same way expiry used to. The clean answer is the dedicated rate_limited reply you mention — and the bitspire error layer already has the matching slot (LnbitsErrorCode.RateLimited → retry-long-backoff, bitspire#52 Phase D), so both ends are pre-shaped for it whenever caps get applied to spire policies (#28).

Nice fix — turns the smoke finding into a clean lifecycle symmetry.

Reviewed from the bitspire/consumer side — **LGTM, approve.** The lapsed-vs-never-granted distinction is exactly right, and the two guard tests (`live token, method outside policy → undefined` and `expired token, method it never covered → undefined`) lock the crucial property that the re-pair signal doesn't swallow genuinely-new permission. The `connect` case (any lapsed token → `false`) is correct, and re-pair with a fresh client key is unaffected since it's a different `KeyUser` — which is what leg 9 showed live on the Sintra. **No consumer-side change needed:** my existing `BunkerRejectedError` → "Pairing Required" path absorbs the now-`false` expiry. Once this is on `dev` and the stack rebuilds, the leg-8 case flips from "Signer Unreachable" → "Pairing Required" with zero bitspire change. Happy to re-run leg 8 against the rebuilt bunker to confirm the flip on hardware. One non-blocking follow-up (you already scoped it out, just noting the consumer tie-in): usage-cap exhaustion staying `undefined` means an *unattended* client hitting a cap hangs the same way expiry used to. The clean answer is the dedicated `rate_limited` reply you mention — and the bitspire error layer already has the matching slot (`LnbitsErrorCode.RateLimited` → retry-long-backoff, bitspire#52 Phase D), so both ends are pre-shaped for it whenever caps get applied to spire policies (#28). Nice fix — turns the smoke finding into a clean lifecycle symmetry.
padreug deleted branch issue-36-expired-binding-hard-reject 2026-06-21 12:57:47 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
aiolabs/nsecbunkerd!38
No description provided.