Boot-time autounlock of encrypted keys from a configured passphrase source #16

Closed
opened 2026-05-31 13:27:32 +00:00 by padreug · 0 comments
Owner

Why

Today every bunker restart (container rebuild, NDK bump, host reboot, config change, OOM) leaves all encrypted keys locked on disk. Each lnbits-side RemoteBunkerSigner account, each webapp NIP-46 client, each future bunker-backed extension then needs an out-of-band unlock_key admin RPC against the bunker before any signing / encrypting / decrypting works. With one operator it's a 30-second nuisance. With N operators it's an O(N) manual step per restart and a guaranteed forgot-to-unlock-operator-7 bug.

The right place to fix this is at the bunker — keys are nsecbunkerd's domain, the lock state belongs to it, and ANY NIP-46 consumer benefits without per-consumer orchestration code. Filed as the architecturally-correct path per coord-log 2026-05-31T13:30Z (lnbits → all) where the design was ratified.

Diagnosis context: coord log 2026-05-31T13:30Z (the ask) + 2026-05-31T13:50Z (NDK 3.0.3 smoke closed, autounlock is the natural next paper-cut). Both in archive/2026-05-31-pre-rotation.md will need to be re-anchored once those entries roll off the active log.

Design surface

Configuration — two env vars, mutually exclusive

  • NSECBUNKER_AUTOUNLOCK_PASSPHRASE — literal passphrase string. Useful for dev / docker compose .env flows.
  • NSECBUNKER_AUTOUNLOCK_PASSPHRASE_FILE — path to a file containing the passphrase (newline-trimmed at read). Idiomatic for sops / systemd-LoadCredential / k8s-secret / external secrets-manager flows where the passphrase lives in a separate credential store.

If both are set, fail loud at boot. If neither is set, autounlock is off by default — preserves today's "explicit per-restart unlock" posture for security-conscious deployments that want crypto-capability restoration to gate on a human action.

Boot-sequence placement

The unlock loop has to wedge between two existing milestones:

  • After DB load + the daemon's relay subscriptions for both kind-24134 (admin) and kind-24133 (NIP-46 signing) are established
  • Before the NIP-46 backend accepts client RPCs — otherwise there's a race window where a client sees "key locked" intermittently while the loop is in flight

Cleanest gate: hold Backend.start() until the autounlock loop completes (or hold each per-key Backend.start() until that key's unlock returns). The existing Daemon.startKeys() loop in src/daemon/run.ts is the obvious place to wedge — call autounlock-per-key inline before invoking startKey(name, nsec).

Unlock loop (pseudocode from lnbits's design)

async function autounlockAllKeys(passphrase: string): Promise<{success: number, total: number}> {
  const keys = await prisma.key.findMany({ where: { deletedAt: null } });
  let success = 0;
  for (const key of keys) {
    try {
      const result = await this.unlockKey(key.keyName, passphrase);
      if (result) { logger.info(`autounlock: unlocked ${key.keyName}`); success++; }
      else        { logger.warn(`autounlock: unlockKey returned false for ${key.keyName} (wrong passphrase?)`); }
    } catch (e: any) {
      logger.error(`autounlock: ${key.keyName} failed: ${e.message}`);
    }
  }
  return { success, total: keys.length };
}

Sequential is fine — log clarity > parallelism, unlock is cheap ChaCha20 + per-key Backend startup is the real cost. Don't let one bad row block the rest of the fleet.

Single-passphrase invariant (document, don't promise multi-key)

Every create_new_key(name, passphrase) in our usage today is called with the same passphrase (LNBITS_NSEC_BUNKER_KEYSTORE_PASSPHRASE). The Key table doesn't carry per-key passphrase metadata, so a single autounlock passphrase covers every row. Document this in the config docstring + docs/AUTOUNLOCK.md so future-us doesn't accidentally promise multi-passphrase support. Per-key passphrase support is a separate feature (separate column + per-key map), explicitly out of scope here.

Idempotency of unlockKey

unlockKey(name, passphrase) should be safe to call on an already-unlocked key. Spot-check the existing impl at src/daemon/run.ts — if it throws on "already unlocked", small fix to swallow that case. Idempotency matters because:

  • Belt-and-suspenders deployments might still run a lnbits-side fallback that re-fires unlock against keys the bunker already handled
  • Future ops scripts might do periodic unlock sweeps
  • The autounlock loop's continue-on-error contract assumes one bad call doesn't poison subsequent ones

Observability

  • One boot-summary log line: autounlock: enabled (source=NSECBUNKER_AUTOUNLOCK_PASSPHRASE_FILE), unlocked N/M keys in <Xms>
  • Per-key INFO log on success (terse: autounlock: unlocked <keyName>)
  • Per-key WARN/ERROR on failure (loud — operators need to notice which key didn't unlock and why)
  • Optional: nsecbunkerd_keys_unlocked_total / nsecbunkerd_keys_locked_total metrics for any future Prometheus exporter — not blocking, just useful to design for

Security disclaimer (docs/AUTOUNLOCK.md)

Spell out the trade explicitly so deployments choose deliberately:

  • "Enabling autounlock means whoever can read the passphrase source can recover any key from the bunker disk. The encrypt-at-rest property is preserved against cat /var/lib/nsecbunker/*.db alone, but lost if the attacker ALSO has the passphrase source."
  • "Recommended: passphrase file on a separate volume / mount with stricter access (mounted via systemd-LoadCredential, sops-decrypted at boot, etc.)."
  • "For security-conscious deployments: leave autounlock off; orchestrate unlock per-restart from an external process with a hardware-prompted passphrase or HSM-derived secret."

Acceptance criteria

  • NSECBUNKER_AUTOUNLOCK_PASSPHRASE + NSECBUNKER_AUTOUNLOCK_PASSPHRASE_FILE env vars wired
  • Both set → fail loud at boot (ambiguous config)
  • Neither set → behavior unchanged (manual unlock_key admin RPC per key per restart)
  • Either set → on boot walk Key table, call unlockKey per row, log per-key result + one summary line
  • NIP-46 channel doesn't accept client RPCs until the autounlock loop completes (no "key locked" race)
  • unlockKey() is idempotent (safe to re-call against an already-unlocked key)
  • docs/AUTOUNLOCK.md with the security trade-off
  • Smoke test: bunker container with autounlock env set + 2 encrypted keys provisioned + container restart → both keys come back unlocked without external intervention; lnbits's RemoteBunkerSigner.nip44_decrypt round-trips against both

Out of scope (separate issues if/when needed)

  • Per-key passphrase support (per-key keystorePassphraseSelector + passphrase map)
  • Passphrase rotation (admin RPC to re-encrypt every key under a new passphrase)
  • HSM/hardware-derived passphrase delivery — orthogonal to where the passphrase comes from at unlock time

Files for reference

  • src/daemon/run.ts — existing unlockKey(keyName, passphrase) method, Daemon.startKeys() loop, Daemon.start() orchestration
  • src/daemon/admin/commands/unlock_key.ts — admin RPC wrapper, reference for existing unlock-error handling
  • prisma/schema.prismaKey table (already has deletedAt for the soft-delete skip)

refs: aiolabs/nsecbunkerd#15 (merged — NDK 3.0.3 bump, the structural fix the autounlock builds on), coord log 2026-05-31T13:30Z (lnbits design ratification, archived), coord log 2026-05-31T13:50Z (NDK 3 smoke closed via operator publish — first-ever working nip44_encrypt end-to-end)

## Why Today every bunker restart (container rebuild, NDK bump, host reboot, config change, OOM) leaves all encrypted keys locked on disk. Each lnbits-side `RemoteBunkerSigner` account, each webapp NIP-46 client, each future bunker-backed extension then needs an out-of-band `unlock_key` admin RPC against the bunker before any signing / encrypting / decrypting works. With one operator it's a 30-second nuisance. With N operators it's an O(N) manual step per restart and a guaranteed forgot-to-unlock-operator-7 bug. The right place to fix this is at the bunker — keys are nsecbunkerd's domain, the lock state belongs to it, and ANY NIP-46 consumer benefits without per-consumer orchestration code. Filed as the architecturally-correct path per coord-log `2026-05-31T13:30Z` (lnbits → all) where the design was ratified. Diagnosis context: coord log `2026-05-31T13:30Z` (the ask) + `2026-05-31T13:50Z` (NDK 3.0.3 smoke closed, autounlock is the natural next paper-cut). Both in `archive/2026-05-31-pre-rotation.md` will need to be re-anchored once those entries roll off the active log. ## Design surface ### Configuration — two env vars, mutually exclusive - `NSECBUNKER_AUTOUNLOCK_PASSPHRASE` — literal passphrase string. Useful for dev / docker compose `.env` flows. - `NSECBUNKER_AUTOUNLOCK_PASSPHRASE_FILE` — path to a file containing the passphrase (newline-trimmed at read). Idiomatic for sops / systemd-LoadCredential / k8s-secret / external secrets-manager flows where the passphrase lives in a separate credential store. If both are set, fail loud at boot. If neither is set, **autounlock is off by default** — preserves today's "explicit per-restart unlock" posture for security-conscious deployments that want crypto-capability restoration to gate on a human action. ### Boot-sequence placement The unlock loop has to wedge between two existing milestones: - **After** DB load + the daemon's relay subscriptions for both kind-24134 (admin) and kind-24133 (NIP-46 signing) are established - **Before** the NIP-46 backend accepts client RPCs — otherwise there's a race window where a client sees "key locked" intermittently while the loop is in flight Cleanest gate: hold `Backend.start()` until the autounlock loop completes (or hold each per-key `Backend.start()` until that key's unlock returns). The existing `Daemon.startKeys()` loop in `src/daemon/run.ts` is the obvious place to wedge — call autounlock-per-key inline before invoking `startKey(name, nsec)`. ### Unlock loop (pseudocode from lnbits's design) ```typescript async function autounlockAllKeys(passphrase: string): Promise<{success: number, total: number}> { const keys = await prisma.key.findMany({ where: { deletedAt: null } }); let success = 0; for (const key of keys) { try { const result = await this.unlockKey(key.keyName, passphrase); if (result) { logger.info(`autounlock: unlocked ${key.keyName}`); success++; } else { logger.warn(`autounlock: unlockKey returned false for ${key.keyName} (wrong passphrase?)`); } } catch (e: any) { logger.error(`autounlock: ${key.keyName} failed: ${e.message}`); } } return { success, total: keys.length }; } ``` Sequential is fine — log clarity > parallelism, unlock is cheap ChaCha20 + per-key Backend startup is the real cost. Don't let one bad row block the rest of the fleet. ### Single-passphrase invariant (document, don't promise multi-key) Every `create_new_key(name, passphrase)` in our usage today is called with the same passphrase (`LNBITS_NSEC_BUNKER_KEYSTORE_PASSPHRASE`). The `Key` table doesn't carry per-key passphrase metadata, so a single autounlock passphrase covers every row. Document this in the config docstring + `docs/AUTOUNLOCK.md` so future-us doesn't accidentally promise multi-passphrase support. Per-key passphrase support is a separate feature (separate column + per-key map), explicitly out of scope here. ### Idempotency of `unlockKey` `unlockKey(name, passphrase)` should be safe to call on an already-unlocked key. Spot-check the existing impl at `src/daemon/run.ts` — if it throws on "already unlocked", small fix to swallow that case. Idempotency matters because: - Belt-and-suspenders deployments might still run a lnbits-side fallback that re-fires unlock against keys the bunker already handled - Future ops scripts might do periodic unlock sweeps - The autounlock loop's continue-on-error contract assumes one bad call doesn't poison subsequent ones ### Observability - One boot-summary log line: `autounlock: enabled (source=NSECBUNKER_AUTOUNLOCK_PASSPHRASE_FILE), unlocked N/M keys in <Xms>` - Per-key INFO log on success (terse: `autounlock: unlocked <keyName>`) - Per-key WARN/ERROR on failure (loud — operators need to notice which key didn't unlock and why) - Optional: `nsecbunkerd_keys_unlocked_total` / `nsecbunkerd_keys_locked_total` metrics for any future Prometheus exporter — not blocking, just useful to design for ### Security disclaimer (`docs/AUTOUNLOCK.md`) Spell out the trade explicitly so deployments choose deliberately: - "Enabling autounlock means whoever can read the passphrase source can recover any key from the bunker disk. The encrypt-at-rest property is preserved against `cat /var/lib/nsecbunker/*.db` alone, but lost if the attacker ALSO has the passphrase source." - "Recommended: passphrase file on a separate volume / mount with stricter access (mounted via systemd-LoadCredential, sops-decrypted at boot, etc.)." - "For security-conscious deployments: leave autounlock off; orchestrate unlock per-restart from an external process with a hardware-prompted passphrase or HSM-derived secret." ## Acceptance criteria - [ ] `NSECBUNKER_AUTOUNLOCK_PASSPHRASE` + `NSECBUNKER_AUTOUNLOCK_PASSPHRASE_FILE` env vars wired - [ ] Both set → fail loud at boot (ambiguous config) - [ ] Neither set → behavior unchanged (manual `unlock_key` admin RPC per key per restart) - [ ] Either set → on boot walk `Key` table, call `unlockKey` per row, log per-key result + one summary line - [ ] NIP-46 channel doesn't accept client RPCs until the autounlock loop completes (no "key locked" race) - [ ] `unlockKey()` is idempotent (safe to re-call against an already-unlocked key) - [ ] `docs/AUTOUNLOCK.md` with the security trade-off - [ ] Smoke test: bunker container with autounlock env set + 2 encrypted keys provisioned + container restart → both keys come back unlocked without external intervention; lnbits's `RemoteBunkerSigner.nip44_decrypt` round-trips against both ## Out of scope (separate issues if/when needed) - Per-key passphrase support (per-key `keystorePassphraseSelector` + passphrase map) - Passphrase rotation (admin RPC to re-encrypt every key under a new passphrase) - HSM/hardware-derived passphrase delivery — orthogonal to where the passphrase comes from at unlock time ## Files for reference - `src/daemon/run.ts` — existing `unlockKey(keyName, passphrase)` method, `Daemon.startKeys()` loop, `Daemon.start()` orchestration - `src/daemon/admin/commands/unlock_key.ts` — admin RPC wrapper, reference for existing unlock-error handling - `prisma/schema.prisma` — `Key` table (already has `deletedAt` for the soft-delete skip) refs: aiolabs/nsecbunkerd#15 (merged — NDK 3.0.3 bump, the structural fix the autounlock builds on), coord log `2026-05-31T13:30Z` (lnbits design ratification, archived), coord log `2026-05-31T13:50Z` (NDK 3 smoke closed via operator publish — first-ever working `nip44_encrypt` end-to-end)
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
aiolabs/nsecbunkerd#16
No description provided.