feat(daemon): boot-time autounlock of encrypted keys (#16)

Adds opt-in autounlock to the daemon's boot sequence. Closes the
"O(N) manual unlock_key RPC per bunker restart" paper-cut without
breaking the secure-by-default posture: deployments that want every
restart to gate crypto capability on a human action keep that
property by leaving both env vars unset.

Configuration — two mutually exclusive env vars:

  NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE       literal passphrase
  NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE  path (newline-trimmed)

Both set → fail loud at boot. Neither set → no-op (default,
behavior unchanged from pre-#16). Var names follow the bunker's
existing NSEC_BUNKER_* convention (see NSEC_BUNKER_DEBUG_TRANSPORT,
NSEC_BUNKER_DISABLE_WATCHDOG); the design issue spec'd NSECBUNKER_*
but aligning with the existing prefix matters more for operator
muscle-memory than matching the issue text verbatim.

Implementation:

  - `Daemon.maybeAutounlock()` wedged at the tail of `startKeys()`.
    Inherits the relay-subscription lifecycle (EOSE-awaited per #9)
    that the existing per-key startKey calls established, so there's
    no "client sees key locked" race window.
  - Enumeration via `prisma.key.findMany({ where: { deletedAt: null } })`
    — Key table is the canonical source of truth for what keys exist
    on the bunker; respects soft-delete.
  - Per-key call to the existing `unlockKey(keyName, passphrase)`,
    which is idempotent post-#16 — encrypted-at-rest keys get unlocked
    on first call; rows already loaded via the unencrypted-config
    passes above are no-ops.
  - Sequential loop with continue-on-error. One bad row (corrupted
    blob, key encrypted under a historical passphrase, etc.) doesn't
    block the rest of the fleet. Per-key INFO/WARN/ERROR + one
    summary line.
  - File-source error (missing path, permission denied) is fatal at
    boot — same severity as a misconfig.

Observability output:

  🔓 autounlock: unlocked <keyName>                                    (success)
  ⚠️  autounlock: unlockKey returned false for <keyName> (...)         (soft fail)
   autounlock: <keyName> failed: <message>                           (throw)
  🔓 autounlock: enabled (source=<env>), unlocked N/M keys in <Xms>    (summary)

Single-passphrase invariant: every `create_new_key(name, passphrase)`
in our usage today uses the same passphrase
(LNBITS_NSEC_BUNKER_KEYSTORE_PASSPHRASE on the lnbits side), so one
autounlock passphrase covers every encrypted key. Per-key passphrase
support is a separate feature (out of scope — see #16 "out of scope"
section + docs/AUTOUNLOCK.md "What's not in scope").

`docs/AUTOUNLOCK.md` ships alongside: usage, the security trade
spelled out by deployment shape, observability hooks, what's
deliberately not in scope. Required-reading link before any operator
flips the env var on for a production-shaped deployment.

Refs aiolabs/nsecbunkerd#16. Builds on idempotent unlockKey from the
previous commit on this branch.
This commit is contained in:
Padreug 2026-05-31 15:31:25 +02:00
commit 7a3cb4f3da
2 changed files with 237 additions and 0 deletions

140
docs/AUTOUNLOCK.md Normal file
View file

@ -0,0 +1,140 @@
# Boot-time autounlock
`nsecbunkerd` stores each managed key encrypted at rest in
`nsecbunker.db`. By default, every key is **locked** after the daemon
starts — clients must drive an `unlock_key` admin RPC against the
bunker before signing / encrypting / decrypting works for that key.
Autounlock is an opt-in feature that, when enabled, reads a
passphrase from a configured source at boot and unlocks every
non-soft-deleted key in the `Key` table automatically. This trades
operational simplicity for a documented security weakening; read
this whole document before enabling.
## Configuration
Two mutually-exclusive environment variables:
| Var | Meaning |
|---|---|
| `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE` | Literal passphrase string. Useful for dev / `docker compose .env` flows. |
| `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE` | Path to a file containing the passphrase (newline-trimmed at read). Idiomatic for sops / systemd-LoadCredential / k8s-secret / external secrets-manager flows where the passphrase comes from a separate credential store. |
**If both are set, the daemon fails loud at boot** with an explicit
error. Ambiguous config is never allowed to silently pick one.
**If neither is set, autounlock is off** — behavior is identical to
pre-#16: keys remain locked until an admin `unlock_key` RPC fires per
key per restart.
## What happens at boot when autounlock is on
After the daemon's existing key-loading passes complete (unencrypted
keys from in-process config, plain-key entries in `nsecbunker.json`),
the autounlock pass runs:
1. Read the passphrase from the configured source. Failure to read
(missing file, no permission) is fatal at boot.
2. Enumerate every row in the `Key` Prisma table where
`deletedAt IS NULL`.
3. For each row, call `unlockKey(keyName, passphrase)`. `unlockKey`
is idempotent post-#16: if the key was already unlocked by a
prior pass, it's a no-op.
4. Log per-key INFO on success, WARN on `unlockKey → false`
(typically: wrong passphrase, possibly the key was created under a
historical passphrase that differs from the current one), ERROR on
throw (typically: corrupted blob).
5. Log one summary line:
`🔓 autounlock: enabled (source=<env>), unlocked N/M keys in <Xms>`.
The loop is sequential — log clarity > parallelism, the unlock op
itself is cheap (one ChaCha20 decrypt per key). For 100 keys it's
milliseconds. If a fleet ever needs the thousands, parallelize then.
The NIP-46 client channel doesn't accept RPCs that route to a key
until that key's `Backend.start()` resolves — which happens inside
`unlockKey`. So there's no race window where a freshly-restarted
bunker would say "key locked" to a client while the loop is in
flight on that key.
## The security trade-off
Enabling autounlock means **whoever can read the passphrase source
can recover any key from the bunker disk.** Specifically:
- The encrypt-at-rest property of `nsecbunker.db` is *preserved*
against `cat /var/lib/nsecbunker/*.db` alone — the database holds
ciphertext + IV per key, not plaintext.
- The encrypt-at-rest property is *lost* if the attacker also has
access to the passphrase source. Anyone with read access to the
passphrase env var, the passphrase file, or the process memory at
the moment of autounlock can decrypt every key.
This is the same trade today's deployments already make when they
hold the passphrase in `lnbits`'s env to drive `unlock_key` RPCs
post-restart. Autounlock makes the trade *explicit at the bunker
level* and *visible per-deployment*, but it doesn't introduce a new
trust requirement that didn't already exist for any deployment using
external automation to drive unlocks.
### Recommendations by deployment shape
- **Dev / regtest / single-host:** literal `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE`
in `docker compose .env` is fine. The threat model on a dev box
doesn't justify the file-source ceremony.
- **Single-tenant production:** passphrase file on a separate
volume / mount with stricter access. Mount via
`systemd-LoadCredential` so the file is only readable by the
bunker process and is materialized from a sops-decrypted source
at boot. Avoid baking the passphrase into the container image or
process env list (which leaks into `ps aux`, container labels, etc.).
- **Multi-tenant / high-security:** leave autounlock off. Orchestrate
unlock per-restart from an external process that prompts for the
passphrase out-of-band (hardware token, HSM-derived secret, human
approval). This preserves the property that bunker startup alone
doesn't restore crypto capability — a deliberate human action is
required.
## What's *not* in scope
These are deliberately out of scope for the autounlock feature.
Separate issues to file if needed:
- **Per-key passphrase support.** The current `Key` table doesn't
carry per-key passphrase metadata; every `create_new_key(name, passphrase)`
in our usage today uses the same passphrase
(`LNBITS_NSEC_BUNKER_KEYSTORE_PASSPHRASE`). The autounlock
passphrase covers every encrypted key by virtue of this
single-passphrase invariant. If a deployment ever needs per-key
passphrases, that's a separate feature (per-key passphrase-selector
column + per-key passphrase map).
- **Passphrase rotation.** Re-encrypting every key under a new
passphrase belongs in a dedicated admin RPC (`rotate_keystore`),
not in autounlock.
- **HSM / hardware-derived passphrase delivery.** Orthogonal to
where the passphrase comes from at unlock time — autounlock just
reads a string. An HSM integration would land between the
hardware and the file the bunker reads from.
## Observability hooks
The autounlock pass emits:
- `🔓 autounlock: unlocked <keyName>` (INFO, one per success)
- `⚠️ autounlock: unlockKey returned false for <keyName> ...` (WARN, one per soft failure)
- `❌ autounlock: <keyName> failed: <err.message>` (ERROR, one per throw)
- `🔓 autounlock: enabled (source=<env>), unlocked N/M keys in <Xms>` (summary, once)
When the optional Prometheus exporter lands, counters
`nsecbunkerd_keys_unlocked_total` and `nsecbunkerd_keys_locked_total`
will be reported from the autounlock summary state. The current
implementation doesn't export metrics — the log line is the
canonical signal.
## See also
- `src/daemon/run.ts:Daemon.maybeAutounlock` — implementation
- `src/daemon/run.ts:Daemon.unlockKey` — the idempotent per-key call
- `src/daemon/admin/commands/unlock_key.ts` — the admin-RPC wrapper for manual unlock
- aiolabs/nsecbunkerd#16 — issue with full design rationale + acceptance criteria
- aiolabs/nsecbunkerd#15 — NDK 3.0.3 bump (the structural fix this builds on)