From 7a3cb4f3da5d113180d88212b50888ce4de81da3 Mon Sep 17 00:00:00 2001
From: Padreug <padreug@aiolabs.dev>
Date: Sun, 31 May 2026 15:31:25 +0200
Subject: [PATCH] feat(daemon): boot-time autounlock of encrypted keys (#16)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Adds opt-in autounlock to the daemon's boot sequence. Closes the
"O(N) manual unlock_key RPC per bunker restart" paper-cut without
breaking the secure-by-default posture: deployments that want every
restart to gate crypto capability on a human action keep that
property by leaving both env vars unset.

Configuration — two mutually exclusive env vars:

  NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE       literal passphrase
  NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE  path (newline-trimmed)

Both set → fail loud at boot. Neither set → no-op (default,
behavior unchanged from pre-#16). Var names follow the bunker's
existing NSEC_BUNKER_* convention (see NSEC_BUNKER_DEBUG_TRANSPORT,
NSEC_BUNKER_DISABLE_WATCHDOG); the design issue spec'd NSECBUNKER_*
but aligning with the existing prefix matters more for operator
muscle-memory than matching the issue text verbatim.

Implementation:

  - `Daemon.maybeAutounlock()` wedged at the tail of `startKeys()`.
    Inherits the relay-subscription lifecycle (EOSE-awaited per #9)
    that the existing per-key startKey calls established, so there's
    no "client sees key locked" race window.
  - Enumeration via `prisma.key.findMany({ where: { deletedAt: null } })`
    — Key table is the canonical source of truth for what keys exist
    on the bunker; respects soft-delete.
  - Per-key call to the existing `unlockKey(keyName, passphrase)`,
    which is idempotent post-#16 — encrypted-at-rest keys get unlocked
    on first call; rows already loaded via the unencrypted-config
    passes above are no-ops.
  - Sequential loop with continue-on-error. One bad row (corrupted
    blob, key encrypted under a historical passphrase, etc.) doesn't
    block the rest of the fleet. Per-key INFO/WARN/ERROR + one
    summary line.
  - File-source error (missing path, permission denied) is fatal at
    boot — same severity as a misconfig.

Observability output:

  🔓 autounlock: unlocked <keyName>                                    (success)
  ⚠️  autounlock: unlockKey returned false for <keyName> (...)         (soft fail)
  ❌ autounlock: <keyName> failed: <message>                           (throw)
  🔓 autounlock: enabled (source=<env>), unlocked N/M keys in <Xms>    (summary)

Single-passphrase invariant: every `create_new_key(name, passphrase)`
in our usage today uses the same passphrase
(LNBITS_NSEC_BUNKER_KEYSTORE_PASSPHRASE on the lnbits side), so one
autounlock passphrase covers every encrypted key. Per-key passphrase
support is a separate feature (out of scope — see #16 "out of scope"
section + docs/AUTOUNLOCK.md "What's not in scope").

`docs/AUTOUNLOCK.md` ships alongside: usage, the security trade
spelled out by deployment shape, observability hooks, what's
deliberately not in scope. Required-reading link before any operator
flips the env var on for a production-shaped deployment.

Refs aiolabs/nsecbunkerd#16. Builds on idempotent unlockKey from the
previous commit on this branch.
---
 docs/AUTOUNLOCK.md | 140 +++++++++++++++++++++++++++++++++++++++++++++
 src/daemon/run.ts  |  97 +++++++++++++++++++++++++++++++
 2 files changed, 237 insertions(+)
 create mode 100644 docs/AUTOUNLOCK.md
diff --git a/docs/AUTOUNLOCK.md b/docs/AUTOUNLOCK.md
new file mode 100644
index 0000000..6e0bd1b
--- /dev/null
+++ b/docs/AUTOUNLOCK.md
@@ -0,0 +1,140 @@
+# Boot-time autounlock
+
+`nsecbunkerd` stores each managed key encrypted at rest in
+`nsecbunker.db`. By default, every key is **locked** after the daemon
+starts — clients must drive an `unlock_key` admin RPC against the
+bunker before signing / encrypting / decrypting works for that key.
+
+Autounlock is an opt-in feature that, when enabled, reads a
+passphrase from a configured source at boot and unlocks every
+non-soft-deleted key in the `Key` table automatically. This trades
+operational simplicity for a documented security weakening; read
+this whole document before enabling.
+
+## Configuration
+
+Two mutually-exclusive environment variables:
+
+| Var | Meaning |
+|---|---|
+| `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE` | Literal passphrase string. Useful for dev / `docker compose .env` flows. |
+| `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE` | Path to a file containing the passphrase (newline-trimmed at read). Idiomatic for sops / systemd-LoadCredential / k8s-secret / external secrets-manager flows where the passphrase comes from a separate credential store. |
+
+**If both are set, the daemon fails loud at boot** with an explicit
+error. Ambiguous config is never allowed to silently pick one.
+
+**If neither is set, autounlock is off** — behavior is identical to
+pre-#16: keys remain locked until an admin `unlock_key` RPC fires per
+key per restart.
+
+## What happens at boot when autounlock is on
+
+After the daemon's existing key-loading passes complete (unencrypted
+keys from in-process config, plain-key entries in `nsecbunker.json`),
+the autounlock pass runs:
+
+1. Read the passphrase from the configured source. Failure to read
+   (missing file, no permission) is fatal at boot.
+2. Enumerate every row in the `Key` Prisma table where
+   `deletedAt IS NULL`.
+3. For each row, call `unlockKey(keyName, passphrase)`. `unlockKey`
+   is idempotent post-#16: if the key was already unlocked by a
+   prior pass, it's a no-op.
+4. Log per-key INFO on success, WARN on `unlockKey → false`
+   (typically: wrong passphrase, possibly the key was created under a
+   historical passphrase that differs from the current one), ERROR on
+   throw (typically: corrupted blob).
+5. Log one summary line:
+   `🔓 autounlock: enabled (source=<env>), unlocked N/M keys in <Xms>`.
+
+The loop is sequential — log clarity > parallelism, the unlock op
+itself is cheap (one ChaCha20 decrypt per key). For 100 keys it's
+milliseconds. If a fleet ever needs the thousands, parallelize then.
+
+The NIP-46 client channel doesn't accept RPCs that route to a key
+until that key's `Backend.start()` resolves — which happens inside
+`unlockKey`. So there's no race window where a freshly-restarted
+bunker would say "key locked" to a client while the loop is in
+flight on that key.
+
+## The security trade-off
+
+Enabling autounlock means **whoever can read the passphrase source
+can recover any key from the bunker disk.** Specifically:
+
+- The encrypt-at-rest property of `nsecbunker.db` is *preserved*
+  against `cat /var/lib/nsecbunker/*.db` alone — the database holds
+  ciphertext + IV per key, not plaintext.
+- The encrypt-at-rest property is *lost* if the attacker also has
+  access to the passphrase source. Anyone with read access to the
+  passphrase env var, the passphrase file, or the process memory at
+  the moment of autounlock can decrypt every key.
+
+This is the same trade today's deployments already make when they
+hold the passphrase in `lnbits`'s env to drive `unlock_key` RPCs
+post-restart. Autounlock makes the trade *explicit at the bunker
+level* and *visible per-deployment*, but it doesn't introduce a new
+trust requirement that didn't already exist for any deployment using
+external automation to drive unlocks.
+
+### Recommendations by deployment shape
+
+- **Dev / regtest / single-host:** literal `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE`
+  in `docker compose .env` is fine. The threat model on a dev box
+  doesn't justify the file-source ceremony.
+- **Single-tenant production:** passphrase file on a separate
+  volume / mount with stricter access. Mount via
+  `systemd-LoadCredential` so the file is only readable by the
+  bunker process and is materialized from a sops-decrypted source
+  at boot. Avoid baking the passphrase into the container image or
+  process env list (which leaks into `ps aux`, container labels, etc.).
+- **Multi-tenant / high-security:** leave autounlock off. Orchestrate
+  unlock per-restart from an external process that prompts for the
+  passphrase out-of-band (hardware token, HSM-derived secret, human
+  approval). This preserves the property that bunker startup alone
+  doesn't restore crypto capability — a deliberate human action is
+  required.
+
+## What's *not* in scope
+
+These are deliberately out of scope for the autounlock feature.
+Separate issues to file if needed:
+
+- **Per-key passphrase support.** The current `Key` table doesn't
+  carry per-key passphrase metadata; every `create_new_key(name, passphrase)`
+  in our usage today uses the same passphrase
+  (`LNBITS_NSEC_BUNKER_KEYSTORE_PASSPHRASE`). The autounlock
+  passphrase covers every encrypted key by virtue of this
+  single-passphrase invariant. If a deployment ever needs per-key
+  passphrases, that's a separate feature (per-key passphrase-selector
+  column + per-key passphrase map).
+- **Passphrase rotation.** Re-encrypting every key under a new
+  passphrase belongs in a dedicated admin RPC (`rotate_keystore`),
+  not in autounlock.
+- **HSM / hardware-derived passphrase delivery.** Orthogonal to
+  where the passphrase comes from at unlock time — autounlock just
+  reads a string. An HSM integration would land between the
+  hardware and the file the bunker reads from.
+
+## Observability hooks
+
+The autounlock pass emits:
+
+- `🔓 autounlock: unlocked <keyName>` (INFO, one per success)
+- `⚠️  autounlock: unlockKey returned false for <keyName> ...` (WARN, one per soft failure)
+- `❌ autounlock: <keyName> failed: <err.message>` (ERROR, one per throw)
+- `🔓 autounlock: enabled (source=<env>), unlocked N/M keys in <Xms>` (summary, once)
+
+When the optional Prometheus exporter lands, counters
+`nsecbunkerd_keys_unlocked_total` and `nsecbunkerd_keys_locked_total`
+will be reported from the autounlock summary state. The current
+implementation doesn't export metrics — the log line is the
+canonical signal.
+
+## See also
+
+- `src/daemon/run.ts:Daemon.maybeAutounlock` — implementation
+- `src/daemon/run.ts:Daemon.unlockKey` — the idempotent per-key call
+- `src/daemon/admin/commands/unlock_key.ts` — the admin-RPC wrapper for manual unlock
+- aiolabs/nsecbunkerd#16 — issue with full design rationale + acceptance criteria
+- aiolabs/nsecbunkerd#15 — NDK 3.0.3 bump (the structural fix this builds on)
diff --git a/src/daemon/run.ts b/src/daemon/run.ts
index f91dbd7..7945867 100644
--- a/src/daemon/run.ts
+++ b/src/daemon/run.ts
@@ -208,6 +208,103 @@ class Daemon {
             const nsec = nip19.nsecEncode(nostrUtils.hexToBytes(settings.key));
             this.loadNsec(keyName, nsec);
         }
+
+        // Boot-time autounlock of encrypted-at-rest keys. Off by default;
+        // enabled by setting NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE or
+        // NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE. See docs/AUTOUNLOCK.md
+        // for the security trade-off and aiolabs/nsecbunkerd#16 for the
+        // design rationale.
+        await this.maybeAutounlock();
+    }
+
+    /**
+     * Boot-time autounlock for encrypted keys.
+     *
+     * Reads a passphrase from one of two mutually exclusive env vars:
+     *   - NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE       — literal passphrase
+     *   - NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE  — path to a file containing
+     *                                                the passphrase (newline-trimmed)
+     *
+     * If neither is set, this is a no-op — the deployment opted out and
+     * keys remain locked until an admin `unlock_key` RPC fires per key
+     * per restart (today's default).
+     *
+     * If both are set, throws at boot — ambiguous config.
+     *
+     * Otherwise: enumerates `Key` table rows where `deletedAt IS NULL`,
+     * calls `unlockKey(keyName, passphrase)` per row. Sequential, with
+     * continue-on-error so one bad row doesn't block the rest of the
+     * fleet. Per-key INFO/WARN/ERROR log + one summary line at the end.
+     *
+     * `unlockKey` is idempotent post-#16 — calling it against a key that
+     * was already loaded via the unencrypted paths above is safe (returns
+     * true without spawning a duplicate Backend).
+     *
+     * Single-passphrase invariant: every `create_new_key(name, passphrase)`
+     * uses the same passphrase in our usage today, so one autounlock
+     * passphrase covers every encrypted key. Per-key passphrase support
+     * is a separate feature (out of scope — see issue #16).
+     */
+    async maybeAutounlock(): Promise<void> {
+        const literal = process.env.NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE;
+        const filePath = process.env.NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE;
+
+        if (literal && filePath) {
+            throw new Error(
+                'Autounlock: NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE and ' +
+                'NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE are mutually exclusive. ' +
+                'Set exactly one (or neither, to leave autounlock off).'
+            );
+        }
+
+        if (!literal && !filePath) {
+            return; // autounlock off (default)
+        }
+
+        let passphrase: string;
+        let source: string;
+        if (literal) {
+            passphrase = literal;
+            source = 'NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE';
+        } else {
+            const fs = await import('fs');
+            try {
+                passphrase = fs.readFileSync(filePath!, 'utf8').replace(/\r?\n$/, '');
+            } catch (e: any) {
+                throw new Error(
+                    `Autounlock: failed to read passphrase file ${filePath}: ${e.message}`
+                );
+            }
+            source = `NSEC_BUNKER_AUTOUNLOCK_PASSPHRASE_FILE=${filePath}`;
+        }
+
+        const keys = await prisma.key.findMany({ where: { deletedAt: null } });
+        const start = Date.now();
+        let success = 0;
+
+        for (const key of keys) {
+            try {
+                const ok = await this.unlockKey(key.keyName, passphrase);
+                if (ok) {
+                    console.log(`🔓 autounlock: unlocked ${key.keyName}`);
+                    success++;
+                } else {
+                    console.warn(
+                        `⚠️  autounlock: unlockKey returned false for ${key.keyName} ` +
+                        `(likely wrong passphrase — encrypted under a different secret?)`
+                    );
+                }
+            } catch (e: any) {
+                console.error(
+                    `❌ autounlock: ${key.keyName} failed: ${e?.message ?? e}`
+                );
+            }
+        }
+
+        const elapsed = Date.now() - start;
+        console.log(
+            `🔓 autounlock: enabled (source=${source}), unlocked ${success}/${keys.length} keys in ${elapsed}ms`
+        );
     }
 
     async start() {