Commit graph

121 commits

Author SHA1 Message Date
65a6966b9f fix(#9): close race between create_new_key and NIP-46 connect
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
Two-layer fix for the issue where a fresh client chaining
create_new_key + NIP-46 connect on the same target key would
time out — bunker had no subscription registered for the new
key by the time the connect event arrived at the relay.

Layer 1 — run.ts: loadNsec and unlockKey were synchronous and
fire-and-forgot the async startKey promise. create_new_key.ts:35
already awaited loadNsec, but the await was a no-op against a sync
return. Promoted both to async and properly awaited startKey, so
backend.start() at least gets a chance to run before the caller's
response goes out.

Layer 2 — backend/index.ts: NDKNip46Backend.start() registers the
kind-24133 subscription via this.ndk.subscribe(...) but returns
immediately, before the relay's EOSE confirms it has the
subscription on file. Override start() in our Backend subclass to
await EOSE before resolving. This is the actual race-closer —
layer 1's await alone wasn't enough because start() was still
returning before the relay registered the subscription.

Surfaced by aiolabs/lnbits#33's eager-bind chain, which publishes
a NIP-46 connect event in the same HTTP round-trip as
create_new_key. Pre-fix lnbits deferred the connect to first
sign_event (minutes-to-hours after provisioning), so the race
window was hidden.

Verified end-to-end on bohm regtest: demo account creation through
the webapp now completes cleanly, with bunker logs showing
connect + sign_event for the freshly-provisioned key.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 12:25:45 +02:00
fb1c239e15 fix(#4): re-enable connection watchdog with env-flag opt-out
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
Calls `relayConnectionWatchdog` (introduced in the previous commit) at
the end of admin-interface connect(). Gated by NSEC_BUNKER_DISABLE_WATCHDOG=1
for operators who run external liveness checks (Prometheus probes, k8s
readiness, etc.) and don't want the daemon to self-terminate.

This restores the watchdog behavior that was commented out in commit
42dbbd7 (the emergency stopgap for the old self-echo false positives),
but on top of the now-reliable pool-status mechanism.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:43:12 +02:00
1792bc489c fix(#4): replace pingOrDie self-echo watchdog with pool-status check
The original watchdog published a kind-24133 event to its own pubkey
every 20s and exited if no echo arrived within 50s. On a single private
relay setup (LNbits's nostrrelay extension channel), NDK 2.8.1's outbox
model doesn't reliably route self-publishes back through the matching
subscription, so the watchdog fires false positives and exits every 50s
even though admin RPCs over the same channel still work fine. The
upstream patches we landed previously (commit 42dbbd7) commented the
call out as an emergency stopgap; this commit replaces the mechanism
with one that actually answers the right question.

Pool-status watchdog: poll `ndk.pool.connectedRelays().length` every
10s, track the most recent moment any relay was connected, exit if no
relay has been connected for 60s. Uses NDK's own connection-lifecycle
tracking which works reliably across all relay configurations — no
self-publish, no subscription dependency, no relay traffic. Same intent
as pingOrDie (detect partition from relay layer and let the supervisor
restart us), reliable signal.

Call site re-enable + env-flag opt-out follow in the next commit.

Drops the now-unused NostrEvent import.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 20:42:43 +02:00
662dd21a60 fix(nix): include prisma CLI + scripts/, wrapper invokes start.js
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
Three correctness fixes to the nix derivation that mirror the Dockerfile
correctness fixes:

1. Drop `pnpm prune --prod --ignore-scripts` from the build phase. The
   prune step removed the prisma CLI (devDependency) from the output,
   so the runtime invocation of `prisma migrate deploy` had nothing to
   exec. Same trap the upstream Dockerfile fell into via `--prod` install.

2. Copy `scripts/` into `$out/share/nsecbunkerd/` alongside dist,
   node_modules, prisma, templates. Without it the launcher script
   (which contains the migration step) wasn't present.

3. The makeWrapper target switches from `dist/index.js` to
   `scripts/start.js`. Same change the Dockerfile ENTRYPOINT got in
   the previous commit. Also adds nodejs_20 to PATH so `npm` is
   resolvable from inside start.js, and drops `--chdir` so the caller
   (systemd, docker compose) controls cwd — start.js now resolves
   sibling paths from `__dirname`, independently committed.

The `patchNdk` substitution narrows from the old `workspace:*` form
(no longer in the package.json after fork commit 06272c8) to the
current `"2.8.1"` → `"^2.8.1"` rewrite needed to align package.json
with the lockfile under --frozen-lockfile.

Remaining known gap: nixpkgs ships prisma-engines 7.7.0 while the
JS prisma CLI in node_modules is 5.4.1, an RPC vocabulary mismatch
that breaks the migrate step at runtime (`Method not found:
listMigrationDirectories`). Either bump prisma JS to ^7.x or overlay
prisma-engines to 5.4.1. Out of scope for this commit; docker build
unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:08:42 +02:00
ccfde02d70 fix(start.js): resolve sibling paths from script location, not cwd
The launcher previously assumed cwd was the package root: `mkdir config`
in cwd, `npm run prisma:migrate` in cwd, `node ./dist/index.js`. Works
under docker (WORKDIR /app, writable) but breaks anywhere cwd differs
from the package root — e.g. a nix-built bunker invoked from a systemd
unit whose WorkingDirectory is the state dir (/var/lib/nsecbunkerd) and
not the nix store path that holds dist/, scripts/, prisma/.

Resolve sibling paths via `path.resolve(__dirname, '..')` so the
package-internal layout is robust to cwd. Use `path.join(pkgRoot, 'dist/index.js')`
for the daemon spawn and `{ cwd: pkgRoot }` for the npm migrate exec.
Switch `mkdir config` (which only works in writable cwd) to
`fs.mkdirSync(configDir, { recursive: true })` where configDir defaults
to `./config` relative to cwd, overrideable via NSEC_BUNKER_CONFIG_DIR.

This lets the nix package install the launcher into the read-only store
while the systemd unit still does its config/state work in
/var/lib/nsecbunkerd with no shell wrapping.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:05:24 +02:00
053357899d fix(docker): entrypoint runs migrations via scripts/start.js
Upstream Dockerfile sets `ENTRYPOINT [ "node", "./dist/index.js" ]`,
which boots the daemon directly and silently bypasses `scripts/start.js`
— the only place that runs `prisma migrate deploy`. On a clean install,
the SQLite db file at $DATABASE_URL is created empty (0 bytes) and
every Policy / KeyUser / Token / SigningCondition operation throws
"table does not exist." `ping` / `get_keys` / `create_new_key` happen
to survive because they only touch the JSON config, not the db.

Two changes:

1. ENTRYPOINT switches to `node ./scripts/start.js`. The CMD arg
   (`start`) and any additional argv pass through to the daemon
   unchanged via process.argv.

2. Runtime pnpm install drops `--prod`. The prisma CLI lives in
   devDependencies; with `--prod`, `npx prisma migrate deploy` tries to
   download prisma@latest at runtime, which OOMs in modest containers.
   Including devDeps at runtime adds modest image bulk for correctness.

Validated end-to-end against the local regtest stack — after the
rebuild the SQLite db boots populated with 22 migrations, and the
lnbits-side admin spike harness passes all 9 steps including NIP-46
sign_event with Schnorr-valid signatures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:05:10 +02:00
5e77de1202 fix: convert policyId to Int before Prisma insert in create_new_token
The wire-level `create_new_token` RPC carries `policyId` as a string
(everything in NDK RPC params is string). The handler correctly
parseInts it for the `findUnique({where:{id:parseInt(policyId)}})` call
but then forwards the unparsed string straight into the Prisma
`token.create({data:{...policyId}})` payload. Prisma rejects with
"Argument `policyId`: Invalid value provided. Expected Int or Null,
provided String" because `Token.policyId` is declared `Int` per the
schema (references `Policy.id`, which is autoincrement Int).

Hoist `policyIdInt = parseInt(policyId)` and use it for both the
findUnique lookup and the create payload. Latent upstream bug — no one
would have seen it before because the wrong-kind error response (fixed
in the previous commit) made the symptom look like a transport timeout
rather than a Prisma validation error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:04:53 +02:00
0a510b7f9a fix(#7): route error responses to the request's kind
The catch block in handleRequest and both response paths in create_account
pass `NDKKind.NostrConnectAdmin` as the response kind. That constant does
NOT exist in NDK 2.8.1 — only `NostrConnect = 24133` is exported — so it
resolves to `undefined` and NDKNostrRpc.sendResponse falls through to its
own default of `NDKKind.NostrConnect = 24133`. Net effect: any error
response to an admin-channel (kind 24134) request is published on the
NIP-46 signing channel (24133) instead, which clients subscribed for
24134 never see. Looks like a transport-layer NDK-echo / silent-drop
issue from the client's perspective, but the bunker IS publishing
reliably — just on the wrong kind.

Mirror `req.event.kind` so the error response goes back on the same
channel the request came in on. Same pattern the unknown-method path
and create_account's validation-error path already used; just propagate
it to the remaining sites. Drops the now-unused NDKKind import from
create_account.ts.

Validated end-to-end against the local bunker via the lnbits-side admin
spike harness — after this fix + the migration entrypoint fix + the
policyId type fix, all 9 spike steps including NIP-46 sign_event pass
with Schnorr-valid signatures. See coordination log entry 2026-05-27T14:30Z.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:04:31 +02:00
8caf856ab2 diag(#7): env-gated per-relay transport instrumentation
Add NSEC_BUNKER_DEBUG_TRANSPORT=1 opt-in logging that emits REQUEST_IN
on inbound NIP-46 RPCs, RESPONSE_SENT around NDKNostrRpc.sendResponse,
and PUBLISHED / PUBLISH_FAILED per-relay on the bunker's pool. Surfaces
the diagnostic signal NDKNostrRpc itself discards: sendResponse calls
`event.publish(this.relaySet)` and throws away the Set<NDKRelay> it
returns, so silent outbox-drops and wrong-kind responses are invisible
without hooking the pool's per-relay events directly.

Validated against the local bunker via the lnbits-side admin spike
harness (~/dev/lnbits/misc-aio/bunker_admin_spike.py): the instrumentation
made the 9-step harness reveal a wrong-kind error response path (separate
fix in the next commit) that had been masquerading as an NDK echo issue
for a week. With the env flag unset the daemon stays as quiet as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 16:56:27 +02:00
e39eaa632d startKey: decode bech32 nsec to hex before constructing NDKPrivateKeySigner
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
NDK 2.8.1's NDKPrivateKeySigner constructor forwards its arg straight
to nostr-tools getPublicKey() which requires 32-byte hex/bytes/bigint
and throws on bech32 input. Every key loaded through startKey (i.e.
every key created via create_new_key, plus boot-time reloads of any
plain-nsec entries in the config) was failing silently with the
nostr-tools type error. The try/catch caught the throw and returned
without loading the key, so the bunker would happily report
create_new_key as successful, the key would persist encrypted on
disk, but the runtime keystore would not have a signer for it.
NIP-46 connect / sign_event against any admin-provisioned target
therefore silently timed out from the client side — blocking
essentially every signing flow.

Sister bug to #5 (getKeys iterator) in a different code path. The
fix matches the existing pattern in create_new_key.ts:16:

    hexpk = nip19.decode(nsec).data as string;

Verified against the local spike harness: create_new_key now loads
the target into runtime; get_keys returns the new entry (assuming
#5 is patched separately for the iterator path).

Fixes #8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 00:32:39 +02:00
42dbbd7536 disable pingOrDie watchdog — false-positives on non-public relays
NDK 2.8.1's outbox model doesn't reliably deliver self-published
events back through subscriptions when the configured relay set is
a single custom (non-public) relay. The pingOrDie self-watchdog
publishes a kind-24133 event to its own pubkey every 20s and exits
the bunker if it doesn't see the echo within 50s — which means on
a private relay channel (e.g. LNbits's nostrrelay extension), the
bunker exits cleanly every 50s even though admin RPCs over that
same channel are working fine.

Plain-WebSocket round-trips to the same relay echo correctly in
<1s, so the issue is on NDK's side, not the relay's.

Commenting out the watchdog is the minimum patch to keep the
daemon alive. Real fix is either an env-flag opt-out, a simpler
connectivity check that doesn't depend on self-echo, or an NDK
upgrade that fixes the outbox-vs-subscribe race.

Fixes #4. See also #7 for the underlying NDK echo investigation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 00:29:53 +02:00
960b9399e8 Dockerfile: switch from npm to pnpm + drop --frozen-lockfile
Two upstream-rot issues fixed in one commit (same root cause: the
upstream Dockerfile predates the move to pnpm and the lockfile has
drifted):

- npm install can't resolve workspace:* deps (which package.json used
  to declare for @nostr-dev-kit/ndk — see prior commit for the pin).
  Switching to pnpm@9 matches the lockfile that ships in-repo.

- pnpm-lock.yaml is out of date vs package.json (likely from
  generation-time vs commit-time drift), so --frozen-lockfile fails
  with ERR_PNPM_OUTDATED_LOCKFILE. Drop the flag in both build and
  runtime stages to let pnpm resolve fresh, at the cost of giving up
  determinism — to be restored once the lockfile is regenerated.

Also reorders the build stage to COPY lockfile + manifest before the
source, so the install layer caches across source-only edits.

Fixes #1, #2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 00:29:41 +02:00
06272c8f2c pin @nostr-dev-kit/ndk to 2.8.1 instead of workspace:*
Upstream declares the dependency as workspace:*, but the repo has no
pnpm-workspace.yaml and no sibling @nostr-dev-kit/ndk package — so
pnpm install fails with ERR_PNPM_WORKSPACE_PKG_NOT_FOUND on a clean
clone. The shipped pnpm-lock.yaml was resolving to ndk 2.8.1, so pin
to that exact version to match what the lockfile already expects.

Fixes #3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 00:29:29 +02:00
711a017e8c add nix flake with devShell and native package build
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
devShell: nodejs_20, pnpm_8, prisma + prisma-engines, sqlite, openssl,
plus the env wiring so prisma uses nix-provided engines instead of
fetching from binaries.prisma.sh.

packages.default: full native build via pnpm_8.fetchDeps + configHook.
Patches the workspace:* ndk spec to the lockfile-resolved ^2.8.1 so
--frozen-lockfile accepts it, then re-runs install with scripts to
trigger bcrypt's node-pre-gyp fallback-to-build (uses python311 since
node-gyp 9.4.1 bundled with pnpm 8 still imports distutils).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 23:59:31 +02:00
Pablo Fernandez
f4fd7403cc gitignore
Some checks failed
Docker image / build-and-push-image (push) Has been cancelled
2024-09-21 13:45:11 -04:00
Pablo Fernandez
87217f9a3f updates 2024-09-21 13:44:35 -04:00
Pablo Fernandez
ff5387b778 updates 2024-09-21 13:44:24 -04:00
Pablo Fernandez
919315bbf7 bump 2024-04-25 14:47:48 +01:00
Pablo Fernandez
919beb941c update ndk 2024-04-25 14:46:32 +01:00
Pablo Fernandez
032b67632e bump ndk 2024-03-19 14:28:33 +00:00
Pablo Fernandez
70ce3b544d absolutely no reason why the username needs to be readonly 2024-02-18 00:10:03 +00:00
Pablo Fernandez
dcb9b6695c remove stupid email example 2024-02-15 13:18:17 +00:00
Pablo Fernandez
717306a108
Merge pull request #30 from hzrd149/master
Add github action for building and publishing docker images
2024-02-15 13:07:26 +00:00
Pablo Fernandez
cbb6c66804
Merge pull request #29 from erskingardner/master
Replace auth request js method, update prettierrc to handle handlebar
2024-02-15 13:06:34 +00:00
Pablo Fernandez
6caf570866
Merge branch 'master' into master 2024-02-15 13:06:16 +00:00
Pablo Fernandez
24eb27a949
Merge pull request #31 from coracle-social/master
Bring back sendPostRequest
2024-02-15 13:04:49 +00:00
Pablo Fernandez
2fde57ff90 respond to create_account with the kind that it came with 2024-02-15 13:03:43 +00:00
Jon Staab
e2be038af7 Bring back sendPostRequest 2024-02-08 10:07:00 -08:00
hzrd149
5b37032ec1 update main branch 2024-02-04 12:50:39 +00:00
hzrd149
109cb5d972 add github action for docker image 2024-02-04 11:52:56 +00:00
Jeff Gardner
e796307f30 Add required id attribute 2024-02-01 11:48:06 +01:00
Jeff Gardner
7d3e7394ed Replace auth request js method, change prettierrc to handle handlebar templates better 2024-02-01 11:22:10 +01:00
Pablo Fernandez
b5d4694e36 bump 2024-01-31 13:58:53 +00:00
Pablo Fernandez
64a41e98ab remove default start and just document using lfg 2024-01-31 13:45:23 +00:00
Pablo Fernandez
0a130089bf remove bad check on missing domains 2024-01-31 13:45:09 +00:00
Pablo Fernandez
ca3bbf4d7d remove wrong defaults on config 2024-01-31 13:44:59 +00:00
Pablo Fernandez
529f68360d create config if it's not there 2024-01-31 13:33:01 +00:00
Pablo Fernandez
a9814fd150 start without requiring start 2024-01-31 13:32:50 +00:00
Pablo Fernandez
f7752ec016 mkdir config 2024-01-31 13:25:58 +00:00
Pablo Fernandez
ed9c130ff6
Merge pull request #28 from nourspace/nour/feat/improved-configs
feat: improved configs
2024-01-26 16:17:15 +00:00
Nour
40391c536f
fix: create connection.txt inside config folder 2024-01-26 15:16:26 +00:00
Nour
894f3c3d14
feat: allow customizing auth host 2024-01-26 15:14:40 +00:00
Nour
33fb9703f7
feat: use dynamic DATABASE_URL for Prisma db 2024-01-26 15:13:26 +00:00
Nour
d1fd2d466a
fix: correct .env file in docker-compose 2024-01-26 15:10:37 +00:00
Pablo Fernandez
f9eb2d8898
Merge pull request #25 from nQuiz/patch-1
chore: minor typos / formatting
2024-01-24 13:18:15 +00:00
Pablo Fernandez
815872fc02
Merge pull request #26 from reyamir/feat/redesign-templates
Redesign default templates
2024-01-24 13:17:53 +00:00
Pablo Fernandez
f5fa696033
Merge pull request #27 from nourspace/nour/feat/docker-cleanups
feat: docker cleanups
2024-01-24 12:14:32 +00:00
Nour
e40fafa3b5
feat: multiple fixes
- Define binaryTargets for schema.prisma so it works inside alpine
- Log adminNpubs
- Set default for config `domains`
2024-01-23 17:44:38 +00:00
Nour
1d4251c23e
feat: cleanup docker setup
- Add .dockerignore
- Replace .env with .env.example
- Add migrations service
- Cleanup Dockerfile: simpler setup, simpler copy, no migrations inside the image
- Update README to match new instruction
2024-01-23 17:43:02 +00:00
reya
13b0151b4f fix create account 2024-01-23 14:36:23 +07:00