pingOrDie self-watchdog false-positives → bunker exits every 30s on non-public relays #4
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Symptom
After getting the bunker to boot (#1, #2, #3) and pointing it at a non-public relay channel (e.g. an LNbits
nostrrelay/testinstance running on the same host), the bunker successfully connects and reports:…then 30 seconds later:
Container exits with code 0 (so
restart: on-failuredoesn't even kick in).Root cause
src/daemon/admin/index.ts:pingOrDieis a self-watchdog: every 20s the bunker publishes a kind-24133 event tagged to its own pubkey, and listens (via the same NDK instance) for matching events. If no echo arrives within 50s (the death timer), itprocess.exit(1).Two separate problems make this fire spuriously on our setup:
process.exit(1)is documented in the log message but the actual call isprocess.exit(0)somewhere upstream — the container exits with 0, sorestart: on-failuredoesn't restart it.So the watchdog is killing the bunker for a reason that doesn't reflect actual problems with admin RPCs (those work —
ping,create_new_keyover the same relay channel both succeed; seeaiolabs/lnbits/issues/18for the spike findings).Fix we applied
Comment out the
pingOrDie(this.ndk)call insrc/daemon/admin/index.ts:125. Bunker stays up indefinitely afterward.Real fix candidates
In rough order of investment:
DISABLE_PING_WATCHDOG=1) — quick fix, lets operators turn it off when they know their relay setup is fine.pool.connectedRelayCount() > 0) rather than the round-trip-via-self pattern. Simpler, fewer moving parts.restart: on-failureworks.Acceptance
Cross-refs
aiolabs/lnbits#18phase 2 spike.~/dev/lnbits/nsec-bunker-spike-findings.md("pingOrDie watchdog disabled" section).get_keysresponses (#5) and possibly future client-side signing flows. Worth investigating together.getKeysthrows on passphrase-encrypted entries —nip19.decode({iv, data})fails #5