sign_and_send_to_nostr → publish_nostr_event has no relay timeout — hangs uvicorn forever on first signup if external relays unreachable #7

Closed
opened 2026-06-03 16:24:24 +00:00 by padreug · 0 comments
Owner

Symptom

provision_merchantsign_and_send_to_nostrnostr_client.publish_nostr_event(event) blocks indefinitely waiting for relay acks when the configured external relays are unreachable from the lnbits process. Because eager default-merchant provisioning (aiolabs/lnbits#46) is awaited inline in the POST /api/v1/auth/register handler, the entire signup HTTP request hangs forever — not just 15 s, forever — and the uvicorn worker is stuck on it. Subsequent register attempts also hang (the worker pool is exhausted) until lnbits is restarted.

This reproduces with or without the NIP-46 bunker in the path. Even when LNBITS_NSEC_BUNKER_URL is unset and the account uses LocalSigner (so sign_event returns instantly without any network round-trip), the subsequent publish_nostr_event to external relays still hangs without bound.

Reproduction (regtest dev stack)

  1. LNBITS_NSEC_BUNKER_URL unset (LocalSigner path).
  2. LNBITS_USER_DEFAULT_EXTENSIONS includes nostrmarket so newly-created accounts get the extension assigned, which triggers _create_default_merchant inside create_user_account_no_check.
  3. Docker network has no outbound internet to public Nostr relays (or relays the user has configured in nostr_client are otherwise unreachable).
  4. POST /api/v1/auth/register for a new user.
  5. lnbits log shows:
    INFO  lnbits.core.services.users:_create_default_pay_link:377 | Successfully created default pay link for user <name>
    INFO  lnbits.core.services.users:create_user_account_no_ckeck:133 | Created default pay link for user <name>
    
  6. Then nothing — no Created default nostrmarket merchant log (success), no Failed to provision default nostrmarket merchant log (caught exception). Just silence.
  7. Account row IS persisted in database.sqlite3 (accounts and ext_nostrmarket.sqlite3 merchants both have rows for the user) — only the request response never comes back.
  8. Other endpoints (/api/v1/auth, future /auth/register calls) start hanging too, because uvicorn workers are stuck. Login becomes impossible until lnbits restart.

Where the hang is

nostrmarket/services.py:198  async def sign_and_send_to_nostr(...)
                              ...
                              await nostr_client.publish_nostr_event(event)
                              # ^ no timeout, no await-with-timeout wrapper,
                              #   no per-relay deadline

The outer try: ... except Exception as ex: logger.warning(...) in provision_merchant (line 240+) only catches raised exceptions — it doesn't unblock an indefinite await. Since publish_nostr_event doesn't raise on unreachable relays, control never returns.

Suggested fix

Wrap the publish call in asyncio.wait_for with a bounded timeout (e.g. 10 s, matching the bunker NIP-46 timeout convention), or fire-and-forget the publish via asyncio.create_task(...) so signup completes immediately while the publish retries in the background. The merchant row is already persisted in DB at this point — the kind:30017 stall event can be re-published later via the existing health-monitor / subscription-resubscribe loop.

try:
    stall_event = await asyncio.wait_for(
        sign_and_send_to_nostr(merchant, default_stall),
        timeout=10.0,
    )
    default_stall.event_id = stall_event.id
    await update_stall(merchant.id, default_stall)
except (Exception, asyncio.TimeoutError) as ex:
    logger.warning(
        f"[NOSTRMARKET] Failed to publish default stall for "
        f"merchant {merchant.id}: {ex}; will retry via background "
        f"publish loop"
    )

A fire-and-forget pattern is preferable since the signup endpoint doesn't actually depend on the publish result.

Context

  • Discovered 2026-06-03 while debugging a regtest signup hang. The user-facing symptom looked like a bunker bug but persisted after disabling the bunker entirely.
  • Workaround we landed in our regtest stack: drop nostrmarket from lnbits_user_default_extensions so new signups skip merchant provisioning. That's not a real fix — anyone with nostrmarket in their defaults hits this whenever their relay set is degraded.
  • Production aio-demo doesn't reproduce because either (a) the merchant call site upstream is still passing the legacy private_key argument and TypeErrors out before sign_and_send_to_nostr, or (b) configured relays are reachable. Both protective factors are fragile.

🤖 Generated with Claude Code

## Symptom `provision_merchant` → `sign_and_send_to_nostr` → `nostr_client.publish_nostr_event(event)` blocks indefinitely waiting for relay acks when the configured external relays are unreachable from the lnbits process. Because eager default-merchant provisioning (aiolabs/lnbits#46) is awaited inline in the `POST /api/v1/auth/register` handler, **the entire signup HTTP request hangs forever** — not just 15 s, forever — and the uvicorn worker is stuck on it. Subsequent register attempts also hang (the worker pool is exhausted) until lnbits is restarted. This reproduces with **or without** the NIP-46 bunker in the path. Even when `LNBITS_NSEC_BUNKER_URL` is unset and the account uses `LocalSigner` (so `sign_event` returns instantly without any network round-trip), the subsequent `publish_nostr_event` to external relays still hangs without bound. ## Reproduction (regtest dev stack) 1. `LNBITS_NSEC_BUNKER_URL` unset (LocalSigner path). 2. `LNBITS_USER_DEFAULT_EXTENSIONS` includes `nostrmarket` so newly-created accounts get the extension assigned, which triggers `_create_default_merchant` inside `create_user_account_no_check`. 3. Docker network has no outbound internet to public Nostr relays (or relays the user has configured in `nostr_client` are otherwise unreachable). 4. `POST /api/v1/auth/register` for a new user. 5. lnbits log shows: ``` INFO lnbits.core.services.users:_create_default_pay_link:377 | Successfully created default pay link for user <name> INFO lnbits.core.services.users:create_user_account_no_ckeck:133 | Created default pay link for user <name> ``` 6. **Then nothing** — no `Created default nostrmarket merchant` log (success), no `Failed to provision default nostrmarket merchant` log (caught exception). Just silence. 7. Account row IS persisted in `database.sqlite3` (`accounts` and `ext_nostrmarket.sqlite3` `merchants` both have rows for the user) — only the request response never comes back. 8. Other endpoints (`/api/v1/auth`, future `/auth/register` calls) start hanging too, because uvicorn workers are stuck. Login becomes impossible until lnbits restart. ## Where the hang is ``` nostrmarket/services.py:198 async def sign_and_send_to_nostr(...) ... await nostr_client.publish_nostr_event(event) # ^ no timeout, no await-with-timeout wrapper, # no per-relay deadline ``` The outer `try: ... except Exception as ex: logger.warning(...)` in `provision_merchant` (line 240+) only catches *raised* exceptions — it doesn't unblock an indefinite await. Since `publish_nostr_event` doesn't raise on unreachable relays, control never returns. ## Suggested fix Wrap the publish call in `asyncio.wait_for` with a bounded timeout (e.g. 10 s, matching the bunker NIP-46 timeout convention), or fire-and-forget the publish via `asyncio.create_task(...)` so signup completes immediately while the publish retries in the background. The merchant row is already persisted in DB at this point — the kind:30017 stall event can be re-published later via the existing health-monitor / subscription-resubscribe loop. ```python try: stall_event = await asyncio.wait_for( sign_and_send_to_nostr(merchant, default_stall), timeout=10.0, ) default_stall.event_id = stall_event.id await update_stall(merchant.id, default_stall) except (Exception, asyncio.TimeoutError) as ex: logger.warning( f"[NOSTRMARKET] Failed to publish default stall for " f"merchant {merchant.id}: {ex}; will retry via background " f"publish loop" ) ``` A fire-and-forget pattern is preferable since the signup endpoint doesn't actually depend on the publish result. ## Context - Discovered 2026-06-03 while debugging a regtest signup hang. The user-facing symptom looked like a bunker bug but persisted after disabling the bunker entirely. - Workaround we landed in our regtest stack: drop `nostrmarket` from `lnbits_user_default_extensions` so new signups skip merchant provisioning. That's not a real fix — anyone with `nostrmarket` in their defaults hits this whenever their relay set is degraded. - Production aio-demo doesn't reproduce because either (a) the merchant call site upstream is still passing the legacy `private_key` argument and TypeErrors out before `sign_and_send_to_nostr`, or (b) configured relays are reachable. Both protective factors are fragile. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
aiolabs/nostrmarket#7
No description provided.