fix: guard every machine_npub deref against unpaired machines (500 + cassette-consumer crash) #33

Merged
padreug merged 2 commits from fix/unpaired-machine-npub-guards into main 2026-06-22 14:58:04 +00:00
Owner

The bug (found on the demo after the v0.1.1 upgrade)

machine_npub became nullable in #29/m011 (register-unpaired flow), but several consumers still assumed it's non-None and crashed normalize_public_key(None) with AttributeError: 'NoneType' object has no attribute 'startswith'.

On lnbits.demo.aiolabs.dev (which has an unpaired machine), this surfaced as:

  • Platform-fee update → 500 (api_update_super_config), and
  • the cassette consumer spamming cassette consumer loop error … 'NoneType' … startswith every 2s (non-functional).

The #29 create/pair paths were guarded; these five sites were missed:

Site Symptom Fix
views_api.api_update_super_config republish loop the 500 skip unpaired (they get fee config at pairing)
cassette_transport.build_state_d_tags_for_machines cassette-consumer loop crash skip unpaired (no state d-tag yet)
crud.get_machine_by_atm_pubkey_hex cassette event-handler crash its except (ValueError, AssertionError) didn't catch the AttributeError; skip unpaired before normalize
bitspire.assert_nostr_attribution could crash the payment listener reject with SettlementAttributionError
views_api cassettes/publish endpoint could crash publish_to_atm clean 400 "not paired"

Verification

On the dev stack, inserted an unpaired active machine (mirroring the demo): before — cassette consumer crashed with the same AttributeError; after — it "(re)registered … with 1 d-tag(s)" (skipping the unpaired one) and runs clean. py_compile + ruff clean on the changed lines.

Deploy

This is a hotfix for the broken demo. After merge I'll tag v0.1.2 + bump the catalog so the demo can upgrade off the broken v0.1.1.

🤖 Generated with Claude Code

## The bug (found on the demo after the v0.1.1 upgrade) `machine_npub` became **nullable** in #29/m011 (register-unpaired flow), but several consumers still assumed it's non-`None` and crashed `normalize_public_key(None)` with `AttributeError: 'NoneType' object has no attribute 'startswith'`. On `lnbits.demo.aiolabs.dev` (which has an unpaired machine), this surfaced as: - **Platform-fee update → 500** (`api_update_super_config`), and - the **cassette consumer spamming `cassette consumer loop error … 'NoneType' … startswith` every 2s** (non-functional). The #29 create/pair paths were guarded; these five sites were missed: | Site | Symptom | Fix | |---|---|---| | `views_api.api_update_super_config` republish loop | the **500** | skip unpaired (they get fee config at pairing) | | `cassette_transport.build_state_d_tags_for_machines` | cassette-consumer **loop crash** | skip unpaired (no state d-tag yet) | | `crud.get_machine_by_atm_pubkey_hex` | cassette **event-handler crash** | its `except (ValueError, AssertionError)` didn't catch the `AttributeError`; skip unpaired before normalize | | `bitspire.assert_nostr_attribution` | could crash the payment listener | reject with `SettlementAttributionError` | | `views_api` cassettes/publish endpoint | could crash `publish_to_atm` | clean **400 "not paired"** | ## Verification On the dev stack, inserted an unpaired active machine (mirroring the demo): **before** — cassette consumer crashed with the same `AttributeError`; **after** — it "(re)registered … with 1 d-tag(s)" (skipping the unpaired one) and runs clean. `py_compile` + `ruff` clean on the changed lines. ## Deploy This is a hotfix for the broken demo. After merge I'll tag **v0.1.2** + bump the catalog so the demo can upgrade off the broken v0.1.1. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
fix: guard every machine_npub deref against unpaired machines (None)
Some checks failed
ci.yml / fix: guard every machine_npub deref against unpaired machines (None) (pull_request) Failing after 0s
d52a3bfafe
machine_npub became nullable in #29/m011 (register-unpaired flow), but
several consumers still assumed it's non-None and crashed
`normalize_public_key(None)` with `AttributeError: 'NoneType' object has no
attribute 'startswith'`. On the demo (which had an unpaired machine) this
broke the platform-fee update (500) and spammed the cassette consumer with
errors every 2s. The #29 create/pair paths were guarded; these were missed:

- views_api `api_update_super_config`: the "republish fee to every active
  machine" loop → skip unpaired (they get their config at pairing).
- cassette_transport `build_state_d_tags_for_machines`: skip unpaired (no
  state-beacon d-tag yet) — the cassette-consumer loop crash.
- crud `get_machine_by_atm_pubkey_hex`: its `except (ValueError,
  AssertionError)` didn't catch the AttributeError; skip unpaired before
  normalize — the cassette event-handler crash.
- bitspire `assert_nostr_attribution`: reject (SettlementAttributionError) an
  unpaired machine instead of crashing the payment listener.
- views_api cassettes/publish endpoint: 400 (not paired) instead of crashing
  publish_to_atm.

Verified on the dev stack: with an unpaired active machine present, the
cassette consumer registers (skipping it) and runs clean — no AttributeError.
fix: complete the unpaired-machine sweep + regression test
Some checks failed
ci.yml / fix: complete the unpaired-machine sweep + regression test (pull_request) Failing after 0s
102c8eac91
Follow-up to the call-site guards: a full sweep of every machine_npub deref
found one more reachable crash — _record_rejected (tasks.py) logs
machine_npub[:12], and the assert_nostr_attribution guard now routes an
unpaired machine there, so None[:12] -> TypeError. Fall back to machine.id.

Every other deref is safe by the attribution-gate invariant: a settlement only
flows past assert_nostr_attribution (now rejecting unpaired) for a paired
machine, so the downstream distribution / parse-path / "landed" logs can't see
None; the collision-loop display already uses `(m.machine_npub or m.id)`.

- tests/test_unpaired_machine_guards.py: regression — assert_nostr_attribution
  rejects an unpaired machine (domain error, not AttributeError) and
  build_state_d_tags skips it.
- tests/test_pair_endpoint.py: update the fake_pair mock for bunker_relay/
  keystore_passphrase (pre-existing drift from #29; was failing before this PR).

Full suite green (213 passed).
padreug force-pushed fix/unpaired-machine-npub-guards from 102c8eac91
Some checks failed
ci.yml / fix: complete the unpaired-machine sweep + regression test (pull_request) Failing after 0s
to 8dad72a00d
Some checks failed
ci.yml / fix: complete the unpaired-machine sweep + regression test (pull_request) Failing after 0s
2026-06-22 14:55:40 +00:00
Compare
Author
Owner

Made it complete, not just whack-a-mole

Did a full sweep of every machine_npub deref. Findings:

  • Identity ops (4 sites: assert_nostr_attribution, _atm_hex_pubkey via fee + cassette transports, get_machine_by_atm_pubkey_hex) — all guarded.
  • One more reachable crash the first pass missed: _record_rejected (tasks.py) logs machine_npub[:12], and my assert_nostr_attribution guard now routes an unpaired machine there → None[:12] TypeError. Fixed (fall back to machine.id).
  • Everything else is safe by the attribution-gate invariant: a settlement only flows past assert_nostr_attribution (which now rejects unpaired) for a paired machine, so the downstream distribution.py / bitspire parse-path / "landed settlement" log derefs can't observe None. The collision-loop display already uses (m.machine_npub or m.id).

Regression test

Added tests/test_unpaired_machine_guards.pyassert_nostr_attribution rejects an unpaired machine with the domain SettlementAttributionError (not AttributeError), and build_state_d_tags skips it. Full suite: 211 passed; the only reds are 2 pre-existing test_pair_endpoint failures (#29 drift — filed separately, out of scope here).

## Made it complete, not just whack-a-mole Did a full sweep of **every** `machine_npub` deref. Findings: - **Identity ops** (4 sites: `assert_nostr_attribution`, `_atm_hex_pubkey` via fee + cassette transports, `get_machine_by_atm_pubkey_hex`) — all guarded. - **One more reachable crash** the first pass missed: `_record_rejected` (`tasks.py`) logs `machine_npub[:12]`, and my `assert_nostr_attribution` guard now *routes* an unpaired machine there → `None[:12]` `TypeError`. Fixed (fall back to `machine.id`). - **Everything else is safe by the attribution-gate invariant**: a settlement only flows past `assert_nostr_attribution` (which now rejects unpaired) for a paired machine, so the downstream `distribution.py` / `bitspire` parse-path / "landed settlement" log derefs can't observe `None`. The collision-loop display already uses `(m.machine_npub or m.id)`. ## Regression test Added `tests/test_unpaired_machine_guards.py` — `assert_nostr_attribution` rejects an unpaired machine with the domain `SettlementAttributionError` (not `AttributeError`), and `build_state_d_tags` skips it. **Full suite: 211 passed**; the only reds are 2 pre-existing `test_pair_endpoint` failures (#29 drift — filed separately, out of scope here).
padreug deleted branch fix/unpaired-machine-npub-guards 2026-06-22 14:58:04 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
aiolabs/spirekeeper!33
No description provided.