refactor(v2): cassette transport — signer.nip44_* migration (#29 v1.1 / closes #21 partial)

Migrates the cassette transport's encrypt/decrypt paths off direct
`account.prvkey` reads to `signer.nip44_encrypt` / `signer.nip44_decrypt`
on the NostrSigner ABC landed by aiolabs/lnbits PR #38 (phase 2.4). Closes
the operator-side regression flagged at coord-log 2026-05-31T06:50Z:
Greg's RemoteBunkerSigner-migrated account had `accounts.prvkey IS NULL`
post-bunker, which the old code couldn't handle — consumer was logging
WARN every poll cycle and skipping every inbound state event.

## What changed

### cassette_transport.py

- New imports: `resolve_signer`, `SignerError`, `SignerUnavailableError`,
  `NsecBunkerTimeoutError`, `NsecBunkerRpcError` from the post-#38 lnbits
  surface. (The `try: from lnbits.core.signers import SignerError` block
  in the old code was permanently failing because `SignerError` actually
  lives in `lnbits.core.signers.base`, not the package root — fixed.)
- New `_resolve_operator_signer(operator_user_id)`: single source of
  truth for "give me the operator's account + NostrSigner, or raise an
  operator-facing error." Used by both the publish path and the consumer
  task.
- New `_nip44_encrypt_via_signer(account, signer, plaintext, peer)`
  and `_nip44_decrypt_via_signer(...)`: route through `signer.nip44_*`
  first; on `SignerUnavailableError` from a LocalSigner stub (the
  post-#38 ABC has LocalSigner raise on nip44_* explicitly — bunker
  migration required for NIP-44 v2), fall back to the hand-rolled impl
  against `account.prvkey`. Transitional until every operator on the
  instance is bunker-backed (S7).
- `_sign_as_operator` simplified: now `await signer.sign_event(event)`
  (the ABC is async; the old code passed `signer.sign_event` to the
  caller without await, returning a coroutine — also broken but never
  hit because the ImportError fallback fired first).
- `publish_to_atm` flow: `_resolve_operator_signer` → `_nip44_encrypt_
  via_signer` → `_sign_as_operator` → publish. Each step maps bunker /
  signer errors to `OperatorIdentityMissing` (400) / `SignerUnavailable`
  (503) / `CassetteTransportError` (500) for the API handler.
- `decrypt_and_parse_state_event` now `async` and takes `(event, account,
  signer)` instead of `(event, operator_privkey_hex)`. Maps
  `NsecBunkerTimeoutError` → `CassetteEventTransientError` (caller
  should retry on next poll, NOT advance `state_event_id`).
  `NsecBunkerRpcError` / `SignerUnavailableError` / `Nip44Error` / etc.
  → `CassetteEventDecodeError` (terminal — caller logs + skips).
- New `CassetteEventTransientError` class for the bunker-timeout case.
  Distinct from `CassetteEventDecodeError` so the consumer can log at
  INFO + retry vs WARNING + advance.
- Deleted `_get_operator_privkey_hex` (no longer needed).

### tasks.py — _handle_cassette_state_event

- Resolves the signer via `_resolve_operator_signer(machine.operator_
  user_id)`. On `CassetteTransportError` (OperatorIdentityMissing /
  SignerUnavailable), logs + skips.
- Awaits `decrypt_and_parse_state_event(event_obj, account, signer)`.
  On `CassetteEventTransientError`, logs at INFO + returns (state_event_
  id NOT advanced → consumer retries on next poll cycle).
  On `CassetteEventDecodeError`, logs at WARNING + returns (still
  state_event_id NOT advanced for v1; the WARN log surfaces the
  underlying issue for operator triage).

### tests/test_cassette_state_consumer.py — rewritten

- Three test doubles: `_FakeBunkerSigner` (working nip44_decrypt via
  hand-rolled impl), `_FakeLocalSignerStub` (raises like the post-#38
  LocalSigner stub), `_FakeRaisingSigner` (configurable exception).
- `_fake_account` helper using SimpleNamespace — the code under test
  only reads `.signer_type` + `.prvkey`.
- Five test classes covering: bunker-signer happy path (incl. multi-
  same-denom round-trip), LocalSigner transitional fallback,
  bunker-error mapping (timeout → transient, rpc reject → decode),
  payload validation (tamper / wrong-key / missing-fields / garbage
  JSON / wrong shape), d-tag construction (unchanged, kept as
  regression guard).
- Async coroutines driven via `asyncio.run` — matches the existing
  project pattern (no pytest-asyncio plugin in CI; see test_init.py
  failure mode).

### nip44.py — docstring update

Added a "Runtime status (post lnbits PR #38, 2026-05-31)" section
documenting that runtime usage moved to `signer.nip44_*` and this
module's role narrowed to (a) the LocalSigner transitional fallback
called from `cassette_transport`, and (b) test-only fixtures in
test_nip44_v2.py for spec-vector + bitspire cross-test validation.
"Don't add new runtime call sites here. The signer abstraction is
the path."

## Verification

- 155 passed, 1 pre-existing async-plugin failure unchanged. The 19
  consumer tests cover bunker happy path + LocalSigner fallback +
  bunker error mapping + payload validation + d-tag construction.
- Live smoke against Greg's RemoteBunkerSigner-migrated account
  on the regtest container: consumer correctly resolves the bunker
  signer, fires `NIP-46 rpc -> method=nip44_decrypt`, catches the
  resulting `NsecBunkerTimeoutError` (the local nsecbunkerd is not
  responding within 15s — separate operational concern), maps to
  `CassetteEventTransientError`, logs at INFO with "will retry next
  poll", and crucially does NOT advance `state_event_id` on the
  cassette_configs rows. Retry semantics preserved.

## Outstanding

- The bunker timeout itself is an operational issue (nsecbunkerd
  config / policy / process state for kind-less nip44_decrypt RPC) —
  not a satmachineadmin code concern; surface to the nsecbunkerd /
  lnbits sessions if it persists.
- Once every operator on the instance is on RemoteBunkerSigner (S7
  fully landed), the `_nip44_*_via_signer` helpers collapse to a
  direct `await signer.nip44_*` call, the LocalSigner fallback can
  be deleted, and `nip44.py`'s runtime exports retire (test-only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Padreug 2026-05-31 09:21:43 +02:00
commit dcb7de0c27
4 changed files with 573 additions and 199 deletions

View file

@ -1,17 +1,37 @@
"""
NIP-44 v2 versioned encrypted payloads (https://github.com/nostr-protocol/nips/blob/master/44.md).
Hand-rolled because lnbits ships only NIP-04 (AES-CBC) in `lnbits.utils.nostr.encrypt_content`,
and the locked design at aiolabs/satmachineadmin#29 (paired with lamassu-next#56) wires
cassette config over kind-30078 with NIP-44 v2 encrypted content. Adding a Python NIP-44
v2 lib dep was an option per the plan; chose the hand-roll path to stay dep-light and
keep the impl auditable inline.
Hand-rolled because lnbits historically shipped only NIP-04 (AES-CBC) in
`lnbits.utils.nostr.encrypt_content`, and the locked design at
aiolabs/satmachineadmin#29 (paired with lamassu-next#56) wires cassette config
over kind-30078 with NIP-44 v2 encrypted content.
Two safety nets keep this honest:
## Runtime status (post lnbits PR #38, 2026-05-31)
**Runtime usage has migrated to the signer abstraction** via
`signer.nip44_encrypt` / `signer.nip44_decrypt` on `lnbits.core.signers.base.
NostrSigner`. For RemoteBunkerSigner-backed accounts the bunker performs the
crypto and the operator's nsec never leaves the bunker process; for the
transitional LocalSigner path `cassette_transport._nip44_*_via_signer` falls
back to the helpers in this module against the stored `account.prvkey`.
This module's runtime export footprint is therefore:
- `encrypt_for` / `decrypt_from` called by the LocalSigner fallback in
`cassette_transport` until every operator on the instance is bunker-backed
(S7 / aiolabs/satmachineadmin#21). Then those calls disappear too.
- Everything else (encrypt_with_conversation_key, decrypt_with_conversation_key,
get_conversation_key, padding helpers, error classes) is **test-only**:
referenced by `tests/test_nip44_v2.py` to validate the wire format against
the canonical paulmillr/nip44 reference vectors and the bitspire cross-test
fixture posted to the coordination log.
Don't add new runtime call sites here. The signer abstraction is the path.
Two safety nets keep the impl honest:
1. tests/test_nip44_v2.py runs reference vectors + round-trip + tamper-detection.
2. bitspire posts a sample event encrypted on their nostr-tools side to the coord log;
test_decrypts_bitspire_sample_event_from_coord_log cross-checks our impl against
theirs by decrypting that event with a known privkey.
2. bitspire posts a sample event encrypted on their nostr-tools side to the
coord log; test_decrypts_bitspire_sample_event cross-checks our impl
against theirs by decrypting that event with a known privkey.
Wire format (per spec):
payload = base64( 0x02 || nonce (32B) || ciphertext (var) || mac (32B) )