fix(v2)(security): wallet IDOR + settlement-processing concurrency

Closes the HIGH-severity security finding from the v2 branch review:
operator A could register a machine pointing at operator B's wallet_id
(or update their machine to do so), then drain B's wallet via the
settlement processor's pay_invoice call. LNbits' pay_invoice doesn't
enforce caller identity at the backend layer — wallet_id is trusted as
the source-of-truth for the source wallet.

Two-layer defence:

1. **API layer.** New _assert_wallet_owned_by helper in views_api.py
   refuses any wallet_id from the request body that doesn't resolve to a
   wallet owned by the authenticated operator. Applied on
   api_create_machine and api_update_machine. Pattern lifted from the
   existing api_settle_client_balance which already did this for
   funding_wallet_id (260-265 in the original file).

2. **DB layer.** m007 adds a UNIQUE index on dca_machines.wallet_id —
   even if a future endpoint forgets the API check, the DB rejects two
   rows claiming the same wallet. CREATE UNIQUE INDEX is portable across
   SQLite and PostgreSQL (ALTER TABLE ADD CONSTRAINT is not on SQLite).

Same commit also addresses concurrency findings H1+H2+H3 from the
architectural review (race conditions on process_settlement +
no retry path for errored settlements):

- m007 also adds processing_claim TEXT to dca_settlements.
- crud.claim_settlement_for_processing does optimistic-lock via
  UPDATE ... SET status='processing', processing_claim=:token
  WHERE id=:id AND status='pending'  (portable; no UPDATE...RETURNING).
  Read-back compares the token; only one concurrent caller wins.
- crud.reset_settlement_for_retry voids failed legs and flips
  'errored' → 'pending' so process_settlement re-runs them. Completed
  legs are LEFT IN PLACE — we never re-pay sats that already moved.
- crud.mark_settlement_status clears processing_claim on terminal
  states so a fresh claim attempt won't see a stale token.
- distribution.process_settlement now uses the claim instead of the
  status-read-and-check pattern. Concurrent listener re-fires +
  partial-dispense recomputes can't double-pay legs.
- New endpoint:
    POST /api/v1/dca/settlements/{id}/retry  (operator-scoped)
  Refuses if status != 'errored' (400). Resets, then re-runs
  process_settlement via the claim path.

DcaSettlement gains a processing_claim: Optional[str] field. Visible to
operators in settlement detail; stale claims (status='processing' for
many minutes) are a "processor crashed mid-flight" signal — operator
can manually mark errored + retry.

32 routes registered. 72/72 tests pass.

Refs: aiolabs/satmachineadmin#9 — closes the v2-branch security finding
and HIGH-priority concurrency findings from the internal review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Padreug 2026-05-14 17:37:58 +02:00
commit 3ede66ff92
5 changed files with 169 additions and 7 deletions

View file

@ -36,6 +36,7 @@ from .calculations import (
)
from .crud import (
apply_partial_dispense,
claim_settlement_for_processing,
count_completed_legs_for_settlement,
create_dca_payment,
get_client_balance_summary,
@ -297,13 +298,22 @@ async def apply_partial_dispense_and_redistribute(
async def process_settlement(settlement_id: str) -> None:
"""Process a pending settlement end-to-end. Safe to invoke multiple
times the status='processed' guard skips already-processed rows."""
settlement = await get_settlement(settlement_id)
"""Process a pending settlement end-to-end.
Concurrency-safe: an optimistic-lock claim flips the settlement to
'processing' atomically and tags it with a per-invocation token.
Concurrent invocations on the same id can't both win — losers see the
claim mismatch on read-back and return without writing any legs.
Retries land via reset_settlement_for_retry which voids failed legs
and flips 'errored' back to 'pending'."""
settlement = await claim_settlement_for_processing(settlement_id)
if settlement is None:
logger.warning(f"distribution: settlement {settlement_id} not found")
return
if settlement.status != "pending":
# Either already claimed by a concurrent invocation, or not in a
# 'pending' state. Either way, nothing to do here.
logger.debug(
f"distribution: skip {settlement_id} — not claimable (already "
"processing or not pending)"
)
return
machine = await get_machine(settlement.machine_id)
if machine is None: