fix(v2)(security): wallet IDOR + settlement-processing concurrency

Closes the HIGH-severity security finding from the v2 branch review: operator A could register a machine pointing at operator B's wallet_id (or update their machine to do so), then drain B's wallet via the settlement processor's pay_invoice call. LNbits' pay_invoice doesn't enforce caller identity at the backend layer — wallet_id is trusted as the source-of-truth for the source wallet. Two-layer defence: 1. **API layer.** New _assert_wallet_owned_by helper in views_api.py refuses any wallet_id from the request body that doesn't resolve to a wallet owned by the authenticated operator. Applied on api_create_machine and api_update_machine. Pattern lifted from the existing api_settle_client_balance which already did this for funding_wallet_id (260-265 in the original file). 2. **DB layer.** m007 adds a UNIQUE index on dca_machines.wallet_id — even if a future endpoint forgets the API check, the DB rejects two rows claiming the same wallet. CREATE UNIQUE INDEX is portable across SQLite and PostgreSQL (ALTER TABLE ADD CONSTRAINT is not on SQLite). Same commit also addresses concurrency findings H1+H2+H3 from the architectural review (race conditions on process_settlement + no retry path for errored settlements): - m007 also adds processing_claim TEXT to dca_settlements. - crud.claim_settlement_for_processing does optimistic-lock via UPDATE ... SET status='processing', processing_claim=:token WHERE id=:id AND status='pending' (portable; no UPDATE...RETURNING). Read-back compares the token; only one concurrent caller wins. - crud.reset_settlement_for_retry voids failed legs and flips 'errored' → 'pending' so process_settlement re-runs them. Completed legs are LEFT IN PLACE — we never re-pay sats that already moved. - crud.mark_settlement_status clears processing_claim on terminal states so a fresh claim attempt won't see a stale token. - distribution.process_settlement now uses the claim instead of the status-read-and-check pattern. Concurrent listener re-fires + partial-dispense recomputes can't double-pay legs. - New endpoint: POST /api/v1/dca/settlements/{id}/retry (operator-scoped) Refuses if status != 'errored' (400). Resets, then re-runs process_settlement via the claim path. DcaSettlement gains a processing_claim: Optional[str] field. Visible to operators in settlement detail; stale claims (status='processing' for many minutes) are a "processor crashed mid-flight" signal — operator can manually mark errored + retry. 32 routes registered. 72/72 tests pass. Refs: aiolabs/satmachineadmin#9 — closes the v2-branch security finding and HIGH-priority concurrency findings from the internal review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 17:37:58 +02:00 · 2026-05-14 17:37:58 +02:00 · 3ede66ff92
commit 3ede66ff92
parent d0a947b7e6
5 changed files with 169 additions and 7 deletions
--- a/crud.py
+++ b/crud.py
@ -528,7 +528,9 @@ async def mark_settlement_status(
    status: str,
    error_message: Optional[str] = None,
 ) -> Optional[DcaSettlement]:
-    """Status: 'pending' | 'processed' | 'partial' | 'refunded' | 'errored'."""
+    """Status: 'pending' | 'processing' | 'processed' | 'partial' |
+    'refunded' | 'errored'. Clears processing_claim on terminal states so a
+    fresh claim attempt won't see a stale token."""
    await db.execute(
        """
        UPDATE satoshimachine.dca_settlements
@ -537,6 +539,10 @@ async def mark_settlement_status(
            processed_at = CASE
                WHEN :status IN ('processed', 'partial', 'refunded')
                THEN :now ELSE processed_at
+            END,
+            processing_claim = CASE
+                WHEN :status = 'processing' THEN processing_claim
+                ELSE NULL
            END
        WHERE id = :id
        """,
@ -550,6 +556,64 @@ async def mark_settlement_status(
    return await get_settlement(settlement_id)


+async def claim_settlement_for_processing(
+    settlement_id: str,
+) -> Optional[DcaSettlement]:
+    """Optimistic-lock claim: atomically flip a settlement to 'processing'
+    and tag it with a per-invocation token. Returns the claimed row on
+    success; None if another caller already won the claim or the settlement
+    is not in a claimable state ('pending').
+
+    Pattern is portable across SQLite + PostgreSQL (doesn't rely on
+    UPDATE ... RETURNING). Two concurrent invocations may both run the
+    UPDATE, but only one row matches the WHERE clause; the loser's UPDATE
+    is a no-op against status='processing'. The read-back check on the
+    token disambiguates."""
+    token = urlsafe_short_hash()
+    await db.execute(
+        """
+        UPDATE satoshimachine.dca_settlements
+        SET status = 'processing', processing_claim = :token
+        WHERE id = :id AND status = 'pending'
+        """,
+        {"id": settlement_id, "token": token},
+    )
+    after = await get_settlement(settlement_id)
+    if after is None:
+        return None
+    if after.processing_claim != token:
+        return None
+    return after
+
+
+async def reset_settlement_for_retry(
+    settlement_id: str,
+) -> Optional[DcaSettlement]:
+    """Operator retry path. Flips 'errored' → 'pending' and voids any
+    'failed' legs so process_settlement re-runs them fresh. Completed legs
+    are left in place — we never re-pay sats that already moved."""
+    await db.execute(
+        """
+        UPDATE satoshimachine.dca_payments
+        SET status = 'voided'
+        WHERE settlement_id = :sid AND status = 'failed'
+        """,
+        {"sid": settlement_id},
+    )
+    await db.execute(
+        """
+        UPDATE satoshimachine.dca_settlements
+        SET status = 'pending',
+            error_message = NULL,
+            processing_claim = NULL,
+            processed_at = NULL
+        WHERE id = :id AND status = 'errored'
+        """,
+        {"id": settlement_id},
+    )
+    return await get_settlement(settlement_id)
+
+
 async def apply_partial_dispense(
    settlement_id: str,
    *,