Tackler-style audit triplet: txn-set-checksum + selector-checksum on reports #26

Open
opened 2026-06-06 11:02:17 +00:00 by padreug · 0 comments
Owner

Goal

Add cryptographic audit metadata to balance reports, modeled on Tackler's triplet: given a report, an auditor can independently verify (a) which version of the books fed it, (b) that the transaction set was complete, and (c) which filter/selector produced it. The combination makes a historical report regenerable byte-for-byte.

This is the property that motivated the whole "are we plain-text enough" question. Without it, a balance from last quarter is a number; with it, it's a number backed by a cryptographic proof you can hand to a collective member.

Prerequisites

Depends on #24 (reversing entries) and #25 (git-backed journal). The checksum only means what we want it to mean if:

  • Postings are immutable by construction (#24) — otherwise the checksum hashes a mutable thing.
  • The ledger lives in git with commit-per-write (#25) — supplies the commit-id leg of the triplet for free.

Both prerequisites are about ensuring the txn-set-checksum has evidentiary weight, not just computational existence. Without them, the checksum is computable but its meaning collapses back onto whatever non-cryptographic property is actually load-bearing.

The triplet

For each balance report endpoint (start with GET /api/v1/balance, GET /api/v1/balances/all), include in the response:

{
  "balance": ,
  "audit": {
    "commit_id": "abc123…",         // git HEAD at report time
    "txn_set_checksum": "sha256…",  // hash over the set of transactions fed the calculation
    "selector": "account ~ ':User-af983632' AND flag = '*'",  // the query that produced the set
    "selector_checksum": "sha256…"  // hash of the selector string
  }
}

Computing txn-set-checksum

Tackler's contrib script (txn-set-checksum.sh) is the reference implementation: sort entries deterministically, serialize each to a canonical form, concatenate, SHA-256. Port to Python and expose as audit.compute_txn_set_checksum(entries: list[Transaction]) -> str.

The contrib script is meant to be independently runnable so an auditor can verify Libra isn't lying about its own inputs. Document the canonicalization rules carefully so a standalone Python script (or even a bash + bean-query pipeline) can recompute the same checksum from the ledger file. This is the part of the triplet that gives it independent value.

Computing selector-checksum

Just SHA-256 of the BQL string (after a deterministic normalization — whitespace collapse, lowercase keywords). Cheap.

Reproducibility

The full property: given commit_id, selector, and txn_set_checksum from an old report, an auditor can:

  1. git checkout <commit_id> against the ledger repo
  2. Run bean-query <selector> against the checked-out file
  3. Recompute the txn-set-checksum from the matched entries
  4. Verify it matches what the old report claimed

If all three match, the report is provably the same one the books would produce today against that historical state.

Scope

  • audit.py module with compute_txn_set_checksum, compute_selector_checksum, current_commit_id.
  • Wrap balance endpoints to include the audit block in responses.
  • Standalone CLI verifier: libra-verify-audit <commit_id> <selector> <expected_checksum> that mirrors what an external auditor would do — kept as a separate executable script so the verifier itself isn't trusted code.
  • Tests: round-trip (compute, then recompute and compare). Tamper-detection (mutate an entry, checksum changes). Independent-verifier compatibility (the Python verifier and a small bash + bean-query script produce the same number).

Out of scope

  • Signing the audit block with a Nostr key or other identity — possible future enhancement, but the cryptographic property here is integrity (the books haven't changed) not authentication (who said this). Don't conflate.
  • A UI for browsing audit metadata — file separately when the API surface is stable.

Dependencies

  • Requires #24 (reversing entries).
  • Requires #25 (git-backed journal).
## Goal Add cryptographic audit metadata to balance reports, modeled on [Tackler's triplet](https://tackler.fi/docs/tackler/latest/reference/auditing/): given a report, an auditor can independently verify (a) which version of the books fed it, (b) that the transaction set was complete, and (c) which filter/selector produced it. The combination makes a historical report regenerable byte-for-byte. This is the property that motivated the whole "are we plain-text enough" question. Without it, a balance from last quarter is a number; with it, it's a number backed by a cryptographic proof you can hand to a collective member. ## Prerequisites Depends on #24 (reversing entries) and #25 (git-backed journal). The checksum only means what we want it to mean if: - Postings are immutable by construction (#24) — otherwise the checksum hashes a mutable thing. - The ledger lives in git with commit-per-write (#25) — supplies the commit-id leg of the triplet for free. Both prerequisites are about ensuring the txn-set-checksum has **evidentiary weight**, not just **computational existence**. Without them, the checksum is computable but its meaning collapses back onto whatever non-cryptographic property is actually load-bearing. ## The triplet For each balance report endpoint (start with `GET /api/v1/balance`, `GET /api/v1/balances/all`), include in the response: ```json { "balance": …, "audit": { "commit_id": "abc123…", // git HEAD at report time "txn_set_checksum": "sha256…", // hash over the set of transactions fed the calculation "selector": "account ~ ':User-af983632' AND flag = '*'", // the query that produced the set "selector_checksum": "sha256…" // hash of the selector string } } ``` ### Computing txn-set-checksum Tackler's contrib script (`txn-set-checksum.sh`) is the reference implementation: sort entries deterministically, serialize each to a canonical form, concatenate, SHA-256. Port to Python and expose as `audit.compute_txn_set_checksum(entries: list[Transaction]) -> str`. The contrib script is meant to be **independently runnable** so an auditor can verify Libra isn't lying about its own inputs. Document the canonicalization rules carefully so a standalone Python script (or even a bash + `bean-query` pipeline) can recompute the same checksum from the ledger file. This is the part of the triplet that gives it independent value. ### Computing selector-checksum Just SHA-256 of the BQL string (after a deterministic normalization — whitespace collapse, lowercase keywords). Cheap. ### Reproducibility The full property: given `commit_id`, `selector`, and `txn_set_checksum` from an old report, an auditor can: 1. `git checkout <commit_id>` against the ledger repo 2. Run `bean-query <selector>` against the checked-out file 3. Recompute the txn-set-checksum from the matched entries 4. Verify it matches what the old report claimed If all three match, the report is provably the same one the books would produce today against that historical state. ## Scope - `audit.py` module with `compute_txn_set_checksum`, `compute_selector_checksum`, `current_commit_id`. - Wrap balance endpoints to include the `audit` block in responses. - Standalone CLI verifier: `libra-verify-audit <commit_id> <selector> <expected_checksum>` that mirrors what an external auditor would do — kept as a separate executable script so the verifier itself isn't trusted code. - Tests: round-trip (compute, then recompute and compare). Tamper-detection (mutate an entry, checksum changes). Independent-verifier compatibility (the Python verifier and a small bash + `bean-query` script produce the same number). ## Out of scope - Signing the audit block with a Nostr key or other identity — possible future enhancement, but the cryptographic property here is *integrity* (the books haven't changed) not *authentication* (who said this). Don't conflate. - A UI for browsing audit metadata — file separately when the API surface is stable. ## Dependencies - Requires #24 (reversing entries). - Requires #25 (git-backed journal).
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
aiolabs/libra#26
No description provided.