fix(accounts): match Beancount's DATE grammar in duplicate detection (libra-#48)

_open_directive_exists hardcoded '^YYYY-MM-DD open ' (dash-only, 2-digit,
single-space), but Beancount's DATE token (parser/lexer.l) is
(17|18|19|20)[0-9]{2}[-/][0-9]+[-/][0-9]+ and inter-token whitespace is any
[ \t\r] run. So a validly-formatted existing Open written as '2024/3/5 open X'
or '2020-01-01  open  X' escaped detection → duplicate Open appended →
bean-check rejects the file. Anchor on Beancount's actual date pattern and
[ \t]+ separators. Adds parametrized coverage for slash/single-digit/multi-
space/tab variants.

Found in a coherence pass over the Beancount source.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Padreug 2026-06-17 10:27:18 +02:00
commit 3adb3d356a
2 changed files with 29 additions and 4 deletions

View file

@ -60,11 +60,21 @@ def _escape_beancount_string(value: str) -> str:
)
# Beancount's DATE token (parser/lexer.l): (17|18|19|20)[0-9]{2}[-/][0-9]+[-/][0-9]+
# — '-' OR '/' separators, 1+ digit month/day. Inter-token whitespace is any
# run of [ \t\r] (ignored by the lexer). The duplicate-detection regex must
# mirror this, or a validly-formatted existing Open (e.g. '2024/3/5 open X' or
# '2020-01-01 open X') escapes detection and a duplicate Open is appended,
# which bean-check then rejects — breaking every later write.
_OPEN_DATE = r"(?:17|18|19|20)\d\d[-/]\d+[-/]\d+"
def _open_directive_exists(source: str, account_name: str) -> bool:
"""Return True if `source` already contains an Open directive for exactly
`account_name`.
Anchored to a real `YYYY-MM-DD open <account>` directive line (re.MULTILINE)
Anchored to a real `<date> open <account>` directive line (re.MULTILINE),
with `<date>` and the inter-token whitespace matching Beancount's grammar,
so the account name can't match text inside another account's description
metadata or a comment (false positive spurious 409). The trailing
negative-lookahead `(?![\\w:-])` requires the next char not to be an
@ -72,12 +82,11 @@ def _open_directive_exists(source: str, account_name: str) -> bool:
- a prefix (Expenses:Gas) does not match a longer sibling
(Expenses:GasStation / Expenses:Gas:Vehicle), and
- a real directive with an inline comment and no space
(`open Expenses:Gas;legacy`) is still detected (`;` ends the name),
which the previous `(?:\\s|$)` boundary missed duplicate write.
(`open Expenses:Gas;legacy`) is still detected (`;` ends the name).
"""
return bool(
re.search(
rf"^\d{{4}}-\d{{2}}-\d{{2}} open {re.escape(account_name)}(?![\w:-])",
rf"^{_OPEN_DATE}[ \t]+open[ \t]+{re.escape(account_name)}(?![\w:-])",
source,
re.MULTILINE,
)