Skip to content

Fix from_format crash on multi-character literal blocks#972

Open
vineethsaivs wants to merge 1 commit into
python-pendulum:masterfrom
vineethsaivs:fix/from-format-literal-blocks
Open

Fix from_format crash on multi-character literal blocks#972
vineethsaivs wants to merge 1 commit into
python-pendulum:masterfrom
vineethsaivs:fix/from-format-literal-blocks

Conversation

@vineethsaivs

Copy link
Copy Markdown

Fixes #971

Formatter.parse() runs re.escape() on the format string, so a literal block such as [de] arrives as \[de\]. The token regex _FROM_FORMAT_RE only suppressed the characters immediately adjacent to the brackets (via the lookbehind/lookahead), so token letters in the middle of a multi-character literal were still tokenized:

>>> pendulum.from_format("21 de noviembre del 2023", "DD [de] MMMM [del] YYYY", locale="es")
AttributeError: 'NoneType' object has no attribute 'values'   # the "e" in [del]
>>> pendulum.from_format("21 de noviembre de 2023", "DD [de] MMMM [de] YYYY", locale="es")
re.error: redefinition of group name 'd'

This matches the whole escaped literal block as a single token in _FROM_FORMAT_RE and unwraps it in _replace_tokens, so the inner text stays literal regardless of which token letters it contains. Single-character literals like [T]/[Z] keep working, and the formatting path (format()) is untouched.

Added a regression test (test_from_format_with_multi_character_escaped_elements). The full from_format and formatting test suites pass.

Formatter.parse() runs re.escape() on the format string first, so a
literal block like [de] arrives as \[de\]. The token regex only
suppressed the characters immediately adjacent to the brackets, so token
letters in the middle of a multi-character literal (for example the d and
e in [del]) were still tokenized, raising AttributeError or a "redefinition
of group name" re.error.

Match the whole escaped literal block as a single token in _FROM_FORMAT_RE
and unwrap it in _replace_tokens, keeping the inner text as a literal.

Fixes python-pendulum#971
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

from_format spanish cause error (copied from google examples)

1 participant