-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace the blib2to3 tokenizer with pytokens #4536
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
@JelleZijlstra with this, the test suite is fully passing. Primer is failing (mostly just because some file in |
I don't think we currently have any tests for this, but I just linked the above two issues here because they are the same bug in the parser where |
Thanks for linking this, I'll make sure these parse identically to how CPython does it. |
Okay, primer is fixed, and all tests are green. |
Might this fuzzer failure indicate a bug?
|
token: pytokens.Token, source: str, prev_token: Optional[pytokens.Token] | ||
) -> pytokens.Token: | ||
r""" | ||
Black treats `\\\n` at the end of a line as a 'NL' token, while it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't sound particularly intentional, I'd be open to changing Black to remove this divergence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to! I can give you a test case with the expected behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is enough as a test case actually:
a \
b
But, the reason this probably exists is to support formatting this file:
class Plotter:
\
pass
class AnotherCase:
\
"""Some
\
Docstring
"""
Yeah, but I'm pretty sure it is a bug in CPython. For now we can work it out in the tokenizer though. I'll add a flag in |
I found another issue that this should close, #2318 |
Description
Replaces black's tokenizer with a from-scratch rewrite done by me. We could vendor the code into black itself, but either pinning it or keeping it as-is would be my recommendation, the tokenizer can be used by multiple tools for perfect compatibility.
Resolves #4520
Resolves #970
Resolves #3700
Tests passing so far: 381/381 (!)