-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lint: Add check-peps.py
#3275
Lint: Add check-peps.py
#3275
Conversation
LGTM. From my testing, it appears that this may place a minimum Python version of 3.9 on anyone who's going to edit PEPs; I think that's not unreasonable. Definitely in favour of doing these sorts of checks in a well-laid-out Python script, there are quite a few checks to be done. |
Yep, I used A |
Yep. Systems that don't have Python 3.9 include Debian Old-Old-Stable aka "Buster" and Ubuntu 20.04LTS "Focal", so I think it's perfectly reasonable to require it. There'll certainly be some servers out there that are on older Pythons, but editing PEPs is an interactive thing and I don't see a problem here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I'd been thinking the same thing :)
We can make this a pre-commit hook too: https://pre-commit.com/#new-hooks (can be a follow-up).
It'd be really good to have some unit tests, so we can easily test what's valid and invalid. (Some type hints would be nice but not essential.)
Not tested yet, here's a few suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't have time at the moment for a full review, so just skimming this and giving some high-level feedback...
The error messages you've clearly put a lot of effort into are a massive author-friendly improvement over the original pygrep checks, and moving this to a Python file allows for much more powerful logic without long and cryptic regexes. It is also (apparently) a lot more efficient to run all the checks in a single Python invocation and a single pass through the files, as the previous regexes had a substantial amount of overhead. Additionally, the test suite is really nice to help validate that the checks are working as expected, and cover all the designed cases.
Aside from my specific suggestions here, I do have a general concern, though: without a clear forward plan to implement #2587 , this results in even more duplication and bifurcation of the header parsing (and in some cases, validation) logic than already exists (in the transform for rendered PEPs, in the PEP 0 code for the index, in the RSS generation script, in the peps.json
code, and (at least temporarily) in the existing pre-commit
config, in addition to here. On top of the specific design choices in this script itself, his leads to a significant maintenance burden to add or change anything with headers, as it potentially requires modifying things a half-dozen places, and is also a magnet for hard to spot bugs due to nonresistance in the logic.
It's not clear to me how much if any of the logic in the current design could be used elsewhere, or if there's a clear path towards eventually centralizing the core header parsing and validation in a single module/subpackage. Do you have any insight there?
As a corollary, its probably just because you're a lot better at programming than I am, but I personally found the overall imperative design with fancy generators, to be a little tricky for me to decipher and follow at first, though after studying it for a while, I think I have a decent handle on it now. This does make me a bit concerned for others maintaining the code, but I'm not sure there's much you can do about it without re-writing the entire thing, which is obviously a non-starter.
One thing I'd like to see is taking advantage of an object-oriented structure where appropriate, at least for the individual header checks themselves, putting everything for a specific header in one place, and making it easier to add a new header check or modify an existing one without having to change things many different places.
Specifically, you could have a class HeaderCheck
with a header_name
(or similar) class attribute for the header name and a run()
method with the common signature, and then have those class instances in a list, or ordered dict of name: class
. This would mean the header names and their checks would all be mapped in one place, you'd just have to add a line there to add a new check class, and dispatch to the appropriate method would be a one-liner.
You could then put common logic, like handling the line number and warning messages in the superclass, as well as handle common types of checks (value must be one of a set of fixed values, value must be a date, value must be a URL, value must an an email, etc) in common subclasses rather than duplicating that logic in each individual check.
That being said, I'm not sure how well that would interact with everything being a generator, and its also the kind of change that's probably out of scope with this PR, but something to think about for the future, and I'm curious if you think its viable.
Re: generators, I've not checked other linters, but sphinx-lint does a similar thing by https://github.com/sphinx-contrib/sphinx-lint/blob/main/sphinxlint/checkers.py Re: duplication/bifurcation, this is primarily to replace the pre-commit regexes, and we'll be deleting those soon. We might be able to further refactor to reuse code from the extension, but validation and parsing are kind of different jobs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just some little linting things.
Also, do we want this to be executable only via python check-pep.py
or also ./check-pep.py
?
Check that shebangs are executable................................................Failed
- hook id: check-shebang-scripts-are-executable
- exit code: 1
check-pep.py: has a shebang but is not marked executable!
If it is supposed to be executable, try: `chmod +x check-pep.py`
If on Windows, you may also need to: `git add --chmod=+x check-pep.py`
If it not supposed to be executable, double-check its shebang is wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this!
# Conflicts: # pep-0499.txt # pep_sphinx_extensions/tests/conftest.py
Thanks @hugovk and @CAM-Gerlach for the reviews, really appreciated A |
pep-lint
represents all of the custom PEP lints in.pre-commit-config.yaml
, converted into a single Python file. This is for (hopefully) easier maintenance, as I find it quite hard to decipher the regular expressions in.pre-commit-config.yaml
, especially as they work on negative matching.I haven't changed
.pre-commit-config.yaml
for now, but when we are confident inpep-lint
, I would suggest removing the custom PEP rules such that there is a single-source of truth.A
📚 Documentation preview 📚: https://pep-previews--3275.org.readthedocs.build/