-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add named groups for python #316
base: main
Are you sure you want to change the base?
Conversation
This is just my first draft where I made sure that everything "works". If/once the approach is agreed, I will still need to do tests, documentation, etc., but I thought it more prudent to share the current approach I am suggesting before doing these things! Feedback most welcome! I also committed quite a bit of refactoring unrelated to this PR, which I will undo and raise separate PRs for |
The last time we've done a big change like this I think was the abolition of the Transformer proc. And the introduction of this (cucumber-expressions library), "proper" (But I could be wrong here). So I'd prefer that we go down the route of releasing this all simultaneously. In terms of feature flags, I'm happy for it to sit behind feature flags - but I'd rather it be simple and purely from a dev POV to avoid us needing to review a leviathan PR. In other words, by all means I'm happy for it to be technically easier to work in whatever way is best, but come our next full release I'd prefer it to be entirely enabled (The major to include this does not need to be the next one - currently v19) I'm only one of the main contributors though, so it's not entirely my decision. But when cucumber-expressions were released, they were released simultaneously - Admittedly this was a long time ago with many more different people at the helm. TL;DR - I'm pro this change, and anti it "sitting behind feature flags when released" (But during development go for it). |
I don’t have a particular view on the change itself, but do I agree with @luke-hill re flags - I don’t see a lot of value in flags here vs just making it a semver major change and calling attention to it in the release notes. Cucumber implementations that use this library will be pinned to at least a minor range, and other consumers should similarly assume breaking changes in a major. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good to me, but I do have some remarks, nothing major though. Apologies if they're a bit scattered, I'm throwing these out as I go through the PR.
Also please do add an entry to the CHANGELOG.
for item in parameter_types_and_names: | ||
if not isinstance(item, tuple) or len(item) != 2: | ||
raise CucumberExpressionError( | ||
f"Expected a tuple of (ParameterType, Optional[str]), but got {type(item)}: {item}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error is very technical. What should a user do if/when they encounter this error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also do most users know what a tuple is?
) -> Tuple[Optional[str], Optional[ParameterType]]: | ||
"""Helper function to parse the parameter name and return group_name and parameter_type.""" | ||
if ":" in name: | ||
group_name, parameter_type_name = [part.strip() for part in name.split(":")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows for the empty group name, which is distinct from the None
group name. Probably now what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be worth while to push this into the parser.
def _extract_text_in_curly_brackets(string: str) -> list: | ||
return CURLY_BRACKET_PATTERN.findall(string) | ||
|
||
def is_cucumber_expression(self, expression_string: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this check is simple enough. The primary constraint is explaining to people what is and is not a cucumber expression. For Java I eventually settled on requiring that all regular expressions start with ^
or end with $
and that everything else is a Cucumber expression. This is both simple and unambiguous.
This helps avoid a situation where a user makes a mistake in a Cucumber expression, causing Cucumber to think it is a regular expressions and then fail because the regular expression also isn't valid and results in a very cryptic error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth standardising this check then across all flavours? I have no idea what we do in ruby as I've not dug into this stuff since the initial release some 4/5 years ago
if source[index + 1] != "?": | ||
# (X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put these comments back. It's really helpful to have a reference here.
# (?>X) | ||
return True | ||
# (?<=X) or (?<!X) else (?<name>X) | ||
return source[index + 3] in ["=", "!"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency between implementations it would be good to keep this similar too. It really helps if all the implementations are similar enough that you can reference a language implementation you do know.
Though the syntax for a named group is Python-specific so it would be good to add a separate case for that and comment on it.
|
||
from cucumber_expressions.argument import Argument | ||
from cucumber_expressions.parameter_type import ParameterType | ||
from cucumber_expressions.parameter_type_registry import ParameterTypeRegistry | ||
from cucumber_expressions.tree_regexp import TreeRegexp | ||
|
||
NAMED_CAPTURE_GROUP_REGEX = re.compile(r"\?P<([^>]+)>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be non-greedy.
I wasn't aware that named capture groups in Python differ from other languages. Where Java, Javascript and Ruby us I'm also missing some error handling around the And adding to the shared test set, even if failing would be good too. |
I have one request so far (Pre-review). Should we have a major release where we change cucumber-expressions to ban the Obviously this only holds if the agreed path for naming is as specified here - which I think most of us are happy with |
I'm not sure about the release strategy yet. I don't quite have time to sponsor a Java implementation, I'm currently working using the message format everywhere and technical debt that is pulling to the surface. It does make me favor feature toggles though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I wanted to ask is whether you'd want to get both named for param types and regex out simultaneously or whether you'd want / consider doing them separately.
Purely thinking about the polyglot implementation (Unless you're volunteering to write a bunch of other flavours?)
for item in parameter_types_and_names: | ||
if not isinstance(item, tuple) or len(item) != 2: | ||
raise CucumberExpressionError( | ||
f"Expected a tuple of (ParameterType, Optional[str]), but got {type(item)}: {item}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also do most users know what a tuple is?
tree_regexp: TreeRegexp, text: str, parameter_types: List | ||
tree_regexp: TreeRegexp, | ||
text: str, | ||
parameter_types_and_names: List[Tuple[ParameterType, Optional[str]]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this maybe be parameter_types_with_names
which then would make sense because the name could often be nil (Which feels "right")
raise CucumberExpressionError( | ||
f"Group has {len(arg_groups)} capture groups, but there were {len(parameter_types)} parameter types" | ||
f"Group has {len(arg_groups)} capture groups, but there were {param_count} parameter types/names" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think the ending of this error shouldn't be amended - the issue is still that there were an incorrect number of parameter types (The names being present / not is irrelevant for the length issue)
def _extract_text_in_curly_brackets(string: str) -> list: | ||
return CURLY_BRACKET_PATTERN.findall(string) | ||
|
||
def is_cucumber_expression(self, expression_string: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth standardising this check then across all flavours? I have no idea what we do in ruby as I've not dug into this stuff since the initial release some 4/5 years ago
""" | ||
group_name_start = index + 3 | ||
group_name_end = source.find(">", group_name_start) | ||
return source[group_name_start:group_name_end] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My python is basically non existent, but above we're using a : b
and here we're using a:b
- Do they mean diff things if not maybe keep things standard?
def group_builder(self): | ||
return self._group_builder | ||
# If it's a named group (e.g., (?P<name>...)), it's still a capturing group | ||
if source[index + 2] == "P" and source[index + 3] == "<": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier we use a substring over a range and here we're using 2 diff substring char matches.
In my head we should probably be using a range in all situations
|
||
def test_documents_match_arguments_with_names_and_spaces(self): | ||
values = match( | ||
"I have { cuke_count : int } cuke(s) and {gherkin_count: int} gherkin(s)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a python specific interpretation. I'm 90% sure we don't permit spaced out arguments inside the braces. but again I'd need to triple check
🤔 What's changed?
Expression matching now returns a tuple: the value as before and an optional name if the expression has a name or the regex uses a named capture group.
The format for the Cucumber Expression when specifying name AND type is:
Where the part before the colon is the name of the arg, and after is the type we are currently using for Cucumber Expressions.
The return type is change is currently a breaking change, and will have obviously need to put it probably behind some feature flag, and make the default return the old, expected single value again, and the new tuple only if enabled. I have not done this yet as I wanted to check the breadth of test cases if the feature was fully enabled for the PoC before implementing. Also, not sure how best to feature flag!
To resolve #206
⚡️ What's your motivation?
Python (in particular pytest-bdd) uses other args in the step definitions, such as fixtures and reserved args for "datatable" and "docstring", so just mapping step arg values to step args in the expressions is not reliable or user-friendly. It is a blocker currently for adopting Cucumber Expressions into the pytest-bdd framework.
🏷️ What kind of change is this?
♻️ Anything particular you want feedback on?
Approach, public API changes, format of the named args, whether it's acceptable in general!
📋 Checklist:
This text was originally generated from a template, then edited by hand. You can modify the template here.