-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Val
data model, a list of strings
#2567
base: master
Are you sure you want to change the base?
Conversation
This lint was recently added in Clippy 1.84.0.
In support of: * casey#2458 * casey#1988 A `Val` is a list of strings, but at each point of usage we must decide whether to view it as a singular joined string or as a list of its parts. Previously, Just values were single strings, so in most places we have to invoke `to_joined()` in order to maintain compatibility. In particular, recipes, functions, and operators like `+` or `/` operate solely on strings. This includes logical operators like `&&`, which continue to be defined on strings. That means, the values `[]` and `['']` are currently completely equivalent. So far, this is a purely internal change without externally visible effects. Only the `Bindings`/`Scope`/`Evaluator` had API changes. No new syntax is implemented. However, in expectation of expressions that build lists, a new `evaluate_list_expression() -> Vec<String>` method is introduced that could be used to implement splat or generator expressions. It is already referenced wherever we have lists of arguments, e.g. variadic functions like `shell()` and dependency arguments. But because singular expressions are equivalent to a joined string, this is an unobservable detail for now. For experimenting with lists of strings, variadic recipe parameters like `*args` now produce a multi-part Val, and default to an empty list (not a list with an empty string). Because all operators use `to_joined()`, this is an unobservable implementation detail. However, if any operator becomes list-aware, then this detail should be reverted, or moved behind an unstable setting. For better understanding of the current behavior, I added a bunch of tests. These will help detect regressions if functions or operators become list-aware. No existing tests had to be touched. Next steps: This change is just the foundation for other work, but some ideas are mutually exclusive. Relevant aspects: * list syntax in casey#2458 * list aware operators in casey#2458 * lossless forwarding of variadics: casey#1988 * invoking dependencies multiple times: casey#558 The preparatory work like `evaluate_list_expression()` is biased towards implementing a splat operator that would enable casey#2458 list syntax and casey#1988 list forwarding, but doesn't commit us to any particular design yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
This is bikeshedding, but what do you think calling it List
instead of Val
?
I originally suggested Val
, but I'm thinking that the "everything is a list of strings" data model is somewhat exotic, so calling it a List
explicitly in the codebase might be helpful for comprehension.
On the other hand, we could also leave it a Val
for now, and consider this rename once we support user-facing changes which actually use it as a list.
There's aready a List
type, but it's a minor type used to help formatting, so renaming the existing List
to something else so we can use List
for what is now called Val
might be worth it.
Don't worry about making this change now, we can either make it before the PR goes in, or in a follow-up.
@@ -83,50 +83,63 @@ impl<'src, 'run> Evaluator<'src, 'run> { | |||
Ok(self.scope.value(name).unwrap()) | |||
} | |||
|
|||
/// A place for adding list operators in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little confusing to me, since it adds some indirection to follow to figure out what's going on. I think we should probably remove it, and just use evaluate_expression
, and we can add it back later when we need it.
/// It can be viewed as a "joined string", or as the individual "parts". | ||
#[derive(Debug, Clone, Default)] | ||
pub(crate) struct Val { | ||
parts: Rc<[Box<str>]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rc<[Box<str>]>
is a little exotic. Is the advantage that extra capacity is freed when compared to an Rc<Vec<String>>
?
I'd be in favor of using Rc<Vec<String>>
, just because it's easier to read, and switching to Rc<[Box<str>]
if we really need to.
} | ||
|
||
/// Construct a `Val` consisting of multiple parts. | ||
pub fn from_parts<I, S>(parts: I) -> Self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think element
is the familiar term for things in array or linked list, so maybe this should be from_elements
? and val.parts
should be val.elements
?
} | ||
|
||
/// Convert to a single joined string. | ||
pub fn to_joined(&self) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can just be join()
.
impl std::fmt::Display for Val { | ||
/// When formatted as a string, a `Val` is joined by whitespace. | ||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | ||
f.write_str(&self.to_joined()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should do:
for (i, element) in &self.elements {
if i > 0 {
write!(f, " ")?; // or a f.write_char if that exists
}
f.write_str(element)?;
}
Ok(())
And then join
should call self.to_string()
. Doing it this way means that you don't need to allocate if you don't have to, i.e., you're printing to stdout or using it in a format string.
} | ||
|
||
/// Construct a `Val` consisting of a single string. | ||
pub fn from_str<S: AsRef<str>>(s: S) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be removed in favor of val.into()
along with the From
implementations.
} | ||
|
||
/// Convert to individual parts. | ||
pub fn to_parts(&self) -> Vec<Box<str>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this return just &[String]
, and rely on the caller to convert it to a vec? Avoids allocation if the caller doesn't need to.
} | ||
} | ||
|
||
#[cfg(test)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This this and the tests should be inside a #[cfg(test)] mod tests { … }
block.
@@ -81,3 +81,25 @@ fn nesting() { | |||
evaluate("'' || '' || '' || '' || 'foo'", "foo"); | |||
evaluate("'foo' && 'foo' && 'foo' && 'foo' && 'bar'", "bar"); | |||
} | |||
|
|||
#[test] | |||
fn empty_variadics_are_falsey() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also add test cases for comparing empty variadics with the empty string with==
, !=
, =~
, and !~
, since those are cases where I can imagine accidentally introducing a breaking change.
#[track_caller] | ||
fn eval_variadics<'a>(args: impl AsRef<[&'a str]>, expected: &'a str) { | ||
Test::new() | ||
.justfile("@foo *args:\n echo {{ args && 'true' || 'false' }}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should have separate tests for &&
and ||
. Operator precedence is always confusing, so separating them might help comprehension.
In support of:
A
Val
is a list of strings, but at each point of usage we must decide whether to view it as a singular joined string or as a list of its parts. Previously, Just values were single strings, so in most places we have to invoketo_joined()
in order to maintain compatibility. In particular, recipes, functions, and operators like+
or/
operate solely on strings. Thisincludes logical operators like
&&
, which continue to be defined on strings. That means, the values[]
and['']
are currently completely equivalent.So far, this is a purely internal change without externally visible effects. Only the
Bindings
/Scope
/Evaluator
had API changes.No new syntax is implemented. However, in expectation of expressions that build lists, a new
evaluate_list_expression() -> Vec<String>
method is introduced that could be used to implement splat or generator expressions. It is already referenced wherever we have lists of arguments, e.g. variadic functions likeshell()
and dependency arguments. But because singular expressions are equivalent to a joined string, this is an unobservable detail for now.For experimenting with lists of strings, variadic recipe parameters like
*args
now produce a multi-part Val, and default to an empty list (not a list with an empty string). Because all operators useto_joined()
, this is an unobservable implementation detail. However, if any operator becomes list-aware, then this detail should be reverted, or moved behind an unstable setting.For better understanding of the current behavior, I added a bunch of tests. These will help detect regressions if functions or operators become list-aware. No existing tests had to be touched.
Next steps: This change is just the foundation for other work, but some ideas are mutually exclusive. Relevant aspects:
The preparatory work like
evaluate_list_expression()
is biased towards implementing a splat operator that would enable #2458 list syntax and #1988 list forwarding, but doesn't commit us to any particular design yet. I intend to write an RFC-style issue comment that proposes a concrete path forward and discusses the various tradeoffs.