Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Val data model, a list of strings #2567

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

latk
Copy link

@latk latk commented Jan 11, 2025

In support of:

A Val is a list of strings, but at each point of usage we must decide whether to view it as a singular joined string or as a list of its parts. Previously, Just values were single strings, so in most places we have to invoke to_joined() in order to maintain compatibility. In particular, recipes, functions, and operators like + or / operate solely on strings. This
includes logical operators like &&, which continue to be defined on strings. That means, the values [] and [''] are currently completely equivalent.

So far, this is a purely internal change without externally visible effects. Only the Bindings/Scope/Evaluator had API changes.

No new syntax is implemented. However, in expectation of expressions that build lists, a new evaluate_list_expression() -> Vec<String> method is introduced that could be used to implement splat or generator expressions. It is already referenced wherever we have lists of arguments, e.g. variadic functions like shell() and dependency arguments. But because singular expressions are equivalent to a joined string, this is an unobservable detail for now.

For experimenting with lists of strings, variadic recipe parameters like *args now produce a multi-part Val, and default to an empty list (not a list with an empty string). Because all operators use to_joined(), this is an unobservable implementation detail. However, if any operator becomes list-aware, then this detail should be reverted, or moved behind an unstable setting.

For better understanding of the current behavior, I added a bunch of tests. These will help detect regressions if functions or operators become list-aware. No existing tests had to be touched.

Next steps: This change is just the foundation for other work, but some ideas are mutually exclusive. Relevant aspects:

The preparatory work like evaluate_list_expression() is biased towards implementing a splat operator that would enable #2458 list syntax and #1988 list forwarding, but doesn't commit us to any particular design yet. I intend to write an RFC-style issue comment that proposes a concrete path forward and discusses the various tradeoffs.

latk added 2 commits January 11, 2025 10:07
This lint was recently added in Clippy 1.84.0.
In support of:

* casey#2458
* casey#1988

A `Val` is a list of strings, but at each point of usage we must decide whether
to view it as a singular joined string or as a list of its parts. Previously,
Just values were single strings, so in most places we have to invoke
`to_joined()` in order to maintain compatibility. In particular, recipes,
functions, and operators like `+` or `/` operate solely on strings. This
includes logical operators like `&&`, which continue to be defined on strings.
That means, the values `[]` and `['']` are currently completely equivalent.

So far, this is a purely internal change without externally visible effects.
Only the `Bindings`/`Scope`/`Evaluator` had API changes.

No new syntax is implemented. However, in expectation of expressions that build
lists, a new `evaluate_list_expression() -> Vec<String>` method is introduced
that could be used to implement splat or generator expressions. It is already
referenced wherever we have lists of arguments, e.g. variadic functions like
`shell()` and dependency arguments. But because singular expressions are
equivalent to a joined string, this is an unobservable detail for now.

For experimenting with lists of strings, variadic recipe parameters like `*args`
now produce a multi-part Val, and default to an empty list (not a list with an
empty string). Because all operators use `to_joined()`, this is an unobservable
implementation detail. However, if any operator becomes list-aware, then this
detail should be reverted, or moved behind an unstable setting.

For better understanding of the current behavior, I added a bunch of tests.
These will help detect regressions if functions or operators become list-aware.
No existing tests had to be touched.

Next steps: This change is just the foundation for other work, but some ideas
are mutually exclusive. Relevant aspects:

* list syntax in casey#2458
* list aware operators in casey#2458
* lossless forwarding of variadics: casey#1988
* invoking dependencies multiple times: casey#558

The preparatory work like `evaluate_list_expression()` is biased towards
implementing a splat operator that would enable casey#2458 list syntax and casey#1988 list
forwarding, but doesn't commit us to any particular design yet.
Copy link
Owner

@casey casey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

This is bikeshedding, but what do you think calling it List instead of Val?

I originally suggested Val, but I'm thinking that the "everything is a list of strings" data model is somewhat exotic, so calling it a List explicitly in the codebase might be helpful for comprehension.

On the other hand, we could also leave it a Val for now, and consider this rename once we support user-facing changes which actually use it as a list.

There's aready a List type, but it's a minor type used to help formatting, so renaming the existing List to something else so we can use List for what is now called Val might be worth it.

Don't worry about making this change now, we can either make it before the PR goes in, or in a follow-up.

@@ -83,50 +83,63 @@ impl<'src, 'run> Evaluator<'src, 'run> {
Ok(self.scope.value(name).unwrap())
}

/// A place for adding list operators in the future.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little confusing to me, since it adds some indirection to follow to figure out what's going on. I think we should probably remove it, and just use evaluate_expression, and we can add it back later when we need it.

/// It can be viewed as a "joined string", or as the individual "parts".
#[derive(Debug, Clone, Default)]
pub(crate) struct Val {
parts: Rc<[Box<str>]>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rc<[Box<str>]> is a little exotic. Is the advantage that extra capacity is freed when compared to an Rc<Vec<String>>?

I'd be in favor of using Rc<Vec<String>>, just because it's easier to read, and switching to Rc<[Box<str>] if we really need to.

}

/// Construct a `Val` consisting of multiple parts.
pub fn from_parts<I, S>(parts: I) -> Self
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think element is the familiar term for things in array or linked list, so maybe this should be from_elements? and val.parts should be val.elements?

}

/// Convert to a single joined string.
pub fn to_joined(&self) -> String {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can just be join().

impl std::fmt::Display for Val {
/// When formatted as a string, a `Val` is joined by whitespace.
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.to_joined())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should do:

for (i, element) in &self.elements {
  if i > 0 {
    write!(f, " ")?; // or a f.write_char if that exists
  }
  f.write_str(element)?;
}

Ok(())

And then join should call self.to_string(). Doing it this way means that you don't need to allocate if you don't have to, i.e., you're printing to stdout or using it in a format string.

}

/// Construct a `Val` consisting of a single string.
pub fn from_str<S: AsRef<str>>(s: S) -> Self {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be removed in favor of val.into() along with the From implementations.

}

/// Convert to individual parts.
pub fn to_parts(&self) -> Vec<Box<str>> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this return just &[String], and rely on the caller to convert it to a vec? Avoids allocation if the caller doesn't need to.

}
}

#[cfg(test)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This this and the tests should be inside a #[cfg(test)] mod tests { … } block.

@@ -81,3 +81,25 @@ fn nesting() {
evaluate("'' || '' || '' || '' || 'foo'", "foo");
evaluate("'foo' && 'foo' && 'foo' && 'foo' && 'bar'", "bar");
}

#[test]
fn empty_variadics_are_falsey() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add test cases for comparing empty variadics with the empty string with==, !=, =~, and !~, since those are cases where I can imagine accidentally introducing a breaking change.

#[track_caller]
fn eval_variadics<'a>(args: impl AsRef<[&'a str]>, expected: &'a str) {
Test::new()
.justfile("@foo *args:\n echo {{ args && 'true' || 'false' }}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have separate tests for && and ||. Operator precedence is always confusing, so separating them might help comprehension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants