Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add {{StringContext}} extended attribute #1392

Closed
wants to merge 13 commits into from

Conversation

lukewarlow
Copy link
Member

@lukewarlow lukewarlow commented Mar 11, 2024

Add a {{StringContext}} extended attribute that can be defined on DOMString and USVString.

This is to hook up the Trusted Types validation during the ES->IDL type
conversion to avoid issues with its default policy.

See w3c/trusted-types#248,
w3c/trusted-types#176

The actual logic of the string validation calls into HTML, and will be something similar to https://w3c.github.io/trusted-types/dist/spec/#html-validate-the-string-in-context.

(See WHATWG Working Mode: Changes for more details.)


Preview | Diff

koto and others added 3 commits March 11, 2024 18:13
(DOM|USV)String.

This is to hook up the Trusted Types validation during the ES->IDL type
conversion to avoid funky issues with its default policy.

See w3c/trusted-types#248,
w3c/trusted-types#176
@lukewarlow
Copy link
Member Author

Fork of #841 will address review comments here

@lukewarlow lukewarlow force-pushed the string-context-xattr branch from 184e827 to 49a7361 Compare March 12, 2024 12:37
@lukewarlow lukewarlow changed the title Add StringContext attribute Add `{{StringContext attribute Mar 12, 2024
@lukewarlow lukewarlow changed the title Add `{{StringContext attribute Add {{StringContext}} extended attribute Mar 12, 2024
@lukewarlow lukewarlow requested review from Ms2ger, domenic and annevk March 12, 2024 13:13
index.bs Show resolved Hide resolved
[{{StringContext}}] extended attribute, then set |V| to the result of performing
[=validate the string in context=], passing [=this=], |V|, the {{StringContext}}
extended attribute [=identifier=], and the [=identifier=]
of the [=operation=] or [=extended attribute=] of the first entry in |S|.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we only need the identifier which is the same for all overloads we don't actually need to worry about which entry in the overload set is used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if that overload doesn't have a StringContext annotated type?

I guess what this means is that you cannot meaningfully have overloaded StringContext types (or at least not all variants one might imagine to be possible). That seems reasonable, but I wonder if we need to make that more explicit in some way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah based on my understanding you can't really have overloads. Like where you need the type to be different in a sub class we just switch back to using a union type and handling it manually rather than using this attribute. cc @koto in case I'm wrong.

Happy to add a note or something to make it clearer, not really sure what it should say though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would add a paragraph to where StringContext is defined that outlines the limitations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a paragraph hopefully it makes sense. Happy to adjust it if you'd like.

@lukewarlow lukewarlow requested a review from annevk March 12, 2024 15:34
@lukewarlow lukewarlow marked this pull request as ready for review March 13, 2024 11:42
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable, although I think we probably should note the overload limitations more explicitly.

@EdgarChen @petervanderbeken @yuki3 any of you want to take a look as well?

index.bs Outdated Show resolved Hide resolved
[{{StringContext}}] extended attribute, then set |V| to the result of performing
[=validate the string in context=], passing [=this=], |V|, the {{StringContext}}
extended attribute [=identifier=], and the [=identifier=]
of the [=operation=] or [=extended attribute=] of the first entry in |S|.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if that overload doesn't have a StringContext annotated type?

I guess what this means is that you cannot meaningfully have overloaded StringContext types (or at least not all variants one might imagine to be possible). That seems reasonable, but I wonder if we need to make that more explicit in some way.

@yuki3
Copy link

yuki3 commented Mar 14, 2024

I think this looks reasonable, although I think we probably should note the overload limitations more explicitly.

@EdgarChen @petervanderbeken @yuki3 any of you want to take a look as well?

Looks good to me, too.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
@whatwg whatwg deleted a comment from Jhonatanbb Mar 15, 2024
@yuki3
Copy link

yuki3 commented Mar 15, 2024

cc @natechapin as he's taking over the Blink-V8 bindings work (including interop work) from me.

a [=regular operation=] argument. A type annotated with the [{{StringContext}}]
extended attribute must not appear in a [=read only=] attribute.

A type that is not {{DOMString}} or {{USVString}} must not be [=extended attributes associated with|associated with=]
Copy link
Member

@mbrodesser-Igalia mbrodesser-Igalia Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future nullable types might require support too. At least Chromium doesn't need them now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://w3c.github.io/trusted-types/dist/spec/#setting-slot-values - does require use of ScriptString? so this might need updating to account for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the TT spec to just use a union type there rather than the extended attribute, because that solves dealing with nullable types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @koto ^

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that section specifically is going through a rewrite, Chrome doesn't match it currently and won't in its new form either so that is one change to the chromium implementation that will be needed. (Very minor web facing changes that shouldn't impact anyone)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the TT spec to just use a union type there rather than the extended attribute, because that solves dealing with nullable types.

Where was the TT spec updated? This patch still adds StringContext extended attribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w3c/trusted-types#484 - in this PR the textContent one specifically was updated to use a union. I've done a follow up commit that changes them all to unions (which I can easily revert if we decide against that)

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is mostly good now, but on closer inspection I think the "validate" aspect needs a bit more flushing out.

index.bs Show resolved Hide resolved

The algorithm returns an ECMAScript value, or [=JavaScript/throws=] a <l spec=ecmascript>{{TypeError}}</l>.

Note: The HTML Standard defines how the validation is performed. [[!HTML]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into this a bit more and this description appears insufficient.

  1. It's not clear it always ends up throwing a TypeError. Doesn't this involve userland for some bits?
  2. It seems this algorithm might also end up reporting violations per step 6.1 of https://w3c.github.io/trusted-types/dist/spec/#abstract-opdef-get-trusted-type-compliant-string.

I don't think "validate" is a sufficiently clear term given the implications. "Validate" makes me expect a side-effect-free-boolean-returning operation, not something that can result in network traffic (and events?).


Apart from the above, we'll need some test coverage to ensure that this is invoked at the correct time relative to other exceptions that can be thrown. This will be observable if this can throw non-TypeError exceptions, but even if this always throws a TypeError (and we ignore the message field) it's observable through network traffic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear it always ends up throwing a TypeError. Doesn't this involve userland for some bits?

Yeah you're correct this can throw any error type.

It seems this algorithm might also end up reporting violations per step 6.1

Also correct it can report CSP violations.

I don't think "validate" is a sufficiently clear term given the implications. "Validate" makes me expect a side-effect-free-boolean-returning operation, not something that can result in network traffic (and events?).

Maybe verify instead? I'm not sure what prior art there is for this sort of thing.

Apart from the above, we'll need some test coverage to ensure that this is invoked at the correct time relative to other exceptions that can be thrown. This will be observable if this can throw non-TypeError exceptions, but even if this always throws a TypeError (and we ignore the message field) it's observable through network traffic.

Additional tests to ensure timing wrt to other errors makes sense here. cc @mbrodesser-Igalia

As for the network traffic aspect, I'm not familiar with the details of how CSP violation reports work and whether their timing is an observable concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string can also be modified, right? It says it returns an ECMAScript value, but doesn't it really return a string?

I think "verify" is okay. I guess in theory we want this to be able to be used for non-Trusted Types purposes.

There's various things that matter with respect to timing:

  1. Does it all happen at the correct point in time relative to other exceptions that may be thrown. I.e., do implementations handle the checks in the correct order.
  2. What further side effects may arise from this check happening. It seems that https://w3c.github.io/webappsec-csp/#report-violation always queues a task so we don't have to worry about that for CSP. And I think we don't have to worry in the general case because this happens before any specification algorithms are involved. Although if we add more of these utilities around the edges of IDL we'll run into conflicts at some point of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't it really return a string?

Doesn't it return the argument directly in step 2?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It overwrites it first in 1.2 as far as I can tell. At least judging from https://w3c.github.io/trusted-types/dist/spec/#html-validate-the-string-in-context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't expect the code path to be used then you need to use Assert.

But given @petervanderbeken's comments I'm starting to wonder if having this as an extended attribute is a good after all. I guess I'm back at #841 (comment) where I wonder if we should put most of the complexity in the specification algorithms.

cc @domenic @koto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cc @mbrodesser-Igalia as it will impact his implementation in Gecko

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion in the matrix I'm on board with the idea of just using union types and updating algorithms, and closing this PR out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@koto or @otherdaniel do you forsee chrome having any issues with this proposed change? It'd be good to get the specs and webkit implementation.

Tldr drop the IDL extended attribute and just use unions everywhere we need enforcement (updating the call site directly).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it helps the discussion I've got a draft commit to update the WebKIt implementation to use union types instead of IDL magic. This doesn't account for DOM timers (webkit impl is different to spec). But otherwise replaces all existing usages of IDL magic.

https://github.com/webkit/webkit/compare/lukewarlow:trusted-types-use-union-over-attribute

the [{{StringContext}}] extended attribute.

The [{{StringContext}}] extended attribute must not be used on only some of the [=overloaded|overloads=] of an
[=overloaded=] [=operation=].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this sentence a bit weird, because it's used on arguments, not on the operation itself. Does this mean that in the case of overloads, either all DOMString/USVString arguments need to be annotated or none? Or just that if the overloads have a DOMString/USVString argument in the same position it should be annotated the same way in all overloads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off the top of my head I think it's that if they're in the same position they have to always or never be annotated. It might also be the case that you can't have an overload where it's an annotated string in one branch, and a different type entirely in another.

The attribute has a side effect and that means these limitations have to be in place.

I'll be honest I'm not super familar with the IDL overloading so I might be missing the point entirely.

@annevk can probably explain it better.

Happy to reword this paragraph to make more sense :)

@petervanderbeken
Copy link

I do think that we need to make [StringContext]-annotated strings not distinguishable from object, interface-like, callback function, dictionary-like and sequence-like. So we probably need to add another letter in the table, and use it in the string types row to explain that these are only distinguishable if the string type doesn't have a [StringContext] extended attribute.

@lukewarlow
Copy link
Member Author

Based on discussions here, in Matrix and a video call we had with Koto and Daniel I'm going to go ahead and close this PR. If Daniel discovers a reason why the observable timing differences between IDL and call site aren't web compatible we can revisit this.

@lukewarlow lukewarlow closed this Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants