Editorial: clarify URL validity #666

annevk · 2021-10-21T11:44:06Z

Closes #595.

url.bs

domenic · 2021-10-21T14:51:24Z

url.bs

@@ -1170,7 +1170,9 @@ unified model would be, please file an issue.

 <li><p>The <a>URL serializer</a> takes a <a for=/>URL</a> and returns an <a>ASCII string</a>. (If
 that string is then <a lt="URL parser">parsed</a>, the result will <a for=url>equal</a> the <a
- for=/>URL</a> that was <a lt="URL serializer">serialized</a>.)
+ for=/>URL</a> that was <a lt="URL serializer">serialized</a>.) The output of the
+ <a>URL serializer</a> is not always a <a>valid URL string</a>. I.e., not all <a for=/>URLs</a> are


It might be worth adding a pointer to #379, because I am still hoping we can change this.

You don't think it's useful that \ in HTTP URLs shows up as something you probably want to fix?

I mean, we can rehash that thread here if you want... yes, I think the serializer should always produce valid URLs, either by expanding the definition of valid, or changing the serializer.

Oh sorry, \ is not applicable then I think. I got this confused with the people who think all inputs ought to be valid or rejected. Whereas you don't necessarily think all inputs ought to be valid or rejected, but the invalid inputs that are accepted, ought to be transformed to something valid when they are spit out again.

So yeah, the reason for that is mainly encouraging RFC 3986 interop. But I'm not sure anyone is really appreciative of that.

TimothyGu · 2021-10-25T18:14:38Z

url.bs

@@ -1160,7 +1160,7 @@ unified model would be, please file an issue.

 <ul>
 <li><p>The <a>URL parser</a> takes an arbitrary string and returns either failure or a
- <a for=/>URL</a>.
+ <a for=/>URL</a>. It might also record zero or more <a>validation errors</a>.


Do we know if

"URL parser records zero validation errors" implies "input string is a valid URL string"?

How about the other direction?

It would be great to clarify the purpose of these validation errors.

We do not, and I strongly suspect they are not equivalent. I think there are some open issues on it.

My preferred strategy has been to instrument whatwg-url with both modes of validation and fuzz to find examples where they mismatch. I haven't made the time to do so yet though.

Closes #595.

…roposal or consensus)

annevk requested review from domenic and rmisev October 21, 2021 11:44

domenic reviewed Oct 21, 2021

View reviewed changes

TimothyGu reviewed Oct 25, 2021

View reviewed changes

annevk added 2 commits December 9, 2022 11:21

Editorial: clarify URL validity

baf4501

Closes #595.

address nit (don't add an issue pointer for now as there's no clear p…

c3ade50

…roposal or consensus)

annevk force-pushed the annevk/valid branch from 568433f to c3ade50 Compare December 9, 2022 10:26

annevk requested a review from domenic December 9, 2022 10:28

annevk added the topic: validation Pertaining to the rules for URL writing and validity (as opposed to parsing) label Dec 9, 2022

domenic approved these changes Dec 9, 2022

View reviewed changes

annevk merged commit 2885626 into main Dec 9, 2022

annevk deleted the annevk/valid branch December 9, 2022 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Editorial: clarify URL validity #666

Editorial: clarify URL validity #666

annevk commented Oct 21, 2021 •

edited by pr-preview bot

Loading

domenic Oct 21, 2021

annevk Oct 21, 2021

domenic Oct 21, 2021

annevk Oct 21, 2021

TimothyGu Oct 25, 2021

domenic Oct 25, 2021

Editorial: clarify URL validity #666

Editorial: clarify URL validity #666

Conversation

annevk commented Oct 21, 2021 • edited by pr-preview bot Loading

domenic Oct 21, 2021

Choose a reason for hiding this comment

annevk Oct 21, 2021

Choose a reason for hiding this comment

domenic Oct 21, 2021

Choose a reason for hiding this comment

annevk Oct 21, 2021

Choose a reason for hiding this comment

TimothyGu Oct 25, 2021

Choose a reason for hiding this comment

domenic Oct 25, 2021

Choose a reason for hiding this comment

annevk commented Oct 21, 2021 •

edited by pr-preview bot

Loading