Skip to content

uri, iri, and url-s #784

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iherman opened this issue Aug 18, 2019 · 12 comments
Closed

uri, iri, and url-s #784

iherman opened this issue Aug 18, 2019 · 12 comments

Comments

@iherman
Copy link

iherman commented Aug 18, 2019

The current spec introduces, in 7.3.5, the uri, iri, uri-reference and iri-reference terms, referring to RFC3986 and RFC3987, respectively.

However... the Web Developer's community is converging toward the exclusive usage of the term 'url', referring to the URL Spec. The URL, as defined in that spec, includes all kinds of schemes like URN-s and others, ie, the syntax covers what URI-s do. Most importantly, newer W3C standards, like HTML5, refer to that spec instead of the RFC-s (look also at the Note related to URL-s).

Bottom line, I believe it would be worth adding the terms url and url-reference, referring to the WhatWG URL Spec. This would avoid possible ambiguities when schemas are used for JSON data closely related to Web specifications.

@handrews
Copy link
Contributor

As an IETF draft series, we preferentially reference IETF RFCs. I'm not particularly interested in getting into WHATWG's wars with other standards groups. Plus, WHATWG's URL spec is oriented towards the needs of browsers and HTML, which is not our primary target environment.

If people want to specify formats specific to other specifications, they can do so with extensions (this sort of thing has come up before, and nothing has changed since then).

@handrews
Copy link
Contributor

For reference, this was last discussed in #233.

@handrews
Copy link
Contributor

And finally note that the next draft substantially improves the extensibility of JSON Schema, so if someone wants to add a vocabulary for WHATWG compatibility, they'll be able to do without needing our approval then.

@Relequestual
Copy link
Member

Additionally, I think using URL would imply that it is network addressable (location) rather than just an identifier (URI). A key difference here from the name alone.

@Julian
Copy link
Member

Julian commented Aug 19, 2019

(I agree with the closure of the ticket, but @Relequestual I think on the last thing the point is "lots of people, even somewhat informed ones, just call everything URLs, regardless of whether they're identifiers or locations", which probably matches my experience too.)

@Relequestual
Copy link
Member

@Julian This is true, however when we use the term URI, people may look and go "Hum, why is that URI and not URL?" - At least I hope they might. Using URI supports the "this COULD be just a name" which is often how I see it being used, and how I often recomend using it initially.

@Julian
Copy link
Member

Julian commented Aug 19, 2019 via email

@handrews
Copy link
Contributor

Yeah, plus we say all over the spec that our URIs are identifiers and not locators. So for things in our spec, changing to URL would give exactly the wrong impression.

And as for a "format": "url", that's where I have no interest in dealing with WHATWG's spec wars. They can write an extension vocabulary.

@iherman
Copy link
Author

iherman commented Aug 19, 2019

This is not an issue of "WHATWG's spec wars". The WHATWG URL spec has become a core reference for a number of W3C specifications, and not only for HTML. It includes cases when the term 'url' is used for identification. It is simply a matter of following the dominant usage and implementations out there.

(For example, the issue that led me to raise this is the result of a work we do on defining a manifest for digital publications in JSON-LD, and using JSON-Schema to define the 'shape' of that manifest. We use URL-s to identify (publication) resources, and we hit this problem. Nothing to do with browsers or HTML.)

The issue has been closed, and I accept the consensus (even if I do not agree with it), so I do not want to pursue this.

@handrews
Copy link
Contributor

@iherman if we end up publishing JSON Schema through the W3C (which has been floated given some responses from IETF JSON folks), then it would be a good idea to revisit this. I would be happy to step out of the way of the process.

Speaking purely personally, I find a standards body that describes the work of past standards authors as an "aberrant monstrosity" or puts comments like "Although we have asked them to stop doing so, the W3C also republishes some parts of this specification as separate documents" in their actual specification to be distasteful. Work your disagreements out like adults instead of complaining about them in your spec and essentially trash-talking your supposed collaborators.

But regardless of my personal opinion, should we move off of the IETF I will actively support going with whatever set of specifications our new home endorses, without any commentary in our actual specification. But as long as we're still targeting an IETF RFC, we will reference the IETF's documents preferentially.

@awwright
Copy link
Member

awwright commented Aug 23, 2019

This comes up from time to time, one time I innocently gave feedback on a bug that I didn't know was so politically charged, and started a nice flame war.

In short, I don't think this is feasible for JSON Schema because WHATWG only targets Web browsers (read: Chrome and relatives of Chrome), which immediately puts JSON Schema outside of that target audience.

First I'm curious, which specifications specifically? W3C threw in the towel on producing HTML (if you instead link to https://www.w3.org/TR/html5/references.html#biblio-url instead, notice where you get redirected). Older versions of HTML5 did reference RFC3986 normatively , but it simply got too complicated to maintain compatibility with WHATWG (as the HTML charter required), since WHATWG insisted WHATWG HTML use their proprietary parser for URIs. Later versions eventually referenced the WHATWG URL document, but with the notice:

URLs can be used in numerous different manners, in many differing contexts. For the purpose of producing strict URLs one may wish to consider [RFC3986] [RFC3987].

And besides HTML, even when updating RDF to 1.1, the W3C changed from a custom-defined format for Unicode URIs, to citing RFC3987 (IRIs).

Second, WHATWG URL serves a fundamentally different purpose than what JSON Schema uses URIs for. Namely, WHATWG URL has no concept of identifiers; and WHATWG exclusively supports Web browsers. Since we wouldn't use it in the meta-schema, it doesn't seem to reach the bar to be a core format.

Third, it also has major technical issues. Unicode/ASCII, and full/absolute/reference are all called the same thing, which is super confusing And there's a ton of flags. Sometimes a space in a URL is legal (and transparently corrected). Sometimes it's a parse error. What would it be in JSON Schema? Who knows, there's no stable document to reference.

I could be persuaded about the utility of a "whatwg-url" custom format if there's a desire to parse URIs that might be malformed (e.g. user input, or found in HTML documents), but again, I constantly run into technical issues.

@handrews
Copy link
Contributor

Who knows, there's no stable document to reference.

Yeah, this, too. "Living standards" drive me nuts. I know web browsers release new versions constantly these days so I guess they can keep up, but a tiny volunteer effort like ours would be referencing a moving target (I can't figure out how you're supposed to pin down anything).

I recognize that WHATWG solved/solves some problems with the standards process for the web browser world, and I guess they've moved beyond that at least somewhat according to the OP (although that RDF reference is a better example for us, really), but really their approach has never made sense to me on a larger scale.

Between the inability to reference something clearly stable and the badmouthing of other standards bodies, in their formal publications no less, I've always steered well clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants