-
Notifications
You must be signed in to change notification settings - Fork 65
[RFC] Add Appendix A: Persisted Documents #264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
25c6968
7863941
a1a2c25
6e9cb85
433b37f
6fbc6ed
52d56fb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,269 @@ | ||||||
# A. Appendix: Persisted Documents | ||||||
|
||||||
This appendix defines an optional extension to the GraphQL-over-HTTP protocol | ||||||
that allows for the usage of "persisted documents". | ||||||
|
||||||
:: A _persisted document_ is a GraphQL document (strictly: an | ||||||
[`ExecutableDocument`](https://spec.graphql.org/draft/#ExecutableDocument)) that | ||||||
has been persisted such that the server may retrieve it based on an identifier | ||||||
indicated in the HTTP request. | ||||||
|
||||||
This feature can be used as an operation allow-list, as a way of improving the | ||||||
caching of GraphQL operations, or just as a way of reducing the bandwidth | ||||||
consumed from sending the full GraphQL Document to the server on each request. | ||||||
|
||||||
Typically, support for the _persisted document_ feature is implemented via a | ||||||
"middleware" that sits in front of the GraphQL service and transforms a | ||||||
_persisted document request_ into a _GraphQL-over-HTTP request_. | ||||||
|
||||||
:: A _persisted operation_ is a _persisted document_ which contains only one | ||||||
GraphQL operation and all the fragments this operation references (recursively). | ||||||
|
||||||
## Identifying a Document | ||||||
|
||||||
:: A _document identifier_ is a string-based identifier that uniquely identifies | ||||||
a GraphQL Document. | ||||||
|
||||||
Note: A _document identifier_ must be unique, otherwise there is a risk of | ||||||
responses confusing the client. Even if the selection sets are identical, even | ||||||
whitespace changes may change the location from which errors are raised, and | ||||||
thus should generate different document identifiers. | ||||||
|
||||||
A _document identifier_ must either be a _prefixed document identifier_ or a | ||||||
_custom document identifier_. | ||||||
Comment on lines
+32
to
+33
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we give a formal BNF syntax? Maybe restrict the identifiers to alpha numeric? GraphQL names maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you like to submit a PR to my PR to add this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds like a plan. This week's quite busy but I'll aim for next week! (famous last words 😅 ) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That was the longest week ever but attempt at formal syntax is here |
||||||
|
||||||
### Prefixed Document Identifier | ||||||
|
||||||
:: A _prefixed document identifier_ is a document identifier that contains at | ||||||
least one colon symbol (`:`). The text before the first colon symbol is called | ||||||
the {prefix}, and the text after it is called the {payload}. The {prefix} | ||||||
identifies the method of identification used. Applications may use their own | ||||||
identification methods by ensuring that the prefix starts `x-`; otherwise, all | ||||||
prefixes are reserved for reasons of future expansion. | ||||||
|
||||||
### SHA256 Hex Document Identifier | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if actual encryption is needed? what's the benefit of having There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The aim of a specification like this is interoperability; so a client that supports persisted operations should be able to use a server that supports persisted operations without too much additional setup. Sharing details of the document identification method used out-of-band is supported (explicitly by There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I expect that the benefit is mainly for APQ support, where a client can enable APQ without knowledge of the server implementation, and so a consistent implementation is essential. SHA256 can be computed natively by modern browsers with a very low collision rate, making it a good choice in this scenario. However, in a scenario where the identifiers are only known to the server, and must be registered with the server in order for the client to operate (as so far is documented in this RFC), they might as well be any opaque key returned by the query storage database. If the database stores the queries with an auto-incrementing integer as an identifier, that would work just as well. Even so, there are still some benefits to using a hash:
I'm fine with the current suggestion of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @dotansimha Indeed, that's already allowed under this spec. The issue is it requires coordination between server and client (they need to agree on how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, right, but the queries still need to be transferred to the server is what you're saying. Yes, that's true, but it can be done after the client is built (but before it's deployed). That's different from having to do it during the build process; it means that clients can be built and persisted documents written even before the server exists. It also allows for arbitrary transfer of documents to the server (you can send them one-way on a pen drive through the mail if you want!). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So the client and the server must be coordinated, and we recommend to use SHA256, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Recommend" and "should" are the same according to RFC2119, so I think that is saying what's already there. If the server follows this recommendation, the client doesn't need any configuration. If the server doesn't do this; then you need to coordinate between client and server. For optimal interoperability, no coordination should be necessary. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be pedantic there, I'd argue that sharing the "sha256" method is still coordination between the client and server. Plus the documents need to be actually transferred (on a pen drive or avian carrier!). So coordination is always required but sha256 is a convenient and widespread default which we recommend (hence the should)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, it is a very light-touch asynchronous form of coordination. Essentially the coordination boils down to two things: 1. does the server support SHA256 hashes (don't necessarily need to ask the server this, it's a fact that should be established in the development team); 2. we need a way to send the operations+hashes to the server and to know when they have been persisted. Client is informed that server supports SHA256 hashes. Since the client can't be deployed until the server has stored the queries/hashes, there is indeed coordination. The coordination is incredibly lightweight compared to alternatives where the server must generate hashes during the client build process. |
||||||
|
||||||
:: A _SHA256 hex document identifier_ is a _prefixed document identifier_ where | ||||||
{prefix} is `sha256` and {payload} is 64 hexadecimal characters (in lower case). | ||||||
|
||||||
The payload of a _SHA256 hex document identifier_ must be produced via the | ||||||
lower-case hexadecimal encoding of the SHA256 hash (as specified in | ||||||
[RFC4634](https://datatracker.ietf.org/doc/html/rfc4634)) of the Source Text of | ||||||
the GraphQL Document (as specified in | ||||||
[the Language section of the GraphQL specification](https://spec.graphql.org/draft/#sec-Language)) | ||||||
encoded using the UTF-8 character set. | ||||||
|
||||||
A service which accepts a _persisted document request_ SHOULD support the | ||||||
_SHA256 hex document identifier_ for compatibility. | ||||||
|
||||||
#### Example | ||||||
|
||||||
The following GraphQL query (with no trailing newline): | ||||||
|
||||||
```graphql example | ||||||
query ($id: ID!) { | ||||||
user(id: $id) { | ||||||
name | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
Would have the following _SHA256 hex document identifier_: | ||||||
|
||||||
```example | ||||||
sha256:7dba4bd717b41f10434822356a93c32b1fb4907b983e854300ad839f84cdcd6e | ||||||
``` | ||||||
|
||||||
Whereas the same query with all optional whitespace omitted: | ||||||
|
||||||
```raw graphql example | ||||||
query($id:ID!){user(id:$id){name}} | ||||||
``` | ||||||
|
||||||
Would have this different _SHA256 hex document identifier_: | ||||||
|
||||||
```example | ||||||
sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b | ||||||
``` | ||||||
|
||||||
### Custom Document Identifier | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it would be unwise to recommend that people do this based solely on the operation name (the risk of clashes as they iterate their queries is too high); however I would support the name being factored into the document identifier along with some hashing; e.g. something like: |
||||||
|
||||||
:: A _custom document identifier_ is a document identifier that contains no | ||||||
colon symbols (`:`). The meaning of a custom document identifier is | ||||||
implementation specific. | ||||||
|
||||||
Note: A 32 character hexadecimal _custom document identifier_ is likely to be an | ||||||
MD5 hash of the GraphQL document, as traditionally used by Relay. | ||||||
|
||||||
## Persisting a Document | ||||||
|
||||||
A client that wishes to utilize persisted documents for a request must generate | ||||||
a _document identifier_ for the associated GraphQL Document and should ensure | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
the server can retrieve this GraphQL Document from the document identifier. The | ||||||
method through which the client and server achieve this is implementation | ||||||
specific. | ||||||
|
||||||
Note: When used as an operation allowlist, persisted documents are typically | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
stored into some kind of trusted shared key-value store at client build time | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
(either directly, or indirectly via an authenticated request to the server) such | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
that the server may retrieve them given the identifier at request time. This | ||||||
must be done in a secure manner (preventing untrusted third parties from adding | ||||||
their own persisted document) such that the server will be able to retrieve the | ||||||
identified document within a _persisted document request_ and know that it is | ||||||
trusted. | ||||||
|
||||||
Note: When used solely as a bandwidth optimization, an error-based mechanism | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As we hint here at APQ should we have an opinion on the runtime vs build time generation of documents where we explicitly state that build-time gets you all the security benefits while runtime does not? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm happy with the wording as-is to cover this. Note that Persisted Documents are not necessarily trusted documents. All trusted documents are persisted documents, but not all persisted documents are trusted documents. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we mention "auto persisted queries" explicitely?
|
||||||
might be used wherein the client assumes that the document has already been | ||||||
persisted, but if the request fails due to unknown _document identifier_ the | ||||||
client issues a follow-up request containing the full GraphQL Document to be | ||||||
persisted. | ||||||
|
||||||
Note: When persisting a document it is generally good practice for the client to | ||||||
issue both the GraphQL Document and the document identifier to the server; the | ||||||
server would then regenerate the document identifier from the GraphQL Document | ||||||
independently, and check that the identifiers match before storing the Document. | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
An alternative but equally valid approach has the client issue the GraphQL | ||||||
Document to the server, and the server returns an arbitrary _custom document | ||||||
identifier_ that the client would incorporate into its bundle. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should write here that the operation identifier to persisted document mapping is usually a build artifact shared by the graphql client and graphql server. The server can use an external "store" as the source for resolving a document identifier to an actual document There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would you like to propose edits in the form of a PR? Perhaps you're just suggesting an extension to this paragraph, such as:
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
## Persisted Document Request | ||||||
|
||||||
A server MAY accept a _persisted document request_ via `GET` or `POST`. | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
### Persisted Document Request Parameters | ||||||
|
||||||
:: A _persisted document request_ is an HTTP request that encodes the following | ||||||
parameters in one of the manners described in this specification: | ||||||
|
||||||
- {documentId} - (_Required_, string): The string identifier for the Document. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In an unofficial APQ request based on this specification, the query should go into There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are your opinions on providing the Edit: already covered via https://github.com/graphql/graphql-over-http/pull/264/files#diff-9be5577e05ae2112d2b8f95584b162d0dec01453bf6c85df58bf5db4f2c9727aR166-R168 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes I think this should be encouraged more; it's great for caching. I'd welcome your edits to address this, if you were so inclined. |
||||||
- {operationName} - (_Optional_, string): The name of the Operation in the | ||||||
identified Document to execute. | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
- {variables} - (_Optional_, map): Values for any Variables defined by the | ||||||
Operation. | ||||||
- {extensions} - (_Optional_, map): This entry is reserved for implementors to | ||||||
extend the protocol however they see fit. | ||||||
|
||||||
### GET | ||||||
|
||||||
For a _persisted document request_ using HTTP GET, parameters SHOULD be provided | ||||||
in the query component of the request URL, encoded in the | ||||||
`application/x-www-form-urlencoded` format as specified by the | ||||||
[WhatWG URLSearchParams class](https://url.spec.whatwg.org/#interface-urlsearchparams). | ||||||
|
||||||
The {documentId} parameter must be a string _document identifier_. | ||||||
|
||||||
The {operationName} parameter, if present, must be a string. | ||||||
|
||||||
Each of the {variables} and {extensions} parameters, if used, MUST be encoded as | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have to define expectations in case this exceeds maximum URL size? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we do; we should do it in the main spec: https://graphql.github.io/graphql-over-http/draft/#sec-GET The Appendix tries not to redundantly repeat statements from the main spec if it can avoid it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which reminds me; I heard that some people are using headers to specify variables when using GraphQL-over-GET... Apparently that works around the length limit 🤨 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, a lot of folks use headers and then the server adds them to the |
||||||
a JSON string. | ||||||
|
||||||
Setting the value of the {operationName} parameter to the empty string is | ||||||
equivalent to omitting the {operationName} parameter. | ||||||
|
||||||
A client MAY provide the _persisted document request_ parameters in another way | ||||||
if the server supports that. | ||||||
|
||||||
Note: A common alternative pattern is to use a dedicated URL for each _persisted | ||||||
operation_ (e.g. | ||||||
`https://example.com/graphql/sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b`). | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add a section on i.e. returning 404 for persisted-documents we can't find and maybe even 400 if they don't leverage an allowed prefix? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think for the URL format 404 should be encouraged. I've been writing up a change to the appendix which encourages the URL format, but haven't had time to finish it yet; I've just raised a PR for my WIP so we have something to easily reference: #305 IMO for the non-URL version (traditional), 404 should not be used - it suggests that the |
||||||
GET requests MUST NOT be used for executing mutation operations. If a mutation | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
operation is indicated by the value of {operationName} and the GraphQL Document | ||||||
identified by {documentId}, the server MUST respond with error status code `405` | ||||||
(Method Not Allowed) and halt execution. This restriction is necessary to | ||||||
conform with the long-established semantics of safe methods within HTTP. | ||||||
|
||||||
#### Canonical Parameters | ||||||
|
||||||
Parameters SHOULD be provided in the order given in the list above, any optional | ||||||
parameters which have no value SHOULD be omitted, and parameters encoded as JSON | ||||||
string SHOULD use the most compressed form (with all optional whitespace | ||||||
omitted). A server MAY reject requests where this is not adhered to. | ||||||
|
||||||
Note: Ensuring that parameters are in their canonical form helps improve cache | ||||||
hit ratios. | ||||||
|
||||||
#### Example | ||||||
|
||||||
Executing the GraphQL Document identified by | ||||||
`"sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b"` with | ||||||
the following query variables: | ||||||
|
||||||
```raw json example | ||||||
{"id":"QVBJcy5ndXJ1"} | ||||||
``` | ||||||
|
||||||
This request could be sent via an HTTP GET as follows: | ||||||
|
||||||
```url example | ||||||
https://example.com/graphql?documentId=sha256:71f7dc5758652baac68e4a10c50be732b741c892ade2883a99358f52b555286b&variables=%7B%22id%22%3A%22QVBJcy5ndXJ1%22%7D | ||||||
``` | ||||||
|
||||||
### POST | ||||||
|
||||||
For a _persisted document request_ using HTTP POST, the request MUST have a body | ||||||
which contains values of the _persisted document request_ parameters encoded in | ||||||
one of the officially recognized GraphQL media types, or another media type | ||||||
supported by the server. | ||||||
|
||||||
#### JSON Encoding | ||||||
|
||||||
When encoded in JSON, a _persisted document request_ is encoded as a JSON object | ||||||
(map), with the properties specified by the persisted document request: | ||||||
|
||||||
- {documentId} - the string identifier for the Document | ||||||
- {operationName} - an optional string | ||||||
- {variables} - an optional object (map), the keys of which are the variable | ||||||
names and the values of which are the variable values | ||||||
- {extensions} - an optional object (map) | ||||||
|
||||||
#### Example | ||||||
|
||||||
If we wanted to execute the following GraphQL query: | ||||||
|
||||||
```raw graphql example | ||||||
query ($id: ID!) { | ||||||
user(id: $id) { | ||||||
name | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
With the following query variables: | ||||||
|
||||||
```json example | ||||||
{ | ||||||
"id": "QVBJcy5ndXJ1" | ||||||
} | ||||||
``` | ||||||
|
||||||
This request could be sent via an HTTP POST to the relevant URL using the JSON | ||||||
encoding with the headers: | ||||||
|
||||||
```headers example | ||||||
Content-Type: application/json | ||||||
Accept: application/graphql-response+json | ||||||
``` | ||||||
|
||||||
And the body: | ||||||
|
||||||
```json example | ||||||
{ | ||||||
"documentId": "sha256:7dba4bd717b41f10434822356a93c32b1fb4907b983e854300ad839f84cdcd6e", | ||||||
"variables": { | ||||||
"id": "QVBJcy5ndXJ1" | ||||||
} | ||||||
} | ||||||
``` | ||||||
|
||||||
## Persisted Document Response | ||||||
|
||||||
When a server that implements _persisted documents_ receives a well-formed | ||||||
_persisted document request_, it must return a well‐formed _GraphQL response_. | ||||||
|
||||||
The server should retrieve the GraphQL Document identified by the {documentId} | ||||||
parameter. If the server fails to retrieve the document, it MUST respond with a | ||||||
well-formed _GraphQL response_ consisting of a single error. Otherwise, it | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest we define the contents of the error sufficiently for the client to conclusively recognize. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's not much scope to do that currently; the best that we can do is ensure that the error message starts with, ends with, or contains, a particular string. Error codes would need to be specified in the main GraphQL specification for us to use them here, and writing to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to non-normative example. Attempt:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Was coming to ask if there was a consistent error when the document identifier is not found |
||||||
should construct a _GraphQL-over-HTTP request_ using this document and the other | ||||||
benjie marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
parameters of the _persisted document request_, and then follow the details in | ||||||
the [Response section](#sec-Response). |
Uh oh!
There was an error while loading. Please reload this page.