Configurable per query or per request context/dataloader batching #192

jmccaull · 2019-06-19T19:19:15Z

This is an initial work of up how I think it should look. I'm still working on cleaning it up a little and documenting but I want to get some eyes on it. I decided not to go with Lombok to keep this as lean as possible.
It was a little difficult keeping everything straight without a package structure so I first organized the files a bit. There are a number of breaking changes relating batching and the context. I removed the BatchExecutionHandler replacing it with a pre-processor, the http objects from the context and moved all of the logic handling http objects up into the AbstractServlet. The context from GraphQL's view is really just the subject and the data loader registry. The builder for the context still has access to all the http objects, I felt they could be useful for things like headers etc, when building the context, and the servlet now deals with an interface so consumers should be able to customize this and even add back the http objects if they desire.
The logic for executing a batch deals with an interface and is context independent. The logic for which type of batch input object to create and which dispatching instrumentation to use has been implemented on the enum for the setting to keep this all in once place. The default in the configuration for this is set to PER_QUERY, which most closely resembles the existing behavior.
As implementing the per request setting required wholly rewriting the dispatching instrumentation (and this was what was recommenced by GraphQL java), I figured it made sense to make it as pluggable and reusable as possible. Further, since the GraphQL java defaults don't allow configuring their dispatching instrumentation to use the options object they provide and probably has a bug regarding chained instrumentation, I've completely removed their default and supplied a custom instrumentation with a pluggable approach for both per query and per request settings.
Overall I think this will also make it a lot easier to refactor the abstract servlet and make a service layer that can be inject in custom controllers etc.
I also plan on updating the readme etc.
Thanks!

…ng instrumentation

…n up

oliemansm

Nice job, thanks a lot!

oliemansm · 2019-06-20T09:20:40Z

src/main/java/graphql/servlet/AbstractGraphQLHttpServlet.java

-                        query(queryInvoker, graphQLObjectMapper, invocationInputFactory.createReadOnly(new GraphQLRequest(query, variables, operationName), request, response), response);
+                        query(queryInvoker, graphQLObjectMapper,
+                            invocationInputFactory.createReadOnly(new GraphQLRequest(query, variables, operationName), request, response),
+                            Optional.of(request), response);


It's recommended not to use Optional as parameter, see https://rules.sonarsource.com/java/RSPEC-3553

oliemansm · 2019-06-20T09:31:26Z

src/main/java/graphql/servlet/context/DefaultGraphQLContextBuilder.java

 public class DefaultGraphQLContextBuilder implements GraphQLContextBuilder {

    @Override
    public GraphQLContext build(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse) {
-        return new GraphQLContext(httpServletRequest, httpServletResponse);
+        return new DefaultGraphQLServletContext();


Think it would be good if we would provide another interface layer in between, e.g. GraphQLServletContext and GraphQLWebSocketContext along with concrete implementations containing the respective objects. That way the DefaultGraphQLServletContext could still provide the getFileParts() for example, simplifying file uploads. And this way people could still make use of existing functionality if they simply grab a different type of context from the environment.

Do you agree? Happy to help out with this of course.

I was curious as to how all that was intended to be used, and maybe I should've posted a question about it first. I also didn't run into any examples of it being used till trying to locally build the Spring boot project last night. I'm still digging though the commons project and trying to work out what I think, but I definitely don’t want to remove any functionality. Would it make more sense to wrap the result from the serlvet and allow it to be accessed there? I also ideally would like a way to break the dependency of the context builder on the actual http objects but I'm at a loss. My goal for keeping the graphql objects leaner is to make refactoring the servlet into a service layer easier. Maybe we can extend the execution input to maintain the file parts?

Are you on Gitter by any chance? Might be easier to chat realtime :)

Couple of use cases I have used myself using the HttpServletRequest for the servlet and the Session for the WebSockets from the context:

Authentication: usually getting the Authorization header to parse and populate the SecurityContext with the result. We could provide an alternate means for hooking in authentication that doesn't involve exposing the raw request through the context though.

File uploads like already mentioned, so that could perhaps be expose differently too. You have to be able to get to them through the DataFetchingEnvironment in your method though. But perhaps a cleaner solution can be constructed for this which provides a nicer API for consumers.

WebSocket Session also used for authentication.

I use the HttpServletRequest as well to grab the remote IP address to verify subsequent calls come from the same IP address as where the JWT was generated for. Which is an additional security mechanism.

And these are just the ones I'm using, I obviously don't know what other use cases people might have atm...

In the spring boot commons example, it looks like the mutation is intended to do something with the file parts (upload)? Should they be passed through the DataFetchingEnvironment somehow instead?

Don't believe you can, since the DataFetchingEnvironment is part of graphql-java itself and is agnostic and wants to remain agnostic of underlying technology. So they're not going to support file parts. The only thing under our control to pass along to the resolvers is in fact the GraphQLContext which can be retrieved from the DataFetchingEnvironment. That's why these they were exposed in the context, so the consumer could then in that resolver grab the files and handle them appropriately.

oliemansm

I've refactored the GraphQLContext implementation a bit to have a DefaultGraphQLContext, a DefaultGraphQLServletContext and a DefaultGraphQLWebSocketContext. Because when you use subscriptions you'll use the GraphQLWebSocketContext and when you use query/mutation you use the GraphQLServletContext.

oliemansm · 2019-06-21T08:33:48Z

src/test/groovy/graphql/servlet/DataLoaderDispatchingSpec.groovy

+        return new GraphQLContextBuilder() {
+            @Override
+            GraphQLContext build(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse, Map<String, List<Part>> fileParts) {
+                new DefaultGraphQLServletContext(registry(), null)


These tests are currently the only reason why DefaultGraphQLServletContext needs to be public. Preferrably we can make that implementation class package private. Also it seems currently the context cannot be built using the builder to pass in the registry and subject. I'm guessing that part is still work in progress?

I initially planned just the one static method that required the registry/subject but added the other and hadn't yet added the with method. Will update. Previously the context objects were public, I could see being able to extend it and not having to rewrite the base interface being useful, do we not want to allow that?

Ah right. but then we need to make the constructor protected instead of private. I tend to try to keep things package private usually so it's not part of the public api and I'm free to mess around with it without breaking anything for consumers. But in this case they're in fact very simple POJO's so might as well make them public.

oliemansm · 2019-06-21T08:37:10Z

src/main/java/graphql/servlet/context/DefaultGraphQLContextBuilder.java

+public class DefaultGraphQLContextBuilder implements GraphQLContextBuilder {
+
+    @Override
+    public GraphQLContext build(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse, Map<String, List<Part>> fileParts) {


I see we're passing this fileParts all the way down to this builder after we've extracted them with some heavy logic within the AbstractGraphQLHttpServlet. We should extract that part into its own class for starters, but also: we might as well push the extraction all the way down to the builder where we construct the actual context. Seeing that the HttpServletRequest that we retrieve the parts from is passed down as well anyway. That would clean up a lot of intermediate methods that now have to pass along that fileParts parameter.

jmccaull · 2019-06-21T14:26:17Z

I've also found a bug with the dispatching instrumentation where it isn't dispatching as expected past the first level of data fetching. Not sure if this is something I introduced or already existed. Working on enhancing the specification test to catch this.

jmccaull · 2019-06-21T21:27:51Z

So I've pushed a rough fix for the dispatching behavior. Need to clean it up and have yet to get to work on the other issues.
I realized performance testing this that there needed to be some sort of check for if each request was at the same "dispatching level" as the others, otherwise one would get fully processed, then as the others were getting there they would prematurely signal they were ready and dispatching essentially went back to one at a time. I can't seem to replicate this behavior in a simple test, as I think everything simply gets resolved too quickly. I also realized that this check needed to take into consideration the shape of the queries, otherwise one that had less depth would cause others to hang. The best case solution I could come up with perfectly dispatches queries that exactly match, or up until they differ. I achieved this by keeping track of the selection set with the dispatch state, and as soon as the selection set differs that comparison can no longer disqualify dispatching. This was fairly easy to create test for and seems to be working.
A side effect of this is that the instrumentation is now a bit beefier. In our system, we are heavily I/O limited so this still ends up still being a huge throughput gain. I could see some not needing this though and I plan on implementing two more context options- one for each context type but that stops the servlet from adding this dispatching instrumentation(or some similar mechanism to avoid adding the instrumentation).
Please look over the instrumentation/tests and let me know what you think. Thanks!

Edit: Actually, removing the completed executions and re-evaluating the dispatches makes tracking the selection set unnecessary, I'm not sure what I was thinking. I still think an option to disable the instrumentation all together is still worthwhile.

…ation

src/main/java/graphql/servlet/AbstractGraphQLHttpServlet.java

hanjb · 2019-06-25T18:11:24Z

src/main/java/graphql/servlet/context/ContextSetting.java

+            case PER_REQUEST_WITHOUT_INSTRUMENTATION:
+                //Intentional fallthrough
+            case PER_REQUEST_WITH_INSTRUMENTATION:
+                return new PerRequestBatchedInvocationInput(requests, schema, contextSupplier.get(), root);


I think I get the difference between PerQueryBatchedInvocationInput and PerRequestBatchedInvocationInput, but I think it would difficult for contextSupplier to provide different context for PerQueryBatchedInvocationInput, since the contextSupplier does not have visibility into which query the context is required.

It would be up to the person customizing the ContextBuilder to keep in mind which setting they are going to be using. Do you think they need some part of the input object that they can't get from the http objects?

Also the intent for the context isn't that they are based on any property of the query, so I'm not sure it would be an issue

…some context setting instrumentation tests

jmccaull · 2019-06-25T19:21:54Z

Change log/migration instructions for this PR:

Introduced a more defined package structure - import statements will need to be updated
BatchExecutionHandler removed - replaced with simpler BatchInputPreProcessor interface
- BatchInputPreProcessor removes the dependency on http objects and no longer requires implementations to handle writing the result.
  *BatchInputPreProcessResult wraps either the batch to execute or the status code/message to return to the client
GraphQLContext has been split into three interfaces: GraphQLContext, GraphQLServletContext and GraphQLWebSocketContext
- The default context implementations have been made immutable and can be created using builders obtained with static methods. See DeafultGraphQLContextBuilder for examples.
Four new context settings introduced to align more closely with Apollo-link-batch. "PER_QUERY_WITH_INSTRUMENTATION" is defaulted because it most closely resembles previous behavior. See readme.md for details

hanjb · 2019-06-25T20:04:40Z

src/main/java/graphql/servlet/instrumentation/RequestStack.java

+     * @return if all managed executions are ready to be dispatched.
+     */
+    public boolean allReady() {
+        List<Integer> dispatchStack = activeRequests.values().stream().findFirst().map(CallStack::getDispatchedLevels).orElse(Collections.emptyList());


not quite sure about the logic here, if activeRequests contains multiple elements, would findFirst() still make sense ?

Each time a level of the query hierarchy is started one of the active stacks will get incremented above the others. Because the futures are created serially, once that one finishes resolving they will all evaluate as ready according to the first part of the if. To make sure we only dispatch once each active query has created their load calls for the current level we just need to make sure that they all evaluate to ready and also have the same dispatch level completed, so it doesn't matter which one we start with, they just all need to be equal

I guess in this context the first part of the if means there are no fetch calls that are unfinished and the second part is that each query has been processed for this step of the hierarchy. The instrumentation removes active requests in the instrumentation call to instrument the result to prevent this from causing deeper queries from hanging waiting on a higher level that will never execute to that depth. Does that help?

I guess I could do something like .stream().distinct().count() <= 1 and it would be clearer

…est stack

jmccaull added 2 commits June 17, 2019 16:04

initial work up of per query/per request scoped context and dispatchi…

bbc3e43

…ng instrumentation

refactor of Abstract classes, context and packages with a little clea…

eda1e89

…n up

jmccaull mentioned this pull request Jun 19, 2019

Batch query performance and context options #190

Closed

jmccaull added 4 commits June 19, 2019 16:37

some javadoc

bf6bca8

simple code analysis

3f47a79

Merge branch 'master' into query_invoker_refactor

6499056

readme update for settings

83384ec

oliemansm reviewed Jun 20, 2019

View reviewed changes

jmccaull and others added 4 commits June 20, 2019 16:16

Added a web context interface, removed optional

894476f

Merge branch 'master' into query_invoker_refactor

e60bc5d

merge conflicts

a2eb676

Refactored contexts specifically for servlet and websocket

8ed10e0

oliemansm reviewed Jun 21, 2019

View reviewed changes

Initial work up of fix for dispatching instrumentation

d186d8f

jmccaull added 3 commits June 21, 2019 19:16

removed selection set tracking

8f5dad2

added two new context settings that do not add dispatching instrument…

0678722

…ation

Renamed some static methods, removed file parts from context builder

a23067a

jmccaull commented Jun 25, 2019

View reviewed changes

src/main/java/graphql/servlet/AbstractGraphQLHttpServlet.java Show resolved Hide resolved

renamed test class

4e20b75

hanjb reviewed Jun 25, 2019

View reviewed changes

moved supplier get call into PerRequestBatchedInvocationInput, added …

b855d63

…some context setting instrumentation tests

jmccaull changed the title ~~WIP: Configurable per query or per request context/dataloader batching~~ Configurable per query or per request context/dataloader batching Jun 25, 2019

hanjb reviewed Jun 25, 2019

View reviewed changes

made deafult contexts public classes, simplified stream logic in requ…

c5f7792

…est stack

jmccaull mentioned this pull request Jun 27, 2019

Update to version of servlet with context settings graphql-java-kickstart/graphql-spring-boot#263

Merged

Bumped version number to 6.0.0 prior to release

e2d590e

oliemansm merged commit 373869b into graphql-java-kickstart:master Jun 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configurable per query or per request context/dataloader batching #192

Configurable per query or per request context/dataloader batching #192

jmccaull commented Jun 19, 2019 •

edited

Loading

oliemansm left a comment

oliemansm Jun 20, 2019

oliemansm Jun 20, 2019

jmccaull Jun 20, 2019

oliemansm Jun 20, 2019

jmccaull Jun 20, 2019 •

edited

Loading

oliemansm Jun 20, 2019

oliemansm left a comment

oliemansm Jun 21, 2019

jmccaull Jun 21, 2019

oliemansm Jun 21, 2019

oliemansm Jun 21, 2019

jmccaull commented Jun 21, 2019

jmccaull commented Jun 21, 2019 •

edited

Loading

hanjb Jun 25, 2019

jmccaull Jun 25, 2019

jmccaull Jun 25, 2019 •

edited

Loading

jmccaull commented Jun 25, 2019

hanjb Jun 25, 2019

jmccaull Jun 25, 2019

jmccaull Jun 25, 2019

jmccaull Jun 25, 2019

Configurable per query or per request context/dataloader batching #192

Configurable per query or per request context/dataloader batching #192

Conversation

jmccaull commented Jun 19, 2019 • edited Loading

oliemansm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmccaull Jun 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oliemansm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmccaull commented Jun 21, 2019

jmccaull commented Jun 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmccaull Jun 25, 2019 • edited Loading

Choose a reason for hiding this comment

jmccaull commented Jun 25, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmccaull commented Jun 19, 2019 •

edited

Loading

jmccaull Jun 20, 2019 •

edited

Loading

jmccaull commented Jun 21, 2019 •

edited

Loading

jmccaull Jun 25, 2019 •

edited

Loading