Oscar Funes

Starting with GraphQL: Caveats - Part 1

April 04, 2018

Is a query language for your API, and a server-side runtime for executing queries by using a type system you define for your data.

I recently started to play with GraphQL to interact with the different microservices available in the company I work for. I’ve been going through documentation, videos and the spec. For sure it’s an interesting technology to work with, especially because it fits what we’re trying to achieve by aggregating different resources and create specific payloads for our current front-end clients (web and mobile). So, what’s so special about GraphQL? For me, it’s a new way to request JSON from the backend without worrying too much about how the response will be. You get what you asked for, in essence. Take a look at the hello world: How you would write a query in your client

{
  hello
}

And the response you would obtain.

{
  data: {
    hello: 'Hello world!'
  }
}

The reality is that if you’re tying together a more complicated application than a hello world, there are some things to take notice before immersing in building this. There are some things that the GraphQL team has left for user-land to implement and think about. Which makes sense, since they make the statement that they’re just a query language, which only means they’re a protocol for exchanging data.

Authentication - Authorization

Most probably, you’re already using some method for authenticating/authorizing users within your system. Which you can reuse, and the GraphQL clients in Node.js allow you to dynamically create the GraphQL configuration per request. For example with express-graphql :

app.use('/graphql', graphqlHTTP(async (request, response, graphQLParams) => ({
  schema: MyGraphQLSchema,
  rootValue: await someFunctionToGetRootValue(request)
  graphiql: true
})));

If you previously defined your preferred method for authentication, like OAuth, JWT, cookies, etc. The previous middleware should do the authentication, and you can pass the credentials to check in each resolver if the user is authorized or not.

Pagination

So for pagination, Facebook prefers using cursors as specified in their Relay specification. And most of the community check Relay as the default when checking for things outside the scope, such as this problem. Relay enforces that a query returns the cursor information, like this:

{
  user {
    id
    name
    friends(first: 10, after: "opaqueCursor") {
      edges {
        cursor
        node {
          id
          name
        }
      }
      pageInfo {
        hasNextPage
      }
    }
  }
}

Let’s go a bit through this. The first argument is how many to show starting from the beginning of the list. In this example, the first ten elements of the list. In this case, the list of friends is a wrapper over your “real” friends. The list of friends contains edges which is another list, which wraps your friends in the node field here you make available the fields you can access for your users, and each edge will have a cursor (an id), so you can later refer to it. The pageInfo is a container for metadata, in this case, to know if you’re at the end of the list. I think Facebook prefers cursors because it’s a better fit for the infinite scroll experience they offer through their applications. In principle, you can implement pagination however you see fit. But you’ll often see people referring to Relay for “best” practices around GraphQL.

Caching

Caching is a “sensitive” subject for HTTP/RESTful advocates because under the hood the way to send queries is through a POST, which is not cacheable by default.

$ curl -XPOST http://localhost:8080/graphql
-H 'Content-Type: application/graphql'
-d 'query Root{ hello }'

Responses to this method are not cacheable, unless the response includes appropriate Cache-Control or Expires header fields.

Let’s look first at caching from the client side. I think one of the most popular clients is Apollo-client, which implements a very effective caching strategy. The general idea is that they cache the graph like if you request the same query (a graph path), you are likely requesting the same data so you get that from the cache, no request goes through. A more interesting form of caching is caching by a node. Since you have to think your data as a graph, when the response of a query comes back, each node will have an id assigned. That way even if you make a different query, if it references the same node you’ll get the node from cache. And this allows for mutations when the response comes it will reference the same id and will update the client cache. As per my previous statement, Relay also has excellent information around client caching. In essence, it works similar to the Apollo-client idea of caching the graph.

REST

I think this discussion shouldn’t happen or maybe it should (?). Each has its place to be used. You can achieve the same benefits, and some even will say that there’s the advantage of caching at the HTTP layer. I’ve seen some people argue that JSON:API is as good (or better) than using GraphQL, there’s also Ion Hypermedia Type and probably inside your company you have your standard for dealing with REST services. Which is something, I think, that highlights the advantage GraphQL has over REST, that there are not infinite ways to define your API. You just have strongly typed queries, and the response will match the query. It’s not like we designed an API, and later on, we decide to add HATEOAS or other metadata. Previously mentioned design specs have the benefit of append-only strategy, which provides no breaking changes going forward, but it doesn’t limit the payload size which can grow infinitely without your clients noticing that the response time increases inadvertently.

Conclusion

While I agree that GraphQL has drawbacks, it has benefits that specifically benefit organizations that need to “stitch” together downstream services for front-end clients to create interfaces that match the data in a better way. Especially if you think your data is hierarchical by default, and it will benefit from modeling all these relationships in a single query. Are you using GraphQL already? Have you had problems with the things I mentioned previously? Please let me know in the comments!


I'm a software architect that enjoys helping people, building platforms, and working in distributed systems at the intersection between people and software.