Zombie Zen

Apollo Client Caching

By Ross Light

Since I’ve started working at Clutter, I’ve grown to enjoy the Apollo React Client library. However, I’ve noticed that the client’s caching behavior can be difficult to understand and cause surprising issues, so I’ve decided to collect my findings into one easily digestible post. This post assumes a basic understanding of GraphQL.

The major hidden component of Apollo Client is the cache, which sits between every query and the backend server.

The major hidden component of Apollo Client is the cache, which sits between every query and the backend server.

Basics

Let’s assume we have a schema like this:

type Query {
  me: User
  me2: User
}

type User {
  id: ID!
  name: String
}

First, whenever you write a query like this:

query {
  me {
    id
    name
  }
}

Apollo translates it into a query like this:

query {
  me {
    __typename # << added by GraphQL
    id
    name
  }
}

Apollo’s cache is a key-value store that is keyed by object type and ID. Gathering the __typename field is crucial for Apollo to cache the object. If a query doesn’t specify an ID field, then the object is stored in the cache as part of the object’s parent. So once this query gets data back from the server, the cache will look like this:

The me field is stored on the ROOT_QUERY and the User object is stored as a separate key in the cache.

The me field is stored on the ROOT_QUERY and the User object is stored as a separate key in the cache.

If another query ends up reading the same value, then the cache entry is linked. For example, if we queried a different field:

query {
  me2 {  # << different field
    id
    name
  }
}

…and the server returned a User with the same ID:

{
  "me2": {
    "__typename": "User",
    "id": "123",
    "name": "Alice Doe"
  }
}

Then the cache would link to and update the existing cache entry:

Both the me and me2 fields link to the same User cache entry.

Both the me and me2 fields link to the same User cache entry.

This automatic linking is super cool, because if another React component is on screen and reading from me.name, it will automatically trigger an update to show the new value! You can even teach Apollo to look at the cache first for ID lookups. In general, if an object you’re querying for has an identifier, you should always include the identifier field in the query.

In-Place Mutations

For mutations that update an existing object, this linking mechanism makes most cache updates very straightforward to write. If you include the field to update in the mutation response, then Apollo will use the identifier to update the cache. For example, if we had the following mutation:

type Mutation {
  updateMyName(name: String!): User
}

…and we executed a mutation like this:

mutation {
  updateMyName(name: "Alice Doe!") {
    id
    name
  }
}

This will transparently update the other queries since the returned User object presents the same identifier.

Mutating Lists

Adding and removing from a list is where using Apollo Client becomes more difficult. So let’s say we had a list of all users and a mutation to register a user:

type Query {
  # ...

  users: [User!]!
}

type Mutation {
  # ...

  registerUser(name: String!): User
}

If we execute a mutation like this:

mutation RegisterUserMutation {
  registerUser(name: "Carol") {
    id
    name
  }
}

Apollo has no way of knowing that this new user should appear in the Query.users field. So how do we tell it to do that? There are two ways:

  1. Specify the queries to reload with refetchQueries. This requires the least amount of work, but will always require another round-trip to the server.
  2. Provide an update function. This is harder, but allows you to change the cache without hitting the server, thus improving your application’s performance and client-side data usage.

Both of these approaches involve crafting queries that fetch the data you want to invalidate in the cache, but the latter requires actually performing the update.

Here’s what the refetchQueries approach might look like:

const [mutate] = useMutation(
  RegisterUserMutation,
  {
    refetchQueries: [{
      query: gql`
        query {
          users {
            id
            name
          }
        }
      `,
    }],
  },
);
// ...
mutate();

And here’s what the update approach might look like:

const UsersQuery = gql`
  query UsersQuery {
    users {
      id
    }
  }
`;

const [mutate] = useMutation(
  RegisterUserMutation,
  {
    update: (store, { data }) => {
      // Check that the server successfully created the user.
      if (!data || !data.registerUser) {
        return;
      }
      const newUser = data.registerUser;

      // Read the existing Query.users field from the cache.
      let users;
      try {
        users = store.readQuery({query: UsersQuery}).users;
      } catch {
        // readQuery throws an error if this data isn't in
        // the cache already. If it hasn't been read, then
        // we don't need to update it.
        return;
      }

      // Write the updated list into the cache.
      store.writeQuery({
        query: UsersQuery,
        data: {
          users: [...users, newUser],
        },
      });
    },
  },
);
// ...
mutate();

As you can see, there’s quite a bit more code in the update approach and logic on the server will be replicated on the client. However, this effort results in an incredibly smooth user experience, even with very decoupled UI components.

Mutating Paginated Lists

Paginated lists are where things get really difficult. Let’s say that our users must be fetched in batches instead of in one go:

type Query {
  # ...

  users(offset: Int = 0): [User!]!
}

(There are better ways to paginate, but I’m keeping it simple for demonstration purposes.)

If we were to try to apply the same techniques as we saw in the previous section, we’d immediately hit a snag: we need to update the users field for all possible offsets that it would appear in. To get around this, Apollo provides a @connection directive that collapses all of the individual page requests into a single cache entry:

query($offset: Int!) {
  users(offset: $offset) @connection(key: "users") {
    id
    name
  }
}

For list fields whose parameters are low-cardinality, this works fine. However, imagine that we had a query field like this:

type Query {
  # ...

  usersRegisteredAfterDate(date: Date, offset: Int = 0): [User!]!
}

Unfortunately, I have not found a good way to invalidate the affected cache entries. Even more unfortunately, this appears to be an issue dating back to 2017. See apollographql/react-apollo#708 and apollographql/apollo-client#2991 for details. apollo-link-watched-mutation appears to be a package that can work around this by creating update functions that apply for each (mutation, query) pair. However, I’ve not used it yet because I haven’t found much documentation for it, and it doesn’t seem to have much usage.

Instead, what I’ve done in these situations is mark the query as fetchPolicy: 'network-only' and pass names to the mutation’s refetchQueries. For example:

const { data } = useQuery(
  gql`
    query NewUsersQuery($date: Date, $offset: Int!) {
      usersRegisteredAfterDate(date: $date, offset: $offset) {
        id
        name
      }
    }
  `,
  {
    variables: {
      date: today,
      offset: 0,
    },
    // Always read from the network, never from the cache.
    fetchPolicy: 'network-only',
  },
);

const [mutate] = useMutation(
  RegisterUserMutation,
  {
    refetchQueries: ['NewUsersQuery'],
  },
);

While this has the downside of making a network request on every component mount, it has the advantage that your users won’t see inconsistent results across pages. The fetchPolicy forces initial loads to ignore the potentially inconsistent cache and refetchQueries triggers any live queries to reload their data from the server.

Wrapping Up

We’ve seen how to use Apollo’s cache layer effectively to provide a consistent view of your application’s data while minimizing network round-trips. While it requires some extra thought to update the cache correctly, the effort is well worth it for the performance and consistency improvements to your application. Understanding the fundamentals helps avoid confusing bugs.