GraphQL with NestJS

Integrating GraphQL into a NestJS application using Apollo Server or other GraphQL libraries.


Data Loaders for Optimized Data Fetching in NestJS and GraphQL

What are Data Loaders?

Data Loaders are a powerful technique for optimizing data fetching in GraphQL applications, especially when combined with frameworks like NestJS. They help prevent the notorious N+1 problem, which can severely impact performance, especially with complex object graphs and numerous relationships. Essentially, a Data Loader is a caching and batching mechanism that delays the execution of database queries until the last possible moment, and then batches similar requests together into a single query.

Think of it like ordering coffee for a group. Instead of each person individually going to the barista and ordering (N+1), a Data Loader acts like a designated coffee runner. People tell the runner what they want (request data), the runner waits for everyone to decide, then places a single, larger order (batches requests) from the barista, significantly reducing the overall waiting time and barista workload.

Understanding and Implementing Data Loaders to Avoid the N+1 Problem

The N+1 problem occurs when fetching a list of items from the database, and then, for *each* item in that list, you need to fetch additional data, resulting in N+1 database queries. Without a Data Loader, a GraphQL resolver might issue one query to fetch a list of users, and then, for *each* user, issue a separate query to fetch their posts. This quickly becomes inefficient, especially with large datasets.

How Data Loaders Solve the N+1 Problem

Data Loaders solve this by:

  • Batching: They collect all individual requests for the same type of data into a single batch.
  • Caching: They cache the results of the batched query, so subsequent requests for the same data are served from the cache, eliminating redundant database calls.
  • Deduplication: If the same data is requested multiple times within the same request cycle, the Data Loader only fetches it once.

Implementing a Data Loader

Here's a conceptual example of how a Data Loader works (using JavaScript pseudo-code):

  class DataLoader {
    constructor(batchLoadFn) {
        this.batchLoadFn = batchLoadFn;
        this.cache = new Map();
        this.queue = [];
    }

    load(key) {
        if (this.cache.has(key)) {
            return Promise.resolve(this.cache.get(key)); // Serve from cache
        }

        return new Promise((resolve, reject) => {
            this.queue.push({ key, resolve, reject });

            // Schedule batch execution (e.g., using setTimeout)
            if (!this.batching) {
                this.batching = true;
                setTimeout(() => this.executeBatch(), 0); // Execute immediately
            }
        });
    }

    async executeBatch() {
        this.batching = false;
        const batch = this.queue;
        this.queue = [];

        const keys = batch.map(item => item.key);

        try {
            const results = await this.batchLoadFn(keys);

            // Ensure the results are in the same order as the keys
            batch.forEach((item, index) => {
                const result = results[index];
                this.cache.set(item.key, result); // Store in cache
                item.resolve(result);
            });
        } catch (error) {
            batch.forEach(item => item.reject(error));
        }
    }
}

// Example batchLoadFn (replace with your actual database query)
async function batchLoadUsers(userIds) {
    // Fetch users from the database in a single query based on userIds
    const users = await db.query('SELECT * FROM users WHERE id IN (?)', [userIds]);

    // Important:  Ensure the order of the returned users matches the order of userIds.
    // This usually involves creating a map to index users by ID, then iterating through userIds.
    const userMap = new Map(users.map(user => [user.id, user]));
    return userIds.map(userId => userMap.get(userId) || null); // Returns array of users or null if not found
}

// Usage
const userLoader = new DataLoader(batchLoadUsers);

// Example usage within a resolver:
async function getUser(userId) {
    return await userLoader.load(userId);
}  

Key points in the above example:

  • The DataLoader class takes a batchLoadFn as a constructor argument. This function is responsible for fetching the data in batches.
  • The load(key) method either retrieves data from the cache or queues a request for batching.
  • The executeBatch() method gathers all queued requests, calls the batchLoadFn, and then resolves the promises associated with each request, storing the results in the cache.
  • The crucial part is ensuring that the `batchLoadFn` *returns the results in the same order as the input keys*. This allows the Data Loader to correctly associate the retrieved data with the original requests.

Integrating Data Loaders with NestJS and GraphQL Resolvers

1. Install Required Packages

Make sure you have the necessary packages installed:

npm install --save @nestjs/graphql graphql apollo-server-express dataloader

2. Create a Data Loader Service

Create a NestJS service to manage your Data Loaders. This service can be injected into your resolvers.

  // src/dataloader/dataloader.service.ts
import { Injectable, Scope } from '@nestjs/common';
import { InjectRepository } from '@nestjs/typeorm';
import { Repository } from 'typeorm';
import * as DataLoader from 'dataloader';
import { User } from '../user/user.entity';
import { Post } from '../post/post.entity';

@Injectable({ scope: Scope.REQUEST }) // Important: Scope to request to avoid data leaking
export class DataloaderService {
    constructor(
        @InjectRepository(User)
        private readonly userRepository: Repository,
        @InjectRepository(Post)
        private readonly postRepository: Repository,
    ) {}

    createUserLoader(): DataLoader {
        return new DataLoader(async (userIds: number[]) => {
            const users = await this.userRepository.findByIds(userIds);
            const userMap = new Map(users.map(user => [user.id, user]));
            return userIds.map(userId => userMap.get(userId) || null);
        });
    }


    createPostLoader(): DataLoader {
        return new DataLoader(async (postIds: number[]) => {
            const posts = await this.postRepository.findByIds(postIds);
            const postMap = new Map(posts.map(post => [post.id, post]));
            return postIds.map(postId => postMap.get(postId) || null);
        });
    }

    createPostsByUserIdLoader(): DataLoader {
        return new DataLoader(async (userIds: number[]) => {
            const posts = await this.postRepository.find({ where: { userId: In(userIds) } });

            // Group posts by user ID.  Crucial step!
            const userPostsMap: { [userId: number]: Post[] } = {};
            posts.forEach(post => {
                if (!userPostsMap[post.userId]) {
                    userPostsMap[post.userId] = [];
                }
                userPostsMap[post.userId].push(post);
            });

            // Return posts in the same order as userIds.  Crucial step!
            return userIds.map(userId => userPostsMap[userId] || []);
        });
    }
}  

3. Module Setup (Register the Data Loader Service)

Import and provide the `DataloaderService` in your module:

  // src/app.module.ts
import { Module } from '@nestjs/common';
import { GraphQLModule } from '@nestjs/graphql';
import { TypeOrmModule } from '@nestjs/typeorm';
import { UserModule } from './user/user.module';
import { PostModule } from './post/post.module';
import { DataloaderService } from './dataloader/dataloader.service';
import { User } from './user/user.entity';
import { Post } from './post/post.entity';
import { AppController } from './app.controller';
import { AppService } from './app.service';

@Module({
  imports: [
    TypeOrmModule.forRoot({
      type: 'sqlite',
      database: 'db.sqlite',
      entities: [User, Post],
      synchronize: true, // only in development
    }),
    GraphQLModule.forRoot({
      autoSchemaFile: 'schema.gql',
      playground: true,
      debug: true,
      context: ({ req }) => ({ req }), //Important: Make the request available in the context!
    }),
    UserModule,
    PostModule,
  ],
  controllers: [AppController],
  providers: [AppService, DataloaderService], // Provide the DataLoaderService
})
export class AppModule {}  

4. Inject the Data Loader Service into your Resolvers

Inject the `DataloaderService` into your resolvers and create a new instance of the Data Loader per request. The GraphQL context is the perfect place to store these request-scoped instances.

  // src/user/user.resolver.ts
import { Resolver, Query, ResolveField, Parent } from '@nestjs/graphql';
import { UserService } from './user.service';
import { User } from './user.entity';
import { Post } from '../post/post.entity';
import { DataloaderService } from '../dataloader/dataloader.service';
import { Context } from '@nestjs/graphql';
import * as DataLoader from 'dataloader';

@Resolver(() => User)
export class UserResolver {
    constructor(
        private readonly userService: UserService,
        private readonly dataloaderService: DataloaderService,
    ) {}

    @Query(() => [User])
    async users(): Promise {
        return this.userService.findAll();
    }

    @ResolveField(() => [Post])
    async posts(
        @Parent() user: User,
        @Context('req') req: any,
    ): Promise {
        if (!req.dataLoaders) {
            req.dataLoaders = {};
        }

        if (!req.dataLoaders.postsByUserId) {
            req.dataLoaders.postsByUserId = this.dataloaderService.createPostsByUserIdLoader();
        }
        return req.dataLoaders.postsByUserId.load(user.id);
    }
}  

5. Explanation of the Code

  • Scope.REQUEST: The `DataloaderService` is scoped to `Scope.REQUEST`. This is *crucial*. Each GraphQL request gets its own instance of the service and therefore its own Data Loaders. This prevents data from leaking between requests, which can happen if a single Data Loader instance is shared across multiple requests.
  • GraphQL Context: We are making the dataLoaders available in the GraphQL Context through context: ({ req }) => ({ req }) in the AppModule. This means the dataLoaders are associated with each request.
  • Creating DataLoaders in the Context: Inside the posts resolver, we now check whether `req.dataLoaders` exists, and create the postsByUserId dataLoader if it doesn't. We also only create the dataloader once per request.
  • Using the Data Loader: We call dataloader.load(user.id) to fetch the posts for a given user ID. The Data Loader will handle batching and caching.

By following these steps, you can effectively integrate Data Loaders into your NestJS and GraphQL application, eliminating the N+1 problem and significantly improving performance. Remember to adapt the code to your specific data model and database access patterns.