Aggregation Framework

Introduction to the Aggregation Framework and its pipeline operators for performing complex data transformations and analysis.


MongoDB Essentials: Aggregation Framework

Practical Aggregation Examples

The MongoDB Aggregation Framework is a powerful tool for processing data and returning computed results. It consists of a pipeline of stages, each transforming the documents as they pass through. Here, we'll explore practical examples demonstrating common aggregation operations.

Example 1: Grouping and Counting

Imagine a collection of documents representing orders. We want to find the number of orders placed by each customer.

Collection: Orders

 [
          { "customerId": "A123", "orderDate": "2023-10-26", "amount": 50 },
          { "customerId": "B456", "orderDate": "2023-10-26", "amount": 100 },
          { "customerId": "A123", "orderDate": "2023-10-27", "amount": 75 },
          { "customerId": "C789", "orderDate": "2023-10-27", "amount": 25 },
          { "customerId": "B456", "orderDate": "2023-10-28", "amount": 120 }
        ] 

Aggregation Pipeline:

 db.orders.aggregate([
          {
            $group: {
              _id: "$customerId",
              totalOrders: { $sum: 1 }
            }
          }
        ]) 

Explanation:

  • The $group stage groups documents by the customerId field.
  • _id: "$customerId" specifies the field to group by.
  • totalOrders: { $sum: 1 } calculates the sum of 1 for each document in each group, effectively counting the number of orders per customer.

Result:

 [
          { "_id": "A123", "totalOrders": 2 },
          { "_id": "B456", "totalOrders": 2 },
          { "_id": "C789", "totalOrders": 1 }
        ] 

Example 2: Calculating Averages

Using the same Orders collection, let's find the average order amount for each customer.

Aggregation Pipeline:

 db.orders.aggregate([
          {
            $group: {
              _id: "$customerId",
              averageOrderAmount: { $avg: "$amount" }
            }
          }
        ]) 

Explanation:

  • The $group stage groups documents by the customerId field, same as before.
  • averageOrderAmount: { $avg: "$amount" } calculates the average of the amount field for each group (customer).

Result:

 [
          { "_id": "A123", "averageOrderAmount": 62.5 },
          { "_id": "B456", "averageOrderAmount": 110 },
          { "_id": "C789", "averageOrderAmount": 25 }
        ] 

Example 3: Using $match for Filtering

Now, let's find the total order amount for customers who placed orders after "2023-10-26".

Aggregation Pipeline:

 db.orders.aggregate([
          {
            $match: {
              orderDate: { $gt: "2023-10-26" }
            }
          },
          {
            $group: {
              _id: "$customerId",
              totalAmount: { $sum: "$amount" }
            }
          }
        ]) 

Explanation:

  • The $match stage filters the documents, keeping only those where the orderDate is greater than "2023-10-26".
  • The $group stage then groups the filtered documents by customerId and calculates the totalAmount using the $sum operator.

Result:

 [
          { "_id": "A123", "totalAmount": 75 },
          { "_id": "B456", "totalAmount": 120 },
          { "_id": "C789", "totalAmount": 25 }
        ] 

Real-world Examples

The Aggregation Framework is not just for simple calculations. It shines when dealing with more complex scenarios. Here are some examples of how it is applied in real-world applications:

E-commerce: Analyzing Product Sales Performance

An e-commerce company can use the Aggregation Framework to analyze product sales performance. They can group sales data by product category, region, or time period to identify top-selling products, understand sales trends, and optimize inventory management.

Example Aggregation: Group sales by product category and calculate the total revenue for each category.

 db.sales.aggregate([
          {
            $group: {
              _id: "$productCategory",
              totalRevenue: { $sum: "$price" }
            }
          },
          {
            $sort: { totalRevenue: -1 } // Sort by revenue in descending order
          }
        ]) 

Social Media: Identifying Trending Topics

A social media platform can use the Aggregation Framework to identify trending topics. They can group posts by hashtag, count the number of posts for each hashtag within a specific timeframe, and identify the hashtags with the highest counts.

Example Aggregation: Count posts by hashtag within the last hour.

 db.posts.aggregate([
          {
            $match: {
              createdAt: { $gt: new Date(Date.now() - 60 * 60 * 1000) } // Last hour
            }
          },
          {
            $unwind: "$hashtags" // Deconstruct the array of hashtags
          },
          {
            $group: {
              _id: "$hashtags",
              count: { $sum: 1 }
            }
          },
          {
            $sort: { count: -1 } // Sort by count in descending order
          },
          {
            $limit: 10 // Get the top 10 trending hashtags
          }
        ]) 

Financial Services: Detecting Fraudulent Transactions

A financial institution can use the Aggregation Framework to detect fraudulent transactions. They can analyze transaction patterns, group transactions by user account, and identify accounts with unusual transaction activity, such as a sudden increase in the number or amount of transactions.

Example Aggregation: Group transactions by user account and calculate the average transaction amount for each account.

 db.transactions.aggregate([
          {
            $group: {
              _id: "$accountId",
              averageTransactionAmount: { $avg: "$amount" },
              transactionCount: { $sum: 1 }
            }
          },
          {
            $match: {
              $expr: {  // Use $expr to compare fields within a document
                $gt: [ "$transactionCount", 10 ] // Filter accounts with more than 10 transactions
              }
            }
          },
          {
            $sort: { averageTransactionAmount: -1 }
          }
        ])