Aggregation Framework
Introduction to the Aggregation Framework and its pipeline operators for performing complex data transformations and analysis.
MongoDB Essentials: Aggregation Framework
Practical Aggregation Examples
The MongoDB Aggregation Framework is a powerful tool for processing data and returning computed results. It consists of a pipeline of stages, each transforming the documents as they pass through. Here, we'll explore practical examples demonstrating common aggregation operations.
Example 1: Grouping and Counting
Imagine a collection of documents representing orders. We want to find the number of orders placed by each customer.
Collection: Orders
[
{ "customerId": "A123", "orderDate": "2023-10-26", "amount": 50 },
{ "customerId": "B456", "orderDate": "2023-10-26", "amount": 100 },
{ "customerId": "A123", "orderDate": "2023-10-27", "amount": 75 },
{ "customerId": "C789", "orderDate": "2023-10-27", "amount": 25 },
{ "customerId": "B456", "orderDate": "2023-10-28", "amount": 120 }
]
Aggregation Pipeline:
db.orders.aggregate([
{
$group: {
_id: "$customerId",
totalOrders: { $sum: 1 }
}
}
])
Explanation:
- The
$group
stage groups documents by thecustomerId
field. _id: "$customerId"
specifies the field to group by.totalOrders: { $sum: 1 }
calculates the sum of 1 for each document in each group, effectively counting the number of orders per customer.
Result:
[
{ "_id": "A123", "totalOrders": 2 },
{ "_id": "B456", "totalOrders": 2 },
{ "_id": "C789", "totalOrders": 1 }
]
Example 2: Calculating Averages
Using the same Orders collection, let's find the average order amount for each customer.
Aggregation Pipeline:
db.orders.aggregate([
{
$group: {
_id: "$customerId",
averageOrderAmount: { $avg: "$amount" }
}
}
])
Explanation:
- The
$group
stage groups documents by thecustomerId
field, same as before. averageOrderAmount: { $avg: "$amount" }
calculates the average of theamount
field for each group (customer).
Result:
[
{ "_id": "A123", "averageOrderAmount": 62.5 },
{ "_id": "B456", "averageOrderAmount": 110 },
{ "_id": "C789", "averageOrderAmount": 25 }
]
Example 3: Using $match for Filtering
Now, let's find the total order amount for customers who placed orders after "2023-10-26".
Aggregation Pipeline:
db.orders.aggregate([
{
$match: {
orderDate: { $gt: "2023-10-26" }
}
},
{
$group: {
_id: "$customerId",
totalAmount: { $sum: "$amount" }
}
}
])
Explanation:
- The
$match
stage filters the documents, keeping only those where theorderDate
is greater than "2023-10-26". - The
$group
stage then groups the filtered documents bycustomerId
and calculates thetotalAmount
using the$sum
operator.
Result:
[
{ "_id": "A123", "totalAmount": 75 },
{ "_id": "B456", "totalAmount": 120 },
{ "_id": "C789", "totalAmount": 25 }
]
Real-world Examples
The Aggregation Framework is not just for simple calculations. It shines when dealing with more complex scenarios. Here are some examples of how it is applied in real-world applications:
E-commerce: Analyzing Product Sales Performance
An e-commerce company can use the Aggregation Framework to analyze product sales performance. They can group sales data by product category, region, or time period to identify top-selling products, understand sales trends, and optimize inventory management.
Example Aggregation: Group sales by product category and calculate the total revenue for each category.
db.sales.aggregate([
{
$group: {
_id: "$productCategory",
totalRevenue: { $sum: "$price" }
}
},
{
$sort: { totalRevenue: -1 } // Sort by revenue in descending order
}
])
Social Media: Identifying Trending Topics
A social media platform can use the Aggregation Framework to identify trending topics. They can group posts by hashtag, count the number of posts for each hashtag within a specific timeframe, and identify the hashtags with the highest counts.
Example Aggregation: Count posts by hashtag within the last hour.
db.posts.aggregate([
{
$match: {
createdAt: { $gt: new Date(Date.now() - 60 * 60 * 1000) } // Last hour
}
},
{
$unwind: "$hashtags" // Deconstruct the array of hashtags
},
{
$group: {
_id: "$hashtags",
count: { $sum: 1 }
}
},
{
$sort: { count: -1 } // Sort by count in descending order
},
{
$limit: 10 // Get the top 10 trending hashtags
}
])
Financial Services: Detecting Fraudulent Transactions
A financial institution can use the Aggregation Framework to detect fraudulent transactions. They can analyze transaction patterns, group transactions by user account, and identify accounts with unusual transaction activity, such as a sudden increase in the number or amount of transactions.
Example Aggregation: Group transactions by user account and calculate the average transaction amount for each account.
db.transactions.aggregate([
{
$group: {
_id: "$accountId",
averageTransactionAmount: { $avg: "$amount" },
transactionCount: { $sum: 1 }
}
},
{
$match: {
$expr: { // Use $expr to compare fields within a document
$gt: [ "$transactionCount", 10 ] // Filter accounts with more than 10 transactions
}
}
},
{
$sort: { averageTransactionAmount: -1 }
}
])