Lecture Notes Of Day 11:
Aggregation Framework – Part 2 (Advanced)
Objective:
In this lesson, we will dive
deeper into MongoDB’s aggregation framework by focusing on more advanced
aggregation stages. By the end of the lesson, you will be able to apply complex
aggregation operations, including $match, $project, and $sort, to manipulate
and query MongoDB data.
Outcome:
Students will be able to use
aggregation pipeline stages like $match, $project, and $sort to perform
advanced data transformations and filtering, enhancing their ability to work
with large datasets in MongoDB.
Introduction
to the Aggregation Pipeline
The aggregation pipeline
is a powerful feature in MongoDB that allows you to transform, filter, and
summarize data. It is an essential tool for complex queries, such as filtering
documents, reshaping the data, and performing calculations. Aggregation
operations are performed on data in stages, where the output of one stage
becomes the input of the next stage.
MongoDB provides a series of
aggregation stages that can be used to manipulate data in various ways. In this
lesson, we will focus on the following stages:
1. $match
2. $project
3. $sort
1. $match
Stage
The $match stage filters
documents from the collection based on specified criteria, similar to a WHERE
clause in SQL.
Usage:
- It
allows you to specify query criteria for documents, reducing the number of
documents that are passed to the next stage in the pipeline.
- $match
can be used for equality comparisons, ranges, regular expressions, etc.
Syntax:
javascript
CopyEdit
{
$match: {
field: value
}
}
Examples:
- Basic
$match:
javascript
CopyEdit
db.orders.aggregate([
{
$match: { status: "shipped" }
}
])
This query filters documents from
the orders collection where the status field equals "shipped".
- Range
Matching:
javascript
CopyEdit
db.orders.aggregate([
{
$match: {
totalAmount: { $gt: 100 }
}
}
])
This query filters orders where totalAmount
is greater than 100.
- Multiple
Conditions:
javascript
CopyEdit
db.orders.aggregate([
{
$match: {
status: "shipped",
totalAmount: { $gt: 100 }
}
}
])
This query matches orders that
are both "shipped" and have a totalAmount greater than 100.
2. $project
Stage
The $project stage is used to
include, exclude, or add computed fields to the documents passed through the
aggregation pipeline.
Usage:
- It
allows you to shape the output of your query by including only certain
fields or adding new fields.
- You
can exclude fields by setting their value to 0 and include them by setting
their value to 1.
Syntax:
javascript
CopyEdit
{
$project: {
field1: 1, // Include
field2: 0, //
Exclude
newField: { $operation: "$field"
} // Computed field
}
}
Examples:
- Basic
$project:
javascript
CopyEdit
db.orders.aggregate([
{
$project: {
orderId: 1,
status: 1
}
}
])
This query returns only the orderId
and status fields from each document.
- Excluding
a Field:
javascript
CopyEdit
db.orders.aggregate([
{
$project: {
status: 1,
totalAmount: 1,
customerName: 0
}
}
])
This query includes status and totalAmount
fields while excluding the customerName field.
- Adding
Computed Fields:
javascript
CopyEdit
db.orders.aggregate([
{
$project: {
orderId: 1,
totalAmount: 1,
discountedPrice: { $multiply: ["$totalAmount",
0.9] }
}
}
])
This query computes a new field discountedPrice,
which is 90% of the totalAmount.
3. $sort
Stage
The $sort stage sorts the
documents in the pipeline based on the specified fields. Sorting can be done in
ascending (1) or descending (-1) order.
Usage:
- You
can sort the documents by one or more fields.
- Sorting
is applied after filtering and projecting the documents, ensuring that the
data is in the correct order for further stages or output.
Syntax:
javascript
CopyEdit
{
$sort: {
field1: 1, // Ascending order
field2: -1 // Descending order
}
}
Examples:
- Basic
$sort:
javascript
CopyEdit
db.orders.aggregate([
{
$sort: {
totalAmount: -1
}
}
])
This query sorts the orders
collection by totalAmount in descending order.
- Sorting
by Multiple Fields:
javascript
CopyEdit
db.orders.aggregate([
{
$sort: {
status: 1, // Ascending order
totalAmount: -1 // Descending order
}
}
])
This query first sorts by status
in ascending order, and within each status, it sorts by totalAmount in
descending order.
Combining
$match, $project, and $sort
You can combine multiple stages
in a single aggregation pipeline to achieve complex data transformations.
Example:
javascript
CopyEdit
db.orders.aggregate([
{
$match: { status: "shipped" }
},
{
$project: {
orderId: 1,
totalAmount: 1,
discountedPrice: { $multiply: ["$totalAmount",
0.9] }
}
},
{
$sort: { discountedPrice: -1 }
}
])
This pipeline first filters
orders where status is "shipped", then calculates a discountedPrice,
and finally sorts the results by discountedPrice in descending order.
Conclusion
In this lesson, we have explored
advanced MongoDB aggregation pipeline stages, including:
- $match:
Filtering documents based on specified criteria.
- $project:
Shaping documents by including/excluding fields and adding computed
fields.
- $sort:
Sorting documents in a specified order.
By combining these stages, you
can create powerful aggregation queries to manipulate and analyze data
efficiently in MongoDB.
Practice
Exercise:
1. Create an
aggregation query that retrieves all orders where the status is "delivered",
includes the orderId, totalAmount, and calculates a taxAmount (10% of totalAmount),
and sorts by taxAmount in descending order.
2. Write a
query that groups employees by department and calculates the average salary per
department, then sorts the results by average salary in descending order.
1. Aggregation Query to Retrieve Orders with Tax Amount
Objective:
Create an aggregation query that retrieves all orders where the status is "delivered",
includes orderId, totalAmount, calculates a taxAmount (10% of totalAmount), and
sorts by taxAmount in descending order.
Aggregation
Query:
javascript
CopyEdit
db.orders.aggregate([
{
$match: { status: "delivered"
} // Filters orders with status
"delivered"
},
{
$project: {
orderId: 1, // Includes the orderId
field
totalAmount: 1, // Includes the totalAmount
field
taxAmount: { $multiply: ["$totalAmount",
0.1] } // Calculates 10% of totalAmount as taxAmount
}
},
{
$sort: { taxAmount: -1 } // Sorts the results by taxAmount in
descending order
}
])
Explanation:
1. $match: Filters
the documents to include only those where the status is "delivered".
2. $project:
o
Includes the orderId and totalAmount fields.
o
Creates a new field taxAmount, which is calculated
as 10% of the totalAmount using the $multiply operator.
3. $sort: Sorts
the documents by the newly calculated taxAmount in descending order (-1).
2.
Aggregation Query to Group Employees by Department and Calculate Average Salary
Objective:
Write a query that groups employees by department, calculates the average
salary per department, and sorts the results by average salary in descending
order.
Aggregation
Query:
javascript
CopyEdit
db.employees.aggregate([
{
$group: {
_id: "$department", // Groups by the department field
averageSalary: { $avg: "$salary"
} // Calculates the average salary for each department
}
},
{
$sort: { averageSalary: -1 } // Sorts the departments by averageSalary
in descending order
}
])
Explanation:
1. $group:
o
Groups the documents by the department field
(stored as _id in the output).
o
Calculates the average salary for each department
using the $avg operator.
2. $sort: Sorts
the result by the calculated averageSalary in descending order (-1).
Summary
of Results:
- The
first query retrieves all "delivered" orders, calculates the taxAmount
(10% of totalAmount), and sorts them by taxAmount in descending order.
- The
second query groups employees by their department, calculates the average
salary in each department, and sorts the departments by average salary in
descending order.
