Learn MongoDB Aggregation with real world example

In this article, we will see what is aggregation in mongodb and how to build mongodb aggregationpipelines.Learn MongoDB Aggregation with real world example. I assume that you have some experience in MongoDB.

When you start using mongodb in the beginning stage. you often write queries in mongodb just to do CRUD(Create Read Update and Delete) operations.

But, when an application gets more complex, you may need to perform several operations on the data before sending it as a response.

For example, Consider that you are building an Analytical Dashboard where you need to show Tasks for each user. Here, server should send the all tasks for each user in an array.

Can you guess how to achieve this in our data?.. Here comes the role of Mongodb Aggregation.

Moreover, This is one of the simple example, you may face lot more complex than this. Let's see how mongodb solves this problem using aggregation pipeline.

MongoDB Aggregation Pipeline

Firstly, Mongodb aggregation is a pipeline which process the data on each pipeline stage. Each stage returns the output which turns into the input for next pipeline in the stage.

Here, Data is passed in the each pipeline which filters,group and sort the data and returns the result.

Let's see the Aggregation Operator that are widely used in the real world applications.

Here's an example of how we are going to build mongodb aggregation pipeline,

1pipeline = [
2  { $match : { … },
3  { $group : { … },
4  { $sort : { … },
5  ...
6]
7db.collectionName.aggregate(pipeline, options)

Before, going further. To practice the aggregation along with the article. you can import the sample dataset from this site.

Once, you download the data. you can import the dataset to mongodb using the mongoimport command.

1mongoimport --db aggrsample --collection test --file sample.json --jsonArray

\$match

match operator is similar to find() operator in mongoDB except that match works with aggregation. Likewise, match pipeline operator matches all the documents that satisfies the condition.

On the above dataset, let's try to match all the documents which has MA as a state.

1db.test
2  .aggregate([
3    {
4      $match: {
5        state: "MA",
6      },
7    },
8  ])
9  .pretty()

As a result, it will find all the document with MA as a state. output will be like,

\$group

As the name suggests, it groups the documents based on the particular field. it can be id or any other fields.

On the top of match command, let's us group the documents by city.

1db.test.aggregate([
2  {
3    $match: {
4      state: "MA",
5    },
6  },
7  {
8    $group: {
9      _id: "$city",
10    },
11  },
12])

Further, you can see the result like,

but wait, we want to retrieve all the fields in the document. why does it only return grouping field?.

Well, there is a reason for it. Once we group the document with _id field( _id field can contain any grouping field).

Then, we need to provide accumulator expression to retrieve the result of grouping pipeline. popular accumulator expressions are

\$first - this expression will return the first document of grouping result
\$push - it will push all the documents into an array based on grouping result
\$max - returns the highest value from the grouping documents.
\$sum - returns the sum(numerial value) of the grouping documents.

Here, we will see how to use pushexpression along with group

1db.test
2  .aggregate([
3    {
4      $match: {
5        state: "MA",
6      },
7    },
8    {
9      $group: {
10        _id: "$city",
11        data: {
12          $push: "$$ROOT",
13        },
14      },
15    },
16  ])
17  .pretty()

So, it will return documents like,

\$project

Sometimes, you may not need all the fields in the document. you can only retrieve specific fields in the document using project operator.

1db.test
2  .aggregate([
3    {
4      $match: {
5        state: "MA",
6      },
7    },
8    {
9      $group: {
10        _id: "$city",
11        data: {
12          $push: "$$ROOT",
13        },
14      },
15    },
16    {
17      $project: {
18        _id: 0,
19        "data.loc": 1,
20      },
21    },
22  ])
23  .pretty()

moreover, you can specify 0 or 1 to retrieve the specific field. By default, _id will be retrieved. you can specify _id as 0 to avoid that.

\$sort

Above all ,sort operator basically sorts the document in either ascending or descending order.

1db.test.aggregate([
2  {
3    $match: {
4      state: "MA",
5    },
6  },
7  {
8    $sort: {
9      pop: 1,
10    },
11  },
12])

Mainly, it sorts the documents based on field pop with ascending order(1 if it is ascending, -1 if it is descending).

\$limit

limit operator limits the number of documents retrieved from the database.

1db.test.aggregate([
2  {
3    $match: {
4      state: "MA",
5    },
6  },
7  {
8    $sort: {
9      pop: 1,
10    },
11  },
12  {
13    $limit: 5,
14  },
15])

So, above query will returns only five documents from the database

\$addFields

Meanwhile, sometime you need to create a custom field which can contain data that are aggregated. you can achieve this using \$addField operator.

1db.test.aggregate([
2  {
3    $match: {
4      state: "MA",
5    },
6  },
7  {
8    $addFields: {
9      stateAlias: "MAS",
10    },
11  },
12])

As a result, it will return documents such as

\$lookup

After that, lookup is one of the popular aggregation operators in mongodb. if you are from SQL background. you can relate this with JOIN Query in RDBMS.

1db.universities
2  .aggregate([
3    { $match: { name: "USAL" } },
4    { $project: { _id: 0, name: 1 } },
5    {
6      $lookup: {
7        from: "courses",
8        localField: "name",
9        foreignField: "university",
10        as: "courses",
11      },
12    },
13  ])
14  .pretty()

from - it takes the collection that it wants to perform the join.
localField - it specifies the field from input document. Here, it takes the field name from the universities collection.
foreignField - it specifies the field from collection that it performs the join.Here, it is university field from coursescollection.
as - it specifies the alias for the field name.

Summary

To sum up, these are all the most common and popular aggregation operators in MongoDB. we will see in depth concepts of mongoDB aggregation operators in upcoming articles.

NEW TECH UPDATES

Search This Blog