Aggregation Search
Aggregation is primarily used to process data (such as calculating averages, summing, etc.) and return the computed results.
Pipeline/Pipe
Aggregation is a pipelined batch processing operation where initial documents undergo multiple pipeline stages to produce transformed aggregation results.
Assume there is a collection books containing records in the following format:
[
{
"_id": "xxx",
"category": "Novel",
"name": "The Catcher in the Rye",
"onSale": true,
"sales": 80
}
]
Aggregation sample as follows:
// Cloud Function Example
const cloudbase = require("@cloudbase/node-sdk");
const app = cloudbase.init();
const db = app.database();
const $ = db.command.aggregate;
exports.main = async (event, context) => {
const res = await db
.collection("books")
.aggregate()
.match({
onSale: true, // Whether on sale
})
.group({
// Group by the category field
_id: "$category",
// Make each group of output records have an avgSales field, whose value is the average of the sales field across all records in the group
avgSales: $.avg("$sales"),
})
.end();
return {
res,
};
};
Stage 1: The match stage filters documents in the collection (onSale:true
means finding books that are on sale) and passes them to the next stage.
Stage 2: The group stage groups documents by the category
field and calculates the average value of the sales
field for all records in each group.
API and Operators
Refer to the Node.js SDK API documentation for a complete list of aggregation stage APIs and operators.
Optimization Execution
Using Indexes
The match and sort stages can utilize indexes when placed at the beginning of the pipeline. The geoNear stage can also leverage geospatial indexes, but it should be noted that geoNear must be the first stage in the pipeline.
Reduce the dataset as early as possible
When the full set of the collection is not required, the match, limit, and skip stages should be applied as early as possible to reduce the number of records to be processed.
Notes
Except for the match stage, the operators used in objects passed to various aggregation stages are aggregation operators. It is important to note that the match stage performs query matching, so its syntax follows that of regular queries where, using standard query operators.
FAQ
Sort exceeded memory limit of 104857600 bytes
{"Error":{"Code":"FailedOperation","Message":"(Location16820) Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. Pass allowDiskUse:true to opt in."},"RequestId":"1728459320973_0.7685102267589137_33591173-19270342dd4_15"}
mongo sort memory overflow issue.
Solutions
- Properly use the project stage to sort on a smaller dataset.
- Properly use the sort stage to reduce the number of sort fields.
- Properly use the match stage, and perform sorting as much as possible after the match stage or last.
- Properly use indexes and utilize them for sorting (if possible).