Aggregate.bucketAuto
1. Interface Description
Function: Aggregation Stage. Divides input records into different groups based on given conditions, with each group being a bucket
.
Disclaimer:
bucket({ groupBy: <expression>, buckets: <number>, granularity: <string>, output: { <output1>: <accumulator expr>, ... <outputN>: <accumulator expr> } })
Notes:
Unlike
bucket
, one difference is that there is no need to specifyboundaries
;bucketAuto
automatically attempts to distribute the records as evenly as possible into each group.
Each group is output as a record, containing an
_id
field whose value is an object with the maximum and minimum values in the group, and acount
field with the number of records in the group.count
is output by default whenoutput
is not specified.
2. Input Parameters
Parameter | Type | Required | Description |
---|---|---|---|
groupBy | expression | Yes | The field is detailed below |
buckets | number | Yes | A positive integer used to specify the number of groups. |
granularity | string | No | The field is detailed below |
output | Object | No | The field is detailed below |
groupBy
is an expression used to determine grouping, applied to each input record. You can use a$
prefix followed by the field path to group as the expression. Unless a default value is specified withdefault
, each record must contain the specified field, and the field value must be within the range specified byboundaries
.
granularity
is an optional enum value string used to ensure that automatically calculated boundaries conform to given rules. This field can only be used when allgroupBy
values are numbers and contain noNaN
. Enum values include:R5
,R10
,R20
,R40
,R80
,1-2-5
,E6
,E12
,E24
,E48
,E96
,E192
,POWERSOF2
.
>
outputis optional, used to determine which fields, in addition to
_id, the output records should contain. The value of each field must be specified using an accumulator expression. When
outputis specified, the default
count` is not output by default and must be manually specified:
output: {
count: $.sum(1),
...
<outputN>: <accumulator expr>
}
In the following cases, the number of output groups may be less than the given number:
- The number of input records is less than the number of groups.
- The number of unique values calculated by
groupBy
is less than the number of groups. - The spacing of
granularity
is less than the number of groups. - The granularity is not fine enough to be evenly distributed across all groups.
granularity Details
granularity` ensures that boundary values conform to a given numeric sequence.
Renard Series
Renard series is a number sequence derived from the 5th / 10th / 20th / 40th / 80th roots of 10, ranging between 1.0 and 10.0 (or 10.3 for R80).
Setting granularity
to R5/R10/R20/R40/R80 constrains boundary values within the series. If a groupBy
value falls outside the 1.0 to 10.0 range (or 10.3 for R80), the series numbers are automatically multiplied by 10.
E Series
E series is a number sequence with a specific tolerance, derived from the 6th / 12th / 24th / 48th / 96th / 192nd roots of 10, ranging from 1.0 to 10.0.
1-2-5 Series
The 1-2-5 series behaves the same as the three-value Renard series.
Powers of Two Series
A series of numbers composed of powers of two.
3. Response
Parameter | Type | Required | Description |
---|---|---|---|
- | Aggregate | Yes | Aggregation object |
4. Sample Code
Suppose the collection items
contains the following records:
{
_id: "1",
price: 10.5
}
{
_id: "2",
price: 50.3
}
{
_id: "3",
price: 20.8
}
{
_id: "4",
price: 80.2
}
{
_id: "5",
price: 200.3
}
Automatically group the above records into three groups:
const tcb = require("@cloudbase/node-sdk");
const app = tcb.init({
env: "xxx",
});
const db = app.database();
const $ = db.command.aggregate;
exports.main = async (event, context) => {
const res = await db
.collection("items")
.aggregate()
.bucket({
groupBy: "$price",
buckets: 3,
})
.end();
console.log(res.data);
};
The returned result is as follows:
{
"_id": {
"min": 10.5,
"max": 50.3
},
"count": 2
}
{
"_id": {
"min": 50.3,
"max": 200.3
},
"count": 2
}
{
"_id": {
"min": 200.3,
"max": 200.3
},
"count": 1
}