Bucket

💡 Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries and outputs a document per each bucket. Also can perform some statistics.

Let's prepare a bucket stage, using bucket can create a different categories and filter

Boundaries means range/levels like 0-18, 18-30,30-50, 50-80, 80-120, in every range includes first value execute not last value, 18-30 ⇒ means 18 includes in the range but not includes 30. There are noting in 0 - 18 and 80 to 120 range so the following query give as only three bucket.

{ "_id" : 18, "numPersons" : 868, "average" : 25.101382488479263 } => 18 to less than 30
{ "_id" : 30, "numPersons" : 1828, "average" : 39.4917943107221 } => 30 to less than 59
{ "_id" : 50, "numPersons" : 2304, "average" : 61.46440972222222 } => 50 to all upperbound

> db.persons.aggregate([
    {
        $bucket: {
            groupBy: '$dob.age',
            boundaries: [0, 18, 30, 50, 80, 120],
            output: {
                numPersons: { $sum: 1 },
                average: { $avg: '$dob.age' },
            }
        }
    }
]).pretty()

**Output**
{ "_id" : 18, "numPersons" : 868, "average" : 25.101382488479263 } 
{ "_id" : 30, "numPersons" : 1828, "average" : 39.4917943107221 }
{ "_id" : 50, "numPersons" : 2304, "average" : 61.46440972222222 }

There are no people less than 18 and greater than 80 and equals to 80. Output noting as no data satisfy those conditions.

> db.persons.find({'dob.age': {$lt: 18}})
**Output**

> db.persons.find({'dob.age': {$gt: 80}})
**Output**

> db.persons.find({'dob.age': 80})
**Ouput**

> db.persons.find({'dob.age': {$gt: 17, $lt: 30}}).count()
868

> db.persons.find({'dob.age': {$gt: 29, $lt: 50}}).count()
1828 

> db.persons.find({'dob.age': {$gt: 49, $lt: 80}}).count()
2304

Adding more levels

{ _id: 18, numPersons: 868, average: 25.101382488479263 }, => 18 to less than 30
{ _id: 30, numPersons: 910, average: 34.51758241758242 }, => 30 to less than 40
{ _id: 40, numPersons: 918, average: 44.42265795206972 }, => 40 to less than 50
{ _id: 50, numPersons: 976, average: 54.533811475409834 }, => 50 to less than 60
{ _id: 60, numPersons: 1328, average: 66.55798192771084 } ⇒ 60 to all upper-bound

> db.persons.aggregate([
    {
        $bucket: {
            groupBy: '$dob.age',
            boundaries: [18, 30, 40, 50, 60, 120],
            output: {
                numPersons: { $sum: 1 },
                average: { $avg: '$dob.age' },
            }
        }
    }
]).pretty()

**Output**
[
  { _id: 18, numPersons: 868, average: 25.101382488479263 },
  { _id: 30, numPersons: 910, average: 34.51758241758242 },
  { _id: 40, numPersons: 918, average: 44.42265795206972 },
  { _id: 50, numPersons: 976, average: 54.533811475409834 },
  { _id: 60, numPersons: 1328, average: 66.55798192771084 }
]

Can also create a auto bucket by defining how many buckets want, almost have equal distributions in each buckets.

> db.persons.aggregate([
    {
        $bucketAuto: {
            groupBy: '$dob.age',
            buckets: 5,
            output: {
                numPersons: { $sum: 1 },
                average: { $avg: '$dob.age' },
            }
        }
    }
]).pretty()

{
	"_id" : {
		"min" : 21,
		"max" : 32
	},
	"numPersons" : 1042,
	"average" : 25.99616122840691
}
{
	"_id" : {
		"min" : 32,
		"max" : 43
	},
	"numPersons" : 1010,
	"average" : 36.97722772277228
}
{
	"_id" : {
		"min" : 43,
		"max" : 54
	},
	"numPersons" : 1033,
	"average" : 47.98838334946757
}
{
	"_id" : {
		"min" : 54,
		"max" : 65
	},
	"numPersons" : 1064,
	"average" : 58.99342105263158
}
{
	"_id" : {
		"min" : 65,
		"max" : 74
	},
	"numPersons" : 851,
	"average" : 69.11515863689776
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bucket.md

Bucket.md

Bucket

Files

Bucket.md

Latest commit

History

Bucket.md

File metadata and controls

Bucket