MongoDB中的⼀个分组统计的查询(Distinct)数据很简单,如下:
> db.t2.find();
{ "country" : "china", "province" : "sh", "userid" : "a" }
{ "country" : "china", "province" : "sh", "userid" : "b" }
{ "country" : "china", "province" : "sh", "userid" : "a" }
{ "country" : "china", "province" : "sh", "userid" : "c" }
{ "country" : "china", "province" : "bj", "userid" : "da" }
{ "country" : "china", "province" : "bj", "userid" : "fa" }
需求是统计出每个country/province下的userid的数量(同⼀个userid只统计⼀次)
过程如下。
⾸先试着这样来统计:
> db.t2.aggregate([ { $group: {"_id": { "country" : "$country", "prov": "$province"} , "number":{$sum:1}} } ])
但是这样的结果是错误的:
{ "_id" : { "country" : "china", "prov" : "bj" }, "number" : 2 }
{ "_id" : { "country" : "china", "prov" : "sh" }, "number" : 4 }
原因是,这样来统计不能区分userid相同的情况 (上⾯的数据中sh有两个 userid = a)
为了解决这个问题,⾸先执⾏⼀个group,其id 是 country, province, userid三个field:
> db.t2.aggregate([ { $group: {"_id": { "country" : "$country", "province": "$province" , "uid" : "$userid" } } } ])
结果为
{ "_id" : { "country" : "china", "province" : "bj", "uid" : "fa" } }
{ "_id" : { "country" : "china", "province" : "bj", "uid" : "da" } }
{ "_id" : { "country" : "china", "province" : "sh", "uid" : "c" } }
{ "_id" : { "country" : "china", "province" : "sh", "uid" : "b" } }
{ "_id" : { "country" : "china", "province" : "sh", "uid" : "a" } }
可以看出,这步的⽬的是把相同的userid只剩下⼀个。
然后第⼆步,再第⼀步的结果之上再执⾏统计:
>db.t2.aggregate([
{ $group: {"_id": { "country" : "$country", "province": "$province" , "uid" : "$userid" } } } ,
{ $group: {"_id": { "country" : "$_id.country", "province": "$_id.province" }, count : { $sum : 1 } } }
])
这回就对了:
{ "_id" : { "country" : "china", "province" : "sh" }, "count" : 3 }distinct查询
{ "_id" : { "country" : "china", "province" : "bj" }, "count" : 2 }
为了让结果好看点,加⼊⼀个$project操作符,把_id分开:
>db.t2.aggregate([ { $group: {"_id": { "country" : "$country", "province": "$province" , "uid" : "$userid" } } } , { $group: {"_id": { "country" : "$_id.country", "province": "$_id.province" }, count: { $sum : 1 } } },
{ $project : {"_id": 0, "country" : "$_id.country", "province" : "$_id.province", "count" : 1}}
])
{ "count" : 3, "country" : "china", "province" : "sh" }
{ "count" : 2, "country" : "china", "province" : "bj" }
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论