您的位置:首页 > 其它

Elasticsearch aggregations 理解

2016-03-23 17:22 309 查看
Aggregation search的来源:

Aggregations grew out of the facets module and the long experience of how users use it (and would like to use it) for real-time data analytics purposes.
As such, it serves as the next generation replacement for the functionality we currently refer to as "faceting".
Facets provide a great way to aggregate data within a document set context.
This context is defined by the executed query in combination with the different levels of filters that can be defined (filtered queries, top-level filters, and facet level filters).
While powerful, their implementation is not designed from the ground up to support complex aggregations and is thus limited.


Facets 虽然强大,但是他们的不支持复杂的聚合。

所以,Aggregation的强大之处就在于他的聚合。。。可以很连贯的从一个搜索结果处开始,进行下一个搜索语句,从而可以实现很复杂的搜索功能。

搜索语句:

以下语句连续执行了 3 个 aggs 搜索,

{
"aggs": {
"1111111": {
"filter": {},
"aggs": {
"2222222": {
"date_histogram": {
"field": "基本.上报时间",
"interval": "1M"
},
"aggs": {
"333333": {
"cardinality": {
"field": "基本.网关"
}
}
}
}
}
}
}
}


搜索结果:

解释:

由于 11111 并没有执行实际的搜索(包含了一个空的 filter 和 aggs 2222),所以 1111 并没有对原始数据进行任何处理,而是直接进入第二个 aggs

由于 22222 使用了 date_histogram 聚合语句,所以把数据按月聚集在一起,每一个子对象中还做了个数量统计: "doc_count": 3

由于 33333 使用了 cardinality 聚合语句,所以把由 22222 输出的 buckets 中的每个子对象(凭什么说是每一个子对象,如果把 date_histogram 的时间间隔设置为 1y 就对比出来了),再执行 uniq 操作,并得出去重后的数量: "value": 1

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 26,
"max_score": 0,
"hits": []
},
"aggregations": {
"1111111": {
"2222222": {
"buckets": [
{
"333333": {
"value": 1
},
"key_as_string": "2016/03/01 00:00:00",
"key": 1456790400000,
"doc_count": 3
},
......
{
"333333": {
"value": 1
},
"key_as_string": "2016/12/01 00:00:00",
"key": 1480550400000,
"doc_count": 3
}
]
},
"doc_count": 26
}
}
}


把时间间隔设置为 1y ,结果如下:

2222 的 "doc_count": 26

3333 的 "value": 10

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 26,
"max_score": 0,
"hits": []
},
"aggregations": {
"1111111": {
"2222222": {
"buckets": [
{
"333333": {
"value": 10
},
"key_as_string": "2016/01/01 00:00:00",
"key": 1451606400000,
"doc_count": 26
}
]
},
"doc_count": 26
}
}
}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: