您的位置:首页 > 其它

ElasticSearch中search after处理深分页介绍

2017-07-14 10:58 621 查看
原文地址:https://www.elastic.co/guide/en/elasticsearch/reference/5.5/search-request-search-after.html


Search Afteredit

Pagination of results can be done by using the 
from
 and 
size
 but
the cost becomes prohibitive when the deep pagination is reached. The 
index.max_result_window
 which
defaults to 10,000 is a safeguard, search requests take heap memory and time proportional to 
from
+ size
. The Scroll api
is recommended for efficient deep scrolling but scroll contexts are costly and it is not recommended to use it for real time user requests. The 
search_after
 parameter
circumvents this problem by providing a live cursor. The idea is to use the results from the previous page to help the retrieval of the next page.

Suppose that the query to retrieve the first page looks like this:

GET twitter/tweet/_search
{
"size": 10,
"query": {
"match" : {
"title" : "elasticsearch"
}
},
"sort": [
{"date": "asc"},
{"_uid": "desc"}
]
}


COPY AS CURLVIEW
IN CONSOLE 



A field with one unique value per document should be used as the tiebreaker of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way
is to use the field 
_uid
 which
is certain to contain one unique value for each document.

The result from the above request includes an array of 
sort
values
 for each document. These 
sort
values
 can be used in conjunction with the 
search_after
 parameter
to start returning results "after" any document in the result list. For instance we can use the 
sort
values
 of the last document and pass it to 
search_after
 to
retrieve the next page of results:

GET twitter/tweet/_search
{
"size": 10,
"query": {
"match" : {
"title" : "elasticsearch"
}
},
"search_after": [1463538857, "tweet#654323"],
"sort": [
{"date": "asc"},
{"_uid": "desc"}
]
}


COPY AS CURLVIEW
IN CONSOLE 



The parameter 
from
 must
be set to 0 (or -1) when 
search_after
 is
used.

search_after
 is
not a solution to jump freely to a random page but rather to scroll many queries in parallel. It is very similar to the 
scroll
 API
but unlike it, the 
search_after
 parameter
is stateless, it is always resolved against the latest version of the searcher. For this reason the sort order may change during a walk depending on the updates and deletes of your index.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: