Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with elasticsearch pagination; index.max_result_window #83

Open
nazrulworld opened this issue May 20, 2020 · 0 comments
Open

Issue with elasticsearch pagination; index.max_result_window #83

nazrulworld opened this issue May 20, 2020 · 0 comments

Comments

@nazrulworld
Copy link
Contributor

nazrulworld commented May 20, 2020

background

I am doing a portal catalog search, which is bringing more 100K brains. When I going iterate all brains, got bellows error

2020-05-18 17:45:57,857 INFO    [elasticsearch:83][waitress] POST http://127.0.0.1:9200/danbioapp-backend-portal_catalog/portal_catalog/_bulk [status:200 request:0.024s]
2020-05-18 17:45:58,861 INFO    [elasticsearch:83][waitress] GET http://127.0.0.1:9200/_nodes/_all/http [status:200 request:0.002s]
2020-05-18 17:45:58,912 WARNING [elasticsearch:97][waitress] GET http://127.0.0.1:9200/danbioapp-backend-portal_catalog/portal_catalog/_search?from=10000&stored_fields=path.path&size=50 [status:500 request:0.050s]
2020-05-18 17:45:59,971 ERROR   [Zope.SiteErrorLog:251][waitress] 1589816759.920.727241015445 http://localhost:9090/danbioapp-backend/f....
Traceback (innermost last):
  Module ZPublisher.WSGIPublisher, line 156, in transaction_pubevents
  Module ZPublisher.WSGIPublisher, line 338, in publish_module
  Module ZPublisher.WSGIPublisher, line 256, in publish
  Module ZPublisher.mapply, line 85, in mapply
  Module ZPublisher.WSGIPublisher, line 62, in call_object
  Module Products.ExternalMethod.ExternalMethod, line 230, in __call__
   - __traceback_info__: ((<PloneSite at /danbioapp-backend>,), {}, None)
  Module <string>, line 32, in main
  Module ZTUtils.Lazy, line 201, in __getitem__
  Module collective.elasticsearch.es, line 104, in __getitem__
  Module collective.elasticsearch.es, line 170, in _search
  Module elasticsearch.client.utils, line 76, in _wrapped
  Module elasticsearch.client, line 660, in search
  Module elasticsearch.transport, line 318, in perform_request
  Module elasticsearch.connection.http_urllib3, line 186, in perform_request
  Module elasticsearch.connection.base, line 125, in _raise_error
TransportError: TransportError(500, u'search_phase_execution_exception',
u'Result window is too large, from + size must be less than or equal to: [10000]
but was [10050]. See the scroll api for a more efficient way to request large data sets.
This limit can be set by changing the [index.max_result_window] index level setting.')

Found some related https://stackoverflow.com/questions/35206409/elasticsearch-2-1-result-window-is-too-large-index-max-result-window

I know it is possible to increase index.max_result_window , but it would consist of memory usage.

My Idea here, if it is possible to use Elasticsearch scroll API here https://github.com/collective/collective.elasticsearch/blob/master/src/collective/elasticsearch/es.py#L48
Not sure if that would solve the problem, your expert opinion requested.
Thanks
@vangheem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant