-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limitation of processes in result lists #4331
Comments
The 10,000 search limitation come from the used ElasticSearch API call Search Request. There are in ElasticSearch 5.6 two other search calls to retrieve more then 10,000 hits or using this APIs to paginate through the hits: The current one used and the scroll variant are deprecated in newer ElasticSearch versions and only the "search after" variant should be used on many hits. |
Do you really want to look through more than 10,000 hits? Perhaps at this point you should rephrase your search query. If you really want to display 10,000 and more hits, this is typically not a search engine task, but a database task, and querying the search engine index is the wrong approach for it. |
Your assumption might be correct and we have the demand, because in 2.x we query the database for - among other - the following use cases. 1 Amount of images in the processes of a newspaper title 2 Analysis of missing metadata It would be possible to split the query for each year or decade and then add it. However that is complicated and i would be afraid to miss some results. Furthermore, Kitodo 3.x should then offer an applicable query language. As for example, the "does not contain" query does not work (#3523). |
In both cases, for my understanding, an external program that looks directly into the database and file system would, be the better alternative. |
I cannot assess the issue technically. If there is a better solution, i'd be glad if it is implemented. |
It is now possible to create Excel lists with the results of more then 10.000 processes. Thus, i will close the issue as not planned. |
Problem
In Kitodo.Production 3.x the amount of processes in the result list seems to be limited to 10.000. The reason is the limit of elastic search (#4277).
In some cases more hits have to be retrieved, for example:
This influences the completeness of the generated excel files, too (#4099).
Solution
In the result list, all retrieved processes should be included. This should be regarded, if #4208 is accepted.
The text was updated successfully, but these errors were encountered: