[opencms-dev] Wrong Solr numfound when using setCheckPermissions

Malu malupolo at gmail.com
Mon Dec 18 13:55:43 CET 2023


Hi,



We have a question about a behaviour that we’ve seen when using method
setCheckPermissions(false) on an index, on version 12 of OpenCMS.



We have an index of about 24k docs indexed. We offer about half of those
docs to users for search on our site. Also on the this searcher we offer a
pagination form where you can go to the last page of results.



We noticed that when permissions check is on with setCheckPermissions(true),
which we belive is the default config on OpenCMS, and you request the last
page (with a default set of 12 rows to show) it takes huge time to resolve
the query. We know that this can be an issue on Solr (seen in this link
<https://solr.apache.org/guide/6_6/pagination-of-results.html#performance-problems-with-deep-paging>)
but on millions of docs, not on an small index likes ours of 24k.



After looking on the the core through OpenCMS GitHub repository, we noticed
this method called setCheckPermissions that allowed us to resolve those
request to the last page pretty fast, resolving our issue of speed.

We use it this way:

OpenCms.getSearchManager().getIndexSolr(ourIndexName).setCheckPermissions(
*false*);



Everything is fine, and we gained speed, but we noticed two strange
behaviours on numFound returned from Solr and pagination:



   - When setting permissions to false, and you request anything to Solr,
   it *subtracts* the *rows parameter* to the *total number of documents*
   making you think you have less documents than when setting permission to
   true. Here is an image example of the same index requesting all documents
   with permissions true and false to better understand what I am trying to
   explain:

This image is with permissions set to *TRUE *and query a simple query of
*handleSolrSelect?q=*:*&row=12*
[image: imagen.png]


This next one, same query, with permissions set to *FALSE,* as you can see
it *substracts* the *rows to the numfound* (24631 – 12 = 24619)
[image: imagen.png]

Here is a more descriptive example with 500 rows requested (and fl
parameter “path” to not wait too much the response 😊 ):
[image: imagen.png]

   - On the pagination end, we have a similar issue, because the JSTL var
   controllers.pagination seems to calculate the number of pages on the
   basis of numfound – rows number, produced by permissions set to false.

This is translated on the user requesting the last page on our front end
form, say for example page 893, seems to get the last page. But actually,
page number 894 is also available although not visible because for
controllers.pagination it doesn’t exists and thus we can’t print it on the
form.


With all this info, we have a couple of questions:

   - Is this some kind of bug? If not, why do we have this behaviour of
   numfound – rows for the numfound var on Solr when permissions set to false?
   - Is there a problem on setting *always* permissions to false on the
   index to gain the speed we need on last page?


Thanks for your help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opencms.org/pipermail/opencms-dev/attachments/20231218/176d11a0/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: imagen.png
Type: image/png
Size: 28224 bytes
Desc: not available
URL: <http://lists.opencms.org/pipermail/opencms-dev/attachments/20231218/176d11a0/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: imagen.png
Type: image/png
Size: 28873 bytes
Desc: not available
URL: <http://lists.opencms.org/pipermail/opencms-dev/attachments/20231218/176d11a0/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: imagen.png
Type: image/png
Size: 31763 bytes
Desc: not available
URL: <http://lists.opencms.org/pipermail/opencms-dev/attachments/20231218/176d11a0/attachment-0002.png>


More information about the opencms-dev mailing list