[opencms-dev] Solr and limited rows

Rüdiger Kurz r.kurz at alkacon.com
Wed Oct 2 21:02:21 CEST 2013


Hi,

OpenCms performs a permission check for all resulting documents and 
throws those away that the current user is not allowed to retrieve and 
expands the result for the next best matching documents on the fly. This 
security check is very cost intensive.

For this reason the results are/were limited to 50. The algorithm 
multiplies this with some maximum value (in your case 5), making sure 
even we are throwing away 200 documents, during permission check, we 
will have at least 50 docs returned. Then afterwards we are 
reconstructing the response completely. You can fight against this 
approach, but anyway moving the permission check into the responsibility 
of the Solr index is as well to be discussed. In a regular case where 
you have a pagination this does not matter since displaying more then 50 
docs per page is really uncomfortable for the end user.

In the build version 8.5.0 which introduces Solr the method: 
https://github.com/alkacon/opencms-core/blob/build_8_5_0/src/org/opencms/search/solr/CmsSolrIndex.java#L393 
offers the optional ignoreMaxRows parameter, so even the 
SolrSelectHandler does not call this method by default you can query 
OpenCms for getting exactly the requested count of documents.

However, since version 8.5.1 the SolrSelectHandler calls the search 
method with ignoreMaxRows=true by default 
(https://github.com/alkacon/opencms-core/blob/branch_9_0_x/src/org/opencms/main/OpenCmsSolrHandler.java#L155). 
This is done in the assumption that the developer knows what he does and 
will not raise the rows parameter to a nearly unlimited size. The 
permission check still costs a lot of time.

I hope this clarifies your question
-Rüdiger

Am 17.09.2013 16:01, schrieb Le Nouveau:
> Hi everyone,
>
> I'm having a little problem to define the number of results, with Solr.
> Every time I specify "rows=XXX", there is "rows=250" in my parsed query
> (according to logs)(XXX could be any number)
> If I don't specify the number of rows, in logs, I have "rows=50".
>
> Here is part of my code :
>          // Request to build
>          StringBuilder query = new StringBuilder();
>
>          // Defining requested types
>          query.append("fq=type:(typeAlpha OR typeBeta)");
>
>          // Defining rows limit
>          query.append("&rows=999999");
>
>          // Defining parent folder
>          query.append("&fq=parent-folders:(" +
> getCmsObject().getRequestContext().getSiteRoot() + area.getPath() +
> ".content/)");
>
>           // Let's search
>           CmsSearchManager searchManager = OpenCms.getSearchManager();
>           CmsSolrResultList results = null;
>           results = searchManager.getIndexSolr("Solr
> Online").search(cms, query.toString());
>
> He is what I get in logs
>          [Solr Online] webapp=null path=/select
> params={q=*:*&fl=*,score&qt=edismax&rows=250&fq=type:(typeAlpha OR
> typeBeta)&fq=parent-folders:(/sites/default/bla/.content/)&start=0}
> hits=29 status=0 QTime=1
>
>
> For the moment, I know I'm under the limit, but what would happen once I
> get 250+ results ? Would the limit grow up alone ? (would be fun ^^)
>
> Thanks for help !
> Regards,
> Le Nouveau

-- 
Rüdiger Kurz

-------------------

Alkacon Software GmbH - The OpenCms Experts
Rüdiger Kurz
An der Wachsfabrik 13
50996 Koeln, DE



More information about the opencms-dev mailing list