[opencms-dev] CmsCollector performance

Sebastian Himberger sebastian.himberger at gmx.de
Wed Mar 5 20:48:15 CET 2008


Hi again,

haven't i just perfectly ignored your simple question and given a maybe
useless explanation?! :D. To answer it directly: No OpenCms will not
always hit the database if you use FlexCache. It will serve the output
from the cache.

best regards,
sebastian




Sebastian Himberger schrieb:
> Hi,
> 
> Flex is more about caching views (page elements, output) than about
> caching result sets. They sit more or less on top of each other. There's
> a dedicated db cache for queries.
> 
> i hope this helps.
> 
> sebastian
> 
> 
> marcio.camurati schrieb:
>> Oh, no problem about it the source that you send give me the macro vision of
>> what is needed, and see it and think that it really help about performance
>> because with it we can get only what is really necessary fo show and not all
>> to filter after. 
>>
>> The other doubt that I have is, using this new custom query the OpenCMS will
>> use the FlexCache to cache this custom query ? Or make this all request will
>> be direct to the database ? 
>>
>> Do you know about it ?
>>
>> Best regards,
>> Marcio Camurati
>>
>>
>>
>> Shi Yusen wrote:
>>> Sorry for not open the performance module. We have to make a living.
>>>
>>> Anyway, here are the steps to build your own module:
>>> 1. Add some new filters in CmsResourceFilter. For example, I added two
>>> new filter methods: addTopLatest(int top) and addPagedLatest(int
>>> startRow, int rowsInPage).
>>>
>>> 2. Add new filter branches in
>>> org.opencms.db.CmsDriverManager.readResources(...)
>>>
>>> 3. Add new functions to implements the new filter branches in
>>> org.opencms.db.oracle.CmsVfsDriver
>>>
>>> 3. Add new query strings to /org/opencms/db/oracle/query.properties. For
>>> example, you can use 
>>> # patterns for statements to select resources/folders (= selections
>>> without content)
>>> # THINGS TO KNOW: don't select the project-ID attrib. of the structure
>>> table per default!
>>> # There are cases, where the project-ID attrib. of the project-resources
>>> tab. is used
>>> # as the project-ID!
>>> C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL1=\
>>>     STRUCTURE_ID,\
>>> 	RESOURCE_ID,\
>>> 	RESOURCE_PATH,\
>>> 	STRUCTURE_STATE,\
>>> 	DATE_RELEASED,\
>>> 	DATE_EXPIRED,\
>>> 	STRUCTURE_VERSION,\
>>> 	RESOURCE_ID_2,\
>>> 	RESOURCE_TYPE,\
>>> 	RESOURCE_FLAGS,\
>>> 	RESOURCE_STATE,\
>>> 	DATE_CREATED,\
>>> 	DATE_LASTMODIFIED,\
>>> 	USER_CREATED,\
>>> 	USER_LASTMODIFIED,\
>>> 	LOCKED_IN_PROJECT,\
>>> 	RESOURCE_SIZE,\
>>> 	DATE_CONTENT,\
>>> 	SIBLING_COUNT,\
>>> 	RESOURCE_VERSION
>>>
>>> # patterns for statements to select resources/folders (= selections
>>> without content)
>>> # THINGS TO KNOW: don't select the project-ID attrib. of the structure
>>> table per default!
>>> # There are cases, where the project-ID attrib. of the project-resources
>>> tab. is used
>>> # as the project-ID!
>>> C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL2=\
>>>     CMS_${PROJECT}_STRUCTURE.STRUCTURE_ID AS STRUCTURE_ID,\
>>> 	CMS_${PROJECT}_STRUCTURE.RESOURCE_ID AS RESOURCE_ID,\
>>> 	CMS_${PROJECT}_STRUCTURE.RESOURCE_PATH AS RESOURCE_PATH,\
>>> 	CMS_${PROJECT}_STRUCTURE.STRUCTURE_STATE AS STRUCTURE_STATE,\
>>> 	CMS_${PROJECT}_STRUCTURE.DATE_RELEASED AS DATE_RELEASED,\
>>> 	CMS_${PROJECT}_STRUCTURE.DATE_EXPIRED AS DATE_EXPIRED,\
>>> 	CMS_${PROJECT}_STRUCTURE.STRUCTURE_VERSION AS STRUCTURE_VERSION,\
>>> 	CMS_${PROJECT}_RESOURCES.RESOURCE_ID AS RESOURCE_ID_2,\
>>> 	CMS_${PROJECT}_RESOURCES.RESOURCE_TYPE AS RESOURCE_TYPE,\
>>> 	CMS_${PROJECT}_RESOURCES.RESOURCE_FLAGS AS RESOURCE_FLAGS,\
>>> 	CMS_${PROJECT}_RESOURCES.RESOURCE_STATE AS RESOURCE_STATE,\
>>> 	CMS_${PROJECT}_RESOURCES.DATE_CREATED AS DATE_CREATED,\
>>> 	CMS_${PROJECT}_RESOURCES.DATE_LASTMODIFIED AS DATE_LASTMODIFIED,\
>>> 	CMS_${PROJECT}_RESOURCES.USER_CREATED AS USER_CREATED,\
>>> 	CMS_${PROJECT}_RESOURCES.USER_LASTMODIFIED AS USER_LASTMODIFIED,\
>>> 	CMS_${PROJECT}_RESOURCES.PROJECT_LASTMODIFIED AS LOCKED_IN_PROJECT,\
>>> 	CMS_${PROJECT}_RESOURCES.RESOURCE_SIZE AS RESOURCE_SIZE,\
>>> 	CMS_${PROJECT}_RESOURCES.DATE_CONTENT AS DATE_CONTENT,\
>>> 	CMS_${PROJECT}_RESOURCES.SIBLING_COUNT AS SIBLING_COUNT,\
>>> 	CMS_${PROJECT}_RESOURCES.RESOURCE_VERSION AS RESOURCE_VERSION
>>>
>>> #
>>> # General subtree selection statement
>>> #
>>> C_ORACLE_RESOURCES_READ_TREE_PAGED=\
>>> SELECT \
>>>     ${C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL1} \
>>> FROM (\
>>>     SELECT \
>>>         ${C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL1},\
>>>         ROWNUM AS ROW_NUMBER \
>>>     FROM (\
>>>         SELECT \
>>>             ${C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL2},\
>>>             CMS_${PROJECT}_RESOURCES.PROJECT_LASTMODIFIED \
>>>         FROM \
>>> 	        ${C_RESOURCES_SELECT_TABLES} \
>>>         WHERE \
>>> 	        ${C_JOIN_RESOURCE_STRUCTURE}
>>>
>>> #
>>> # Resources order by DATE_LASTMODIFIED AND ROWNUM
>>> #
>>> C_ORACLE_RESOURCES_PAGED_ORDER_BY_DATELASTMODIFIED=\
>>> 	        ORDER BY CMS_${PROJECT}_RESOURCES.DATE_LASTMODIFIED DESC\
>>> 	    )\
>>> 	) \
>>> WHERE ROW_NUMBER >=? AND ROW_NUMBER <?
>>>
>>> to get paged latest resources.
>>>
>>> Use
>>> #
>>> # General subtree selection statement
>>> #
>>> C_ORACLE_RESOURCES_READ_TREE=\
>>> SELECT \
>>>     ${C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL1} \
>>> FROM (\
>>>     SELECT \
>>>         ${C_ORACLE_RESOURCES_SELECT_ATTRIBS_LEVEL2},\
>>>         CMS_${PROJECT}_RESOURCES.PROJECT_LASTMODIFIED \
>>>     FROM \
>>> 	    ${C_RESOURCES_SELECT_TABLES} \
>>>     WHERE \
>>> 	    ${C_JOIN_RESOURCE_STRUCTURE}
>>>
>>> #
>>> # Resources order by DATE_LASTMODIFIED
>>> #
>>> C_ORACLE_RESOURCES_ORDER_BY_DATELASTMODIFIED=\
>>> 	    ORDER BY CMS_${PROJECT}_RESOURCES.DATE_LASTMODIFIED DESC\
>>> 	) \
>>> WHERE ROWNUM<=?
>>>
>>> to get top latest resources.
>>>
>>> To filter the resources by sql rather than by Java. That's the trick.
>>> Too simple, right?
>>>
>>> Good luck,
>>>
>>> Shi Yusen/Beijing Langhua Ltd.
>>>
>>>
>>> 在 2008-03-05三的 05:28 -0800,marcio.camurati写道:
>>>> Shi Yusen,
>>>>
>>>> The velocity is really very nice !
>>>>
>>>> Can you give more informations about this project, do you use only simple
>>>> collectors like allInFolderDateReleasedDesc or
>>>> allInSubTreeDateReleasedDesc
>>>> for get this information at the OpenCMS resources?
>>>>
>>>> You said at the end for don't worry about the sql problem, this problem
>>>> really exit if we use the OpenCMS at CORE it´s really necessary any
>>>> modification about sql operation to resolve the performance problem that
>>>> Martin post to us ?
>>>>
>>>> Best regards,
>>>> Marcio Camurati
>>>>
>>>>
>>>> Shi Yusen wrote:
>>>>> Here is a website we almost completed with more than 50k pages now, and
>>>>> about 15k-20k more annually. You'll find it's quite fast.
>>>>> http://www2.scnjw.com/scnjw/index.html
>>>>>
>>>>> CentOS + Squad + Apache + Tomcat + OpenCms 7.0.3 + Oracle.
>>>>>
>>>>> Don't worry about OpenCms performance. For the sql problem, you can
>>>>> write your own performance module to improve the sql operation. It's
>>>> not
>>>>> difficult.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Shi Yusen/Beijing Langhua Ltd.
>>>>>
>>>>>
>>>>> 在 2008-03-05三的 03:47 -0800,marcio.camurati写道:
>>>>>> Hi Martin,
>>>>>>
>>>>>> I read you post, and see you cenario at our future project that is at
>>>> the
>>>>>> begin yet, the performance is one of our concern about use the OpenCMS
>>>> to
>>>>>> manage the content of the site, do you have any good news about your
>>>>>> performance problem ? or it continuos ?
>>>>>>
>>>>>> If you can tallk more obut it or about the solution that you do for
>>>> it.
>>>>>> Best regards,
>>>>>> Marcio Camurati
>>>>>>
>>>>>>
>>>>>> Martin Bednář wrote:
>>>>>>> I have problems with very poor performance of Collectors, specialy
>>>> with 
>>>>>>> allInFolderDateReleasedDesc and allInSubTreeDateReleasedDesc. I have 
>>>>>>> site with about >10000 articles categorized in folders, so i have
>>>> this 
>>>>>>> structure
>>>>>>>
>>>>>>> /Categories
>>>>>>> /Categories/Cat1
>>>>>>> /Categories/Cat2
>>>>>>> /Categories/CatX
>>>>>>> ...
>>>>>>>
>>>>>>> in Cat1, CatX i have  page which shows articles in whole category
>>>> (with 
>>>>>>> paging  by 20 articles)
>>>>>>> on homepage i have 20 newest articles from all categories
>>>>>>> I use something like:
>>>>>>> <cms:contentload
>>>>>>>    collector="allInSubTreeDateReleasedDesc" 
>>>>>>> param="/Categories/|magArticle|20"
>>>>>>>    editable="true">
>>>>>>>
>>>>>>> respectively in cat1
>>>>>>> <cms:contentload
>>>>>>>    collector="allInSubTreeDateReleasedDesc" 
>>>>>>> param="/Categories/Cat1/|magArticle"
>>>>>>>    pageSize="20" pageIndex="1" editable="true">
>>>>>>>
>>>>>>> Performance is very poor, I looked to source code and see that CMS
>>>>>> works 
>>>>>>> (for my HP for exmaple) in this way:
>>>>>>> Load /Categories resource from DB, Create CmsResource
>>>>>>> Load all 10000 under /Categories resources from DB and Create 
>>>>>>> CmsResource objects for it
>>>>>>> Sort all 10000 CmsResources by ReleaseDate
>>>>>>> throw 9980 unneeded objects !!!
>>>>>>> return 20 CmsResources
>>>>>>>
>>>>>>> It's really crazy.
>>>>>>>
>>>>>>> Is there a way how to optimize it ?
>>>>>>>
>>>>>>> Why data is not sorted on SQL server and returned only 20 items in 
>>>>>>> recordset ?
>>>>>>>
>>>>>>> I thing that it's really performance problem, waste of CPU and
>>>> memory.
>>>>>>> This operation takes about 2,3min on my server (CPU 2xQuadCore
>>>> 2,4GHz, 
>>>>>>> 4GB RAM, 8xHDD on 3WareCard in RAID6) on my old server with AMD64 
>>>>>>> Opteron and 4HDD in SW RAID it takes about 14minutes !
>>>>>>>
>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> This mail is sent to you from the opencms-dev mailing list
>>>>>>> To change your list options, or to unsubscribe from the list, please
>>>>>> visit
>>>>>>> http://lists.opencms.org/mailman/listinfo/opencms-dev
>>>>>>>
>>>>>> -- 
>>>>>> View this message in context:
>>>>>>
>>>> http://www.nabble.com/CmsCollector-performance-tp14931100p15848425.html
>>>>>> Sent from the OpenCMS - Dev mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> This mail is sent to you from the opencms-dev mailing list
>>>>>> To change your list options, or to unsubscribe from the list, please
>>>>>> visit
>>>>>> http://lists.opencms.org/mailman/listinfo/opencms-dev
>>>>> _______________________________________________
>>>>> This mail is sent to you from the opencms-dev mailing list
>>>>> To change your list options, or to unsubscribe from the list, please
>>>> visit
>>>>> http://lists.opencms.org/mailman/listinfo/opencms-dev
>>>>>
>>>> -- 
>>>> View this message in context:
>>>> http://www.nabble.com/CmsCollector-performance-tp14931100p15850039.html
>>>> Sent from the OpenCMS - Dev mailing list archive at Nabble.com.
>>>>
>>>>
>>>> _______________________________________________
>>>> This mail is sent to you from the opencms-dev mailing list
>>>> To change your list options, or to unsubscribe from the list, please
>>>> visit
>>>> http://lists.opencms.org/mailman/listinfo/opencms-dev
>>> _______________________________________________
>>> This mail is sent to you from the opencms-dev mailing list
>>> To change your list options, or to unsubscribe from the list, please visit
>>> http://lists.opencms.org/mailman/listinfo/opencms-dev
>>>
> 
> 
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev




More information about the opencms-dev mailing list