[opencms-dev] opencms6 - postgresql8 and performance

Andreas Haumer andreas at xss.co.at
Thu Sep 22 17:00:40 CEST 2005


Hi Gerard,

Lambert Gerard schrieb:
> Hello,
> Sorry for the mix in the thread.
> I send a updated version of the script.

Thanks.
I can confirm that with your new script a fresh installation of
OpenCMS with PostgreSQL 8.0.3 works fine and without errors.
I also did import some resources into OpenCMS, work in online
and offline mode and found no obvious error.

> These script is only based on the mysql script and not on a analysis of explain in postgres.

I did some analysis of the queries OpenCMS does with the
database and I think there is at least one index we should add.

It's a query which is executed really often when OpenCMS answers a
request. OpenCMS tries to load data from table CMS_ONLINE_STRUCTURE
joined with table CMS_ONLINE_RESOURCES and a WHERE clause looking
for a specific value in column CMS_ONLINE_STRUCTURE.RESOURCE_PATH

Look at the following example:

opencms=# explain SELECT CMS_ONLINE_STRUCTURE.STRUCTURE_ID,
                          CMS_ONLINE_STRUCTURE.RESOURCE_ID,
                          CMS_ONLINE_STRUCTURE.RESOURCE_PATH,
                          CMS_ONLINE_STRUCTURE.STRUCTURE_STATE,
                          CMS_ONLINE_STRUCTURE.DATE_RELEASED,
                          CMS_ONLINE_STRUCTURE.DATE_EXPIRED,
                          CMS_ONLINE_RESOURCES.RESOURCE_ID,
                          CMS_ONLINE_RESOURCES.RESOURCE_TYPE,
                          CMS_ONLINE_RESOURCES.RESOURCE_FLAGS,
                          CMS_ONLINE_RESOURCES.RESOURCE_STATE,
                          CMS_ONLINE_RESOURCES.DATE_CREATED,
                          CMS_ONLINE_RESOURCES.DATE_LASTMODIFIED,
                          CMS_ONLINE_RESOURCES.USER_CREATED,
                          CMS_ONLINE_RESOURCES.USER_LASTMODIFIED,
                          CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED AS LOCKED_IN_PROJECT,
                          CMS_ONLINE_RESOURCES.RESOURCE_SIZE,
                          CMS_ONLINE_RESOURCES.SIBLING_COUNT,
                          CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED
                     FROM CMS_ONLINE_STRUCTURE INNER JOIN CMS_ONLINE_RESOURCES ON CMS_ONLINE_STRUCTURE.RESOURCE_ID=CMS_ONLINE_RESOURCES.RESOURCE_ID
                    WHERE CMS_ONLINE_STRUCTURE.RESOURCE_PATH='/'
                 ORDER BY CMS_ONLINE_STRUCTURE.STRUCTURE_STATE ASC;
                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
  Sort  (cost=264.00..264.00 rows=1 width=322)
    Sort Key: cms_online_structure.structure_state
    ->  Nested Loop  (cost=0.00..263.99 rows=1 width=322)
          ->  Seq Scan on cms_online_structure  (cost=0.00..257.98 rows=1 width=166)
                Filter: (resource_path = '/'::text)
          ->  Index Scan using cms_online_resources_pkey on cms_online_resources  (cost=0.00..6.00 rows=1 width=156)
                Index Cond: (("outer".resource_id)::text = (cms_online_resources.resource_id)::text)
(7 Zeilen)

opencms=# select count(*) from cms_online_structure;
  count
-------
   3598
(1 Zeile)

As you can see, PostgreSQL does a sequential scan on table
CMS_ONLINE_STRUCTURE filtering on column RESOURCE_PATH.
In this example, there are 3598 rows in this table, which
makes the sequential scan quite costly, and the query is
executed a _lot_ (with different WHERE clauses of course)!

I'd suggest to add the following index:

opencms=# create index IDX_ONLINE_STRUCTURE_6 ON CMS_ONLINE_STRUCTURE (RESOURCE_PATH);

With this index, the query costs are reduced quite significant:

opencms=# explain SELECT CMS_ONLINE_STRUCTURE.STRUCTURE_ID,
                          CMS_ONLINE_STRUCTURE.RESOURCE_ID,
                          CMS_ONLINE_STRUCTURE.RESOURCE_PATH,
                          CMS_ONLINE_STRUCTURE.STRUCTURE_STATE,
                          CMS_ONLINE_STRUCTURE.DATE_RELEASED,
                          CMS_ONLINE_STRUCTURE.DATE_EXPIRED,
                          CMS_ONLINE_RESOURCES.RESOURCE_ID,
                          CMS_ONLINE_RESOURCES.RESOURCE_TYPE,
                          CMS_ONLINE_RESOURCES.RESOURCE_FLAGS,
                          CMS_ONLINE_RESOURCES.RESOURCE_STATE,
                          CMS_ONLINE_RESOURCES.DATE_CREATED,
                          CMS_ONLINE_RESOURCES.DATE_LASTMODIFIED,
                          CMS_ONLINE_RESOURCES.USER_CREATED,
                          CMS_ONLINE_RESOURCES.USER_LASTMODIFIED,
                          CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED AS LOCKED_IN_PROJECT,
                          CMS_ONLINE_RESOURCES.RESOURCE_SIZE,
                          CMS_ONLINE_RESOURCES.SIBLING_COUNT,
                          CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED
                     FROM CMS_ONLINE_STRUCTURE INNER JOIN CMS_ONLINE_RESOURCES ON CMS_ONLINE_STRUCTURE.RESOURCE_ID=CMS_ONLINE_RESOURCES.RESOURCE_ID
                    WHERE CMS_ONLINE_STRUCTURE.RESOURCE_PATH='/'
                 ORDER BY CMS_ONLINE_STRUCTURE.STRUCTURE_STATE ASC;
                                                      QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
  Sort  (cost=11.04..11.05 rows=1 width=322)
    Sort Key: cms_online_structure.structure_state
    ->  Nested Loop  (cost=0.00..11.03 rows=1 width=322)
          ->  Index Scan using idx_online_structure_6 on cms_online_structure  (cost=0.00..5.02 rows=1 width=166)
                Index Cond: (resource_path = '/'::text)
          ->  Index Scan using cms_online_resources_pkey on cms_online_resources  (cost=0.00..6.00 rows=1 width=156)
                Index Cond: (("outer".resource_id)::text = (cms_online_resources.resource_id)::text)
(7 Zeilen)

See?
IMHO this index would be a good thing to add.

Of course, the same is true for the "offline" tables, so I'd
suggest to add the following index too:

create index IDX_OFFLINE_STRUCTURE_6 ON CMS_OFFLINE_STRUCTURE (RESOURCE_PATH);

(btw: both indices are created if you use the Oracle script)

Comments?

- andreas

PS: I'll try if I can find another index which we should add.
(I want to learn more about OpenCMS data model, anyway... ;-)

-- 
Andreas Haumer                     | mailto:andreas at xss.co.at
*x Software + Systeme              | http://www.xss.co.at/
Karmarschgasse 51/2/20             | Tel: +43-1-6060114-0
A-1100 Vienna, Austria             | Fax: +43-1-6060114-71



More information about the opencms-dev mailing list