[opencms-dev] opencms6 - postgresql8 and performance
Andreas Haumer
andreas at xss.co.at
Thu Sep 22 17:00:40 CEST 2005
Hi Gerard,
Lambert Gerard schrieb:
> Hello,
> Sorry for the mix in the thread.
> I send a updated version of the script.
Thanks.
I can confirm that with your new script a fresh installation of
OpenCMS with PostgreSQL 8.0.3 works fine and without errors.
I also did import some resources into OpenCMS, work in online
and offline mode and found no obvious error.
> These script is only based on the mysql script and not on a analysis of explain in postgres.
I did some analysis of the queries OpenCMS does with the
database and I think there is at least one index we should add.
It's a query which is executed really often when OpenCMS answers a
request. OpenCMS tries to load data from table CMS_ONLINE_STRUCTURE
joined with table CMS_ONLINE_RESOURCES and a WHERE clause looking
for a specific value in column CMS_ONLINE_STRUCTURE.RESOURCE_PATH
Look at the following example:
opencms=# explain SELECT CMS_ONLINE_STRUCTURE.STRUCTURE_ID,
CMS_ONLINE_STRUCTURE.RESOURCE_ID,
CMS_ONLINE_STRUCTURE.RESOURCE_PATH,
CMS_ONLINE_STRUCTURE.STRUCTURE_STATE,
CMS_ONLINE_STRUCTURE.DATE_RELEASED,
CMS_ONLINE_STRUCTURE.DATE_EXPIRED,
CMS_ONLINE_RESOURCES.RESOURCE_ID,
CMS_ONLINE_RESOURCES.RESOURCE_TYPE,
CMS_ONLINE_RESOURCES.RESOURCE_FLAGS,
CMS_ONLINE_RESOURCES.RESOURCE_STATE,
CMS_ONLINE_RESOURCES.DATE_CREATED,
CMS_ONLINE_RESOURCES.DATE_LASTMODIFIED,
CMS_ONLINE_RESOURCES.USER_CREATED,
CMS_ONLINE_RESOURCES.USER_LASTMODIFIED,
CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED AS LOCKED_IN_PROJECT,
CMS_ONLINE_RESOURCES.RESOURCE_SIZE,
CMS_ONLINE_RESOURCES.SIBLING_COUNT,
CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED
FROM CMS_ONLINE_STRUCTURE INNER JOIN CMS_ONLINE_RESOURCES ON CMS_ONLINE_STRUCTURE.RESOURCE_ID=CMS_ONLINE_RESOURCES.RESOURCE_ID
WHERE CMS_ONLINE_STRUCTURE.RESOURCE_PATH='/'
ORDER BY CMS_ONLINE_STRUCTURE.STRUCTURE_STATE ASC;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Sort (cost=264.00..264.00 rows=1 width=322)
Sort Key: cms_online_structure.structure_state
-> Nested Loop (cost=0.00..263.99 rows=1 width=322)
-> Seq Scan on cms_online_structure (cost=0.00..257.98 rows=1 width=166)
Filter: (resource_path = '/'::text)
-> Index Scan using cms_online_resources_pkey on cms_online_resources (cost=0.00..6.00 rows=1 width=156)
Index Cond: (("outer".resource_id)::text = (cms_online_resources.resource_id)::text)
(7 Zeilen)
opencms=# select count(*) from cms_online_structure;
count
-------
3598
(1 Zeile)
As you can see, PostgreSQL does a sequential scan on table
CMS_ONLINE_STRUCTURE filtering on column RESOURCE_PATH.
In this example, there are 3598 rows in this table, which
makes the sequential scan quite costly, and the query is
executed a _lot_ (with different WHERE clauses of course)!
I'd suggest to add the following index:
opencms=# create index IDX_ONLINE_STRUCTURE_6 ON CMS_ONLINE_STRUCTURE (RESOURCE_PATH);
With this index, the query costs are reduced quite significant:
opencms=# explain SELECT CMS_ONLINE_STRUCTURE.STRUCTURE_ID,
CMS_ONLINE_STRUCTURE.RESOURCE_ID,
CMS_ONLINE_STRUCTURE.RESOURCE_PATH,
CMS_ONLINE_STRUCTURE.STRUCTURE_STATE,
CMS_ONLINE_STRUCTURE.DATE_RELEASED,
CMS_ONLINE_STRUCTURE.DATE_EXPIRED,
CMS_ONLINE_RESOURCES.RESOURCE_ID,
CMS_ONLINE_RESOURCES.RESOURCE_TYPE,
CMS_ONLINE_RESOURCES.RESOURCE_FLAGS,
CMS_ONLINE_RESOURCES.RESOURCE_STATE,
CMS_ONLINE_RESOURCES.DATE_CREATED,
CMS_ONLINE_RESOURCES.DATE_LASTMODIFIED,
CMS_ONLINE_RESOURCES.USER_CREATED,
CMS_ONLINE_RESOURCES.USER_LASTMODIFIED,
CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED AS LOCKED_IN_PROJECT,
CMS_ONLINE_RESOURCES.RESOURCE_SIZE,
CMS_ONLINE_RESOURCES.SIBLING_COUNT,
CMS_ONLINE_RESOURCES.PROJECT_LASTMODIFIED
FROM CMS_ONLINE_STRUCTURE INNER JOIN CMS_ONLINE_RESOURCES ON CMS_ONLINE_STRUCTURE.RESOURCE_ID=CMS_ONLINE_RESOURCES.RESOURCE_ID
WHERE CMS_ONLINE_STRUCTURE.RESOURCE_PATH='/'
ORDER BY CMS_ONLINE_STRUCTURE.STRUCTURE_STATE ASC;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Sort (cost=11.04..11.05 rows=1 width=322)
Sort Key: cms_online_structure.structure_state
-> Nested Loop (cost=0.00..11.03 rows=1 width=322)
-> Index Scan using idx_online_structure_6 on cms_online_structure (cost=0.00..5.02 rows=1 width=166)
Index Cond: (resource_path = '/'::text)
-> Index Scan using cms_online_resources_pkey on cms_online_resources (cost=0.00..6.00 rows=1 width=156)
Index Cond: (("outer".resource_id)::text = (cms_online_resources.resource_id)::text)
(7 Zeilen)
See?
IMHO this index would be a good thing to add.
Of course, the same is true for the "offline" tables, so I'd
suggest to add the following index too:
create index IDX_OFFLINE_STRUCTURE_6 ON CMS_OFFLINE_STRUCTURE (RESOURCE_PATH);
(btw: both indices are created if you use the Oracle script)
Comments?
- andreas
PS: I'll try if I can find another index which we should add.
(I want to learn more about OpenCMS data model, anyway... ;-)
--
Andreas Haumer | mailto:andreas at xss.co.at
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71
More information about the opencms-dev
mailing list