[opencms-dev] Adding checksums to OpenCms 8

Christian Steinert christian_steinert at web.de
Wed Sep 30 13:26:30 CEST 2009


Generally this sounds like a very good idea and I agree that it seems 
bet to add this at the driver level although higher layers should have 
some way of requesting checksum validity information (not necessarily 
the checksums values themselves, since this is maybe too much of an 
implementation detail and, for example, the checksum algorithm might 
change over time).

But would the checksums be restricted to file content or are there also 
considerations  to add checksums for properties and/or to general file 
system structures? I find it hard to estimate the possible performance 
impact of checksumming this kind of information, too, so I don't know 
whether that is a good idea. Some OS-filesystems are of course capable 
of doing checksums on all metadata as well, but they have very optimized 
data structures and I/O behavior which might be impossible to do when 
sitting inside of a Java VM+Servlet Container and on top of various 
different databases.
Nonetheless, I at least wanted to raise the point

Best REgads
Christian
> Hi List,
>
> I recently had a customer who stored about 10000 JPEGs inside OpenCms
> (with MySQL). Due to hard disk degradation in a RAID1-Array some of the
> data became invalid (slowly over time of course) resulting in corrupt
> images. Although backups were in place (with checksums to verify
> everything) the slow degradation made it extremely difficult to find the
> corrupt images. The only way was to read backups from various stages and
> compare checksums and last modification dates. I've read a lot about
> data integrity and since OpenCms stores all the binary data in the DB I
> think it might be worth it to add additional features to the database
> structure.
>
> I would suggest adding at least a field FILE_CONTENT_HASH to the
> CMS_CONTENTS table which is filled in during file writes and updates.
> The field could be NULLable indicating that no checksum is available.
> This would also allow to disable generating the checksum in favor of
> write performance. Maybe we could implement a hook in the driver
> structure to perform validations on read (using a Java interface).
> Additional checks could be performed using a scheduled task or custom
> modules. Eventually it would be nice to have the checksum available in
> the CmsFile objects but I don't think this an requirement for a first
> step. I don't know if this should also applied to properties. More
> security is of course always good but I really would want to keep the
> changes to a minimum at first.
>
> Whats your take on it?
>
> Best regards,
> Sebastian
>
>
>
>
> _______________________________________________
> This mail is sent to you from the opencms-dev mailing list
> To change your list options, or to unsubscribe from the list, please visit
> http://lists.opencms.org/mailman/listinfo/opencms-dev
>
>   




More information about the opencms-dev mailing list