[opencms-dev] speed of CmsExport / ZipOutputStream bug

Ruben Malchow ruben at khm.de
Wed Aug 9 19:07:19 CEST 2006



hello list,

i can't stop wondering: this time, i wonder if it's possible that 
exporting 10 files around 3mb each can take more than 15 minutes? so, 
this is a 3ghz p4, 2g ram: no swapping going on, and - this is weird - 
not even a lot of disk activity, the system is at 100% ... what's taking 
so long?

i dived into this, and did a little test: the ZipInputStream is 
EXTREMELY inefficient when it's being confronted with LARGE byte arrays.

so, instead of <1sec when reading from a file and writing to a ZipEntry 
in 1024-byte chunks, it took 60ms to read the entire ~3mb to a byte 
array, and then MORE THAN ONE HUNDRED SECONDS to write it to the zip entry.

now, i don't use capital letters very often, but the ZipOutputStream 
seem to be incredibly inefficient. so basically, i propose to replace 
this code in CmsExport.java:

	getExportZipStream().write(file.getContents());

with this code:

	byte[] buff=file.getContents();
	int offset=0;
	int chunkSize=1024;
	while((buff.length-offset)>0) {
		getExportStream.write(
			buff,
			offset,
			Math.min(chunkSize,(buff.length-offset))
		);
		offset+=chunkSize;
	}

to speed the export up by an incredible 10000 percent - especially if 
you have lots of large files. this is a serious bug in java, verfied to 
exist in 1.5.0_07, but not in 1.4.2_08 on win32, but anticipating this 
wouldn't hurt very much. in an example on java.sun.com, they even used a 
bigger buffer (4096 byte) ...


.rm





More information about the opencms-dev mailing list