[opencms-dev] Importing problem

Martin Kuba makub at ics.muni.cz
Thu Mar 25 12:00:02 CET 2004


David Currie wrote:
> The difference between the two sets of pages is simply that the body of 
> the page in /system/bodies/... has all the accented characters changed 
> to '?'.  In the generated .zip file, the characters are there, so I 
> don't think the exporting is the problem, it's an import problem.
> 
> I'm assuming that there is no native OpenCMS code that makes these 
> changes, what are the external libraries used during the import 
> process?  Could there be a codepage problem there?

Accented characters changed to '?' characters are a typical
sign of non-matching encodings. The '?' characters are
created by methods like String.getBytes(encoding) or
OutputStreamWriter.write() when characters cannot
be represented in the target encoding.

I don't think that TomCat or JSDK version have anything
to do with it. If the exported files are properly encoded
in ISO-8859-1, the problem is most likely in importing them
as another encoding, which produces random unicode characters,
and these characters when stored in database or written
to output cannot fit into single-byte encoding, and
are replaced with '?'.

Martin
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Supercomputing Center Brno             Martin Kuba
Institute of Computer Science    email: makub at ics.muni.cz
Masaryk University             http://www.ics.muni.cz/~makub/
Botanicka 68a, 60200 Brno, CZ     mobil: +420-603-533775
--------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3415 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20040325/e0b49542/attachment.bin>


More information about the opencms-dev mailing list