[opencms-dev] Jtidy bug (was: Re: CmsXmlException when pasting news from word to HtmlWidget)

Christian Steinert christian_steinert at web.de
Sun Jun 11 16:39:29 CEST 2006


Huang Zhibin schrieb:
> Hi, list,
> Today I copy one news from word and paste it in the Content
> (<xsd:element name="Content" type="OpenCmsHtml" /> ... <layout
> element="Content" widget="HtmlWidget"
> configuration="source,link,anchor,formatselect,imagegallery,downloadgallery,linkgallery,htmlgallery,tablegallery,height:400px"
> />), and meet with one problem. I do not know how to solve it, does
> anyone can give me some suggestions?
> Thanks for your kindness. (BTW: I included that news as an attachment
> for you to test.)
> Regards,
> Zhibin
>

Hi Zhibin, (and also Hi jonathan.)

try the jtidy version I emailed you. This is exactly the same error that
I had.


But please notice the following:
finally I remembered something I *DID* change before re-compiling the
jtidy library. There is some code that will try to convert special
characters from their binary representation into the corresponding html
entity (like "–") . This call I have commented out. Probably the
bug was there.


So my version of tidy will now
1. convert almost all html entities into their binary code (this seems
to be how tidy works),
2. do the html cleaning
3. and then my version will NOT convert any special characters back into
their entity represention.

So anything like Umlauts or chinese characters or whatever are expressed
in their binary form after cleaning, not as html entities.


I thought this might be of interest.
Christian



More information about the opencms-dev mailing list