[opencms-dev] V 10.5.0 - Extended HTML Import

Gerd Schrick mail at schrick-design.de
Wed Feb 22 00:23:05 CET 2017


Hi Alex,

thanks for your reply.

Yes, I already read about these options but I still don't know how to prepare the data to be imported; binary files are straight-forward, but the "content" (container pages, content elements ...).

I assume that the mentioned "OpenCms native XML based import" is what is created/used by the Database Export/Import.
I looked into that, but to generate this seems to me not that easy (creating suitable UUIDs, correct referencing of content elements and formatter in container pages, link resolving, generating the Manifest for all the properties, ...)
[BTW: is there anywhere a kind of explaining format description?]
Thinking about the setup (ANT/XSLT) to generate something like this I believe that it makes more sense to put my effort in developing a suitable importer(-module) that can be shared and help others too.

I did not found anything about import experiences yet ... how do/did others migrate existing content to OpenCms? Is there some "easy" way I simply can't see?


Meanwhile I'll go with XmlPages and the Extended HTML Import.
Studying the source code of the import I found out that it's possible and easy to set arbitrary properties by use of the meta tags in the html files - such a cool feature should be more obvious! :-)
What I still have to find a solution for is, how to get the content of the other locals into the XmlPage (SingleTree multilang site) - I think I'll try to customize the import with an overwrite-option "update", that works non-destructive, just updates existing elements/properties and adds non-existing; so with a subsequent import the other local's content could simply be added too.
(if I'm successfull with this, I'll be happy to share in case of interest)

Comment about XmlPages:
hopefully OpenCms will forever support XmlPages (in addition; it's great to have the choice) as they're perfectly for a usecase like one of ours that is really "document-centric" [large sets 10.000+ of legal documents that simply are, will and shall (always) be just large junks of HTML; page, content and properties can be found in one place / at one object what is really intuitive when using the Explorer; the docs are provided/prepared external, so editing is not something central; "flexible page design" by editors as possible with container pages (what undoubtly is a great feature) is an absolute no-go here]


Best regards,
Gerd

Am 15. Februar 2017 08:18:21 MEZ, schrieb Alex Kandzior <alex at opencms.org>:
>Gerd,
>
>regarding your question about how to import data in OpenCms.
>
>There are multiple ways to import data to OpenCms, e.g.:
>* OpenCms native XML based import
>* Local file synchronization
>* CMIS access to VFS
>* WebDav access to VFS
>* Network share access to VFS
>
>Of these, the network share access is certainly the most modern and
>powerful option.
>How to set this up is documented here:
>
>http://documentation.opencms.org/opencms-documentation/more-opencms-features/mounting-the-vfs/network-share-access/index.html
><http://documentation.opencms.org/opencms-documentation/more-opencms-features/mounting-the-vfs/network-share-access/index.html>
>
>Once you have this working, your can access the OpenCms VFS like it is
>part of the native file system of your server.
>You could then export your content form your old CMS, process in on the
>shell level, and just copy the transformed XLM to OpenCms also via
>shell commands.
>
>Kind regards,
>Alex.
>
>-------------------
>Alexander Kandzior
>
>Alkacon Software - The OpenCms Experts                                 
>                  
>http://www.alkacon.com - http://www.opencms.org                        
>                                 
>
>
>
>> Am 13.02.2017 um 17:55 schrieb Gerd Schrick <mail at schrick-design.de>:
>> 
>> Hi Alex,
>> 
>> thanks a lot for your reply.
>> 
>> Yes, Lenya's content is in xml and the content I'd like to import as
>well.
>> I already did many batch conversations/cleanups of xml/html with ANT
>and its possibilities (with heavy use of XSLT transformations).
>> So this would be no "big deal" for me to somehow "redesign" and
>prepare the data to be imported in OpenCms.
>> 
>> But how can I then get the data into OpenCms?
>> I assume the only way is thru some kind of import procedure that
>ensures the correct handling (e.g. assigning uuids, link processing,
>add system related properties, storing in the right places etc.) - at
>least that was how I did / have to do it in Lenya.
>> 
>> What other options are there in OpenCms to "import"?
>> Directly to the VFS?
>> What is the correct format that I have to prepare for that?
>> Is there some documentation on this?
>> 
>> Sorry to bother you with all these questions, but I have not found
>anything helpful in the web yet - although I assume that this topic
>is/was relevant for most of the customers.
>> Maybe it's just so easy and obvious that I've overlooked it already
>;-)
>> 
>> 
>> Thanks for your help!
>> 
>> Gerd
>> 
>> Am 13. Februar 2017 08:51:24 MEZ, schrieb Alex Kandzior
><alex at opencms.org>:
>> Gerd,
>> 
>> the extended HTML import function you are trying to use is - in it's
>current implementation - compatible only with XmlPages.
>> 
>> The XmlPage technology has been replaced by the ContainerPage
>technology between OpenCms 7 and 8.
>> 
>> XmlPages where using large chunks of (quite unstructured) HTML blobs
>to create a page. These blobs where stored inside of the page itself.
>> 
>> Container pages today contain just pointers to a set of content
>elements. Each content element is a structured XML content stored in a
>separate file.
>> 
>> For backward compatibility, XmlPages are still supported in OpenCms
>today BUT the templates and demos all use container pages.
>> You need to enable support for XmlPages first by adding the
>configuration for this resource type in openms-vfs.xml and also
>opencms-workplace.xml.
>> 
>> You would then need to create an XmlPage compatible template in order
>to see the imported XmlPages displayed.
>> 
>> 
>> Maybe another approach for content migration is more suitable:
>> 
>> I have no idea about how your content is structured in Lenya, but
>IIRC Lenya also was XML based with XSLT transformation. 
>> 
>> Maybe you can try to transform your existing contents directly to
>OpenCms XML content types by XSLT?
>> 
>> Kind regards,
>> Alex.
>> 
>> -------------------
>> Alexander Kandzior
>> 
>> Alkacon Software - The OpenCms Experts                               
>                    
>> http://www.alkacon.com <http://www.alkacon.com/> -
>http://www.opencms.org <http://www.opencms.org/>                       
>                                  
>> 
>> 
>> 
>>> Am 12.02.2017 um 23:48 schrieb Gerd Schrick <mail at schrick-design.de
><mailto:mail at schrick-design.de>>:
>>> 
>>> Dear List,
>>> 
>>> a short introduction:
>>> after about 10 years of intensive develop and maintenance work for a
>large website + mobile and some smaller (still productive) with the
>Open Source CMS "Apache Lenya" (and Cocoon 2.1) I re-discovered OpenCms
>(used it about 13 yrs ago for some smaller sites) as a potential
>replacement for Lenya.
>>> Already evaluated some other in theory and "quickly" tested some of
>them (Hippo amongst others).
>>> I really was impressed by what I read about OpenCms (partly
>unbelivable ;-) and about 2 weeks ago i finally installed it (Arch
>Linux, MariaDB, Java 8, Tomcat 8) ... and the more I play with it the
>more I love it :-)
>>> 
>>> My question or "problem" regards to the "Extended HTML Import":
>>> 
>>> I installed the official 10.5.0 release with the default settings
>including the Apollo Example.
>>> With help of the good documentation I've set up an additional site
>(my evaluation prototype) for three languages (de, en, fr; SingleTree)
>with just a simple Template (+ Model) with only one Container.
>>> This works very well.
>>> 
>>> To kind of "finalize" my prototype and show it to the customer I
>need to fill it with content (about 1.500 documents, each html in 3
>langs + a PDF per doc)).
>>> 
>>> With a small amount of testpages (6 folder + 5 docs + 5 PDFs) I
>tried the Extended HTML Importer.
>>> After adding "head|Head Element,body|Body Element,foot|Foot Element"
>to the template's "template-elements" property (found this hint
>somewhere in the I-net) the import run was successfull w/o errors.
>>> 
>>> BUT:
>>> 
>>> - the imported content/page is only shown when I use <cms:include
>element="body" editable="true"/> instead of <cms:container .../> (else
>I get an error) but this way the created (not imported) content is not
>shown (obviously)
>>> 
>>> - no edit option in the page editor available
>>> 
>>> - in EXPLORER I can't navigate to the imported data; when expanding
>the parent folder (subfolder in my prototype site) the arrow in front
>of the the folder icon rotates to point downward but nothing happens;
>on hover (over the folder) a kind of tooltip shows in red an Exception.
>In the OpenCms log I found:
>>> ERROR [din.server.DefaultErrorHandler:  58]
>>> java.lang.NullPointerException
>>>     at
>org.opencms.ui.components.CmsResourceIcon.getSmallTypeIconURI(CmsResourceIcon.java:393)
>>>     at
>org.opencms.ui.components.CmsResourceIcon.getIconInnerHTML(CmsResourceIcon.java:300)
>>> :
>>> and
>>> ERROR [encms.ui.CmsVaadinErrorHandler:  80]
>>> Invocation of method itemClick in
>org.opencms.ui.apps.CmsFileExplorer$9 failed.
>>> com.vaadin.event.ListenerMethod$MethodException: Invocation of
>method itemClick in org.opencms.ui.apps.CmsFileExplorer$9 failed.
>>>     at
>com.vaadin.event.ListenerMethod.receiveEvent(ListenerMethod.java:533)
>>> :
>>> 
>>> - in the SITEMAP I can navigate to the imported data (an icon
>with'?' is shown for the imported pages) and in "Resources" view I see
>all the folders, pages and PDFs in the correct structure, and can
>access/edit (props) the folders and PDFs BUT NOT the imported HTML
>pages; when accessing their PROPERTIES or INFO:
>>> the dialog shows: "java.lang.NullPointerException: No description
>available."
>>> and the log:
>>> ERROR [ org.opencms.gwt.CmsGwtService: 183] 
>>> java.lang.NullPointerException
>>>     at
>org.opencms.gwt.CmsVfsService.addPageInfo(CmsVfsService.java:345)
>>> :
>>> and
>>> ERROR [ org.opencms.gwt.CmsLogService:  66] Client LOG (Host
>192.168.178.41, Address 192.168.178.41, Ticket 1486935332797): null
>>> org.opencms.gwt.CmsVfsService.addPageInfo(CmsVfsService.java:345)
>>> :
>>> 
>>> For testing, I imported the 7.0.0 documentation module + the
>documentation for the Extended HTML Import and XMLContent what shows
>the same behaviour in Explorer and Sitemap.
>>> 
>>> 
>>> 
>>> Have I missed something to add/configure?
>>> Is this import feature incompatible with version 10.5.0?
>>> 
>>> Stumbled over the class
>>>
>https://github.com/alkacon/opencms-core/blob/branch_10_5_x/src-modules/org/opencms/workplace/tools/database/CmsHtmlImport.java
><https://github.com/alkacon/opencms-core/blob/branch_10_5_x/src-modules/org/opencms/workplace/tools/database/CmsHtmlImport.java>
>>> that it guess this handling the import (is this correct?) ...
>>> detected there (line 1137) that a "CmsXmlPage" page is created ...
>>> and it seems to me as if there is no (more) support for the
>"xmlpage" resourcetype in 10.5.0 what causes the issue (I'm just
>guessing, as I didn't found a way to create a "XmlPage" in the system,
>only a ContainerPage with XmlContent).
>>> 
>>> 
>>> 
>>> Is there a way to get this working somehow? maybe some kind of
>"hack" afterwards in the DB (changing some values)?
>>> For the prototype it does not have to be perfect.
>>> Also thought about a custom build (already set up the environment as
>described and git-cloned the core :-) or to create an import module
>based on the Extended HTML Import code ...
>>> 
>>> 
>>> 
>>> Sorry for the long but hopefully explanatory text.
>>> 
>>> Thank you very much for your help on this!
>>> 
>>> Best regards,
>>> Gerd
>>> 
>>> _______________________________________________
>>> This mail is sent to you from the opencms-dev mailing list
>>> To change your list options, or to unsubscribe from the list, please
>visit
>>> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
><http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev>
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> This mail is sent to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please
>visit
>> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
><http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev>
>> 
>> 
>> 
>> _______________________________________________
>> This mail is sent to you from the opencms-dev mailing list
>> To change your list options, or to unsubscribe from the list, please
>visit
>> http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
>> 
>> 
>> 
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>This mail is sent to you from the opencms-dev mailing list
>To change your list options, or to unsubscribe from the list, please
>visit
>http://lists.opencms.org/cgi-bin/mailman/listinfo/opencms-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20170222/5835c1e4/attachment.htm>


More information about the opencms-dev mailing list