[opencms-dev] OpenCMS / Google Sitemap generation script

Tim Howland thowland at gmail.com
Thu Oct 6 18:40:17 CEST 2005


Hey folks-

I wrote the following jsp page to generate XML sitemaps that Google can 
read (see http://www.google.com/webmasters/sitemaps/docs/en/faq.html ).

Basically, it recurses through your site tree and pulls out any jsp, 
html, or pdf files it finds, generating a link in Google's format so 
they can locate the file.

It also looks for three custom properties- these properties can be set 
at the folder level, they will cascade to all files and subfolders below.

    * sitemap_hidden (default value: false): set this to "true" to hide
      something from this script. I use it to hide client extranets and
      include files.
    * sitemap_change_frequency (default value: weekly): controls how
      often google should check for an update of this element. Legal
      values are:   always,  hourly, daily, weekly, monthly, yearly, never .
    * sitemap_priority (default value: 1): the relative priority of this
      element. Should be an integer.


Here's the script- improvements and comments are welcome, of course:

<%@ page session="false" %>
<%@ page import="java.util.*,org.opencms.jsp.*,org.opencms.file.*,java.text.DateFormat, java.text.SimpleDateFormat, org.opencms.main.*" %>
<%@ taglib prefix="cms" uri="http://www.opencms.org/taglib/cms" %>
<%@ taglib prefix="c" uri="http://java.sun.com/jstl/core" %>
<?xml version="1.0" encoding="UTF-8" ?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<%!
protected String BASE_URL="";
public static SimpleDateFormat ISO8601FORMAT = new SimpleDateFormat("yyyy-MM-dd");
private String recurseTree(CmsObject cmso,CmsJspActionElement jsp, String path)  {
	StringBuffer sb = new StringBuffer();
	try {
		ArrayList files = (ArrayList) cmso.getFilesInFolder(path);
		Iterator i = files.iterator();
		while (i.hasNext()) {
			CmsFile f = (CmsFile) i.next();	
			String thispath = jsp.link(cmso.getSitePath(f));
			CmsProperty secret = cmso.readPropertyObject(f,"sitemap_hidden",true);
			CmsProperty changeFreqProperty = cmso.readPropertyObject(f,"sitemap_change_frequency",true);
			CmsProperty priorityProperty = cmso.readPropertyObject(f,"sitemap_priority",true);
			String changeFrequency = changeFreqProperty.getValue("weekly");
			String priority = priorityProperty.getValue("1");
			if ((secret.getValue("false") == "false") 
				&& (thispath.endsWith("html")||thispath.endsWith("jsp")||thispath.endsWith("pdf"))) {		
				sb.append("<url>\n");			
				sb.append("<loc>"+BASE_URL+thispath+"</loc>\n");
				//DateFormat df = DateFormat.getDateInstance();
				String niceDate = ISO8601FORMAT.format(new Date(f.getDateLastModified()));
				sb.append("<lastmod>"+niceDate+"</lastmod>\n");
				sb.append("<changefreq>"+changeFrequency+"</changefreq>\n");
				sb.append("<priority>"+priority+"</priority>\n");
				sb.append("</url>\n");
			}
		}
		ArrayList folders = (ArrayList) cmso.getSubFolders(path);
		Iterator j = folders.iterator();	
		while (j.hasNext()) {
			CmsFolder f = (CmsFolder) j.next();
			sb.append( recurseTree(cmso,jsp, cmso.getSitePath(f) ) );
		}
	}
	catch (CmsException cmsException) {
		sb.append("A CMS exception occurred: "+cmsException.toString());
	
	}
	return sb.toString();

}

%>
<%

CmsJspActionElement cms = new org.opencms.jsp.CmsJspActionElement(pageContext, request, response);
CmsObject cmso = cms.getCmsObject();
String url = cms.info("opencms.url");
int lastSlash = url.indexOf("/",8);
BASE_URL = url.substring(0,lastSlash);
out.println (recurseTree(cmso,cms, "/"));
%> 
</urlset>

You can download it (plus a readme file with slightly more info) from 
http://www.wdogsystems.com/opencms/opencms/downloads/index.jsp .
Tim

-- 
 Tim Howland
 Watchdog Systems, LLC
 (978) 225-8494
 http://wdogsystems.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://webmail.opencms.org/pipermail/opencms-dev/attachments/20051006/00396ddb/attachment.htm>


More information about the opencms-dev mailing list