Cocoon Plumber
Sunday 4 February 2007 @ 5:22 pm

Cocoon‘s pipeline-driven service style is ideal for serving structured data on the web. It reduces every data problem to an XML issue, and every processing and viewing problem to an XSLT problem. The same tools can be brought to bear on all aspects of the service; as your skills deepen, the benefits are exponential.

As a Cocoon project becomes more complex, though, it’s helpful to have tools that let you see what’s going on inside the pipelines. Cocoon provides views, which allow you to see the output of each step in the pipeline, and profiling, which adds timings. Getting at these can be a little tedious when working intensively, so it would be nice to have a menu that shows you the pipeline that generated the current page and lets you see the views, the components, and the timings easily.

This project, called “Cocoon Plumber”, provides such a menu in the form of a Firefox sidebar. It has two components: a Firefox extension that generates the sidebar, and a Cocoon subproject that can be dropped into any existing Cocoon sitemap to provide the information to populate the sidebar.
Cocoon Plumber Screenshot


Download and unzip the code. To try out the demo, add the cpdemo directory to the mount-table of a recent Cocoon installation, like this:

<mount uri-prefix="cpdemo" src= "file://C:/Documents and Settings/Peter/My Documents/projects/cocoon-plumber/cpdemo/"/>

Install the Firefox extension cocoonplumber.xpi in the usual way. Restart Firefox. If your Cocoon isn’t running at http://localhost:8888/, open the options menu for the Cocoon-Plumber extension and change the base url as needed.

Now visit http://localhost:8888/cpdemo/blah1/blah2/test.html. The page you see is generated by a very simple pipeline: it starts with a simple xml file containing a “step” element, and a couple of stylesheets add two more steps, and then a final stylesheet turns it into html. To see all this, hit Ctrl-Shift-C to open the sidebar. You can inspect the xml and xsl files by clicking the links in the src attributes, and you can view the intermediate cocoon-views of the output of each step by clicking the label link.

Plumber is easy to add to an existing Cocoon project. Just copy the “plumber” directory from cpdemo into the base directory of the project (where the project sitemap.xmap is found). Add this pipeline to the project sitemap, at the top of the <map:pipelines> element:

  <map:match pattern="plumber/**">
    <map:mount check-reload="yes" uri-prefix="plumber" src="plumber/sitemap.xmap"/>
  <map:match pattern="profile.xml">
    <map:generate type="profiler"/>
    <map:serialize type="xml"/>

The second map:match is optional, and depends on your having configured profiling in your project. If profiling is not available, plumber will still work; it will simply not display timings.

Finally, you need to activate the CInclude and Source-Writing transformers, if they aren’t already used in your project:

<map:transformers default="xslt">
  <map:transformer name="cinclude" src= "org.apache.cocoon.transformation.CIncludeTransformer"/>
  <map:transformer name="write-source" src= "org.apache.cocoon.transformation.SourceWritingTransformer"/>

How it works

Firefox component:

The Firefox extension needs to be given the base url of the Cocoon project. With that, it can populate the sidebar with a menu appropriate to any page generated by the Cocoon project. The sidebar will contain a view of the pipeline, with all the stylesheets linked to allow the user to view them. The component labels are also linked to allow you to see the current page through that view. All sidebar links cause new tabs to open.

Cocoon component:

The sidebar gets its data from a sub-sitemap that generates HTML from the main sitemap, so that it can present a view of the current pipeline. This involves using the sitemap.xmap as the source xml in the generator; the but the first problem is to identify the pipeline that is generating the current page. Since pipelines can use all sorts of different matchers, there’s no easy way to do this. I owe to Art Rhyno an ingenious solution: we apply a stylesheet to the base sitemap.xmap to produce a new plumber.xmap, which “hollows out” each pipeline and applies explicit ids. Since the plumber.xmap uses the same matchers as the base sitemap.xmap, it reliable identifies the pipeline that a given page is using. To deploy this new sitemap is a two-step process: we mount the plumber sub-project’s sitemap as a sub-sitemap, and it in turn generates the plumber sitemap from the base sitemap and mounts it to serve sidebar requests.

This allows the Firefox extension to derive the plumber url from any given url from the base project: if (as in the supplied demo) the base project url is


then the plumber url will be


The “plumber” level points to the plumber sub-project sitemap, and the “sitemap” element points to the generated plumber sitemap.

To do

There’s plenty that could usefully be done to improve Cocoon Plumber. I’ve taken it as far as I have time to for the moment, but I’d welcome fixes or enhancements from any source.

  • the sidebar keeps showing the hourglass cursor even after the page has fully loaded; I can’t figure out why
  • mapping the profiling data to the components of the pipeline isn’t entirely reliable

Zotero – A First Glance
Friday 6 October 2006 @ 7:04 am

Well, Z-Day has come and gone and it’s time to tidy up the wrapping paper. My first impression of Zotero is positive: the integration of citation capture into the browser is looking good.

There are a range of generic options (create a new citation from scratch, capture current web page, capture link, create snapshot). More importantly, there are specific options when Zotero recognizes a page it knows how to scrape from: you see an icon in the location bar, like the RSS icon when Firefox discovers a feed. Click it to capture the citation. It did a decent job of grabbing book citations from our OPAC and from Amazon, and quite a nice job of grabbing article citations from JSTOR–with one important caveat, discussed below. By default it takes a snapshot when it creates a record, but this seems a little slow and I turned it off. Beyond individual books and articles, you see a folder icon when you’re on the search results screen in JSTOR: you can save the whole set of 25 records with one operation (though you need to check them off one by one: it needs a “check all” option). It takes a while, but it’s very cool. But turn off the snapshots: not only were they excruciatingly slow, they ended up as empty PDFs.

Once you’ve captured some stuff, you can use the “Locate” button to find them using an OpenURL pointed at your institutional resolver. Here a problem in the capture of JSTOR citations shows up: Zotero captures the date in an unstructured format (“Dec., 1931”), and then omits it from the OpenURL, presumably because it didn’t have a clean year and month. Without dates, SFX (for one) does a lousy job of managing thresholds, and so many of the OpenURLs failed to resolve back to JSTOR, although they would if they included the year. If I edit the date down to just the year “1931”, it gets included in the OpenURL and all is well. The problems with date handling pointed out by Bruce D’Arcus therefore have very practical consequences.

I’ve only played with the capture mechanisms so far; there are also functions for exporting and managing citations. Zotero’s beta 1.0 release has come so far, and has such strong backing, and such attractive plans, that I’m confident we can look forward to a really useful tool.

Zotero hiring
Friday 22 September 2006 @ 9:35 am

As if Zotero weren’t already looking cool enough, it now appears they’re hiring a senior programmer and a “Technology Evangelist” on two-year contracts. (This according to an email forwarded to the Code4Lib list by Raymond Yee; I can’t find a posting online). Since the programmer needs PHP and MySQL on top of Firefox skills, we can assume they’ll be getting serious about the collaborative possibilities of Zotero. Dan Cohen tantalizes us:

What if you could share a folder of references and notes with a colleague across the country? What if you could receive a feed of new resources in your area of interest? What if you could synchronize your Zotero library with a server and access it from anywhere? What if you could send your personal collection to other web services, e.g., a mapping service or text analyzer or translation engine?

All that on top of what looks like a very slick browser-based interface to capture metadata as you browse. I can’t wait.

[Update: Raymond’s message is now in the Code4Lib archive.]

