Java – an XSLT custom function that returns a node set or an XML fragment (not a simple data type)
I'm trying to develop an XSLT custom function that can return node sets or XML fragments. Let's say:
Input file:
<root> <!-- author: blablabla usage: more blablabla labelC: [in=2] <b>formatted</b> blablabla --> <tag1 name="first"> <tag2>content a</tag2> <tag2>content b</tag2> <tag3 attrib="val">content c</tag3> </tag1> <!-- author: blebleble usage: more blebleble labelC: blebleble --> <tag1 name="second"> <tag2>content x</tag2> <tag2>content y</tag2> <tag3 attrib="val">content z</tag3> </tag1> </root>
Such an XSLT template, such as:
<xsl:template match="//tag1/preceding::comment()[1]" xmlns:d="java:com.dummy.func"> <section> <para> <xsl:value-of select="d:genDoc(.)"/> </para> </section> </xsl:template>
Will produce:
<section> <para> <author>blablabla</author> <usage>more blablabla</usage> <labelC in="2"><b>formatted</b> blablabla</labelC> </para> </section>
Tag1 and tag2 appear for the first time when matching
<section> <para> <author>blebleble</author> <usage>more blebleble</usage> <labelC>blebleble</labelC> </para> </section>
Match on the second occurrence
Basically, what I want to implement with this custom function is to parse some metadata in the annotation and use it to generate XML
I found some examples on the Internet. One is: http://cafeconleche.org/books/xmljava/chapters/ch17s03.html
According to the example, my function should return one of the following
org.w3c.dom.traversal.NodeIterator,org.apache.xml.dtm.DTM,org.apache.xml.dtm.DTMAxisIterator,org.apache.xml.dtm.DTMIterator,org.w3c.dom.Node and its subtypes (Element,Attr,etc),org.w3c.dom.DocumentFragment
I can implement a function that returns XML as a simple type string However, this brings some other problems: mainly tag characters are escaped when inserting the original XML
Is there an example of how to implement such a function? What interests me most is how to return the correct set of XML nodes to the calling template
Solution
The following may take you along the road you want to go Note that this requires XSLT version 2.0 (in XSLT 1.0, the same is true when a replacement function is provided for tokenize) Note also that this assumes a specific annotation content structure
Edit: the XSLT version with functions is shown at the bottom
XSLT
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="/"> <output> <xsl:apply-templates/> </output> </xsl:template> <xsl:template match="tag1/*"> </xsl:template> <xsl:template match="comment()"> <section> <para> <xsl:for-each select="tokenize(.,'
')[string-length() != 0]"> <xsl:variable name="splitup" select="tokenize(normalize-space(current()),':')"/> <xsl:choose> <xsl:when test="$splitup[1]='author'"> <author><xsl:value-of select="normalize-space($splitup[2])"/></author> </xsl:when> <xsl:when test="$splitup[1]='usage'"> <usage><xsl:value-of select="normalize-space($splitup[2])"/></usage> </xsl:when> <xsl:when test="$splitup[1]='labelC'"> <labelC> <xsl:for-each select="tokenize($splitup[2],'] ')[string-length() != 0]"> <xsl:variable name="labelCpart" select="normalize-space(current())"/> <xsl:choose> <xsl:when test="substring($labelCpart,1,1) = '['"> <xsl:variable name="attr" select="tokenize(substring($labelCpart,2),'=')"/> <xsl:attribute name="{$attr[1]}"><xsl:value-of select="$attr[2]"/></xsl:attribute> </xsl:when> <xsl:otherwise> <xsl:value-of select="$labelCpart"/> </xsl:otherwise> </xsl:choose> </xsl:for-each> </labelC> </xsl:when> </xsl:choose> </xsl:for-each> </para> </section> </xsl:template> </xsl:stylesheet>
When applied to the following XML
<?xml version="1.0" encoding="UTF-8"?> <root> <!-- author: blablabla usage: more blablabla labelC: [in=2] <b>formatted</b> blablabla --> <tag1 name="first"> <tag2>content a</tag2> <tag2>content b</tag2> <tag3 attrib="val">content c</tag3> </tag1> <!-- author: blebleble usage: more blebleble labelC: blebleble --> <tag1 name="second"> <tag2>content x</tag2> <tag2>content y</tag2> <tag3 attrib="val">content z</tag3> </tag1> </root>
The following outputs are given
<?xml version="1.0" encoding="UTF-8"?> <output> <section> <para> <author>blablabla</author> <usage>more blablabla</usage> <labelC in="2"><b>formatted</b> blablabla</labelC> </para> </section> <section> <para> <author>blebleble</author> <usage>more blebleble</usage> <labelC>blebleble</labelC> </para> </section> </output>
Call edited XSLT with a function (giving the same output)
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:d="java:com.dummy.func" exclude-result-prefixes="d"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <xsl:template match="/"> <output> <xsl:apply-templates/> </output> </xsl:template> <xsl:template match="tag1/*"> </xsl:template> <xsl:function name="d:section"> <xsl:param name="comm"/> <section> <para> <xsl:for-each select="tokenize($comm,'=')"/> <xsl:attribute name="{$attr[1]}"><xsl:value-of select="$attr[2]"/></xsl:attribute> </xsl:when> <xsl:otherwise> <xsl:value-of select="$labelCpart"/> </xsl:otherwise> </xsl:choose> </xsl:for-each> </labelC> </xsl:when> </xsl:choose> </xsl:for-each> </para> </section> </xsl:function> <xsl:template match="comment()"> <xsl:copy-of select="d:section(.)"/> </xsl:template> </xsl:stylesheet>