Wednesday, November 08, 2006

Using XML, XPath, and XSLT with JavaScript

There are several JavaScript XML libraries available to try to provide a cross-browser XML library. I've previously cited zXML in my Inline SVG article, there's also Sarissa and XML for <Script> and I'm sure others. However, when I started using them I discovered various problems and limitations with them for what I needed to do. So, I decided to write my own cross-browser XML routines. I wanted an XML library that was as lightweight as possible, doing just what I needed and utilizing as much of the built in browser capabilities as possible. In this article, I will first present a simple XML class which just holds the XML DOM object returned from an AJAX request and provides functions to get and change node values based on XPaths. Secondly, I will give a simple XSLT class to preform XSL Transformations. The complete code for the XML and XSL library is available here

XML

First off is the constructor which determined if we will be using IE (ActiveX) functions or W3C standard functions for XML manipulation and either stores the XML DOM returned as the responseXML from an XMLHttpRequest(XHR) or creates an empty DOM object for later manipulation.
//
// XML
//
function XML(xmlDom) {
this.isIE = window.ActiveXObject;
if (xmlDom != null) {
 this.xmlDom = xmlDom;
} else {
 // create an empty document
 if (this.isIE) {
    Try.these (
       function() { axDom = new ActiveXObject("MSXML2.DOMDocument.5.0"); },
       function() { axDom = new ActiveXObject("MSXML2.DOMDocument.4.0"); },
       function() { axDom = new ActiveXObject("MSXML2.DOMDocument.3.0"); },
       function() { axDom = new ActiveXObject("MSXML2.DOMDocument"); },
       function() { axDom = new ActiveXObject("Microsoft.XmlDom"); }
    );
    this.xmlDom = axDom;
 } else {
    this.xmlDom = document.implementation.createDocument("", "", null);
 }
}
};
In case you're not getting the XML data from an XHR response, I also provided a load function to load an XML file from an URL.
// load
//
// Loads an XML file from an URL
XML.prototype.load = function(url) {
 this.xmlDom.async = false;
 this.xmlDom.load(url);
};
Next, I want to be able to find a single node in the XML DOM based on an XPath. The getNode function will do that for me.
// getNode
//
// get a single node from the XML DOM using the XPath
XML.prototype.getNode = function(xpath) {
 if (this.isIE) {
    var result = this.xmlDom.selectSingleNode(xpath);
 } else {
    var evaluator = new XPathEvaluator();
    var result = evaluator.evaluate(xpath, this.xmlDom, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
 }
 return result;
};
More often, I don't need the actual DOM node object, but the value of that node, so I provide the getNodeValue a function to get the value of a node.
// getNodeValue
//
// get the value of a the element specified by the XPath in the XML DOM
XML.prototype.getNodeValue = function(xpath) {
 var value = null;
 try {
    var node = this.getNode(xpath);

    if (this.isIE &&amp; node) {
       value = node.text;
    } else if (!this.isIE && node.singleNodeValue) {
       value = node.singleNodeValue.textContent;
    }
 } catch (e) {}

 return value;
};
If you're just looking for a particular node/value, the previous functions are fine, but more often you need a group of node or node values (ex. all the book titles). So, the full JavaScript file also contains corresponding functions to get multiple nodes (getNodes) and node values as an array (getNodeValues). At some point, you'll probably want to access this XML DOM as XML data (string), so we need a function to serialize the XML from the DOM to a String. The getNodeAsXml function will do that for you. If you pass in the root XPath ("/") it will serialize the complete XML file. Also, the included prettyPrintXml function can be used to escape the HTML characters.
// getNodeAsXml
//
// get the XML contents of a node specified by the XPath
XML.prototype.getNodeAsXml = function(xpath) {
 var str = null;
 var aNode = this.getNode(xpath);
 try {
    if (this.isIE) {
       str = aNode.xml;
    } else {
       var serializer = new XMLSerializer();
       str = serializer.serializeToString(aNode.singleNodeValue);
    }
 } catch (e) {
    str = "ERROR: No such node in XML";
 }
 return str;
};
So far, we've only read the XML data we received. But we may also want to change the contents of the XML. So, here are the functions necessary to change existing node values (updateNodeValue), add new nodes/values (insertNode), and delete existing nodes (removeNode). Note: the XPaths used for node insertions cannot be overly complex as I wanted to keep this library simple.
// updateNodeValue
//
// update a specific element value in the XML DOM
XML.prototype.updateNodeValue = function(xpath, newvalue) {
 var node = this.getNode(xpath);
 var changeMade = false;
 newvalue = newvalue.trim();

 if (this.isIE &&amp; node) {
    if (node.text != newvalue) {
       node.text = newvalue;
       changeMade = true;
    }
 } else if (!this.isIE && node.singleNodeValue) {
    if (node.singleNodeValue.textContent != newvalue) {
       node.singleNodeValue.textContent = newvalue;
       changeMade = true;
    }
 } else {
    if (newvalue.length > 0) {
       this.insertNode(xpath);
       changeMade = this.updateNodeValue(xpath, newvalue);
    }
 }

 return changeMade;
};

// insertNode
//
// insert a new element (node) into the XML document based on the XPath
XML.prototype.insertNode = function(xpath) {
 var xpathComponents = xpath.split("/");
 var newChildName = xpathComponents.last();
 var parentPath = xpath.substr(0, xpath.length - newChildName.length - 1);
 var qualifierLoc = newChildName.indexOf("[");
 // remove qualifier for node being added
 if (qualifierLoc != -1) {
    newChildName = newChildName.substr(0, qualifierLoc);
 }
 var node = this.getNode(parentPath);
 var newChild = null;
 if (this.isIE && node) {
    newChild = this.xmlDom.createElement(newChildName);
    node.appendChild(newChild);
 } else if ((!this.isIE) && node.singleNodeValue) {
    newChild = this.xmlDom.createElement(newChildName);
    node.singleNodeValue.appendChild(newChild);
 } else {
    // add the parent, then re-try to add this child
    var parentNode = this.insertNode(parentPath);
    newChild = this.xmlDom.createElement(newChildName);
    parentNode.appendChild(newChild);
 }
 return newChild;
};

// removeNode
//
// remove an element (node) from the XML document based on the xpath
XML.prototype.removeNode = function(xpath) {
 var node = this.getNode(xpath);
 var changed = false;
 if (this.isIE &&amp; node) {
    node.parentNode.removeChild(node);
    changed = true;
 } else if ((!this.isIE) && node.singleNodeValue) {
    node.singleNodeValue.parentNode.removeChild(node.singleNodeValue);
    changed = true;
 }
 return changed;
};
You should now be able to fully read and manipulate XML using XPaths. For any significant reformatting or processing of the XML, you'll probably want to utilize XSLT. So, next I'll provide a simple XSLT class. It will preload (into a DOM object) an XSL file in its constructor. This provides rapid transformations of new XML data passed in to it's transform function (including parameters). First, the constructor which requires an URL pointing to the XSL file to be used for transformations. It does not accept a DOM object like the XML class did as it is assumed that the XSL (layout information) is more-or-less static and stored in a file whereas the XML data would most likely be dynamic and received via an XHR request.
//
// XSLT Processor
//
function XSLT(xslUrl) {
 this.isIE = window.ActiveXObject;
 if (this.isIE) {
    var xslDom = new ActiveXObject("MSXML2.FreeThreadedDOMDocument");
    xslDom.async = false;
    xslDom.load(xslUrl);
    if (xslDom.parseError.errorCode != 0) {
       var strErrMsg = "Problem Parsing Style Sheet:\n" +
          " Error #: " + xslDom.parseError.errorCode + "\n" +
          " Description: " + xslDom.parseError.reason + "\n" +
          " In file: " + xslDom.parseError.url + "\n" +
          " Line #: " + xslDom.parseError.line + "\n" +
          " Character # in line: " + xslDom.parseError.linepos + "\n" +
          " Character # in file: " + xslDom.parseError.filepos + "\n" +
          " Source line: " + xslDom.parseError.srcText;
         alert(strErrMsg);
       return false;
    }
    var xslTemplate = new ActiveXObject("MSXML2.XSLTemplate");
    xslTemplate.stylesheet = xslDom;
    this.xslProcessor = xslTemplate.createProcessor();
 } else {
    var xslDom = document.implementation.createDocument("", "", null);
    xslDom.async = false;
    xslDom.load(xslUrl);
    this.xslProcessor = new XSLTProcessor();
    this.xslProcessor.importStylesheet(xslDom);
}
};
Finally, the transform function will preform the XSL transformation and return the result. It accepts parameters to be passed to the XSL as an associative array of parameter name and value pairs.
// transform
//
// Transform an XML document
XSLT.prototype.transform = function(xml, params) {
 // set stylesheet parameters
 for (var param in params) {
    if (typeof params[param] != 'function') {
       if (this.isIE) {
          this.xslProcessor.addParameter(param, params[param]);
       } else {
          this.xslProcessor.setParameter(null, param, params[param]);
       }
    }
 }

 if (this.isIE) {
    this.xslProcessor.input = xml.xmlDom;
    this.xslProcessor.transform();
    var output = this.xslProcessor.output;
 } else {
    var resultDOM = this.xslProcessor.transformToDocument(xml.xmlDom);
    var serializer = new XMLSerializer();
    var output = serializer.serializeToString(resultDOM);
 }
 return output;
};
You should now be able to easily utilize XML and XSL on your web pages. An example usage of this XML/XSLT library would be to first build an XSLT object during the page initialization (onload).
xslProcessor = new XSLT(xslUrl);
Then, when XML responses are received from an AJAX request, process them. For example,
// get the XML data
try {
 var xmlData = new XML(request.responseXML);
} catch (e) {
 alert(request.responseText);
}

// get the username out of the XML
var userName = xmlData.getNodeValue('//userName');

// transform the XML to some HTML content
var newData = xslProcessor.transform(xmlData, {'param':'value'});
document.getElementById('someDiv').innerHTML = newData;
Update (February 27, 2007): I previously had a function in my XML class to generate valid HTML for the XML, but I've now modified it to use Oliver Becker's XML to HTML Verbatim Formatter with Syntax Highlighting stylesheet to format the XML. If that is not available, it will default to simply serialize the XML and convert the special characters. The code for this looks like:
/ Define a class variable which can be used to apply a style sheet to
// the XML for format it to HTML
XML.htmlFormatter = new XSLT("xsl/xmlverbatim.xsl");

// toHTML
//
// Transform the XML into formatted HTML
//
XML.prototype.toHTML = function() {
var html = null;
if (XML.htmlFormatter) {
 html = XML.htmlFormatter.transform(this);
} else {
 html = this.getNodeAsXml('/');
 if (html != null) {
  html = html.replace(/&/g, "&");
  html = html.replace(/</g, "&lt;");
  html = html.replace(/>/g, "&gt;<br/>");
 }
}
   return html;
}
As before, the complete package is available to download Update (March 7, 2007): I have updated this package to utilize Google's AJAXSLT XSL-T implementation when a native JavaScript XSLT function is not available. I need to test it more before releasing it however.

8 comments:

Anonymous said...

Very useful!!!

Dinesh Sharma said...

its really very usefull.
can you provide the test page....

Dinesh Sharma said...

I am not able to load my xml file using the below code.....
var obj=new XML(new XMLHttpRequest());
obj.load(../app_data/xmlData.xml);
var node=obj.getNode("test/NewUser");

pls help me if I am wrong.

Steven Pothoven said...

Dinesh, you're doing two conflicting approaches. If you want to pass the XML retrieved from an XMLHttpRequest, you need to give it the responseXML attribute of the response from the XMLHttpRequest (see http://blog.pothoven.net/2006/01/ajax-in-practice.html for an example of sending and processing XMLHttpRequest object, but I'd recommend using one of the many JavaScript libraries that do it for you -- prototype.js, jquery, etc.). In this case you'd do:

var xml = XML(response.responseXML);
var node = xml.getNode("test/NewUser");

(assuming "response" is the result of the XMLHttpRequest)

The load method of my XML class is an alternative was to retrieve the XML data which will fallback to an XMLHttpRequest for you if the browser can't load it directly. In your case, that might be the easier way to go. Try:

var xml = XML();
xml.load('../app_data/xmlData.xml');
var node = xml.getNode('test/NewUser');

Anonymous said...

Hi!!! blog.pothoven.net is one of the most outstanding informational websites of its kind. I take advantage of reading it every day. I will be back.

Quantum said...

very usefull but do you have a cloneNode or importNode method, it would be very usefull.
I also have problem to set attributes in some cases. for exemple when i insert a node using
var myNode = myXml.insertNode(myXpath);
i can write
myNode.setAttribute("myAttrName", myAttrValue)

but when i'm using
var myNode = myXml.getNode(myXpath);
i can't put attrbute.
please help me!
thanks

Unknown said...

The beginning of the script seems to start with the dificult and work towards the easy; the opposite of my usual approach. I suggest starting with if (document.implementation && document.implementation.createDocument) { // since this is the W3C approved approach...

Then as part of the else clause, put in the IE way of doing things.

Gilles1395 said...

hi,
i'm trying to put my application using your xml.js from firefox 3.5.3 to firefox 17 esr.
it doesn't and it seems the problem is from the getNode function. the XPathEvaluator evaluate is putting error.
do you have any solution?
thanks for your answer.