CoverStory
LANGUAGES: C#
ASP.NET VERSIONS: 2.0
New XML Features in .NET Version 2: Part I
The XmlReader and XmlWriter Classes
By Dan Wahlin
It s fascinating how fast XML technologies have advanced since the initial release of the XML 1.0 specification in 1998. Prior to XML, flat-files were the norm (actually, they still are in many of today s industries ... but I digress), which meant that developers had to write custom parsing, validation, and transformation mechanisms to work with data. XML introduced the concept of describing data through simple begin and end tags and brought several other vital components to the table, including parsers and validators. With the release of the XSLT 1.0 specification in 1999, XML data could even be transformed to a variety of formats quickly and easily with a minimal amount of programming code.
Version 1.1 of the .NET Framework provided excellent support for a variety of mainstream XML technologies, including DTDs, XML Schemas, XSLT 1.0, XPath 1.0, SOAP 1.1, DOM Level 2, and XML namespaces. All of this functionality was available out of the box; that is, without having to install additional add-ons. The release of .NET version 2 continues the tradition of excellent support for XML technologies. In fact, just to name a few of the enhancements, this new release adds significantly more performant XSLT transformation capabilities and enhanced XmlReader and XmlWriter APIs plus a more powerful XPathNavigator class.
In this article series I ll discuss some of the new XML features found in .NET version 2, and demonstrate how they can be used to work with XML data in a variety of ways. In Part I I ll focus on the XmlReader and XmlWriter classes; in Part II I ll examine new enhancements to the XmlDocument and XPathNavigator classes and take a look at new classes, such as XslCompiledTransform and XmlDataSource.
What s New with XmlReader?
The .NET Framework offers several different classes within the System.Xml namespace (and associated sub namespaces) for parsing XML data, including XmlReader, XmlDocument, XPathNavigator, and XmlSerializer. The fastest, most scalable and memory efficient API for parsing XML data is found in the XmlReader class. XmlReader provides a forward-only parsing API that allows large amounts of XML data to be parsed quickly. Version 2 s XmlReader boasts up to a 100% increase in performance as compared to version 1.1.
Version 2 of .NET introduces a new API for creating and using an XmlReader object. Previously, in version 1.1, developers would write code similar to the following example to parse XML data and locate an element named
:
XmlTextReader reader = new XmlTextReader(xmlFilePath);
while (reader.Read()) {
if (reader.Name == "Address") {
string city = reader.GetAttribute("City");
}
}
reader.Close();
In version 2, the XmlTextReader class has been marked obsolete and the abstract XmlReader class has been enhanced significantly. Although you still can t create an instance of an XmlReader directly using the new keyword, you can call its new Create method:
XmlReader reader = XmlReader.Create(xmlFilePath);
while (reader.Read()) {
//Parse XML data
}
reader.Close();
The Create method returns an XmlReader instance that can be used much like the XmlTextReader class in version 1.1. However, the XmlReader class found in .NET version 2 contains an abundance of new methods that make it easier to locate data within an XML document and convert it to various data types. Some of my favorite new methods include ReadToDescendant, ReadToNextSibling, and ReadSubTree, as well as other methods, such as ReadContentAsDateTime, which allows XML data to automatically be converted to CLR data types without writing casting code.
ReadToDescendant provides a simple way to locate a specific child node within an XML document. For example, if one or more elements named
exist within a document, ReadToDescendant can be used to move to the first element:
//using keyword automatically closes the XmlReader in C#
using (XmlReader reader = XmlReader.Create(xmlFilePath)) {
//Move to first
elementreader.MoveToDescendant("Address");
}
ReadToNextSibling can be used to move from a node on which an XmlReader is currently positioned (such as
) to the next sibling in the document:
using (XmlReader reader = XmlReader.Create(xmlFilePath)) {
//Move to first
element and then loop//through all siblings
reader.MoveToDescendant("Address");
do {
string city = reader.GetAttribute("City");
//Access other
element datawhile (reader.MoveToNextSibling());
}
New methods aren t the only enhancements that have been added to the XmlReader class. It s also capable of performing validation in version 2. In version 1.1, an XmlValidatingReader had to be instantiated to validate XML against a DTD or XSD schema. In version 2, however, XmlReader can be used to parse and validate data. Validation is turned on by using XmlReaderSettings, a new class available in the framework.
XmlReaderSettings allows a variety of features to be set as a reader parses XML data. For example, white space, comments, and processing instructions can be ignored by setting to true the XmlReaderSettings IgnoreWhiteSpace, IgnoreComments, and IgnoreProcessingInstructions properties. By setting the CheckCharacters property to true, Character checking can be performed to ensure that only legal XML characters are used in the document. Validation can be performed by adding one or more schemas to the XmlReaderSettings Schemas collection and by setting the ValidationType property to ValidationType.Schema.
Once properties have been set on an XmlReaderSettings object it can be passed to the XmlReader s Create method. Figure 1 shows an example of using the XmlReaderSettings class, along with the XmlReader class, to validate an XML document against an XSD schema.
private bool status
public void btnSubmit_Click(object sender, EventArgs e) {
string xmlPath = Server.MapPath("~/XML/MSDN.xml");
string schemaPath = Server.MapPath("~/Schemas/MSDN.xsd");
//Load schema used to validate
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add(String.Empty, schemaPath);
schemaSet.Compile();
//Define XmlReader settings
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.ValidationType = ValidationType.Schema;
readerSettings.Schemas = schemaSet;
//Hook up event handler to handle any validation errors
readerSettings.ValidationEventHandler +=
new ValidationEventHandler(ValidationEventHandler);
//Create XmlReader instance and pass in XmlReaderSettings
using (XmlReader reader =
XmlReader.Create(xmlPath, readerSettings)) {
while (reader.Read()) { }
}
this.lblOutput.Text =
(status) ? "Validation Succeeded!" : "Validation Failed!";
}
private void ValidationEventHandler(object sender,
ValidationEventArgs e) {
status = false;
}
Figure 1: The XmlReaderSettings class allows various XmlReader features to be turned on or off easily. This example demonstrates how to validate an XML document against an XSD schema that is loaded into the new XmlSchemaSet class.
What s New with XmlWriter?
The XmlReader class provides an efficient way to parse XML documents, but it can t be used to generate XML data. That s where .NET s XmlWriter class comes into play. XmlWriter allows XML data to be written to a variety of output sources, including Streams, TextWriters, files, and other XmlWriters.
Although the XmlWriter API hasn t changed as substantially as the XmlReader API in version 2, there are several changes that you can leverage to generate dynamic RSS feeds or other custom XML documents. One of the most important changes is in the way it is instantiated. In version 1.1, the following code would typically be written to generate XML:
XmlTextWriter writer = new XmlTextWriter(
filePath,Encoding.UTF8);
writer.WriteStartElement("golfers");
writer.WriteStartElement("golfer");
writer.WriteAttributeString("name","Mike");
writer.WriteAttributeString("handicap","14");
writer.WriteEndElement();
writer.WriteEndElement();
writer.Close();
Version 1.1 contained an XmlWriter class, but it couldn t be instantiated using the new keyword because it was abstract. In version 2, the XmlWriter class is still abstract, but it now exposes a Create method that accepts a variety of parameters that control where the XML output goes:
//using keyword automatically closes the XmlWriter in C#
using (XmlWriter writer = XmlWriter.Create(path)) {
writer.WriteStartElement("golfers");
writer.WriteStartElement("golfer");
writer.WriteAttributeString("name","Simon");
writer.WriteAttributeString("handicap","2");
writer.WriteEndElement();
writer.WriteEndElement();
}
The XmlWriter s Create method can also accept an instance of a new class named XmlWriterSettings that can be used to apply indentation to XML output or cause attributes to be placed on separate lines. It can also be used to control the output encoding or prevent the XML declaration from being written out. Figure 2 demonstrates how the XmlWriterSettings class can be used. The output is shown in Figure 3.
XmlWriterSettings ws = new XmlWriterSettings();
ws.Indent = true;
ws.CheckCharacters = true;
ws.NewLineOnAttributes = true;
StringWriter sw = new StringWriter();
using (XmlWriter writer = XmlWriter.Create(sw, ws)) {
writer.WriteStartElement("customers");
writer.WriteStartElement("customer");
writer.WriteAttributeString(
"id", Guid.NewGuid().ToString());
writer.WriteAttributeString("fname", "John");
writer.WriteAttributeString("lname", "Doe");
writer.WriteEndElement();
writer.WriteStartElement("customer");
writer.WriteAttributeString(
"id", Guid.NewGuid().ToString());
writer.WriteAttributeString("fname", "Jane");
writer.WriteAttributeString("lname", "Doe");
writer.WriteEndElement();
writer.WriteEndElement(); //close customers
}
this.txtXml.Text = sw.GetStringBuilder().ToString();
Figure 2: The XmlWriterSettings class can be used to control several different features of XmlWriter. The code shown here uses XmlWriterSettings to ensure that valid XML characters are used, indentation is applied, and attributes are placed on their own line in the output.
id="83ef3265-e94d-4587-ac00-faaee5ca4f03" fname="John" lname="Doe" /> id="74be7621-632d-4f54-be1d-9f77a3ddf83b" fname="Jane" lname="Doe" /> Figure 3: XML
output generated from running the code shown in Figure 2. Notice that each
attribute is placed on its own line because the XmlWriterSettings
NewLineOnAttributes property was set to true. Now that you ve seen a few of the new features associated
with the XmlReader and XmlWriter classes in .NET version 2, let s finish up
with an example of using these classes together to parse and display data from
an RSS feed. Although several different options exist for parsing RSS feeds in
the .NET Framework (XmlDataSource, DataSet, XPathNavigator, and others I ll
discuss in Part II of this series), using the XmlReader and XmlWriter classes
together provides a highly efficient and scalable solution because both work
with streams of data. To display an RSS feed, the XML data must first be parsed;
the XmlReader class performs this job quite well. Each public partial class ParsingRss : System.Web.UI.Page { protected void
Page_Load(object sender, EventArgs e) { StringWriter sw = new
StringWriter(); XmlWriterSettings ws =
new XmlWriterSettings(); ws.Indent = true; ws.OmitXmlDeclaration =
true; using (XmlWriter writer
= XmlWriter.Create(sw, ws)) { writer.WriteStartElement("ul"); string xmlPath =
Server.MapPath("~/XML/MSDN.xml"); using (XmlReader
reader = XmlReader.Create(xmlPath))
{ reader.ReadToDescendant("item"); //Read each do { ReadSubTree(reader.ReadSubtree(), writer); } while
(reader.ReadToNextSibling("item")); } writer.WriteEndElement(); } this.lblOutput.Text =
sw.ToString(); } //Read child nodes of private void
ReadSubTree(XmlReader subReader, XmlWriter
writer) { subReader.Read(); string link = null; string title = null; while
(subReader.Read()) { if (subReader.Name ==
"title") { title =
subReader.ReadElementString(); } if (subReader.Name ==
"link") { link =
subReader.ReadElementString(); } } //Write out title and
link node values to XmlWriter writer.WriteStartElement("li"); writer.WriteStartElement("a"); writer.WriteAttributeString("href", link); writer.WriteString(title); writer.WriteEndElement(); //close writer.WriteEndElement();
//close subReader.Close(); } } Figure 4: Parsing
RSS feeds using the XmlReader and XmlWriter classes provides a highly scalable
solution that minimizes memory consumption. Version 2 of the .NET Framework offers many new features
that simplify working with XML data. In this article you ve seen some of the
new features associated with the XmlReader and XmlWriter classes, as well as
their helper classes. These classes provide the most efficient APIs available
in the framework to parse and create XML data. In Part
II of this article series I ll cover additional XML functionality built into
.NET version 2, and demonstrate how, in some cases, you can consume and display
XML data without writing a single line of C# or VB.NET code! The sample code
accompanying this article is available for download. Dan Wahlin
(Microsoft MVP for ASP.NET and XML Web services) is the president of Wahlin
Consulting and founded the XML for ASP.NET Developers Web site (http://www.XMLforASP.NET), which focuses
on using XML, ADO.NET, and Web services in Microsoft s .NET platform. He s also
a corporate trainer and speaker, and teaches XML and .NET training courses
around the US. Dan
coauthored Professional Windows DNA
(Wrox), ASP.NET: Tips, Tutorials and Code
(SAMS), and ASP.NET 1.1 Insider Solutions,
and authored XML for ASP.NET Developers
(SAMS). Displaying RSS Feeds with XmlReader and XmlWriter
Figure 5: The output generated by
using the XmlReader and XmlWriter classes to parse and display an RSS document. Conclusion