Skip navigation

.NET XML Readers

XML plays a key role in the .NET Framework because XML is widely used and tightly integrated with ADO.NET and because many classes that you can use to read, write, and navigate in XML documents support XML as a file format. The .NET Framework provides two distinct APIs to access data in XML documents. It provides classes that expose the content through the well-known XML Document Object Model (XMLDOM), but it also provides a new approach to XML documents—XML readers—that falls somewhere between XMLDOM and Simple API for XML (SAX). (Of course, readers only take care of getting data out of an XML file. To save data to XML documents, you have XML writers.) Both XML readers and writers share the same access model, which resembles database cursors.

XML Classes


In the .NET Framework, XML classes are usually defined in the System.Xml namespace. .NET XML classes are built on key industry standards, such as Document Object Model (DOM) Level 2, XPath 1.0, Extensible Style Language Transformations (XSLT) 1.0, XML Schema Definition (XSD), and Simple Object Access Protocol (SOAP). However XML classes also feature several improvements to the overall programming model that go beyond the specification of World Wide Web Consortium (W3C) standards.

In addition to providing two fairly distinct XML APIs for content access, .NET unifies the XMLDOM with the data access services that ADO.NET provides. As a result, you can switch from ADO.NET objects to XML strings and vice versa. XmlDataDocument is the XMLDOM class that bridges between XMLDOM classes and classes for relational access to data. XmlDataDocument holds a reference to a DataSet object that is built on the fly if the XML content lends itself to rendering as a tabular structure.

With this XML-to-data bridge, you can exploit a dual and interchangeable model for navigating through data and choose the one—relational or XPath-driven—that best suits your needs.

Reading XML Data


With an XML reader, you access XML documents through classes derived from the abstract class XmlReader, which represents a fast, noncached, forward-only, stream-based access to XML sources. XML readers are stream-based, like SAX, but more lightweight than DOM.

XML readers don't push data to callers through predefined interfaces but leave callers free to pull only the data they need. Callers decide where and how to read the data and how to store it.

The XmlReader class is the core of all XML reading you can do in .NET, including the reading that you perform under the aegis of DOM and SAX APIs. In other words, no matter which API you use to access XML data in .NET, behind it, an XML reader or an XML writer does much of the dirty work.

Below is a simple console application that reads in an XML file:

using System;
using System.Xml;

class MyXmlApp
\{
  public static void Main(String\[\] args)
  \{
    try \{
      String fileName = args\[0\];
      XmlTextReader xtr = new XmlTextReader(fileName);
      // Open the stream and move to the root
	  xtr.Read();
	  
	  Console.WriteLine("<\{0\}>", xtr.Name);
	  // Read the entire content of the node
	  Console.WriteLine(xtr.ReadInnerXml());
	  Console.WriteLine("</\{0\}>", xtr.Name);
	  xtr.Close();
	  \}
	catch (Exception e) \{
	  Console.WriteLine("Error:\t\{0\}", e.Message);
	  \}
	  
	return;
  \}
 \}

This application opens the XML file using the XmlTextReader class—a class derived from XmlReader—then moves the pointer to the root node:

XmlTextReader xtr = new XmlTextReader(fileName);
xtr.Read();

Next, the application outputs the root tag's name and writes down the entire XML text that forms the document—any markup text that falls between the root node's opening and the closing tags:

onsole.WriteLine(xtr.ReadInnerXml());
xtr.Close();

XmlTextReader also provides several methods to jump from one node to another in much the same way you do within a cursor in a database recordset. In my next column, I'll examine these methods.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish