XML Transformations

Learn the Basics of XSLT

ControlFreak

LANGUAGES: VB.NET | C#

ASP.NET VERSIONS: 2.0+

 

XML Transformations

Learn the Basics of XSLT

 

By Steve C. Orr

 

The .NET Framework contains rich support for XML, including the ability to parse and create XML documents. The simplicity of XML certainly makes it useful for storing and transporting data, but its formatting is still too complex for the average user to be able to digest directly. Therefore, the data contained within an XML document should be transformed into a more attractive form before being fed to end users. The question then becomes: What is the best way to reformat XML data?

 

XML Data Formatting Techniques

The .NET Framework provides a variety of ways to transform the data contained within XML documents into visually attractive output. For example, the XMLDataSource control allows an XML document to be used much like a database loaded into a DataSet and bound directly to standard ASP.NET controls such as the GridView and Repeater.

 

The System.XML namespace provides a variety of classes that can be used for reading and writing XML data in a slightly more manual fashion. The XmlDocument object can be used to load an entire XML document into memory and manipulate it programmatically. However, large XML files may be more efficiently parsed with the XMLReader class because it needn t load the entire file all at once into potentially precious memory resources.

 

All of these techniques can be good choices under the right circumstances, but there is another useful solution that should be considered even though it doesn t necessarily require .NET at all. The open standard eXtensible Style Language Transformations (XSLT) is often the fastest and most efficient way to mutate XML data into a visually attractive reading format. Most often the XML is transformed into HTML, although XSLT can be used to transform XML data into almost any imaginable text-based format.

 

The XML Web Control

The XML Web control provided by ASP.NET is one of the simplest tools available to transform XML data into HTML. As you can see by the example declaration here, it has only two unique properties:

 

 DocumentSource="data.xml"

 TransformSource="formatting.xslt">

 

The DocumentSource property is used to specify the local XML file that contains the data. The TransformSource property requires a reference to an XSLT file that describes how to display the XML data. Remote XML data files can also be retrieved and manipulated with a bit of code, as shown in Figure 1.

 

'Imports System.Xml.XPath

Dim loc As String="http://rss.news.yahoo.com/rss/topstories"

Dim xp As XPathDocument = New XPathDocument(loc)

Dim xpn As XPathNavigator = xp.CreateNavigator()

Xml1.XPathNavigator = xpn

Xml1.TransformSource = "~/format.xslt"

Figure 1A: The XML Web control can be configured at run time to handle XSL transformations (VB.NET).

 

//using System.Xml.XPath

string loc="http://rss.news.yahoo.com/rss/topstories";

XPathDocument xp = new XPathDocument(loc);

XPathNavigator xpn = xp.CreateNavigator();

Xml1.XPathNavigator = xpn;

Xml1.TransformSource = "~/format.xslt";

Figure 1B: The XML Web control can be configured at run time to handle XSL transformations (C#).

 

The code in Figure 1 first loads into an XPathDocument object an external XML document (in the form of an RSS news feed). From this object, an XPathNavigator object is created. The XPathNavigator object is then assigned as the XML data source for the XML Web control. Finally, an XSLT document is referenced, which contains instructions for manipulating the XML data. The formatted HTML will be rendered into the parent ASPX Web page.

 

Exploring RSS

Really Simple Syndicate (RSS) is an XML-based format used for general publishing. Figure 2 lists a basic RSS file that is used for the examples in this article.

 

 The Art of Coding by Steve C. Orr

 http://SteveOrr.net/

 Free stuff

 en-US

 [email protected]

 

   Threat Analysis

   Secure Your Apps

   [email protected]

   Articles

   Wed, 30 Jul 2008 15:46:43 GMT

   

     http://SteveOrr.net/articles/Threat-Analysis.aspx

   

 

 

   Process Explorer

   Free Useful Utility

   [email protected]

   Articles

   Wed, 30 Jul 2008 15:46:40 GMT

   

     http://SteveOrr.net/articles/Process-Explorer.aspx

   

 

 

Figure 2: RSS files like this utilize XML to publish news feeds.

 

An RSS file can optionally define multiple channels to group related content. Each content item is then defined by Item elements, which may contain many optional sub-elements that further describe the content item.

 

See the RSS Specification at http://validator.w3.org/feed/docs/rss2.html to learn more about the RSS format. To learn more about using RSS from ASP.NET, see Feed RSS to Your Users.

 

Exploring XSLT

By now you might be wondering what the instructions contained within an XSLT file look like. XSLT is essentially a light XML-based programming language. This language is used to transform XML data into other formats, such as HTML. Figure 3 lists the XSLT used to manipulate the XML data from Figure 2; the result is shown in Figure 4.

 

    version="1.0"

    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

 

   

     

        </p> <p>         <xsl:value-of select="channel/title" /> </p> <p>       

       

             type="text/css" href="rss.css" />

     

     

       

       

     

   

 

 

   

     

       

   

   

 

 

   

     

       

     

     

        -

       

     

     

       

     

   

   

 

Figure 3: This XSLT file contains instructions for transforming an RSS feed into the HTML output shown in Figure 4.

 

The first couple lines are boilerplate XML, defining this content as an XSLT file. The next section defines a template to be applied to the top level of the source RSS/XML document. It starts by outputting standard HTML sections such as the header and title. The title value that gets applied to the resulting HTML document is extracted from the RSS channel s title element. A Cascading Style Sheet (CSS) file is referenced to apply additional formatting beauty to the resulting HTML document. The body of the HTML document is then defined by two additional XSL template sections that are referenced in place, but defined further down.

 


Figure 4: The XSLT in Figure 3 transforms the data from Figure 2 into this HTML output.

 

The second template section ( channel/title ) displays the channel s title as a hyperlink. This content is displayed inside a standard HTML div tag.

 

The third and final template section ( channel/item ) is repeated for each article item listed in the source RSS/XML file. Once again, data is extracted from the source XML data and inserted into the defined bits of HTML before it all gets rendered as a single HTML document. The article s title is displayed as a hyperlink. The hyperlink s address is extracted from the Link elements of the RSS file, and the hyperlink s text is extracted from the Title elements. The category and publication date are extracted and displayed in the following HTML div tag. The final div tag is filled with the content item s description.

 

Because (in this case) the entire HTML document is defined by the XSLT file, the parent ASPX page doesn t need any HTML at all. In this example the ASPX file is completely blank other than the declaration of the XML Web control that s used to execute the server-side XSL transformation:

 

 

The code contained in the code-behind file (listed in Figure 1) sets the control s XML and XSLT files. Alternatively, in most cases you could ditch the code completely and instead declare these files with the XML control s ASPX declaration:

 

 DocumentSource="data.xml"

 TransformSource="formatting.xslt">

 

The XSLT language has several other standard programming constructs that can be useful from time to time, as well, such as If statements and For..Each iterations. For syntax details, visit http://www.w3schools.com/xsl/.

 

Manual Transformation

While the XML Web control can be a nice time saver, it s not strictly necessary. With a little extra code, you can execute XSL transformations manually with standard .NET classes. The syntax may be a bit different, but the process is essentially the same. To begin, the source XML data document must be loaded, and the XSLT document must be loaded. This is shown by the first two code blocks listed in Figure 5.

 

'Load the XML data document

Dim doc As XPathDocument = _

   New XPathDocument(Server.MapPath("data.xml"))

'Load the XSLT file

Dim sXslFile As String = _

    Server.MapPath("formatter.xslt")

Dim xslt As XslCompiledTransform = _

   New XslCompiledTransform()

xslt.Load(sXslFile)

'Do the transformation

Dim ms As MemoryStream = New MemoryStream()

Dim writer As XmlTextWriter = _

   New XmlTextWriter(ms, Encoding.ASCII)

Dim rd As StreamReader = New StreamReader(ms)

xslt.Transform(doc, writer)

ms.Position = 0

Dim sHtml As String

sHtml = rd.ReadToEnd()

Response.Write(sHtml)

'cleanup

rd.Close()

ms.Close()

Response.End()

Figure 5A: With a little extra coding, XSL transformations can be done manually without the help of the XML Web control (VB.NET).

 

//Load the XML data document

XPathDocument doc = new

      XPathDocument(Server.MapPath("data.xml"));

//Load the XSLT file

string sXslFile = Server.MapPath("formatter.xslt");

XslCompiledTransform xslt = new XslCompiledTransform();

xslt.Load(sXslFile);

//Do the transformation

MemoryStream ms = new MemoryStream();

XmlTextWriter writer =

      new XmlTextWriter(ms, Encoding.ASCII);

StreamReader rd = new StreamReader(ms);

xslt.Transform(doc, writer);

ms.Position = 0;

string sHtml = null;

sHtml = rd.ReadToEnd();

Response.Write(sHtml);

//cleanup

rd.Close();

ms.Close();

Response.End();

Figure 5B: With a little extra coding, XSL transformations can be done manually without the help of the XML Web control (C#).

 

Then the transformation must be executed manually, because the XML Web control is no longer doing it. The Transform method of the XslCompiledTransform class does most of the heavy lifting in this example, streaming the resulting HTML into an XmlTextWriter object. This XML stream is then rendered to the page by a call to the Response.Write method.

 

Conclusion

As Service Oriented Architectures continue to take over the software development world, XML is becoming continually more important. The ubiquity of XML makes it a worthy investment of your development research time. By mastering the art of XML manipulation, you ll be able to work productively on a seemingly infinite number of software systems that use it at their core.

 

The separation of data and presentation layers is a design pattern that shows no sign of diminishing popularity. XSL Transformations are often the most efficient way to manipulate XML data into a more user-friendly format. The syntax of XSLT is not particularly complicated, especially if you re already comfortable working with XML. For those who wish to learn more about the subjects discussed in this article, check out these useful resources:

 

Sample code accompanying this article is available for download.

 

Steve C. Orr is an ASPInsider, MCSD, Certified ScrumMaster, Microsoft MVP in ASP.NET, and author of Beginning ASP.NET 2.0 AJAX (Wrox). He s been developing software solutions for leading companies in the Seattle area for more than a decade. When he s not busy designing software systems or writing about them, he often can be found loitering at local user groups and habitually lurking in the ASP.NET newsgroup. Find out more about him at http://SteveOrr.net or e-mail him at mailto:[email protected].

 

Why Bother?

In a world where Service Oriented Architectures (SOA) are beginning to dominate software design, the separation of data from the display of that data has never been more important. This kind of separation is a popular and well established design pattern. For example, ASP.NET code-behind files are a common way to separate the format of a Web page from the logic that retrieves and manipulates its underlying data. Similarly, Cascading Style Sheets (CSS) are yet another example of separating data from its formatting.

 

This philosophy allows data experts to experiment with the most efficient ways to retrieve data without adversely affecting the efforts of their user interface design experts. Because these development tasks are defined in separate files, they can be individually edited by varying developers or departments.

 

Additionally, this separation allows many user interfaces to potentially work on top of the same data. For example, an application could easily allow users to select their own individual color schemes, or even radically different layouts. Alternatively, many different applications could operate on the same data using different front ends. For example, users could potentially have the choice between an HTML front end, a Silverlight front end, or a Windows Presentation Foundation (WPF) front end. These applications could all live in harmony, or maybe only the most popular choices manage to retain long-term development resources. In any case, the result is happy users with plenty of options from which to choose.

 

 

 

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish