ControlFreak
LANGUAGES: VB.NET | C#
ASP.NET VERSIONS: 2.0+
XML Transformations
Learn the Basics of XSLT
By Steve C. Orr
The .NET Framework contains rich support for XML, including the ability to parse and create XML documents. The simplicity of XML certainly makes it useful for storing and transporting data, but its formatting is still too complex for the average user to be able to digest directly. Therefore, the data contained within an XML document should be transformed into a more attractive form before being fed to end users. The question then becomes: What is the best way to reformat XML data?
XML Data Formatting Techniques
The .NET Framework provides a variety of ways to transform the data contained within XML documents into visually attractive output. For example, the XMLDataSource control allows an XML document to be used much like a database loaded into a DataSet and bound directly to standard ASP.NET controls such as the GridView and Repeater.
The System.XML namespace provides a variety of classes that can be used for reading and writing XML data in a slightly more manual fashion. The XmlDocument object can be used to load an entire XML document into memory and manipulate it programmatically. However, large XML files may be more efficiently parsed with the XMLReader class because it needn t load the entire file all at once into potentially precious memory resources.
All of these techniques can be good choices under the right circumstances, but there is another useful solution that should be considered even though it doesn t necessarily require .NET at all. The open standard eXtensible Style Language Transformations (XSLT) is often the fastest and most efficient way to mutate XML data into a visually attractive reading format. Most often the XML is transformed into HTML, although XSLT can be used to transform XML data into almost any imaginable text-based format.
The XML Web Control
The XML Web control provided by ASP.NET is one of the simplest tools available to transform XML data into HTML. As you can see by the example declaration here, it has only two unique properties:
DocumentSource="data.xml"
TransformSource="formatting.xslt"> The DocumentSource property is used to specify the local
XML file that contains the data. The TransformSource property requires a
reference to an XSLT file that describes how to display the XML data. Remote
XML data files can also be retrieved and manipulated with a bit of code, as
shown in Figure 1. 'Imports System.Xml.XPath Dim loc As
String="http://rss.news.yahoo.com/rss/topstories" Dim xp As XPathDocument = New XPathDocument(loc) Dim xpn As XPathNavigator = xp.CreateNavigator() Xml1.XPathNavigator = xpn Xml1.TransformSource = "~/format.xslt" Figure 1A: The XML
Web control can be configured at run time to handle XSL transformations (VB.NET). //using System.Xml.XPath string loc="http://rss.news.yahoo.com/rss/topstories"; XPathDocument xp = new XPathDocument(loc); XPathNavigator xpn = xp.CreateNavigator(); Xml1.XPathNavigator = xpn; Xml1.TransformSource = "~/format.xslt"; Figure 1B: The XML
Web control can be configured at run time to handle XSL transformations (C#). The code in Figure 1 first loads into an XPathDocument
object an external XML document (in the form of an RSS news feed). From this
object, an XPathNavigator object is created. The XPathNavigator object is then
assigned as the XML data source for the XML Web control. Finally, an XSLT
document is referenced, which contains instructions for manipulating the XML
data. The formatted HTML will be rendered into the parent ASPX Web page. Really Simple Syndicate (RSS) is an XML-based format used
for general publishing. Figure 2 lists a basic RSS file that is used for the
examples in this article. Exploring RSS
http://SteveOrr.net/
http://SteveOrr.net/articles/Threat-Analysis.aspx
http://SteveOrr.net/articles/Process-Explorer.aspx
Figure 2: RSS files like this utilize XML to publish news feeds.
An RSS file can optionally define multiple channels to group related content. Each content item is then defined by Item elements, which may contain many optional sub-elements that further describe the content item.
See the RSS Specification at http://validator.w3.org/feed/docs/rss2.html to learn more about the RSS format. To learn more about using RSS from ASP.NET, see Feed RSS to Your Users.
Exploring XSLT
By now you might be wondering what the instructions contained within an XSLT file look like. XSLT is essentially a light XML-based programming language. This language is used to transform XML data into other formats, such as HTML. Figure 3 lists the XSLT used to manipulate the XML data from Figure 2; the result is shown in Figure 4.
version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
type="text/css" href="rss.css" />
Figure 3: This XSLT file contains instructions for transforming an RSS feed into the HTML output shown in Figure 4.
The first couple lines are boilerplate XML, defining this content as an XSLT file. The next section defines a template to be applied to the top level of the source RSS/XML document. It starts by outputting standard HTML sections such as the header and title. The title value that gets applied to the resulting HTML document is extracted from the RSS channel s title element. A Cascading Style Sheet (CSS) file is referenced to apply additional formatting beauty to the resulting HTML document. The body of the HTML document is then defined by two additional XSL template sections that are referenced in place, but defined further down.
Figure 4: The XSLT in Figure 3
transforms the data from Figure 2 into this HTML output.
The second template section ( channel/title ) displays the channel s title as a hyperlink. This content is displayed inside a standard HTML div tag.
The third and final template section ( channel/item ) is repeated for each article item listed in the source RSS/XML file. Once again, data is extracted from the source XML data and inserted into the defined bits of HTML before it all gets rendered as a single HTML document. The article s title is displayed as a hyperlink. The hyperlink s address is extracted from the Link elements of the RSS file, and the hyperlink s text is extracted from the Title elements. The category and publication date are extracted and displayed in the following HTML div tag. The final div tag is filled with the content item s description.
Because (in this case) the entire HTML document is defined by the XSLT file, the parent ASPX page doesn t need any HTML at all. In this example the ASPX file is completely blank other than the declaration of the XML Web control that s used to execute the server-side XSL transformation:
The code contained in the code-behind file (listed in Figure 1) sets the control s XML and XSLT files. Alternatively, in most cases you could ditch the code completely and instead declare these files with the XML control s ASPX declaration:
DocumentSource="data.xml"
TransformSource="formatting.xslt"> The XSLT language has several other standard programming
constructs that can be useful from time to time, as well, such as If statements
and For..Each iterations. For syntax details, visit http://www.w3schools.com/xsl/. While the XML Web control can be a nice time saver, it s
not strictly necessary. With a little extra code, you can execute XSL
transformations manually with standard .NET classes. The syntax may be a bit
different, but the process is essentially the same. To begin, the source XML
data document must be loaded, and the XSLT document must be loaded. This is
shown by the first two code blocks listed in Figure 5. 'Load the XML data document Dim doc As XPathDocument = _ New
XPathDocument(Server.MapPath("data.xml")) 'Load the XSLT file Dim sXslFile As String = _ Server.MapPath("formatter.xslt") Dim xslt As XslCompiledTransform = _ New
XslCompiledTransform() xslt.Load(sXslFile) 'Do the transformation Dim ms As MemoryStream = New MemoryStream() Dim writer As XmlTextWriter = _ New XmlTextWriter(ms,
Encoding.ASCII) Dim rd As StreamReader = New StreamReader(ms) xslt.Transform(doc, writer) ms.Position = 0 Dim sHtml As String sHtml = rd.ReadToEnd() Response.Write(sHtml) 'cleanup rd.Close() ms.Close() Response.End() Figure 5A: With a
little extra coding, XSL transformations can be done manually without the help
of the XML Web control (VB.NET). //Load the XML data document XPathDocument doc = new XPathDocument(Server.MapPath("data.xml")); //Load the XSLT file string sXslFile = Server.MapPath("formatter.xslt"); XslCompiledTransform xslt = new XslCompiledTransform(); xslt.Load(sXslFile); //Do the transformation MemoryStream ms = new MemoryStream(); XmlTextWriter writer = new
XmlTextWriter(ms, Encoding.ASCII); StreamReader rd = new StreamReader(ms); xslt.Transform(doc, writer); ms.Position = 0; string sHtml = null; sHtml = rd.ReadToEnd(); Response.Write(sHtml); //cleanup rd.Close(); ms.Close(); Response.End(); Figure 5B: With a
little extra coding, XSL transformations can be done manually without the help
of the XML Web control (C#). Then the transformation must be executed manually, because
the XML Web control is no longer doing it. The Transform method of the XslCompiledTransform
class does most of the heavy lifting in this example, streaming the resulting
HTML into an XmlTextWriter object. This XML stream is then rendered to the page
by a call to the Response.Write method. As Service Oriented Architectures continue to take over
the software development world, XML is becoming continually more important. The
ubiquity of XML makes it a worthy investment of your development research time.
By mastering the art of XML manipulation, you ll be able to work productively
on a seemingly infinite number of software systems that use it at their core. The separation of data and presentation layers is a design
pattern that shows no sign of diminishing popularity. XSL Transformations are
often the most efficient way to manipulate XML data into a more user-friendly
format. The syntax of XSLT is not particularly complicated, especially if you re
already comfortable working with XML. For those who wish to learn more about
the subjects discussed in this article, check out these useful resources: Sample code
accompanying this article is available for download. Steve C. Orr is an
ASPInsider, MCSD, Certified ScrumMaster, Microsoft MVP in ASP.NET, and author
of Beginning ASP.NET 2.0 AJAX (Wrox). He s
been developing software solutions for leading companies in the Seattle area
for more than a decade. When he s not busy designing software systems or
writing about them, he often can be found loitering at local user groups and
habitually lurking in the ASP.NET newsgroup. Find out more about him at http://SteveOrr.net or e-mail him at mailto:[email protected]. In a world where Service Oriented Architectures (SOA) are
beginning to dominate software design, the separation of data from the display
of that data has never been more important. This kind of separation is a
popular and well established design pattern. For example, ASP.NET code-behind
files are a common way to separate the format of a Web page from the logic that
retrieves and manipulates its underlying data. Similarly, Cascading Style
Sheets (CSS) are yet another example of separating data from its formatting. This philosophy allows data experts to experiment with the
most efficient ways to retrieve data without adversely affecting the efforts of
their user interface design experts. Because these development tasks are
defined in separate files, they can be individually edited by varying
developers or departments. Additionally, this separation allows many user interfaces
to potentially work on top of the same data. For example, an application could easily
allow users to select their own individual color schemes, or even radically
different layouts. Alternatively, many different applications could operate on
the same data using different front ends. For example, users could potentially
have the choice between an HTML front end, a Silverlight front end, or a
Windows Presentation Foundation (WPF) front end. These applications could all
live in harmony, or maybe only the most popular choices manage to retain long-term
development resources. In any case, the result is happy users with plenty of
options from which to choose. Manual Transformation
Conclusion
Why Bother?