Integrate SQL Server 2000 With XML

Learn how to combine the FOR XML T-SQL extension with ADO.NET to improve your Web server’s performance.





Integrate SQL Server 2000 With XML

Learn how to combine the FOR XML T-SQL extension with ADO.NET to improve your Web server's performance.


By Asli Bilgin


XML comprises a large chapter of the .NET Framework's data-access story. In fact, XML serves as the only persistence format for data using ADO.NET. Quite fittingly, SQL Server, as a .NET Enterprise Server, makes great strides with native support for querying and updating data using XML. Like almost all .NET technologies, it uses the Web as a fortifier, leveraging Internet standards. SQL Server supports the HTTP protocol for data transfer, XML for storing data, and XSD for data schemas.


Microsoft provides many ways to work with XML data using SQL Server. You could use satellite XML technologies such as XPath queries and XSL templates, but in this article I'll show you how to leverage FOR XML T-SQL extensions, in conjunction with the SELECT statement, to retrieve publishers and their titles from a sample pubs database. You'll use the FOR XML EXPLICIT mode to hand-craft your own custom XML schema, and you'll see how you can pass an XML stream to Visual Studio .NET using a stored procedure.


Additionally, I'll demonstrate how to draw on the classes within the System.Data and System.Xml namespaces to write and read XML from SQL Server. You also will learn how to stream XML data from SQL Server using the XmlTextReader class.


Leveraging cached data is a great way to improve your Web server's performance. In this article, you'll see how to push the XmlTextReader stream into an ADO.NET DataSet object. Because ADO.NET DataSets persist as XML, they make ideal caching mechanisms for the XML data retrieved from SQL Server.


Finally, you'll use ASP.NET Web Forms to display the cached XML file in master and detail DataGrid controls by binding to DataSet and DataView objects.


Retrieve XML Data

One way to retrieve XML data from SQL Server is to use the FOR XML extension, which you append to the SELECT statement. Traditionally, SQL Server returns relational data in a tabular format. When you append the FOR XML clause to the end of your SELECT statement, SQL Server returns data as XML. You then can take this XML stream and persist it as an XML document. Alternately, you can store the XML results in memory as an ADO.NET DataSet.


Why would you want to use the FOR XML extension? Representing your data hierarchically rather than relationally has many advantages. First, data packaging is cleaner. XML lets you display your data with hierarchy-friendly controls, such as TreeView, and it makes data transport easier thanks to its text-based nature with different platforms and firewalls. XML also enjoys wide industry acceptance within the technology and business sectors, making it a logical choice for interoperability.


In addition, XML not only works well with hierarchical controls, but it is the only persistence data format within ADO.NET, which enables DataSets and DataReaders to grab results quickly and readily with little or no conversion. Those of you familiar with classic ADO's shaped RecordSets will see that the FOR XML statement is a great replacement for classic ADO's SHAPE command, which uses a proprietary format for displaying parent and child data. Finally, XML is great for caching. Caching XML for your ASP.NET pages can improve the performance of your Web application significantly. You can combine XML data caches with ASP.NET's new page and fragment caching. For example, consider a situation where you are retrieving data from SQL Server to populate navigation bars. Most likely, that data is not going to change too much over time. You can use XML files to hold the data on the Web server rather than calling the database every time. Drop-down lists also are likely candidates for caching.


You append the FOR XML extension to a SELECT statement using this syntax:



                [ , XMLDATA ]

                [ , ELEMENTS ]

                [ , BINARY BASE64 ]


You can't use the FOR XML clause without specifying one of these modes: RAW, AUTO, or EXPLICIT. The latter three keywords - XMLDATA, ELEMENTS, and BINARY BASE64 - are optional. The XMLDATA keyword returns both schema and data as your XML results; ELEMENTS is useful for generating XML data that uses nested XML elements; and BINARY BASE64 lets you encode binary information, such as images, for text-based transport. Newcomers to XML often hold the common misconception that because XML is a text-based format, it is unable to package binary information such as Word documents. Using BASE64 encoding, however, you can provide the means to transport binary data.


Next, let's explore different queries that retrieve XML data using T-SQL extensions. We will use the Northwind and pubs database, which ship with SQL Server 2000.


RAW Mode

You can't specify element names using the RAW mode. Instead, you get XML data using generic tags. With tags, you have no control over the naming and you also can't control the structure. The tag forces the column names and values into attributes rather than making them child elements within the element.


Additionally, the RAW mode flattens your data. Because all your data rows are tagged with the tag, you can't take advantage of the hierarchical nature of XML. Consider this statement:


SELECT customers.customerid, customers.companyname,

orders.orderid, orders.orderdate

FROM customers, orders

WHERE customers.customerid = orders.customerid



The previous statement returns the data from the Northwind database, shown here:


orderid="10643" orderdate="1997-08-25T00:00:00"/>

orderid="10692" orderdate="1997-10-03T00:00:00"/>


As you can see, your customer and their order(s) don't share a parent-child relationship.


There's not much value in flattening data that's meant to be hierarchical. You can think of this data as a classic ADO RecordSet converted to XML using the adPersistXml option. It's relational, not hierarchical, XML data.


If you use the Query Analyzer to test your queries, be sure to display the results in text rather than a grid because the grid view truncates large result sets in an unintelligible fashion. Additionally, be sure to increase the characters displayed by the Query Analyzer: Select Tools > Options from the menu, navigate to the Results tab, and adjust the Maximum Characters Per Column setting to a larger value; I like to use a value of 3,000.



As its name indicates, the AUTO mode is intelligent enough to generate XML data automatically based on the structure of your SELECT statement and the joins you use. The AUTO mode, along with the ELEMENTS keyword, improves your XML's readability because it enables you to use nested elements. This is nice for simple queries, but if you want to get more complex, you will be confronted with a few limitations.


The AUTO mode doesn't enable you to have more than one child for an element. For example, you can't create a structure like this:





Instead, experiment with the AUTO mode using the following query. You will have to insert some data into the CustomerDemographics table first (or you can download and run the script that I have provided to do this):


SELECT customers.customerid, customers.companyname,

orders.orderid, orders.orderdate ,


FROM customers, orders, customerdemographics

WHERE customers.customerid = 'ALFKI'  AND

customers.customerid = orders.customerid AND

customers.customerid = customerdemographics.customertypeid



Here is the XML obtained when running the previous query:



  Alfreds Futterkiste









You can see by the results that SQL Server doesn't interpret the CustomerDemographics table as a child of the Customers table and instead tucks it under the Orders table. This happens because SQL Server relies on the order of your tables in the FROM clause. In this case, the CustomerDemographics table follows the Orders table, causing the nesting you see in the query.


Although the AUTO mode doesn't work correctly with more complex queries, its support for single child nesting is sufficient for now. Consider this statement:


SELECT customers.customerid, customers.companyname,

orders.orderid, orders.orderdate, [order details].productid

FROM customers, orders, [order details]

WHERE customers.customerid = 'ALFKI'  AND

customers.customerid = orders.customerid AND

orders.orderid = [order details].orderid



See Figure 1 for the results of running this query.



  Alfreds Futterkiste














Figure 1. Using the ELEMENTS keyword, the AUTO mode is useful for working with single-level element nesting.


Using the AUTO mode does have other caveats. Although you have some control over the structure, you can't combine attributes and elements for column data. You have two choices to circumvent this limitation: You can use all attributes (which is the default), which contains the column data in attributes as in the RAW mode; or you can use all elements using the ELEMENTS keyword. In addition, AUTO does not support aggregate functions and grouping within your SELECT statement.



Although EXPLICIT mode is the most complicated of the three modes, it is your best bet when producing an XML document for consumption by an ADO.NET DataSet. It employs a proprietary schema management model that's difficult to understand at first, but the extra pain comes with a reward because EXPLICIT gives you complete control over the layout of your final XML document.


Using VB .NET, create a new ASP.NET Web Application named EmployeeList. From the Server Explorer in VS .NET, drill down to the node that represents the SQL Server you wish to work with. Right-click on the Stored Procedures node under the Pubs Database node and select the New Stored Procedure option (alternately you can create the stored procedure using SQL Server Enterprise Manager). Replace the code in the code designer window with the code in Figure 2.





SELECT 1 AS Tag, NULL AS Parent,

NULL AS [PubEmp!1!RootNode!element],

NULL AS   [Publisher!2!PublishingHouse!element],

NULL AS [Publisher!2!PublisherID!element],

NULL AS [Employee!3!EmployeeID!element],

NULL AS [Employee!3!EmployeeName!element],

NULL AS [Employee!3!EmpPublisherID!element]



SELECT 2 AS Tag, 1 AS Parent,

NULL AS [PubEmp!1!RootNode!element],

p.pub_name  + ' (' + + ')' AS


p.pub_id AS [Publisher!2!PublisherID!element],

NULL AS [Employee!3!EmployeeID!element],

NULL AS [Employee!3!EmployeeName!element],

NULL AS [Employee!3!EmpPublisherID!element]

FROM publishers p



SELECT 3 AS Tag, 2 AS Parent,

NULL AS [PubEmp!1!RootNode!element],

p.pub_name  + ' (' + + ')' AS PublishingHouse,

p.pub_id AS PublisherID,

e.emp_id As EmployeeID,

e.fname + ' ' + e.lname AS EmployeeName ,

e.pub_id AS EmpPublisherID

FROM employee e, publishers p

WHERE e.pub_id = p.pub_id

ORDER BY [Publisher!2!PublishingHouse!element],



Figure 2. The FOR XML EXPLICIT clause is constructed as a series of SELECT statements, which should be designed carefully to ensure the XML data is returned appropriately.


Make sure you set the appropriate execute permissions on the GetEmployees stored procedure. (Note: You must have the Enterprise Architect version of VS .NET to create stored procedures. If you don't, you can use the SQL Server Query Analyzer to create your stored procedure. If you are using integrated Windows Authentication to connect to your database, you'll have to set a couple security settings. Be sure the MACHINE_NAME/ASPNET user has a valid login to your SQL Server database, where the MACHINE_NAME is the name of the machine on which you are running VS .NET.)


Although the code in Figure 2 looks intimidating, the results are quite basic: The query simply retrieves a list of employees and the publishing houses they work for.


The first SELECT statement sets up the root node, padding the columns with null values. The SELECT statement must have a Tag value that uniquely identifies the results of SELECT block. The Tag column must be a number, and it's generally easier to read if you use consecutive numbers. The Parent column is null, indicating that it is the root node.


The second SELECT statement sets up the Publishers node. It references the root using the Parent column, which references the Tag column of the parent result set. As you might guess, the third SELECT statement contains the employee information.


With the EXPLICIT mode, you can keep stringing on nodes to the XML document without being limited to the linear single parent-child structure you have with the AUTO mode. It's easy to get creative, using the Parent references to create multiple children nodes under a single parent.


Why the exclamation points? As I mentioned earlier, the EXPLICIT mode uses a proprietary schema-creation engine. One of its conventions is to use exclamation points as delimiters for creating column names. Here is the basic syntax:




Take a look at one of the column names from Figure 2:




The ElementName variable indicates the name of the element node in the final XML document. In this case, you have an Employee element. TagNumber references the SELECT block that retrieves the data for that value. It makes sense that you use the SELECT block tagged as 3 because this is the one that retrieves employee data. AttributeName indicates the name of the attribute (or subelement) for the column data. The fourth variable, element, serves as a directive. This indicates that you want your column data as subelements, which are named according to the value you specify in AttributeName. If you don't specify a directive, the column data would appear as attributes of the element you specified in the ElementName variable.


It would be nice if FOR XML enabled you to map your schema to an XSD file; after all, it is the W3C standard. Unfortunately, however, you must create the schema manually. Later in this article, you'll discover other ways to work with mapping schemas using SQL Server.


Work With ADO.NET and ASP.NET

There are advantages to using FOR XML to cache data on the Web server, and I'll show you how to bind this cache to a server-side ASP.NET control such as a DataGrid.


Continuing with the EmployeeList example, now add a button control to a Web form in the project. Change the button's Text property to Cache Data. This button will contain the code that generates the cached XML file. Then, from Server Explorer, drag the GetEmployees stored procedure and drop it onto the Web form. This creates the relevant SqlConnection and SqlCommand objects. Next, open the code-behind file for the Web form and add this line:


Imports System.Xml


Now, find the button's click event handler that uses the XmlTextReader object of the System.Xml library to read the stream of XML data from the stored procedure, and add this code:



Dim xrEmployees As XmlTextReader = _


Dim ds As New DataSet()



' RetrieveAppBinPath is a custom function to

' retrieve the local bin directory.

ds.WriteXml(RetrieveAppBinPath() & "CachedEmployees.xml")


This code uses the ExecuteXmlReader method to create an XmlTextReader object. The Read method pulls the XML data off the wire. You then load the data into a DataSet object. The DataSet has methods that enable you to persist the XML data to a file. Make sure to set permissions on the bin directory for your machine's ASP.NET user. Then execute the project.


The page should create a new XML file on the Web server. You can view this file from the Solution Explorer by selecting the Show All Files icon and drilling into the bin directory. Open the CachedEmployees.xml file and examine the contents. The structure of this data will help you better understand the schema-creation mechanism of the FOR XML EXPLICIT extension. You can toggle between the relational and hierarchical view of the data by clicking on the XML and Data tabs, respectively. Figure 3 shows you the Data view of the XML file. You always should specify the file path of the final application explicitly, otherwise your file will be stored in the default Windows System directory. In this example, I use the private method RetrieveAppBinPath() to get a handle on this project's application directory.


Figure 3. When you examine the resulting XML file in VS .NET, you can get a good sense of how the FOR XML EXPLICIT clause works.


You use the DataSet's ReadXml method to bind the master DataGrid control to the cache: Set the DataSource property of the DataGrid to the Publisher table.


Bind a second DataGrid to the Employee table using the DataNavigateUrlField to relate the two DataGrids together. Use the RowFilter property of a DataView object to capture the PublisherId from the query string. The DataView enables you to bind the second DataGrid to the employees belonging to the publishing house selected by the user. Figure 4 and Figure 5 show screenshots of the ASP.NET application generated by this article's downloadable code.


Figure 4. The master DataGrid is generated from a cached XML file.


Figure 5. The detail DataGrid is generated from a cached XML file.


The FOR XML extension is a great way to retrieve relational data quickly as an XML document. Unfortunately, this has some limitations. For example, you can't use it with a recursive schema. You wouldn't be able to use the employee table to show a hierarchy of employees and their managers because the ReportsTo column in the Employee table points back to itself.


But the lack of XSD support is the FOR XML extension's biggest limitation. The proprietary schema mapping of the EXPLICIT mode is powerful, but it's cumbersome and tedious. Ideally, the EXPLICIT mode should enable you to point to an XSD file to map the schema rather than crafting it by hand.


One more T-SQL extension is the OPENXML function. This enables you to insert XML data into a SQL Server database. It's great for batch inserts, but unfortunately, it doesn't support updates and deletions.


There are other, impressive ways of retrieving and updating XML data with SQL Server. The latest Web Releases of SQL Server include these new features. Updategrams leverage the XSD rather than a proprietary mechanism. This new feature, along with XPath queries, templates, and HTTP access to SQL data are fodder for future articles. Once you understand the variety of ways you can work with SQL Server data using XML technologies, the better you will be able to find a solution that best meets your needs.


The sample code in this article is available for download.


Asli Bilgin is an MCSD serving as a .NET Technical Evangelist for Dell Professional Services ( She consults for Fortune 100 companies in architecting distributed enterprise solutions using Microsoft technologies such as SQL Server, ASP.NET, and VB.NET, and she is on the charter speaker board for INETA. She also speaks at Microsoft conferences and recently co-authored Mastering Visual Basic .NET Database Programming (Sybex). E-mail her at mailto:[email protected].


Tell us what you think! Please send any comments about this article to [email protected]. Please include the article title and author.




Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.