XML Q&A - 26 Apr 2001


Editor's Note: Send your XML questions to Rich Rollman at [email protected]

Using a self-referential column to expand a hierarchy that a table represents—such as parts in a parts list—seems to be a natural fit for XML. Why can't I get self-referencing columns to expand a table-based hierarchy?

The basic problem lies in how to execute recursive queries, which SQL Server doesn't support. However, you can use a number of workarounds to achieve the desired results. The workarounds show some useful techniques that you can use in other scenarios where you need to generate XML from the database.

One workaround is to write a FOR XML EXPLICIT query with fixed hierarchy depth. For example, if you have a table that contains a hierarchical organization list (such as the organization chart that Table 1 shows), you can write a FOR XML EXPLICIT query, like the one that Listing 1 shows, to output a hierarchy three levels deep. Figure 1 shows this query's output.

You can add more columns and another SELECT statement, which includes the next hierarchy level, to increase the number of levels this query produces. Note that this query returns a trivial amount of data. Even so, the query requires six columns (data columns * levels of hierarchy). A real-world query might return many columns, further complicating the query, but this workaround produces a straightforward XML result.

Using ID and IDREF attributes is another solution that doesn't force you to limit the number of hierarchy levels. In "XML Q&A," March 2001, I talked about using ID and IDREF relationships to express many-to-many (M:N) XML relationships. You can use the same query mechanism to produce links within XML from an organization to its parent organization. The code in Listing 2, page 60, shows a much simpler query that uses ID and IDREF to express the hierarchical relationship.

ID and IDREF let you handle arbitrary depth and multiple hierarchies in the same XML document (note that the FOR XML EXPLICIT query can produce an ill-formed XML document because multiple entries might have no parent). However, ID and IDREF don't nest tags to represent the hierarchy. Therefore, the resulting representation is less readable and more difficult to navigate and query.

Another alternative is to output the list directly, then apply an Extensible Style Language Transformations (XSLT) stylesheet to build the hierarchy. You can do this by using a FOR XML AUTO query like the following, which uses the AS clause to rename the columns to be more consistent with the output from the previous queries:

SELECT Org_Id AS id, Title AS title, Parent_Org_Id AS parent
FROM Organizations AS Org

You then add an enclosing root tag to wrap the results to make the XML well formed, as Figure 2 shows.

To transform the results, you apply the XSLT stylesheet that Web Listing 1 shows. (For download instructions, see the More on the Web box.) The code uses XSLT's recursive-processing model to build the nested output. The code selects all root directory Org nodes without a parent attribute (i.e., the top-level organizations). For each of those top-level organization nodes, the code finds all the Org nodes that have a parent attribute matching the ID of the node; then the code repeats the process recursively for each Org node. The advantages of using XSLT are that you get a simple query and a simple XSLT transformation. The disadvantages of using XSLT are the increased CPU load and memory consumption that the XSLT transformation requires.

When I use XML to store a decimal value in a float or real column, then use a FOR XML query to examine the value, why do I get an approximate decimal value instead of the exact decimal value I stored?

This problem relates to floating-point precision, rather than to SQL Server's XML support. When you store a decimal number in a float or real column—a real number is equal to a float(24) number—you might be storing a close approximation of the actual value because floating-point numbers don't precisely represent all possible values. If you increase the mantissa size, you might solve the problem.

When performing floating-point operations, you also need to ensure that the operands have the same degree of precision. However, you still can't accurately represent some numbers. In those cases, your best bet is to use an int, decimal, money, or smallmoney data type instead of float because those four types maintain the specified precision.

If your application requires a float type, your best solution is to perform explicit type conversion, using the STR() function to specify the exact precision and rounding semantics you want. STR() converts numeric data to character data and works well within a FOR XML query. Web Listing 2 shows the schema for a table containing numeric types. And Web Listing 3 shows the T-SQL code you can use to take a floating-point number from an XML document and store the number in the FloatConversion table by using OpenXML (note that the OpenXML example stores the same number in all columns). Table 2 shows the data as Query Analyzer represents it. Note that the floating-point column stores an approximation of the original number and that the decimal and money columns round appropriately to the number of digits that the schema specifies.

The following FOR XML AUTO query uses the STR() function to retrieve a floating-point number from the FloatConversion table. Note that in addition to using the STR() function, the following FOR XML AUTO query also uses the LTRIM() function to remove any leading space characters:

SELECT LTRIM(STR(myFloat,15,5)) AS myFloat FROM FloatConversion
FOR XML AUTO, elements

Finally, note that data access APIs such as OLE DB and ODBC also provide a set of data types for operating on database data. When you use floating-point numbers, implicit conversion routines in these libraries can affect how your query displays a floating-point number. You're always wise to explicitly control the formatting of floating-point numbers instead of letting the system do it.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.