ADO.NET and System.Xml v.2.0: The Beta Version
An In-depth Book Review
By Craig Murphy
There s no doubt about it, XML as a means of representing data has arrived. Whilst Microsoft s .NET Framework 1.1 did a good job of bringing XML and data representation/access together, it s fair to say that the .NET Framework 2.0 brings with it significantly more integration and ease of use.
UK software legends Dave Sussman and Alex Homer teamed up with Microsoft s Mark Fussell to produce this 528-page coverage of ADO.NET, the System.Xml v.2.0 namespace, and XML integration with SQL Server 2005.
When Microsoft introduced the .NET programming framework, many people were surprised at the fundamental changes to data access techniques that it encompassed. Not only was there a new model for both connected and disconnected relational database management, but it presented a raft of new ways to work with XML. And now version 2.0 of the framework is on its way, bringing with it fresh opportunities and new techniques that not only extend the reach of the technology, but also make many common tasks much easier to accomplish than ever before. This book previews and explains the features of the new versions of ADO.NET and System.Xml, based on the Beta release. It includes asynchronous commands and multiple active results sets, SQL Server 2005 integration, and the universal query architecture plus the enhancements to the DataSet, XPathDocument, the new XQuery support, and more.
There are 12 chapters, each of which is a good mix of narrative and code. In fact, there s nothing in the way of reference-style content, i.e. page after page of method and property listings that merely regurgitate a help file.
Two authors contribute to a well-balanced Foreword, kept down to a shade over four pages; both are Microsoft employees working on the WebData Team: Michael Pizzo and Soumitra Sengupta. The WebData team goes back a long way, as far as I know, at least to the year 2000 (and probably earlier). These guys have been so close to XML and data access, it positions them perfectly as either budding authors of this book or as readers/reviewers. I think it s safe to say that this book has been read by the very people who made ADO.NET and System.Xml v.2.0 a reality, and it was written by the three people most qualified to turn a technical topic into readable narrative.
New Concepts in Data Access
This first chapter wastes no time in setting the scene, both historically (.NET 1.0) and currently (.NET 2.0). On the first page, mention of the fact that the Common Language Runtime (CLR) can now be hosted inside SQL Server 2005 suggests that what we are about to read is going to be a factual and straight-to-the-point account of .NET 2.0.
The chapter then goes on to provide a summary of the evolution of data management in .NET, information about the .NET 2.0 Beta release, and new concepts in version 2.0. It also provides a summary of new features in ADO.NET and System.Xml; this is an excellent section that draws a neat map (textually) of the remainder of the book, covering such areas as ADO.NET enhancements, SQL Server 2005 integration, and XML enhancements. Lastly, this chapter discusses the new data source controls and data binding features in ASP.NET.
ADO.NET Data Management Enhancements
ADO.NET v.1.0 heralded the introduction of disconnected data access and sported a number of classes that allow us to work with data in the absence of a physical connection to the database. This chapter discusses ADO.NET v.2.0 s ability to access data synchronously, its integration with the .NET System.Transaction namespace (allowing local transactions to be promoted to a distributed transaction), batch updating using a DataAdaptor, and the new classes available to managed code that allow the bulk copying of data (similar to SQL Server s Bulk Copy Program, BCP).
This is the first chapter to mention MARS (Multiple Active Result Sets). MARS is a SQL Server 2005 feature that allows us to open multiple result sets over a single connection (you ll need ADO.NET v.2.0 to make use of it). When would MARS be useful? Well, if you use a single table to hold your entire application s collection of drop-down or pop-up menu items, you may well need to fire off multiple queries at the same table in the same database.
The narrative in this chapter explains the need for synchronous and asynchronous access to data and ties it in nicely with its discussion about MARS. Clearly there are issues with MARS (what it does) and asynchronous data access; this chapter explains those issues succinctly.
Provider Factories, Schema Discovery, and Security
Today s modern applications don t limit themselves to a single database. In fact, from a corporate perspective, it pays to select a vendor whose application runs on a choice of database platforms. This chapter discusses the ADO.NET 2.0 APIs that we can use to help make our applications database-agnostic, or database-neutral. Such neutrality is achieved by introducing the concept of layering or interface/protocol stacking.
This chapter provides a fairly in-depth and technical look at the Provider Factory classes, the DBConnectionStringBuilder class, the Schema Discovery API, Security, and Performance. Whilst this is required reading, particularly before an application is architected and a database platform is firmed up, I did think that this chapter arrived rather early in the book.
The DataSet and DataTable Classes
The performance enhancements between the DataSet class in ADO.NET 1.0 and the same class in 2.0 are presented in an eye-catching graph the reader could hardly fail to miss it. A comparison of the 1.0 DataSet and the 2.0 DataSet during an insert rows benchmark demonstrates that large numbers of inserts in a 2.0 application should take less than half as long as a 1.0 application.
Coverage of the new features found in the DataSet class follows. With references to future chapters, this only whets our appetite for an XML datatype which we ll learn more about later.
Again, this chapter is rather in-depth and does appear as a reasonably low chapter number. However, because we cannot avoid working with the DataSet and DataTable classes, such coverage is welcomed.
ADO.NET and SQL Server 2005
This chapter covers the merging of two separate items of functionality: SQL Server 2005 and ADO.NET. It does so by focusing on three areas of functionality: MARS, SQL Server Query Notifications, and SQL Server User-Defined Types (UDTs).
MARS was first mentioned in chapter two, albeit at a rather high level. This chapter goes into much more detail (nine pages) regarding the use of MARS.
SQL Server Query Notifications essentially invert the processing logic: instead of a client application periodically polling the database for changes, Query Notifications allow the client to register their interest and be told whenever a particular piece of data (obtained via a query) has been changed or invalidated. The authors spend some 15 pages covering Query Notifications and make good use of seven small examples.
Given that we ve already learned that this book explains the fact that SQL Server can host any CLR-compliant language, it should come as no surprise to learn that this chapter also discusses UDTs and how we can extend SQL Server s existing data types with those from a managed code environment. Some 12 pages and five code examples explain when and how UDTs should be used.
SQL Server 2005 CLR Hosting
This chapter s key takeaway is the ability to write SQL Server 2005 stored procedures using a CLR-compliant language such as Visual Basic.NET or C#. The <SqlProcedure> attribute allows us to mark or decorate a piece of code as being a stored procedure.
XML in SQL Server 2005
This chapter starts making reference to SQL Server 2005 as an XML database. This is excellent news. The new XML datatype allows us to store XML in a typed form such that it conforms to an XML Schema. This kind of functionality is something we had to do manually prior to SQL Server 2005. The XML datatype also allows us to use column names in SQL statements and stored procedures.
A good discussion about the XML Schema Repository in SQL Server 2005 then follows. XML Schema is worthy of a book in its own right. The authors carefully recognise this and make excellent use of a simple example that covers the salient points, thus keeping this discussion focused on the Schema Repository.
Now that structure and schema have been covered, the authors move on to discussing inserts and selects using the XML datatype. It should come as no surprise to learn that XML has its own query language: XQuery. XQuery uses the XPath language to query an XML document that is held in a SQL Server 2005 table. Again, XQuery and XPath are topics in their own right and, again, the authors have noticed this and do not try to explain these subjects in depth. Instead, they chose to focus on the common questions/pitfalls that you and I might come across when trying to work with XML-based querying, such as XML namespaces and the rather interesting subject of binding relational data inside XML.
The XML datatype is also implemented in ADO.NET. A brief discussion about the changes to the underlying namespaces, classes, and methods they expose follows. I was pleased to see the authors highlight the fact that certain field types and property names might change between the beta release and the final release of .NET v.2.0. Working with SQL Server 2005 s XML datatype via ADO.NET includes coverage of: reading XML via a DataReader, updating an XML column with a Command, updating an XML column using an XML DML Statement, reading and updating the XML datatype with a DataSet or DataTable, loading a DataTable containing an XML-Typed column, and updating a DataTable containing an XML-Type column.
The remainder of this chapter, seven pages and five screenshots, covers how to use the XML classes in the SQL Server CLR.
XML in the .NET Framework
This is by far the easiest to read and most succinct introduction to XML and its associated technologies that I have seen to date. It covers the importance of XML, why you need it, a look at the System.Xml v.1.0 namespace, what s new in the System.Xml v.2.0 namespace, XML support in Visual Studio 2005, and an introduction to XQuery.
This book s authors position XML within the context of the XML specifications, such as XML 1.0, DOM 1.0, XPath 1.0, XSLT 1.0, SOAP 1.2, XML Schema, and the XML Information Set. Whilst these specifications don t make easy reading, this chapter covers their salient points, giving new readers a good overall grounding in the history of XML.
There is good coverage of XML s relationship (no pun intended) to databases, its use in Web applications, description of data via schemas, transformation and presentation via XSLT, and querying via XQuery. Mention of its ability to form part of a content publishing framework is touched on, but the focus quickly moves on to XML in .NET.
Through effective use of diagrams, XML s position with the .NET Framework becomes clear. The boundaries between the XML world and the relational world are evident, even to the point that the notion of serialising a class as XML for dissemination over a network is touched upon.
Whilst not very detailed, a reasonable discussion of the System.Xml v.1.0 namespaces follows. It covers such topics as: XML reading and writing, XML document editing, XML validation and content checking, XML querying, and XSL transformation. I see this chapter as a directional chapter; it provides the pointers and it s up to you to perform the in-depth research. This is an admirable approach, and one that makes the chapter a bit easier to read.
System.Xml v.2.0 is then explained, using the same headings as noted above for v.1.0. XML support in Visual Studio 2005 is discussed, intermingled with a handful of screenshots. Importantly, with XSLT now being compiled, the fact we can now debug XSLT using the Visual Studio 2005 debugger is a gem not missed by these authors! Indeed, a screenshot of the XML Schema editor, which looks rather similar to the XML Schema editor in earlier versions of Visual Studio, is also included.
This chapter closes with 10 pages explaining XQuery and how it can be used for querying XML data.
Reading and Writing XML
I like the way the authors present scenarios at the start of most chapters. This is an ideal mechanism for communicating what the chapter is about to cover and, importantly, why we might need the chapter itself.
Given that this chapter is nearly 90 pages, it really only covers reading (loading) and writing (creating) XML documents and fragments. However, it does go into reasonable detail covering how XML can be validated (for type and structure) using Document Type Definitions and XML Schema.
Good coverage of how to deal with XML fragments, i.e. small parts of a larger XML document, and how to deal with them using code, makes this chapter a good read. Similarly, there are a few good examples that cover most XML developers bane: XML namespaces. Kudos to the authors for providing these examples. Further coverage borrows from the other chapters where we learned that System.Xml v.2.0 now supports Typed Content Accessor methods, thus allowing us to work with XML as if it were typed.
I was pleased to see code examples of the ReadTo methods. These allow us to open an XML document/datastore and navigate to a particular element with just one line of code. Essentially, the ReadTo methods map on to XPath axes.
Creating XML is the job of the XmlWriter class. Of course, the authors cover creation of an untyped XML document and a typed XML document. They also demonstrate how to integrate the XmlReader and XmlWriter classes to create an HTML page from an RSS feed: the code was simple and evidence enough; no screenshot was required and none was given.
Tagged on toward the end of this chapter are six pages about Security and XML. With XML s popularity, it s only a matter of time before somebody finds a means of causing havoc using XML and that would throw the entire industry into mayhem. Restricting DTD parsing and XSLT processing are two such security mechanisms that the authors cover, both with narrative and code examples.
The chapter wraps up with a look at how we might create an XML Schema from an existing XML Document, a process known as inferring an XML Schema from an XML Document. We need this kind of functionality to make good use of XML Documents with SQL Server 2005, which requires an XML Schema before it will look at XML. Whilst XML Schema inference is embedded within the Visual Studio IDE, it is also surfaced via the XmlSchemaInference class; thus, we can access it programmatically.
XML Serialization Enhancements
As the authors rightly point out in this chapter s opening salvo, serialisation isn t something you might find yourself doing much of. Or so you thought. If you ve been building Web service-based applications using either Visual Studio or Delphi, under the hood, the Web service architecture has been serialising your data and classes on your behalf.
Pre-generation of serialisation assemblies is covered, albeit rather lightly. The command-line syntax for the Serialisation Generation Tool, sgen, is given, as is a working example that covers two pages. An additional three pages explain the enhanced operation of the IXMLSerializable interface, using two pages of code to demonstrate it in practice. This chapter, whilst useful, has a high code-to-narrative ratio, which isn t necessarily a bad thing. Serialisation is important, so perhaps it s best demonstrated via a reasonable-sized code example.
XML Document Stores
I m surprised that this chapter appears so late in the book: I would have thought it would have been presented earlier. It spends much of its 45 pages setting the scene for XML, using the XmlDocument class, limitations of the XML DOM, design guidelines for exposing XML from your classes, the XPathNavigator with cursor-editing model, using XPath queries, inserting attributes and elements using the XPathNavigator, and adding and removing elements using the XPathNavigator. I think we can conclude that the XPathNavigator plays a major part in the System.Xml v.2.0 namespace. Indeed, it was read-only in the v.1.0 namespace; now it has been opened up for editing.
The XPathNavigator API is covered using a series of small examples, which demonstrate the salient points without too much noise. The XmlDocument class is now capable of understanding XML Schema (previously we had to use a validating XmlReader). This has the advantage of making our XmlDocument instances type-aware. Obviously, some sort of mapping between XML Schema types (XSDs) and CLR types needs to be defined. This is a subject the authors choose to explain via a mapping table and a code example making use of the CLR type Double.
A further eight pages cover the ins and outs of validation against an XML Schema, making good use of short examples to convey the salient points. The remainder of the chapter covers XPath queries, drawing upon the authors knowledge and experience to answer many frequently asked questions (FAQs from newsgroups).
Transforming XML Documents
Chapter 12 s great revelation is the introduction of compiled XSL Transforms, introduced by the XslCompiledTransform class. Naturally, the authors waste no time in benchmarking the performance of the compiled XslTransform class and the .NET 1.1 XslTransform class. You will be interested to note that the new XslCompiledTransform class offers 200% to 400% performance gains, depending upon your scenario. How is this performance gain realised? Well, XSLT is now compiled into .NET s intermediate language: MSIL. If XSLT can be compiled into MSIL, you can perhaps imagine the incredible side-effect that this has: creation of a program database (PDB) file is now possible, as is debugging via an XSLT debugger!
This chapter also provides a few words revolving around the subject of when to use the eXtensible Stylesheet Language for Transformation (XSLT). This is a much-welcomed addition.
Lastly, this chapter discusses XSLT security. It s perhaps something not a lot of us think about, but with script hacking becoming increasingly popular, it is worthy of a mention. Whilst basic .xsl transformations are reasonably harmless (knock on wood!), XSLT s scripting capabilities introduce the need for security. Two settings are discussed: XsltSettings.EnableScript and UnmanagedCodePermission. Like most Microsoft offerings these days, the EnableScript property is set to false. I would like to have seen a little more written about the XSLT security concerns and solutions ( go and write it yourself , I hear you cry!). However, I m sure that .NET v.2.0 has been designed with security in mind; besides, security sometimes has to be retrospective until there s an attack, we don t know what to secure!
I have only two gripes about this book: the code samples are written using Visual Basic.NET and the book does assume some prior knowledge. Luckily, the authors have been kind enough to provide C# versions of the code on their Web site! Prior knowledge shouldn t put you off; .NET has been with us publicly since 2000. Unless you ve been living under a rock for the last five years, it s very likely that you ve found yourself reading some .NET 1.0 and 1.1 material.
This is a good read; it is straight to the point and hits the nail on the head for a number of potentially scary subjects. If you are thinking of buying a book with the title ADO.NET and System.Xml v.2.0, I think it s fair to say that you re coming from a reasonably technical background and that you have an interest in where the technology s come from and where it s going.
Whilst the code samples in the book have been edited down, full versions are available from Dave and Al s Web site (http://www.daveandal.net). I recommend having the full code samples available to you; even if you don t compile/run them, it s good to use them for reference.
If you have enjoyed some success with .NET 1.1 and you are now planning a .NET/XML project and intend to use .NET 2.0, I strongly recommend this book as a great introduction to the salient points: the narrative is there, but has not been padded out. This book is clearly targeted at a developer audience, particularly an audience with some .NET 1.0 or 1.1 exposure and a moderate amount of XML awareness.
Title: ADO.NET and System.Xml v.2.0: The Beta Version
Authors: Alex Homer, Dave Sussman, Mark Fussell
Page Count: 528 pages
VBUG member Craig Murphy is an author, blogger, community evangelist, developer, speaker, project manager, and Microsoft MVP (XML Web services). He specialises in all things XML, particularly SOAP/Web services and XSLT. Craig is also evangelical about C#, Test-Driven Development, and Extreme Programming. He can be reached via his Web site: http://www.craigmurphy.com.