ADO.NET for the ADO Programmer

ADO.NET is the latest in a long line of database access technologies that began with the Open Database Connectivity (ODBC) application programming interface (API) several years ago. Along the way, a number of interesting things happened. For example, COM landed at the database territory and started a colonization process that culminated with OLE DB. Next, ActiveX Data Objects (ADO)-roughly an automation version of OLE DB-has been elected to govern the Visual Basic and ASP community of Windows-based database developers.

With .NET, Microsoft is offering a general-purpose framework-the Framework Class Library-that will span to cover all the existing Windows API and more. In particular, it will include a number of frequently used libraries now available through separate COM objects. Among these, you find the XML and ADO object models that have been integrated in a subtree of classes called ADO.NET.

ADO.NET turns out to be the substrate that will form the foundation of data-aware .NET applications. Unlike ADO, ADO.NET was purposely designed in observance of more general, and less database-oriented, guidelines. ADO.NET gathers all the classes that allow data handling. Such classes represent data container objects that feature typical database capabilities-indexing, sorting, views. While ADO.NET is the definitive solution for .NET database applications, it shows off an overall design that is not as database-centric as the ADO model.

ADO.NET is quite different from ADO. It is a new data access programming model that requires full understanding, commitment, and a different mindset. However, once you get started with ADO.NET, you'll realize that any ADO skills you have are a tremendous help in building effective applications and solving old issues in a different, but more elegant and consistent manner.

For the remainder of this article, I'll focus on the ADO.NET way of accomplishing basic database operations. I'll make it apparent when ADO.NET is a better choice than ADO, as well as when you're better off sticking with ADO. ADO.NET is not ADO adapted to fit in the .NET infrastructure. This is apparent when you look at ADO.NET in terms of syntax, code design, and migration.

Data Access After .NET

Access to data sources in ADO.NET is ruled by managed providers. Functionally speaking, a managed provider is much like an OLE DB provider with two important differences. First, they work in the .NET environment and retrieve and expose data through .NET classes like DataReader and DataTable. Second, their architecture is simpler because it's optimized for .NET.

At this time, ADO.NET comes in two flavors of managed providers: one for SQL Server 7.0 and higher, and one for all the other OLE DB providers you may have installed. The classes you use in both cases are different, but follow a similar naming convention. The names are the same except for the prefixes. The prefix is SQL in former case and ADO in the latter.

You should use SQL classes to access SQL Server tables because they go straight to the internal API of the database server, skipping the intermediate level represented by the OLE DB provider. ADO classes are a .NET interface on top of OLE DB providers and use the COM Interop bridge to do the job.

For a primer on ADO.NET objects, see Omri Gazitt's article, Introducing ADO+: Data Access Services for the Microsoft .NET Framework, and my article ADO+ Drives the Evolution of Data Species. The former is more technical and provides a high-level, annotated overview of the ADO.NET programming model. The latter is aimed at explaining the motivation for ADO.NET and how it relates to XML, script, and other technologies.

Reading Data

An ADO.NET application that needs to read some data out of a data source should start by creating a connection object. It can be SQLConnection or ADOConnection depending on the target provider. Bear in mind that, although not recommended, you can use ADO.NET classes to connect to a SQL Server database. The only drawback is that your code passes through an unnecessary extra layer of code. It calls into the managed provider of the ADO, which in turn calls the SQL Server OLE DB provider. The SQL Server managed provider, instead, goes straight to the data as an OLE DB provider would do.

A significant difference between ADO and ADO.NET connection objects lies in the fact that the ADO.NET connection doesn't support the CursorLocation property. Note that this is not a documentation bug, but rather a debatable design issue. To enforce its basic data-centric vision of the world, ADO.NET has no explicit implementation of cursors.

In ADO, you were used to employing cursors to pull records out of a database, or any other OLE DB-compliant data source. You might choose between client or server cursors and, depending on this choice, a few predefined cursor types. ADO.NET tends to abstract from the data source and provides a new programming interface for reading and analyzing data.

In ADO, you create a Recordset object by specifying a connection and a command text. The Recordset has certain policies for cursor location and type. To read data you can do any of the following:
Create a static, in-memory copy of the selected records and process them at will while being disconnected from the data source. ADO calls this a static cursor.

Scroll data through a fast, forward-only, read-only cursor that works over a static snapshot of the records. ADO calls this a read-only cursor.

Access data through two flavors of server-side cursors that need an outstanding connection but allow you to detect, at different levels, incoming changes made by other connected users. ADO calls these keyset and dynamic cursors.

The first two options are similar in that they both work on disconnected recordsets and read information from a client-side cache. Furthermore, these two options proved to be the most frequently used in Web-oriented contexts and for new n-tier systems.

In ADO, all of the options above map to a different type of cursor. As you'll see later in this paper, ADO.NET is quite different, but you won't lose any of the capabilities you had with ADO. By contrast, your code will abstract from the actual data source and its physical storage medium and format.

ADO.NET makes available two objects to manipulate data pulled out of a data source. They are the DataSet and the DataReader objects. The former is an in-memory cache of records that you can visit in any direction and modify at will. The latter is a highly optimized object aimed to scroll read-only records in a forward-only manner. Note that the DataSet looks like a static cursor while the DataReader object is the direct .NET counterpart of the ADO's read-only cursor.

In ADO.NET there's no support for server-side cursors. However, this doesn't mean that you cannot use them. What you need to do is import the ADO type library in .NET. This is as easy as right-clicking on the References node in the project window. Once you do this, you can start using native ADO objects in your applications.

While I personally recommend that you plan to rewrite your existing applications with .NET in mind, I recognize that porting to .NET is not an easy decision to make. A complete importation of ADO might be the first practical step with .NET without investing too much time and resources. However, bear in mind that this is only the first step of a longer path. In no way should this be the only step you take towards .NET. The real added value of .NET stems from a uniform and consistent programming interface and the broad use of native classes. Importing COM type libraries is supported, but not encouraged and is reasonable only as a short-term solution or intermediate step.

When thinking of ADO.NET, take into careful account the fact that ADO.NET unifies the programming interface for data container classes. Whatever type of application you're going to write-Windows Form, Web Form, or Web Service-you handle data through the same set of classes. Whether the data source in the back-end is a SQL Server database, an OLE DB provider, an XML file, or an array, you scroll and handle their content through the same methods and properties.

If you stick to ADO in the .NET world, be prepared to face some side-effects like the extra code you need to make recordsets usable from data-bound controls.

DataSet, DataTable, and Recordset

ADO.NET has no direct counterpart for the Recordset object. The closest is the DataTable object. Although they have a nearly identical set of functionalities, they play a different role in the respective frameworks.

The Recordset is a huge object that has many of the ADO capabilities, but still lacks in some areas. It is good at many things-it is creatable, it works disconnected, it's feature-rich-but could be improved in some areas. For example, it's heavy to serialize over a network due to its inherent COM nature, it's a binary object that is difficult to share between modules running on different platforms, and it cannot penetrate firewalls. In addition, it represents a single table of records. If this table comes as the result of one or more JOINs, it can be difficult to update the original data sources. When you attempt to reconcile your disconnected recordset with the original data source, it works as long as the data source understands SQL. However, your recordset could have easily been created by a non-SQL provider.

In ADO.NET, all the functionality of the ADO Recordset has been split into a few simpler objects-one of which is the DataReader. The DataReader mimics the behavior of a fast, read-only and forward-only cursor.

The DataTable is a simple object that represents a data source. You can manually fabricate a DataTable, or you can also have it automatically filled by DataSet commands. The DataTable doesn't know anything about the origin of the data it contains. It allows you to manipulate data in memory and do things like navigation, sorting, editing, applying filters, creating views, and more.

The DataSet object is something for which ADO has no counterpart. It is a data container class and is the key object to realize the ADO.NET data abstraction. The DataSet groups one or more DataTable objects. The DataTable exposes its content through generic collections like rows and columns. When you attempt to read from a data table, you might pass through two different layers of objects: DataTableMapping and DataView.

The DataTableMapping object contains the description of a mapping between columns of data in a data source and a DataTable object. This class is used by the DataSetCommand object when populating a DataSet. It maintains the link between abstract columns in the data set and physical columns in the data source.

A view of the table is rendered through the DataView object. It represents a customized view of the DataTable and can be bound to specialized controls like the datagrid in Windows Forms and Web Forms. This object is the in-memory equivalent of the SQL CREATE VIEW statement.

All the tables in a DataSet can be put into relationship through a common field. The relationship is managed by a DataRelation object. This looks much the same as ADO data shaping but with one important difference. There's no need to cope with the data shaping language and you end up having an extremely flexible architecture. The ADO.NET navigation model allows you to easily move from the master row in one table to all of its child rows.

The DataRelation object is an in-memory counterpart of the JOIN statement and is helpful with implementing parent/child relations involving columns that have the same data type. Once the relationship has been set, any change that would break the relationship is disallowed and originates a runtime exception. Views and relations are two methods to implement master/detail schemas. Bear in mind that a view is just a mask put on the records, whereas a relation is a dynamic link set between one or more columns of two tables, and with relations you have no way to change the order or set conditions.

If your code needs 1:1 foreign-key relationships, and you don't change the data, then you are likely better off using plain JOIN commands. If you need extra filtering capabilities, you should go for ADO.NET custom views.

Translating Existing Code

There are plenty of ASP pages out there using ADO objects to pull data. Let's review a few typical scenarios you might be facing in the near future while dealing with the porting or the adaptation of your code.

If you have ASP pages that generate reports from a single recordset, the DataReader object is your best friend.

You navigate through it and it outputs to the page.

String strConn, strCmd;

strConn = "DATABASE=MyAgenda;SERVER=localhost;UID=sa;PWD=;";

strCmd = "Select * From Names where ID=" + contactID.Text;

SQLConnection oCN = new SQLConnection(strConn);

SQLCommand oCMD = new SQLCommand(strCmd, oCN);

oCN.Open();

SQLDataReader dr;

oCMD.Execute(out dr);
while (dr.Read()) {

// Use dr.GetString(index) or

// dr["field name"] to Response.Write data

}

You can also use the HasMoreRows property to quickly check if the DataReader is empty. If you only need to walk through a series of records, nothing is better and faster than the DataReader. This holds true if you need to query for one single record. The content of the DataReader is not editable, but you can always move that content out of it into a more manageable object-for example the DataTable or one or more DataRow objects.

The DataReader stops being the proper tool to use when you need to handle complex relationships between tables and records. In ADO, you always end up working with recordsets. The more articulated your data model, the more complex the SQL commands become. The navigation model remains sequential and you often end up caching more data than needed. DataSet and DataRelation objects are the underpinning of this sort of table-relationship model.

To manage parent/child relationships, ADO also marshals the data-shaping engine. Functionally speaking, data shaping and ADO.NET relations are the same thing. However, in terms of design, they have very little in common. Shaped recordsets embed all the information in a single tabular object. ADO.NET relations are dynamic links you can establish at any time between two tables of data. ADO relies on the Shaping OLE DB service provider and features a specific SQL-like language to create a hierarchical recordset within the execution of a single ADO command.

In ADO.NET, each object involved in the relationship is always seen as an individual. The relation itself is exposed as an object and given certain behavior guidelines. For example, a DataRelation object can cascade changes from parent rows down to child rows. You do this by adding a ForeignKeyConstraint object to the DataTable's Constraints collection. The ForeignKeyConstraint object represents a restriction enforced on a set of columns coupled through a foreign key relationship when a value or a row is deleted or updated. As mentioned earlier, once the relationship has been set, and until it is programmatically terminated, you cannot enter changes that would break it up.

In addition, relations are not transitive. You can set two different relations between, say, Customers and Orders, and Orders and Products. However, while navigating the orders for a certain customer you cannot jump from one order to the related set of products rows. Instead, you must open the Orders/Products relation separately, locate the order you need, and then ask for the related rows. That's why sometimes 1:1 relationships are better off rendered through plain old SQL JOIN statements.

Need to store records in the ASP Session object? With ADO.NET and the DataSet object, you can do that rather safely without incurring the nasty effects discussed in "Storing an ADO Recordset in GIT Might Cause An Access Violation"

http://support.microsoft.com/support/kb/articles/Q249/1/75.ASP), as well as struggling with thread affinity.

Updating Data

Web applications generally update their data using plain SQL statements or, better yet, using parameterized stored procedures. However, when it comes to using disconnected data, you might want to exploit built-in services to update all the records that need to be revised. To accomplish this, ADO provides the batch update mechanism.

The UpdateBatch method is used to send Recordset changes held in the copy buffer to the server to update the data source. It makes use of an optimistic type of lock and allows all of the pending local changes. It also sends all the changes to the data source in a single operation. An optimistic lock occurs when the data source locks the records being changed only when the changes are committed. As a result, two users can access the same record at the same time and enter changes that are soon overwritten by the other. Of course, this approach works as long as the data source is capable of detecting and rejecting data conflicts. It also assumes that the whole data source is not extremely volatile and subject to frequent changes. Otherwise, the cost of the inferred reconciliation soon outweighs the savings of a few full, pessimistic locks. In fact, with the UpdateBatch method, an error is returned should any of the changes fail. Then, you access the error using the Errors collection and Error object.

Understanding how optimistic locks work in ADO is a key element in understanding how the ADO.NET model for updating data is significantly more powerful. From the ADO code, you call UpdateBatch and the rest that happens is beyond your control. That is, updates are performed on the server by scrolling the rows that have been changed, comparing the original value with the current value in the corresponding record of the data source. If all coincide, then the proper SQL statement (INSERT, UPDATE or DELETE) executes against the table.

The issue is that you can't control the SQL statement that actually applies the changes for you. The server-side update code is neither better than the one you write, nor does it work if you target a non-SQL provider. At the beginning of this section, I stated that Web applications typically update their data through parameterized stored procedures. However, this isn't what happens if you use a batch update.

In ADO.NET, this model has been expanded quite a bit. Now it follows a more generic schema that lets you specify your own commands for basic operations like insertion, deletion, update, and selection. It's easy to see the intention to abstract from the data source and provide the same support, regardless of the data source's nature. Batch updating in ADO.NET requires you to create a DataSetCommand object-either SQLDataSetCommand or ADODataSetCommand.
Note: In Beta 2, DataSetCommand objects will be called DataAdapter objects.

Once you hold a DataSetCommand object in your hands, you call its Update method. DataSetCommand exposes properties like InsertCommand, DeleteCommand, UpdateCommand, and SelectCommand. They're Command objects but you don't have to set them unless the default behavior doesn't fulfill your expectations. This is the same as in ADO. During Update, if any of the xxxCommand properties is not set but primary key information is present, the Command object will be automatically generated. Notice that for this to work, having a primary key set for the involved data tables is mandatory.

The following code shows how to set a primary key for the EmployeesList table of a DataSet:

DataColumn[] keys = new DataColumn[1];

keys[0] = m_oDS.Tables["EmployeesList"].Columns["EmployeeID"];

m_oDS.Tables["EmployeesList"].PrimaryKey = keys;

A primary key is basically an array of DataColumn objects.

If you want to use a stored procedure to update a table, or if you're working against a proprietary, non-SQL data provider, you will be making frequent use of these command properties.

The Extended Support for XML

In ADO, XML was nothing more than a mere input and output format. However, in ADO.NET, XML is the data format that provides you with the means of manipulating, reorganizing, sharing, and transferring your data. Any chunk of data you bring into a DataSet, regardless of origin, can be manipulated through a double-faced programming model. You can interchangeably access pieces of information sequentially, row after row, as well as following a nonsequential, hierarchical path driven by the XML document object model.

A DataSet reads and writes data and schema as XML documents. Both data and schema are transportable through HTTP and can be used on any platform that understands XML. The same data can be rendered through different schemas (and XSLT finds its way out) at different times. You use the ReadXmlSchema method to write schemas. A XML schema includes a description of the tables in the data set, as well as their relations and constraints. You should do this prior to calling into the ReadXmlData method that populates the DataSet.

The following code sample shows a minimal ASP.NET page that displays an updateable table of data.






As Figure 2 demonstrates, you can add new rows to the table. However, there's no SQL Server or Access table behind it. It's only an XML file, but in the code that runs against it, there's no trace of XML nodes or XMLDOM methods. You use the same intuitive interface of data tables to read and update XML records. You could do roughly the same in ADO, but the model here is deeper and larger and has a stronger potential for you to exploit.

Summary

The success of Web applications changed the face of the typical distributed system. Now most distribution systems are n-tier systems characterized by a high, and still growing, demand of scalability and interoperability. As a result, data disconnection and XML became best practices and gained a wide acceptance from the industry.

ADO.NET tries to unify some of the today's best practices under the umbrella of .NET. The overall programming model for data access is comprehensive and incredibly powerful. While the model may not meet all of your individual needs, it is a big step forward in designing a model for the future. However, please keep in mind that ADO.NET is a beta and comes with limited documentation support.
ADO programmers will benefit the most from this beta because they are already familiar with many aspects of ADO.NET, including the highest level of abstraction-the inspiring model. ADO.NET code is not compatible with existing ADO code, but the set of functionalities is similar. To take full advantage of ADO.NET, you should put some effort into understanding the concept itself, rather than simply figuring out the fastest way to port your code. Whatever .NET programming model you choose-Windows Forms, Web Forms, or Web Services-ADO.NET will be there to help you with data access issues.

http://msdn.microsoft.com/library/default.asp?URL=/library/techart/exchangeanchor.htm

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish