Break Through the Data-Access Maze

Choose the smartest path for your data-access challenges.

asp:cover story




Break Through the Data-Access Maze

Choose the smartest path for your data-access challenges.


By Jeffrey Hasan and Kenneth Tu


ASP.NET applications present unique challenges for data access, particularly when they include or access XML Web Services. ADO.NET is a data-access technology that's, by now, familiar to most .NET developers. Its usage is well documented in numerous books and articles, and there's little new to add as far as introducing the technology. What's less documented, though, is how to choose the best data-access option among the often confusing array of choices ADO.NET provides. The best choice usually depends on two factors: the type of data being retrieved (or updated), and the required transport mechanism for returning query (or update) results back to the calling application.


For example, consider one simple scenario: code behind a Web page that queries a table and binds the results to a DataGrid on the page. In this case, the query result may be retrieved in any way conducive to binding to a DataGrid, including as a DataSet, DataReader, or XML. The query result doesn't need to be transported or serialized, so the data-access options are flexible.


Now consider a more complex scenario: A Web method queries a table and returns the results over the wire as XML. In this case, the result must be serialized to XML, so the data-access options are more limited; a DataSet is a good choice because it serializes easily to XML. But a DataReader does not, so it's not an option.


At the simplest level, you can optimize data access by considering how data needs to be retrieved, used, and transported. Later in this article, we'll provide a decision-support flow diagram for data access, which can help you choose the best data-access approach for a given scenario.


ASP.NET applications - especially XML Web Services - present challenging data-access scenarios that require difficult designs. In this article, we'll discuss the best ways to access data from SQL Server databases and present a decision-support system for choosing optimal data-access approaches.


Connect to SQL Server 2000

The .NET Framework provides several data providers optimized for specific data sources. The SQL Server .NET Data Provider is best for SQL Server 7.0 and later because it communicates using SQL Server's native Tabular Data Stream protocol. This data provider is 30 to 40 percent faster than a comparable OLE DB data provider because it avoids using an intermediate OLE DB layer. The SQL Server data provider is included in the System.Data.SqlClient namespace. (For earlier versions of SQL Server you must use .NET's standard OLE DB data provider, which is included in the System.Data.OleDbClient namespace.)


Before you can access data, you must open an optimized connection to the data source. For SQL Server data sources, you can do this in two ways. You could set connection-string parameters that specify the data source address and the transaction context, or you could implement connection pooling to reuse existing connections. Here's an example of an optimized connection string for the Northwind database in a SQL Server instance named MySQLDB:



    database=northwind; enlist=false;

     packet size=8192;connect timeout=300;


Figure 1 provides optimization details for several important connection-string parameters.







The name or network address of the SQL Server instance to which you connect. Use an IP address instead of a DNS name; this avoids name resolution, which increases the time necessary to make the connection.



When 'true', the connection pooler enlists the connection automatically in the creation thread's current transaction context. If the application is not using transactions, set to 'false' for better performance.

Packet Size


Size, in bytes, of the network packets used to communicate with an instance of SQL Server. When used with SQL Server, you can fine-tune this parameter for communication performance by adjusting this value based on the size of the data packets being transferred between client and server. Range is 512-32767 bytes.

Connect Timeout


This parameter limits the wait time when a connection attempt fails. Also improves overall performance by avoiding deadlocked connections that are unavailable to participate in a connection pool.

Figure 1. Here's a summary of optimized parameter settings for a SQL Server database connection string.


Some debate exists about the best location for storing database connection strings in ASP.NET applications. Our preference is to store them as custom application settings in the web.config file:



        value=" server=\MySQLDB;uid=sa;


                enlist=false;packet size=8192;

                connect timeout=300;" />


Then you can retrieve the connection string from the code behind:


Imports System.Data.SqlClient


Dim strConn As String = _


Dim sqlConn As SqlConnection = New SqlConnection(strConn)



This approach provides the best flexibility because you can customize the connection string for separate deployments of the same application without rebuilds. For example, the staging copy of an application can point to a development database, whereas the production copy of an application can point to the production database. ASP.NET caches the web.config file, which minimizes lookup times on the connection string.


Pool Your Connections

ASP.NET applications use database connections inherently for small durations of time. These applications typically open a connection, execute a query, return a response quickly, and release the connection back to the pool for other instances to use as soon as possible. The default number of connections is 100, so if the connections are released back to the pool properly after usage, the default setting should be more than enough for an ASP.NET application. You can adjust this number to meet the needs of the application. Be forewarned - setting this number too low could cause an empty connection pool, such that further requests for a connection will hang your application until one becomes available in the pool. To release a connection properly, invoke the Connection object's Close method, which releases the resources used by the connection and returns the connection back to the pool. Figure 2 provides optimization details for several important connection-string parameters related to connection pooling.





Connection Lifetime


The connection lifetime value for each connection returned to the pool. A value of zero (0) causes pooled connections to have the maximum timeout.

Connection Reset


When set to 'true', the database connection is reset when removed from the pool.

Max Pool Size


The maximum number of connections allowed in the pool.

Min Pool Size


The minimum number of connections maintained in the pool.



When 'true', the connection is drawn from the appropriate pool or, if necessary, created and added to the appropriate pool.

Figure 2. Here's a summary of optimized connection-string parameter settings for SQL Server database-connection pooling.


Now that you can optimize your database connections, let's look at the two main data-access objects the .NET Framework provides: the DataReader and DataSet objects. We'll discuss their relative advantages and how to pick the right object for a specific ASP.NET data-access scenario.


DataReader vs. DataSet

The DataReader object provides an unbuffered, sequential data stream for fast, efficient, read-only access to a set of records. The DataSet object takes a virtual snapshot of a database and stores it as a disconnected, in-memory representation of a collection of data tables, relations, and constraints. Most importantly, the DataSet is integrated tightly with XML, and it can serialize relational data to and from XML while preserving both data and schema information.


The DataSet object handles more complex data-access scenarios and provides more data-transport options compared to a DataReader object. A big advantage of the DataSet is it may be disconnected from the underlying data source and operated on without a persistent database connection. Also, it can perform complex operations on data - such as sorting, filtering, and tracking updates - without requiring custom code. Unlike the DataReader, the DataSet may be cached in memory and retrieved at any time. It also can provide multiple, alternative views of the same data, where each alternate view is stored as a subset of records in separate DataView objects. Finally, the DataSet offers the distinct advantage of serializing its relational data and structures to XML automatically. This lets a client application receive serialized relational data from a Web service method as XML. The data then can be transported across the wire and hydrated on the client into a new DataSet object.


The main disadvantage of the DataSet object is its relatively heavy use of server resources, including memory and processing time, compared to the DataReader. The DataSet can't compete with the efficiency of the DataReader for read-only operations, but it does provide multiple capabilities that the DataReader does not.


The DataReader and DataSet objects provide distinct advantages for different data-access scenarios. Figure 3 highlights the three most common data-access scenarios in ASP.NET applications: read-only, update, and data delivery. The DataReader object is appropriate for the first scenario, which requires fast, read-only data access without caching or serialization. DataReader objects aren't designed for transport, nor can they operate without a persistent database connection. The DataSet object is appropriate for the second and third scenarios because it supports update data access and enables relational data to be represented and transported as XML.


Figure 3. Here are the three most common data-access scenarios in ASP.NET applications: read-only data access, update data access, and data delivery from a Web Service method. This diagram shows the interactions between database, Web server, and client.


XML Support in the DataSet Object

Three important scenarios might require you to serialize relational data from a DataSet to XML:


     Exchanging relational data with XML Web Services, which includes both receiving XML from a Web method as well as passing XML back to a Web method;

     validating data against a specific XSD schema;

     and generating typed DataSets.


The DataSet object provides support for all of these scenarios. We'll briefly consider each in turn.


Serialize a DataSet as XML: The .NET Framework makes it trivial to exchange a DataSet between a Web service and a client application because it serializes the DataSet to XML automatically. The Web method's return argument simply needs to be a DataSet type:


  Public Function GetProductList() As DataSet

            ' Code to generate a hydrated DataSet

End Function


This approach works if you don't need to work directly with the DataSet's XML. But if you do, the DataSet provides good support for serializing its data out to XML. The DataSet provides a method named GetXml, which generates an XML representation of the relational data automatically in a hydrated DataSet. In addition, the DataSet object provides a method named GetXmlSchema, which generates schema-only information without the actual data. This listing shows an example of how to write a DataSet's schema information to a file:


Imports System.IO


Sub GenerateSchemaFile(sqlDS As DataSet)


        Dim xsdFile As File

        If xsdFile.Exists("C:\temp\products.xsd") Then _


         Dim sw As StreamWriter

        sw = New StreamWriter("C:\temp\products.xsd")

        sqlDS.Namespace = "urn:products-schema"




    End Try

End Sub


Validate a DataSet against a specific XSD schema: An XSD schema file describes the structure of a relational data set fully, including the data types it contains. XSD schema files are excellent validation tools to ensure a DataSet contains valid relational data. This is especially useful when you are adding additional records to a DataSet and need to verify that both the data types and field names are valid. The DataSet provides a method named ReadXmlSchema for importing an XSD schema file. If an invalid data type or field name is assigned, the DataSet will raise a specific SqlException. Figure 4 shows an example based on the Products table in the Northwind database.


Imports System.Data

Imports System.Data.SqlClient

Imports System.Xml

Imports System.IO


Sub ValidateSchema()

    ' Purpose: Validate a DataSet using an XSD schema file

    Dim sqlDS As DataSet



         ' Step 1: Create a blank DataSet

        sqlDS = New DataSet()


        ' Step 2: Import an XSD schema file into the DataSet

        Dim myStreamReader As StreamReader = New _




         ' Step 3: Manually add a product correctly

        Dim dr As DataRow = sqlDS.Tables(0).NewRow()

        dr("ProductID") = 200

        dr("ProductName") = "Red Hot Salsa"

        dr("Discontinued") = False



         ' Step 4: Manually add a product incorrectly

         ' by seting ProductID to a string (expects Int32)

        Dim dr As DataRow = sqlDS.Tables(0).NewRow()

        dr("ProductID") = "XJ8" ' Invalid assignment

        dr("ProductName") = "Red Hot Salsa"

        dr("Discontinued") = False



    Catch sqlError As SqlException


     End Try


End Sub

Figure 4. This listing shows how you can validate a DataSet using an XSD schema file.


Step 4 in Figure 4 generates a SqlException because the data type of the Product ID is set incorrectly, based on the schema definition. The exception message will be:


Couldn't store in ProductID Column.

Expected type is Int32.


Generate typed DataSets: A typed DataSet is a class that inherits from the DataSet object and incorporates a specific XSD schema. It provides strongly typed accessor methods that correspond to the fields in the underlying recordset. Typed DataSets reduce programming errors because they provide specific field references and enforce the data types for each field. For example, typed DataSets prevent you from assigning a string value to a date field.


The sample ASP.NET project that accompanies this article provides code listings for each of the data-access scenarios outlined in this article, including an example of a typed DataSet.


Use Typed Accessors

We've focused most of our attention on the DataSet object, but the DataReader object deserves one more mention so we can highlight a specific optimization for reading data from SQL server. If you know the underlying data type for a column, you can achieve a performance gain by using the DataReader's typed accessor methods. These methods - GetSqlDateTime, GetSqlInt32, GetSqlString, to name a few - access the column values in their native data types to offer the best performance because they reduce the amount of type-data conversion required to retrieve a column value. Also, they prevent common type-conversion errors and help keep the data precise. Here is a code example of how to use the GetSqlInt32 method:


While (sqlDR.Read())

    strResults = sqlDR.GetSqlInt32(0).ToString()

End While


For a complete list of typed accessor methods, refer to the SqlDataReader Members documentation in the .NET Framework Class Library Documentation or in MSDN Library | .NET Framework SDK | .NET Framework | Reference | Class Library | System.Data.SqlClient | SqlDataReader Class | SqlDataReader Members.


In closing, see Figure 5 for a decision-flow diagram that captures a broad range of data-access scenarios. It breaks down the decision-making process for choosing the appropriate data-access approach. Design decisions largely are dictated by the format in which data must be exchanged (that is, XML vs. non-XML), as well as whether the data access must be read-only or requires updates. This diagram can't capture all factors, but it does give some guidance when you make a final data-access design decision.


Figure 5. This decision-flow diagram for data access suggests primary and alternate data-access methods based on a particular scenario.


We've reviewed the most relevant ASP.NET data-access scenarios and demonstrated how to choose the best approach. The best way to optimize data access is to recognize the nature of the scenario at hand and have a full understanding of the relative advantages and disadvantages of a given approach. For more information, we recommend you reference the chapter on Data Access in our recent book, Performance Tuning and Optimizing ASP.NET Applications (Apress).


The sample code in this article is available for download.


Jeffrey Hasan and Kenneth Tu are technical architects and software developers who specialize in Microsoft technologies at InfoQuest Systems (, a leading provider of business intelligence applications and services for the telecommunications and broadband industries. Their primary expertise is in .NET enterprise application development, with a special focus on developing enterprise Web applications using ASP.NET. They recently co-authored Performance Tuning and Optimizing ASP.NET Applications (Apress). Read more about this and other publications at E-mail Jeffrey Hasan at mailto:[email protected].


Free Bonus Content

You can learn more about optimizing your data-access code in Chapter 3 of Jeffrey Hasan and Kenneth Tu's Performance Tuning and Optimizing ASP.NET Applications. This chapter is available free online (Adobe Acrobat format).




Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.