asp:cover story
LANGUAGES: VB .NET
ASP.NET VERSIONS: 1.0 | 1.1
Break Through the Data-Access Maze
Choose the smartest path for your data-access challenges.
By Jeffrey Hasan and Kenneth Tu
ASP.NET applications present unique challenges for data access, particularly when they include or access XML Web Services. ADO.NET is a data-access technology that's, by now, familiar to most .NET developers. Its usage is well documented in numerous books and articles, and there's little new to add as far as introducing the technology. What's less documented, though, is how to choose the best data-access option among the often confusing array of choices ADO.NET provides. The best choice usually depends on two factors: the type of data being retrieved (or updated), and the required transport mechanism for returning query (or update) results back to the calling application.
For example, consider one simple scenario: code behind a Web page that queries a table and binds the results to a DataGrid on the page. In this case, the query result may be retrieved in any way conducive to binding to a DataGrid, including as a DataSet, DataReader, or XML. The query result doesn't need to be transported or serialized, so the data-access options are flexible.
Now consider a more complex scenario: A Web method queries a table and returns the results over the wire as XML. In this case, the result must be serialized to XML, so the data-access options are more limited; a DataSet is a good choice because it serializes easily to XML. But a DataReader does not, so it's not an option.
At the simplest level, you can optimize data access by considering how data needs to be retrieved, used, and transported. Later in this article, we'll provide a decision-support flow diagram for data access, which can help you choose the best data-access approach for a given scenario.
ASP.NET applications - especially XML Web Services - present challenging data-access scenarios that require difficult designs. In this article, we'll discuss the best ways to access data from SQL Server databases and present a decision-support system for choosing optimal data-access approaches.
Connect to SQL Server 2000
The .NET Framework provides several data providers optimized for specific data sources. The SQL Server .NET Data Provider is best for SQL Server 7.0 and later because it communicates using SQL Server's native Tabular Data Stream protocol. This data provider is 30 to 40 percent faster than a comparable OLE DB data provider because it avoids using an intermediate OLE DB layer. The SQL Server data provider is included in the System.Data.SqlClient namespace. (For earlier versions of SQL Server you must use .NET's standard OLE DB data provider, which is included in the System.Data.OleDbClient namespace.)
Before you can access data, you must open an optimized connection to the data source. For SQL Server data sources, you can do this in two ways. You could set connection-string parameters that specify the data source address and the transaction context, or you could implement connection pooling to reuse existing connections. Here's an example of an optimized connection string for the Northwind database in a SQL Server instance named MySQLDB:
server=192.168.1.1\MySQLDB;uid=sa;pwd=jy87s5;
database=northwind; enlist=false;
packet size=8192;connect timeout=300;
Figure 1 provides optimization details for several important connection-string parameters.
Name |
Default |
Description |
Server |
- |
The name or network address of the SQL Server instance to which you connect. Use an IP address instead of a DNS name; this avoids name resolution, which increases the time necessary to make the connection. |
Enlist |
'true' |
When 'true', the connection pooler enlists the connection automatically in the creation thread's current transaction context. If the application is not using transactions, set to 'false' for better performance. |
Packet Size |
8192 |
Size, in bytes, of the network packets used to communicate with an instance of SQL Server. When used with SQL Server, you can fine-tune this parameter for communication performance by adjusting this value based on the size of the data packets being transferred between client and server. Range is 512-32767 bytes. |
Connect Timeout |
- |
This parameter limits the wait time when a connection attempt fails. Also improves overall performance by avoiding deadlocked connections that are unavailable to participate in a connection pool. |
Figure 1. Here's a summary of optimized parameter settings for a SQL Server database connection string.
Some debate exists about the best location for storing database connection strings in ASP.NET applications. Our preference is to store them as custom application settings in the web.config file:
value="
server=192.168.1.1\MySQLDB;uid=sa; pwd=jy87s5;database=northwind; enlist=false;packet size=8192; connect
timeout=300;" /> Then you can retrieve the connection string from the code
behind: Imports System.Data.SqlClient Dim strConn As String = _ ConfigurationSettings.AppSettings("ConnectionString") Dim sqlConn As SqlConnection = New SqlConnection(strConn) sqlConn.Open() This approach provides the best flexibility because you
can customize the connection string for separate deployments of the same
application without rebuilds. For example, the staging copy of an application
can point to a development database, whereas the production copy of an
application can point to the production database. ASP.NET caches the web.config
file, which minimizes lookup times on the connection string. Pool Your Connections ASP.NET applications use database connections inherently
for small durations of time. These applications typically open a connection,
execute a query, return a response quickly, and release the connection back to
the pool for other instances to use as soon as possible. The default number of
connections is 100, so if the connections are released back to the pool
properly after usage, the default setting should be more than enough for an
ASP.NET application. You can adjust this number to meet the needs of the
application. Be forewarned - setting this number too low could cause an empty
connection pool, such that further requests for a connection will hang your
application until one becomes available in the pool. To release a connection
properly, invoke the Connection object's Close method, which releases the
resources used by the connection and returns the connection back to the pool.
Figure 2 provides optimization details for several important connection-string
parameters related to connection pooling. Name Default Description Connection Lifetime 0 The connection lifetime value for each connection
returned to the pool. A value of zero (0) causes pooled connections to have
the maximum timeout. Connection Reset 'true' When set to 'true', the database connection is reset
when removed from the pool. Max Pool Size 100 The maximum number of connections allowed in the pool. Min Pool Size 0 The minimum number of connections maintained in the
pool. Pooling 'true' When 'true', the connection is drawn from the
appropriate pool or, if necessary, created and added to the appropriate pool. Figure 2. Here's a summary of optimized
connection-string parameter settings for SQL Server database-connection
pooling. Now that you can optimize your database connections, let's
look at the two main data-access objects the .NET Framework provides: the
DataReader and DataSet objects. We'll discuss their relative advantages and how
to pick the right object for a specific ASP.NET data-access scenario. DataReader vs. DataSet The DataReader object provides an unbuffered, sequential
data stream for fast, efficient, read-only access to a set of records. The
DataSet object takes a virtual snapshot of a database and stores it as a
disconnected, in-memory representation of a collection of data tables,
relations, and constraints. Most importantly, the DataSet is integrated tightly
with XML, and it can serialize relational data to and from XML while preserving
both data and schema information. The DataSet object handles more complex data-access
scenarios and provides more data-transport options compared to a DataReader
object. A big advantage of the DataSet is it may be disconnected from the
underlying data source and operated on without a persistent database
connection. Also, it can perform complex operations on data - such as sorting,
filtering, and tracking updates - without requiring custom code. Unlike the
DataReader, the DataSet may be cached in memory and retrieved at any time. It
also can provide multiple, alternative views of the same data, where each
alternate view is stored as a subset of records in separate DataView objects.
Finally, the DataSet offers the distinct advantage of serializing its
relational data and structures to XML automatically. This lets a client
application receive serialized relational data from a Web service method as
XML. The data then can be transported across the wire and hydrated on the
client into a new DataSet object. The main disadvantage of the DataSet object is its
relatively heavy use of server resources, including memory and processing time,
compared to the DataReader. The DataSet can't compete with the efficiency of
the DataReader for read-only operations, but it does provide multiple
capabilities that the DataReader does not. The DataReader and DataSet objects provide distinct
advantages for different data-access scenarios. Figure 3 highlights the three
most common data-access scenarios in ASP.NET applications: read-only, update,
and data delivery. The DataReader object is appropriate for the first scenario,
which requires fast, read-only data access without caching or serialization.
DataReader objects aren't designed for transport, nor can they operate without a
persistent database connection. The DataSet object is appropriate for the
second and third scenarios because it supports update data access and enables
relational data to be represented and transported as XML. XML Support in the DataSet Object Three important scenarios might require you to serialize
relational data from a DataSet to XML:
Exchanging relational data with XML Web Services, which
includes both receiving XML from a Web method as well as passing XML back to a
Web method;
validating data against a specific XSD schema;
and generating typed DataSets. The DataSet object provides
support for all of these scenarios. We'll briefly consider each in turn. Serialize a DataSet as XML: The .NET Framework
makes it trivial to exchange a DataSet between a Web service and a client
application because it serializes the DataSet to XML automatically. The Web
method's return argument simply needs to be a DataSet type: ' Code to generate a hydrated DataSet End Function This approach works if you don't need to work directly
with the DataSet's XML. But if you do, the DataSet provides good support for
serializing its data out to XML. The DataSet provides a method named GetXml,
which generates an XML representation of the relational data automatically in a
hydrated DataSet. In addition, the DataSet object provides a method named
GetXmlSchema, which generates schema-only information without the actual data.
This listing shows an example of how to write a DataSet's schema information to
a file: Imports System.IO Sub GenerateSchemaFile(sqlDS As DataSet) Try Dim xsdFile As
File If
xsdFile.Exists("C:\temp\products.xsd") Then _ xsdFile.Delete("C:\temp\products.xsd") Dim sw As StreamWriter sw = New
StreamWriter("C:\temp\products.xsd") sqlDS.Namespace =
"urn:products-schema" sw.Write(sqlDS.GetXmlSchema) sw.Close Finally End Try End Sub Validate a DataSet against a specific XSD schema:
An XSD schema file describes the structure of a relational data set fully,
including the data types it contains. XSD schema files are excellent validation
tools to ensure a DataSet contains valid relational data. This is especially
useful when you are adding additional records to a DataSet and need to verify
that both the data types and field names are valid. The DataSet provides a
method named ReadXmlSchema for importing an XSD schema file. If an invalid data
type or field name is assigned, the DataSet will raise a specific SqlException.
Figure 4 shows an example based on the Products table in the Northwind
database. Imports System.Data Imports System.Data.SqlClient Imports System.Xml Imports System.IO Sub ValidateSchema() ' Purpose: Validate a
DataSet using an XSD schema file Dim sqlDS As DataSet Try ' Step 1: Create
a blank DataSet sqlDS = New
DataSet() ' Step 2: Import an XSD schema file into the DataSet Dim myStreamReader As StreamReader = New _ StreamReader("c:\temp\products.xsd") sqlDS.ReadXmlSchema(myStreamReader) ' Step 3: Manually add a product correctly Dim dr As DataRow
= sqlDS.Tables(0).NewRow() dr("ProductID") = 200 dr("ProductName") = "Red Hot Salsa" dr("Discontinued") = False sqlDS.Tables(0).Rows.Add(dr) ' Step 4: Manually add a product incorrectly ' by seting
ProductID to a string (expects Int32) Dim dr As DataRow
= sqlDS.Tables(0).NewRow() dr("ProductID") = "XJ8" ' Invalid assignment dr("ProductName") = "Red Hot Salsa" dr("Discontinued") = False sqlDS.Tables(0).Rows.Add(dr) Catch sqlError As SqlException Response.Write(sqlError.Message) End Try End Sub Figure 4. This listing shows how you can validate a
DataSet using an XSD schema file. Step 4 in Figure 4 generates a SqlException because the
data type of the Product ID is set incorrectly, based on the schema definition.
The exception message will be: Couldn't store Expected type is Int32. Generate typed DataSets: A typed DataSet is a class
that inherits from the DataSet object and incorporates a specific XSD schema.
It provides strongly typed accessor methods that correspond to the fields in
the underlying recordset. Typed DataSets reduce programming errors because they
provide specific field references and enforce the data types for each field.
For example, typed DataSets prevent you from assigning a string value to a date
field. The sample ASP.NET project that accompanies this article
provides code listings for each of the data-access scenarios outlined in this
article, including an example of a typed DataSet. Use Typed Accessors We've focused most of our attention on the DataSet object,
but the DataReader object deserves one more mention so we can highlight a
specific optimization for reading data from SQL server. If you know the
underlying data type for a column, you can achieve a performance gain by using
the DataReader's typed accessor methods. These methods - GetSqlDateTime,
GetSqlInt32, GetSqlString, to name a few - access the column values in their
native data types to offer the best performance because they reduce the amount
of type-data conversion required to retrieve a column value. Also, they prevent
common type-conversion errors and help keep the data precise. Here is a code
example of how to use the GetSqlInt32 method: While (sqlDR.Read()) strResults =
sqlDR.GetSqlInt32(0).ToString() End While For a complete list of typed accessor methods, refer to
the SqlDataReader Members documentation in the .NET Framework Class Library
Documentation or in MSDN Library | .NET Framework SDK | .NET Framework |
Reference | Class Library | System.Data.SqlClient | SqlDataReader Class |
SqlDataReader Members. In closing, see Figure 5 for a decision-flow diagram that
captures a broad range of data-access scenarios. It breaks down the
decision-making process for choosing the appropriate data-access approach. Design
decisions largely are dictated by the format in which data must be exchanged
(that is, XML vs. non-XML), as well as whether the data access must be
read-only or requires updates. This diagram can't capture all factors, but it
does give some guidance when you make a final data-access design decision. We've reviewed the most relevant ASP.NET data-access
scenarios and demonstrated how to choose the best approach. The best way to
optimize data access is to recognize the nature of the scenario at hand and
have a full understanding of the relative advantages and disadvantages of a
given approach. For more information, we recommend you reference the chapter on
Data Access in our recent book, Performance Tuning and Optimizing ASP.NET
Applications (Apress). The sample code in this
article is available for download. Jeffrey Hasan and Kenneth Tu are technical
architects and software developers who specialize in Microsoft technologies at
InfoQuest Systems (http://www.infoquest.tv),
a leading provider of business intelligence applications and services for the
telecommunications and broadband industries. Their primary expertise is in .NET
enterprise application development, with a special focus on developing
enterprise Web applications using ASP.NET. They recently co-authored Performance Tuning and Optimizing ASP.NET
Applications (Apress). Read more about this and other publications at http://www.asptechnology.net. E-mail Jeffrey Hasan at mailto:[email protected]. You can learn more about optimizing your data-access code
in Chapter 3 of Jeffrey Hasan and Kenneth Tu's Performance Tuning and
Optimizing ASP.NET Applications. This
chapter is available free online (Adobe Acrobat format).
Figure 3. Here are the three most common data-access scenarios in
ASP.NET applications: read-only data access, update data access, and data
delivery from a Web Service method. This diagram shows the interactions between
database, Web server, and client.
Figure 5. This decision-flow diagram for data access suggests primary
and alternate data-access methods based on a particular scenario.Free Bonus Content