A View to Automation

Downloads
43672.zip

Think about the last time you had to update a table that contained a large number of columns. Now, imagine that the table is under construction and you're responsible for adding or deleting columns or modifying columns' data specifications. And, of course, this situation occurs when you're on a tight schedule and have other, more important problems to solve—what a headache! But you can automate a solution to this problem by combining the basic capabilities of SQL Server stored procedures with SQL Server's underlying data dictionary.

A stored procedure is a group of T-SQL statements compiled into one execution plan. Using stored procedures has several benefits. First, stored procedures let a developer efficiently reuse code. Once you create a stored procedure and store it in the database, anyone who has access to it can call the procedure any number of times, so you don't have to keep rewriting the same piece of code. The result is more productive use of a programmer's time. Second, stored procedures enable faster query execution. Because a stored procedure is precompiled and SQL Server reuses the resulting execution plan, when you use stored procedures, you get significant savings in time and overall CPU utilization. If an application repeatedly calls the same stored procedures, the application will retain those stored procedures in memory, thus saving additional time—especially when compared to the time required to call the equivalent T-SQL code many times. Third, using stored procedures reduces overall network traffic. Instead of transmitting many lines of T-SQL code, SQL Server transmits only the name of the stored procedure it's calling, thus optimizing the use of network bandwidth. Finally, using stored procedures enhances security controls. Even if users don't have permissions to access specific tables, you can grant them permissions to execute a stored procedure that references those tables. This method of limiting table access can save administrative effort.

Even though the benefits of using stored procedures are valuable, you still have to expend time and effort to write the procedures. Writing stored procedures for UPDATE, INSERT, and DELETE operations is time-consuming, and the process is full of opportunities to introduce errors. For example, a typical stored procedure for an UPDATE operation contains several parts: a CREATE PROCEDURE statement; a parameter list, which can be lengthy, depending on the number of columns in the table; an UPDATE statement; a SET list, which identifies the list of columns the stored procedure will update; and a WHERE clause to list the criteria that specify which rows the procedure will update. Similarly, a stored procedure for inserting rows into a table consists of the following parts: a CREATE PROCEDURE statement; a parameter list; an INSERT statement; a list of columns that identifies the columns the procedure will insert; and a VALUES clause to list the criteria that specify which rows the procedure will insert. Creating a stored procedure for deleting rows from a table is relatively simple: It requires only selection criteria as parameters and a WHERE clause to limit the number of rows the procedure will delete.

Now, think about the time you could save by automatically generating parameter and SET lists that contain all the information required for each column. You can make this automatic generation happen by taking advantage of the data dictionary or metadata that SQL Server stores in information schema views in the master database.

Information Schema Views

A database's information schema view contains metadata for all data objects in that database. In SQL Server 7.0, Microsoft added information schema views to the list of system objects and thus made SQL Server compliant with the ANSI SQL-92 standard. Figure 1 shows a list of the SQL Server 7.0 information schema views. These views, which are available in the master database only, are the ones that we refer to in this article. The COLUMNS view contains information about table columns and data types. The CONSTRAINT_COLUMN_USAGE and TABLE_CONSTRAINTS views contain information about constraints, constraint type, and the columns involved in a particular constraint. The constraint we're interested in is a primary key constraint. The primary key constraint columns require building the WHERE clause for UPDATE and DELETE stored procedures.

We make two important assumptions in the program that Listing 1, page 34, shows, which we use to generate stored procedures. First, we define the stored procedure name as p_<Function>_<Table Name> where the function is UPDATE, INSERT, or DELETE. If the table name contains a space, the code will replace the space with an underscore character (_). In the example code, a space in a table name such as My Table causes problems when you already have a table called My_Table because you end up with two tables that have the same name. Second, the code defines the parameter name as a concatenation of the at character (@) and column name; the code replaces spaces in the column name with underscore characters.

Walking Through the Code

The solution that Listing 1 shows generates a stored procedure that updates a table. To create a stored procedure that performs an INSERT or DELETE against the table, you use the same basic logic. You can download code to create all three types of stored procedures at http://www.sqlmag.com, InstantDoc ID 43672.

The T-SQL code at callout A in Listing 1 begins by determining whether the stored procedure we're creating already exists. If the stored procedure already exists, the code drops it. The code uses the CREATE PROCEDURE command at callout B to generate the sp_CreateUpdateProc stored procedure, which contains two parameters: the table name and the table's owner. The name of the stored procedure and the parameters it passes are commented so that you can select this commented text with a mouse rather than rekeying it, as we explain later. The block of code at callout C declares all required variables and concludes by setting the table owner to DBO if the owner was null.

Next, the code has to determine whether the object with the requested name exists and make sure the object is a table. The query at callout D accomplishes this task by retrieving the object type from the INFORMATION_SCHEMA.TABLES view. The code uses the system function Object_ID with @objectname as a parameter to determine whether the object ID exists (i.e., is not null). If the object ID doesn't exist, the code issues the error message "OBJECT @objectname DOES NOT EXIST." Similarly, the code at callout E references the system function Object_ID to determine whether @objectname is a table. If @objectname isn't a table, the code generates the error message "OBJECT @objectname IS NOT A TABLE."

In the next few steps, the code extracts column information from the INFORMATION_SCHEMA.COLUMNS view. To begin this process, the code at callout F creates a temporary table, #TB_SCHEMA, to keep this column information. The code scans this table three times to build a parameter list, a SET statement, and search arguments. Table 1 shows the structure of this temporary table.

Now, the code at callout G populates the #TB_SCHEMA table by using a SELECT statement. Note that the column PK_Flag is initialized to 0. The UPDATE statement that callout H shows sets PK_Flag to 1 for all primary key columns in the #TB_SCHEMA table. The code identifies primary key columns by querying two views: INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE and INFORMATION_SCHEMA.TABLE_CONSTRAINTS. Let's examine these two views. INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE contains constraint names and cross-referenced column names. INFORMATION_SCHEMA.TABLE_CONSTRAINTS contains information about constraint properties, including constraint type. By joining these two views to the #TB_SCHEMA table, you can identify which columns of #TB_SCHEMA belong to the primary key constraint.

The code at callout I determines whether the primary key constraint exists. If the constraint doesn't exist, the code issues the error message "Object "table_name" DOES NOT HAVE A PRIMARY KEY" The next step is to remove the columns that the code can't reference in the UPDATE statement, as the code at callout J shows. These column data types are text, ntext, image, timestamp, rowversion, and the identity column, which isn't a primary key constraint. Note that the code can reference an identity column in an UPDATE statement as a search argument, but not in a SET clause. At this point, the #TB_SCHEMA table contains all the required information for the code-generation process. In the CREATE PROCEDURE T-SQL statement that callout K shows, the generated stored procedure name is based on the table name. First, the code drops the stored procedure if it already exists. If the table name contains spaces, the code replaces those spaces with an underscore character.

Next, the code at callout L, page 36, generates the parameter list. To obtain data for the parameter list, the code extracts data from the #TB_SCHEMA temporary table it built earlier. Each row in this table corresponds to a separate parameter. You can use one of two methods to perform row-by-row processing. One approach is to declare a cursor. The second approach, which we believe is more elegant and efficient, is to generate columns that contain unique values.

As you've probably noticed, the temporary table has an identity column called RowNo. To generate those unique-value columns, callout L's code first uses the MAX() and MIN() functions to retrieve the maximum and minimum values of the RowNo column. These values will become the last and first rows in the temporary table. Then, the code retrieves row data by using the @RowNo value it just obtained for that row as a key. The code starts building the parameter list based on the data type of each column involved. The code uses the @pRowNo variable to store the current row it's processing. The final SELECT statement in callout L's code retrieves @RowNo using the MIN() function again and the current @RowNo value as the search argument where the next row number is greater than the current row number. This section of code also checks that it isn't processing the last row, and if it isn't, the code appends a comma to each line to separate the items in the list. The code continues to process the BEGIN...END block until it has no more rows to retrieve. The final steps in callout L's code generate the definition for @ERRORCODE and generate a comment line showing where to put any additional validation logic you need.

The block of code at callout M builds the UPDATE statement. Building this statement requires traversing the #TB_SCHEMA table one more time to build the WHERE clause and SET list. To build the WHERE clause, the code uses the primary key constraint as a search argument. If the column that you're updating belongs to the primary key constraint, the code appends to the WHERE clause a string in the format <column name> = @<column name>; if the column doesn't belong to the primary key constraint, the code prints the string as a part of the SET clause.

The final section of Listing 1, which callout N shows, is for error handling. This basic code contains one interesting feature: the use of the OBJECT_NAME() system function. Passing @@PROCID as a parameter to this function will return the name of the stored procedure that Listing 1 created. To provide a fully descriptive error message while still keeping the code generic, you need to display the stored procedure's name in the error message. Note that you can easily modify the error-handling code and the WHERE clause to meet your application's requirements.

Using the Application

Now, you're ready to use the application in Listing 1 to create a stored procedure that updates a table. To use the application against the Pubs database, you perform six simple steps. First, launch Query Analyzer and paste Listing 1's T-SQL code into the query window. Click the Execute Query icon or type F5 or Ctrl-E to run the code. Next, select the commented text sp_CreateUpdateProc 'authors'and right-click to execute the stored procedure that the code generates. Then, copy the results of the sp_CreateUpdateProc stored procedure from the output window into a new query window and execute the stored procedure. Under the Pubs tree view, view the stored procedures that are in Pubs and press F5 to refresh the view. The screen will display the generated dbo_p_Update_Authors stored procedure. You can now execute the generated stored procedure to perform the updates you require.

Using this T-SQL application to create stored procedures can save you a lot of time and eliminate errors when you're updating, inserting, or deleting rows in a table that has many columns. An added advantage is that when you use the generated stored procedure, SQL Server can reuse the execution plan and thus speed up your queries' execution time.

Comments

Plain text