Storing Computations

Downloads
22091.zip

Using computed columns makes your code more efficient

Editor's Note: Send your experts-only T-SQL tips to Itzik Ben-Gan at [email protected]. If we use your tip in the magazine, you'll receive $100 and an exclusive T-SQL Black Belt shirt.

Using expressions in queries is a common practice. But did you know that you can base a column in a query's resultset on a computation rather than simply using the query to return a base column from a table? If you use the same computation in many queries against the same table, storing the computation in the base table as a computed column shortens your queries and results in less required code maintenance. For example, suppose you need to calculate the gross and net values of each order row in the Northwind sample database's Order Details table. You could perform the computations as part of your query as follows:

SELECT
  OrderID,
  ProductID,
  Quantity,
  Quantity * UnitPrice AS GrossValue,
  Quantity * UnitPrice * (1 - Discount) AS NetValue
FROM \[Order Details\]

Or, if you have many different queries that retrieve the gross and net values, you might prefer to add GrossValue and NetValue computed columns to the Order Details table:

ALTER TABLE \[Order Details\]
  ADD GrossValue AS Quantity * UnitPrice,
      NetValue AS Quantity * UnitPrice * (1 - Discount)

Now you can refer to the GrossValue and NetValue columns in the same way that you'd refer to any other columns in your table:

SELECT OrderID, ProductID, Quantity, GrossValue, NetValue
FROM \[Order Details\]

Note that adding a computed column to a table doesn't necessarily mean that SQL Server stores the computation result on disk. By default, SQL Server evaluates the computation at runtime, when the query that refers to the computed column is invoked. SQL Server 7.0 introduced computed columns; SQL Server 2000 added the ability to create indexes on computed columns, so the results can be stored on disk. An index on a computed column saves the CPU time needed to perform the calculations and allows efficient filtering and sorting where appropriate.

A client once presented me with an interesting problem related to computations. (I've seen similar requests in the public technical newsgroups.) The client needed to perform a different computation for each row of a certain table, with each computation involving several columns. Computed columns aren't adequate for this scenario because you can't store a different computation for each row in one computed column. To see how to solve this problem, let's look at a generic table called Computations, which you can create by running the script that Listing 1 shows.

The table's computation column, which holds a Unicode character string, will store computations that refer to any of the columns arg1, arg2, or arg3. Some examples of possible values you can store in the computation column are

N'@arg1 + @arg2 + @arg3'
N'@arg1 * @arg2 - @arg3'
N'CEILING(1. * @arg2 / @arg1)'

(The N preceding each string stands for National, denoting a Unicode literal.)

For reasons that will become apparent, I'm using variables—not just the column names—to represent the columns involved in the computations. I haven't found a way to calculate the computation results as part of a single SELECT statement without using cursors. Instead, I added a column called result to the table and wrote a trigger that, upon an INSERT or UPDATE, performs the computation and stores the result in the result column. Listing 2 shows the first part of the trigger, including the header, a couple of validity checks, and a variable-declaration section.

First, the trigger checks whether the INSERT or UPDATE operation that activated it affected any rows; if not, the trigger doesn't need to do anything more. If the operation did affect a row, the COLUMNS_UPDATED() function checks whether the arg1, arg2, arg3, or computation column was modified. The function returns a binary string in which each bit represents a column in the table. The bits representing modified columns are turned on. The integer value 120 has the fourth, fifth, sixth, and seventh bits turned on, representing the fourth through seventh columns in the table. A bitwise AND (&) computation between COLUMNS_UPDATED() and 120 results in a value greater than 0 only if at least one of these columns was modified. If neither the columns involved in the computation nor the computation itself was modified, the trigger doesn't need to proceed. The variable-declaration section includes variables to hold the values of the following columns: the key_col column, each column that might participate as an argument in the computation, the column that holds the computation itself, and the column that holds the computation result.

Next, the trigger performs a loop that iterates through all rows in the Inserted table, which holds all rows that were inserted or modified in the base table, Computations. Listing 3 shows the loop's code (excluding the body, which I discuss in the next paragraph). The code first initializes the variable @key with the minimum key_col value in the Inserted table. The loop iterates through all keys by setting the value of @key to the Inserted table's next minimum key_col value that's greater than the current value of @key.

The body of the loop, which Listing 4 shows, performs the main computational activity. The code fetches the values of the input arguments and the computation from the current row in Inserted and stores these values in local variables. Note that the computation is prefixed with 'SET @result = ', meaning that the trigger will store the computation result in a variable that's used as an output parameter.

Now the code needs to dynamically perform the computation stored in the @comp variable and pass the value from the @result variable inside the computation to the trigger's @out_result variable. To achieve this result, you can use an undocumented feature of the sp_executesql system stored procedure, which lets you use output parameters. SQL Server MVP Umachandar Jayachandran discovered this undocumented feature by looking at the procedure's code in the master database. The script that Listing 5 shows demonstrates how sp_executesql dynamically performs a computation and returns an output parameter. The script's simple example uses sp_executesql to calculate the product of two input arguments and stores the result in an output parameter called @r.

The first argument that sp_executesql accepts is the statement that you want to execute dynamically. The second argument is a string that holds the list of input and output parameters that the statement uses. All the other arguments that sp_executesql accepts are assignments of values to the parameters that the statement uses. Note that the output parameter inside the statement is called @r and the variable that the calling batch assigns to @r is called @r_out. Both when declaring @r and when assigning it the variable from the calling batch, you have to specify the keyword OUTPUT to indicate an output parameter.

In Listing 4's code, the trigger uses sp_executesql to dynamically execute the statement stored in the @comp variable. The trigger assigns to the statement's arguments the values fetched from the Inserted table and stores the results in the variable @out_result. Then the trigger code simply has to update the result column in the relevant row in the Computations table with the value stored in @out_result.

After creating the trigger, you can test this solution by running the script that Listing 6 shows. This script inserts three rows into the Computations table and displays the table's contents, which Table 1 shows. If you update the computation or argument columns, the trigger updates the result column accordingly. For example, the following code adds 10 to the arg1 column in all rows, then displays the contents of the Computations table:

UPDATE Computations
  SET arg1 = arg1 + 10
  
SELECT * FROM Computations

Table 2 shows this query's output.

The sp_executesql system stored procedure lets you perform computations dynamically and even use output parameters. Without the ability to perform the computations dynamically, maintaining a different computation for each row in the table would be unnecessarily complicated. You'd have to write your own parser as a user-defined function (UDF) or in the client application, which isn't a simple task. One great benefit of using sp_executesql as in the above example is that you can store any valid T-SQL expression as a computation, including those that use built-in T-SQL functions.

Comments

Plain text