In "Using Information Schema Views," February 2002, I showed you a partial solution for using information schema views to obtain a report of all the foreign keys in a database when you want to see all the relationships between your tables. I assumed I should easily be able to do this with one or more of the information schema views. My original solution used three different views and a five-way join. However, this solution wasn't the best solution nor was it correct in all situations.
In last month's column, I used the code that Listing 1 shows to create two tables: tab1 and tab2. Tab1 has a composite primary key made up of columns a and b, and tab2 has a composite foreign key made up of columns c and d. Then, I used the query that Listing 2 shows to find all the foreign keys and their corresponding primary keys. Listing 2 qualifies the query by adding a WHERE condition so that the query accesses only foreign keys in which the referencing table is tab2. Figure 1 shows the results of the query.
However, because the keys are composite, this solution doesn't work. The solution shows each column of tab2's foreign key with each column of tab1's primary key, as if I were doing a cross join. I want to match the first column in tab1's primary key with the first column in tab2's foreign key and match the second column in tab1's primary key with the second column in tab2's foreign key. Figure 2 shows the result that I want.
To obtain this desired result, I need another condition that will return only rows in which the column positions are the same. In the tables, columns c and a are in position 1, and columns d and b are in position 2. When I wrote last month's column, I could think of no way to produce the desired result by using only information schema views. But contributing editor Itzik Ben-Gan, a colleague and technical reviewer for T-SQL Solutions, supplied a solution, which Listing 3 shows. Ben-Gan's solution uses the KEY_COLUMN_USAGE information schema view, which contains a column called ORDINAL_POSITION and also contains the table name and the column name that the key constraint is attached to. Without this information schema view, I had to join the REFERENTIAL_CONSTRAINTS view to the CONSTRAINT_TABLE_USAGE view to get the table name, then join REFERENTIAL_CONSTRAINTS to the CONSTRAINT_COLUMN_USAGE view to get the column name. With Ben-Gan's solution, one join produces both pieces of information for a table. In my solution, I had to join each of the CONSTRAINT views to the referencing table and then to the referenced table. Ben-Gan's solution joins REFERENTIAL_CONSTRAINTS to KEY_COLUMN_USAGE twice, which produces a straightforward three-way join. Thus, Ben-Gan's solution reduces the number of joins by two.
Although Ben-Gan's solution is designed to work for composite foreign keys, it also works for single-column keys because if a key contains only one column, the ordinal position of the key will be 1 in both tables.
Looking for an Impossible Example
Although I was glad to find a more efficient solution to my problem, I was worried. In this column, I had planned to show you a solution to the compositeforeign key problem that uses the sysforeignkeys system table; I thought this solution was an example of a situation in which you have to directly access a system table. I had a couple of other possible examples in mind also, but I started to wonder whether I could solve all of my example problems by using information schema views. However, I've found one problem that I'm reasonably confident I can't solve by using a Microsoft-supplied method: obtaining the definition of a computed column.
In SQL Server 2000 and 7.0, you can define a column in a table as a computed column. The column definition is based on a computation and is recomputed each time a query accesses that column. For example, in the Pubs database, the titles table contains price and sales information for each book in the database. If you want to store that information—along with the gross sales for each book—in a separate table, you can create the table and populate it as Listing 4 shows.
Although the priceinfo table in Listing 4 is defined with four columns, the INSERT statement inserts values only for three of those columns. The fourth column, gross, is the computed column and has no value stored in the table. However, every time you select from the priceinfo table, SQL Server computes the values in the gross column and makes those values available to your query.
When you use the sp_help system stored procedure to look at this table's definition, you'll see that this column is a computed column, but you can't see what the definition is. You can get the definition by using Enterprise Manager's Design Table window—but what if you want to get the definition programmatically? The code that makes up the column's definition is stored in the syscomments system table. Syscomments has an ID column that shows the ID of the object it's storing definition information for. The number column usually refers to a version of a stored procedure or trigger, but if the object is a table, the number column represents the ID of the column that has a definition. You can use the syscolumns system table or the COLUMNS information schema view to find the name of the column that has that column ID. The syscolumns system table also has a column called IsComputed, which has a value of either 0 or 1 that denotes whether the column is computed.
I did my best to verify that I couldn't use an information schema view to find the definition of a computed column. Because the syscomments table also contains the definitions of stored procedures, functions, and views, I looked for an information schema view that contained these definitions and found one called ROUTINES. However, when I examined ROUTINES, I found that it contains nothing but stored procedures and functions—it contains no computed column definitions. An information schema view called VIEWS contains the definitions of all views, but I couldn't find any computed columns in VIEWS either. Then, I widened my search. Because the columns containing the definitions of functions, stored procedures, and views are all defined as data type nvarchar(4000)—the largest possible nvarchar column—I used the following query to search the COLUMNS information schema view for all columns defined as data type nvarchar(4000):
SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE DATA_TYPE = 'nvarchar' AND CHARACTER_MAXIMUM_LENGTH = 4000
I found view and routine definitions, and I saw that column defaults and check constraints were defined as nvarchar(4000); but I found no information about computed columns. I finally realized that because computed columns aren't ANSI standard, their definitions might not have a place in the ANSI-defined information schema views. Using the syscomments system table might be the only way to retrieve the definition of a computed column. In support of my conclusion, the COLUMNS information schema view contains no information about whether a column is a computed column, but the system table syscolumns does contain that information.
The code that Listing 5 shows returns the names of all a database's computed columns and their definitions. The output from this query shows that the syscolumns.IsComputed column is actually a computed column. The column definition shows that SQL Server computes the column's value by examining a bit in the syscolumns.colstat column. The following expression returns a 1 if colstat has the 4 bit on and a 0 if the 4 bit is off:
(CONVERT(int,sign((\[colstat\] & 4))))
You could use the code in Listing 5 to create a stored procedure that returns a computed column, then you could pass in values for the table name and column name. The procedure would look similar to the code that Listing 6 shows. But this code is only an example; to use this procedure in production code, you need to add error checking to verify that only real table and column names were passed in.
Information schema views in SQL Server 2000 are powerful; by using them, you can get most of the information you want without accessing the system tables, a practice that Microsoft discourages. When you learn to properly use system and property functions and system stored procedures, you reduce your need to access the system tables even further. However, I don't feel comfortable telling you to ignore the system tables entirely; a few kinds of information are still available only through the system tables. Even if you can write most of the code you need to access system information and metadata without using the system tables, the system tables still contain a wealth of information about how SQL Server keeps track of metadata. Also, some clever and useful T-SQL coding examples are in the system stored procedures, and many of these examples access the system tables. To understand how system stored procedures perform their magic, you must understand the system tables.
If you want to be the best SQL Server administrator you can be, learn all that you can about how SQL Server stores its system information. Explore all the system procedures, system functions, and information schema views, and become familiar with the information in the system tables. Such essential, in-depth knowledge might be what you need someday to solve a baffling puzzle in your SQL Server system.