Imagine this scenario: You're the DBA for your company's transactional production database, and you're preparing to extract data from the database so that you can create a report for one of your company's project managers. However, you can't decide whether you should use the data from table PMBPMSRC or from table PMBPMSTRT because they contain the same two fields: OBJT and STRN. When you ask the manager who requisitioned the report which table to use, she doesn't know, either. But in the end, knowing how the data is stored in the database isn't her job—it's yours. And you realize that for assignments like this, you need a business metadata repository that contains information such as definitions and descriptions of each table and column in your database.
A metadata repository is like a Webster's Dictionary for your database. But instead of buying it at the bookstore, you have to build it yourself as you create and populate the tables in your database. If a proper business metadata repository were in place, determining the difference between the PMBPMSRC and PMBPMSTRT tables would be simple and you would be spared long hours working on your report.
Above and Beyond the Call of Data
Metadata is a familiar commodity; you've probably heard it described as "data about data." SQL Server Books Online (BOL) defines metadata as "information about the properties of data, such as the type of data in a column (e.g., numeric, text) or the length of a column. It can also be information about the structure of data or information that specifies the design of objects such as cubes or dimensions."
The BOL definition addresses technical metadata, which doesn't help you determine the difference between PMBPMSRC and PMBPMSTRT. You need a business metadata repository, which, as far as I know, isn't part of the SQL Server package.
Your business metadata repository should be a full description of the data in your database. It should describe where the data came from (its source or lineage), how it entered the database, what processes captured and stored the data, what the data represents, how and by whom the data will be used, and the relationships between data items in the database.
Figure 1, page 47, shows SQL Server's Meta Data Services SQL Repository, an object-oriented repository technology that you can integrate with enterprise information systems or with applications that process metadata. The SQL Repository contains information about data types and lengths, primary keys, foreign key relationships, and object properties and methods, but it isn't a business metadata repository. When you look at the Bin field (child object of the CELLAR table) in the properties window that Figure 1 shows, you can see that this level of technical information is valuable to a programmer. However, this type of metadata isn't useful for a nontechnical business user. The data is accessible only through Enterprise Manager or through a similar custom solution that your programming staff creates. And even if the data were readily accessible, the object terminology and cryptic nature of the hierarchical layout would test the patience of even the most organized business user.
Let's look at an alternative to using the Meta Data Services SQL Repository for business metadata. This solution uses a new feature of SQL Server 2000 called extended properties. By creating a set of extended properties for each table and for each column within a table, you can document criteria such as where, when, and how data was captured; where the data came from and whether any processing was done to it before it was put into storage in the database; and whether a data item is still active or another data item has replaced it. You can also add field captions in multiple languages for forms and reports.
If you use extended properties, the metadata possibilities are endless. For more information about extended properties, see Bill Vaughn's "Managing Extended Properties," July 2001, InstantDoc ID 20886, and Brian Moran's "Tip: Managing Data Dictionaries Based on SQL Server 2000 Extended Properties," November 2001, InstantDoc ID 22399. Moran's tip includes code written by SQL Server MVP Bob Pfeiff that creates extended-property fields for each table or column object in the database so that developers can fill in the extended-property field values as they need them. For this article's solution, I altered Pfeiff's code by adding extra extended properties.
Web Listing 1 creates the sample database for this article. (To download this article's Web listings, enter InstantDoc ID 26273 at http://www.sqlmag.com and click Download the code.) Web Listing 2 creates the usp_CreateExtendedProperties stored procedure, which appends extended properties to each table and column of the database that it is executed in, standardizes the extended-property names that are in use, and creates a business metadata repository. You can use this modified version of Pfeiff's code to create your own business metadata repository for the database that you created in Web Listing 1.
The code in Web Listing 2 retrieves a list of all user table names and column names from the user tables, then attaches to each user table and each user column the following 11 extended properties:
- CreationDate: the date the database object was added
- CreatedBy: the name of the person who added the database object
- MS_Description: the Description property from Enterprise Manager's Table Design window
- UsedFor: the main reason the data value is in the database
- DataSource: the origin of the data value
- SourceApplication: the name of the program that captured the data
- ProcNeeded4Storage: the name of the program that processes the data after capture but before storage
- StillActive: the database object's activity status
- ReplacedBy: the name of the database object that replaced an inactive database object noted by StillActive
- EnglishLabel and FrenchLabel: examples of how to store field labels in multiple languages
When you execute Web Listing 2's stored procedure, all the tables and columns in the database acquire the 11 extended properties. To see the extended properties, you need to open Query Analyzer and expand the tree until you can see the database that you've modified. Then, just place the cursor over a table or column name, right-click, and choose the Extended Properties option. Figure 2 shows the extended properties for the EMPLOYEE table.
You can add values to the extended properties in three ways. You can use Figure 2's GUI display, you can access the extended properties through Visual Basic (VB) code (your developers will probably use this method), or you can use the code that Listing 1 shows to make global changes to the extended-property values.
Note that Listing 1 inserts a data value (the first argument) into every occurrence of the extended property (the second argument) in the database. And the stored procedure that the listing creates overwrites a preexisting extended-property value. Listing 2 shows two examples of how you can use the fn_listextendedproperty() function to view the extended-property values you created. Listing 2's third example, a code statement from BOL, contains a bug and doesn't work.
Metadata for All
Business metadata offers a vital resource to everyone in the office, but simply storing the metadata in the database isn't a sufficient business solution. You need to develop a scheme to distribute the metadata to all your users, technical and nontechnical alike. You should also document your transactional database metadata as frequently as you perform backups and performance-tuning checks. If you follow this easy plan, the next time you're asked to create a report for one of your company's project managers, you'll know that you should use the data from the PMBPMSRC table.