Every once in a while, I feel the urge to retreat from my career life and spend time on projects closer to home. So a few months ago, I decided to create a database for my music CD collection. During the course of the task, I discovered that design limitations are everywhere. Let me share with you a few of my experiences.
Finding a Sample Database
Inventory-CD day started out bright and early. I was full of energy and optimism, thinking that if I worked diligently at cataloging my CD collection, I would be able to finish the project in 2 days. I jumped into the job, searching first for a ready-made database that would help speed up the inventory process. The Microsoft Access sample databases in the Office 97 installation package fit my needs. I selected Music Collection (one of the sample databases), then sat back and waited while the Create Database wizard generated the database and all the forms and reports for me. What a life, I thought. Now all I have to do is start loading data into the database.
Figure 1 shows the Main Switchboard window that opened at the end of the database-generation sequence. The Switchboard is a form that contains command buttons for each major function of an Access application; it's equivalent to the top level of an application's user menu. I clicked Enter/View Recordings to open the main form, Recordings, and started entering my data. Interestingly, even though you might think that a recording is a single track of an album, the template database uses the term "recording" to mean the entire album or CD. For each CD, I entered such information as recording title, artist, music category, and release year, as Figure 2 shows. The form even includes fields for individual track number, title, and duration. Recordings-to-tracks, a simple one-to-many (1:M) relationship—what more could I want? I was about to find out.
Hitting a Snag
I zipped along, cataloging CD after CD, then encountered a problem that brought me to a screeching halt. As I began to enter information for a new CD, I realized that it was the one CD in a hundred that challenged the Music Collection's database design. It was a compilation CD, a CD that had a different recording artist for each track.
This column is about database design, so it has to include a lesson and a happy ending. And sure enough, it does. Let's do the lesson first.
Learning the Lesson
Microsoft designed the Music Collection database for a typical CD that showcases one artist, one of the artist's many recordings, and one recording's many tracks. Figure 3 shows the physical data model that represents the two 1:M relationships. (If you're following along with your own copy of Access, from the menu bar choose Tools, Relationships to open this relationship window.) Microsoft's design resembles my own quick-and-dirty designs that help me get moving on a project. However, as this article's problem illustrates, I usually have to go back and restructure the physical data model to accommodate an atypical case—in this example, a compilation CD that has one recording, many tracks, and a different artist for each track. Now I have to modify the database before I can get back to data entry.
Let's start first with requirements, which you should collect and analyze before you begin to design your database. A music collection database should be able to completely inventory your music so that you can search by artist, by title, or by category. Then, you can assemble a playlist from your findings. For example, if you want to construct a playlist of all Alasdair Fraser tracks, you should be able to do that, even though the tracks might be scattered over many CDs, including compilation CDs.
Now that I know what I want the database to do, how can I make it happen? Fortunately, the Music Collection database is easy to reengineer. Take another look at Figure 3, the physical model of the original design. This model shows relationships that represent one artist to many recordings and one recording to many tracks. But I need a model that shows one recording to many tracks and one track to one artist, so I have to rework the model until it incorporates these relationships.
Redesigning the Database
Figure 4 shows the reengineered physical model that gives me the flexibility to conduct the searches that I want. In the revised design, Recordings and Recording Artists are in a many-to-many (M:N) relationship that intersects at Tracks. The revised model's components tell me that one recording has many (one or more) tracks, and that one artist sings or plays many tracks. But wait—do the model relationships represent the one-track, one-artist condition that I needed? Yes, because if I read the design from Tracks to Recording Artists, the design reads that one artist performs on one track.
As you analyze most database designs, you find design problems that prevent you from doing something that you need to do, that restrict what your company can do, or that restrain your organization from moving in a potentially profitable direction. Then, as you resolve these discrepancies, you often find that you need to analyze and redesign your database repeatedly until it's so complex that you must rewrite all the production applications that use it. Case in point: I could analyze and redesign the sample music database to address yet another complication—songs that duos or trios have recorded. I'm not talking about groups, but about such combinations as Nat King Cole and Natalie Cole, Barbra Streisand and Kris Kristofferson, James Galway and the Chieftains, and the trio of Linda Ronstadt, Dolly Parton, and Emmylou Harris. If you were a purist, you would argue that my redesign doesn't go far enough—after all, more than one artist can perform on a single track. The ultimate solution would be to make Tracks-to-Recording Artists a M:N relationship. But the ultimate solution is too detailed for the purposes of this article, so I'm going to stop at the redesign that Figure 4 shows. Now I need to make a few adjustments to the forms that accompanied the sample database so that I can continue entering data.
Figure 5 shows the new version of the Recordings form. Making the necessary changes didn't take long—less than 30 minutes. Compare Figure 5 to Figure 2, the original Recordings form. The biggest change is in the Recording Artist's position on the form. In Figure 2, Recording Artist was part of the main form. In Figure 5, RecordingArtistID and Artist Name are part of the Track subform. And, despite appearances, I haven't introduced redundancy into the Track table. The Recording Artist table populates the Artist Name field based on whichever RecordingArtistID value I choose. For readers who are familiar with Access code and combo boxes, the control source for the Artist Name field looks like this:
When I enter track data into the subform, I use the combo box to select the artist (the drop-down list displays both Recording-ArtistID and Artist Name). The Track table stores the RecordingArtistID, and the Recording Artist table produces the Artist Name.
If you think that I'm being paranoid about careful design, consider this: The sample Music Collection database that wouldn't work for my own personal needs is now part of the Windows 2000 collection of applets, proving that inadequate database design will find its way into production systems everywhere. If you have a Win2K computer and a compilation CD, place the CD in your CD-ROM drive. The Win2K CD Player reads the unique identifier on your CD and automatically connects your computer to Microsoft's Internet-based index of CD and track information. The player downloads CD and track information onto your hard disk and into an .mdb file that looks amazingly like the sample Music Collection database. When I place my compilation CD in my CD-ROM drive, the Recording Artist entry for the CD appears as Various Artists. If you're interested in checking out the .mdb file, it's located in C:\documents and settings\%username%\application data\Microsoft\CD player\deluxeCD.mdb.
Benefitting from Good Design
Rest assured that every time you take a shortcut in database design to save time, you guarantee the need to redesign the database in the future. The most important lesson you can learn from this article is that no matter how small or how personal a database might be, it's never too small nor too personal to benefit from good design. You're much better off taking a few minutes up front to do the job right than to end up spending much time and many dollars trying to correct design failings later on.