Have you seriously considered what data mining could do for your company? I've read SQL Server Books Online (BOL), walked through a few demos, and perused some Web sites about data mining, but I've never built an end-to-end data-mining solution. If you're like me, data mining sounds cool, but you're not exactly sure what it is—let alone what to do with it.
I think of data mining as one of two types of data analysis under the umbrella of business intelligence (BI). The most common type of BI today is online analytical processing (OLAP), which is a presentation and data-aggregation technology that lets you visualize and interact with your data in ways that you can't in traditional SQL reporting environments. An effective OLAP tool connected to a world-class OLAP cube lets you browse your data, drill down and around in flexible ways, and ask questions about what the data means. But an OLAP tool doesn't automatically find the valuable but hidden highlights in your data. You still need to know what you're looking for.
To me, the Holy Grail of data mining is the ability to discover information and patterns you didn't know existed in your data so that you can make better business decisions. In data mining, you set pattern-seeking algorithms loose on your data, and the algorithms do the work, bringing to light interesting and important relationships in your data. Once you know about those relationships, an OLAP tool can help you analyze them.
Sounds great! Unfortunately, data-mining technology in today's market tends to be too hard to use and too expensive for most companies. Most serious data-mining environments require users to have a firm foundation in advanced statistical techniques just to make heads or tails of the results. However, Microsoft has taken a different approach to data mining, simplifying the process and making it affordable for the masses.
With SQL Server 7.0 OLAP Services, Microsoft became the first top database vendor to include OLAP technology in the database system itself at no additional cost. In many ways, this approach to delivering data-analysis tools has been responsible for the growth in OLAP awareness over the past few years. Whether or not you use Microsoft OLAP tools, the availability of free OLAP functionality in SQL Server has motivated other vendors to offer more-manageable, more cost-competitive OLAP solutions. With SQL Server 2000, Microsoft added a set of integrated data-mining technologies. Although I don' know many people who are using this technology in production systems today, the release of data-mining technology in SQL Server 2000 Analysis Services was a watershed event—the first time a major vendor made a concerted effort to bring data mining to the masses.
SQL Server's data-mining technology still needs to be easier to use and more functional. I can't share details about new data-mining functionality planned for the Yukon release of SQL Server, but suffice it to say, the SQL Server team is working on some practical and cool enhancements. And a new project I'm working on will give me the chance to dive into real-world data-mining issues and to share those experiences with you.
In the meantime, I encourage you to start soaking up any available information about data mining. This technology won't change the nature of your business overnight, but it will have a profound affect on the way you interact with your data in the future. A great place to start is the 149-page guide to "Preparing and Mining Data with Microsoft SQL Server 2000 and Analysis Services". This guide also offers a 35M download of sample code that you can dig in to. SQL Server's in-the-box data-mining functions provide some unique opportunities for database professionals to add significant value to customers and employers—and to add value to their own careers by being among the first people in the market to understand how to apply data mining to a wide range of needs.