Recently I've heard some interesting statements from people who are trying to decide whether OLAP or a relational-reporting tool is best for supporting their business intelligence (BI) needs. One person believes that because relational-reporting tools are a mature technology, they provide all the data he needs without specialized analytics. Another person wonders whether he can structure his OLAP cube to replicate the same reports that he previously generated from a relational-reporting tool. It's easy to get caught up in a particular technology and forget when it's appropriate to use that tool. OLAP and relational-reporting tools are both useful, but they work differently. You need to understand the strengths of each technology so that you can decide which one is best for your specific requirements.
Most people recognize that relational reporting is a more mature, well-understood, and widely used technology than OLAP. In addition, relational reporting scales well to large numbers of users and has a lot of flexibility. Relational reporting focuses on the problem of making a fixed number of reports available to large numbers of users. This solution provides advanced formats for online or hard-copy reports and lets you schedule and cache reports. Formatting, scheduling, and caching all make sense when you want a finite set of reports that will be reused many times.
But relational reporting isn't a slam-dunk solution because since relational reporting was originally created, the data needs of decision makers have become increasingly complex. Leaders must make decisions more quickly and take more information into account for every decision. Generally, more people are involved in making decisions today than in the past. An organization that had 25 decision makers 50 years ago might have 2500 decision makers today.
Relational reporting is a great answer to the problem of delivering data to your users, but sometimes, simply delivering data isn't enough. OLAP goes beyond simple delivery and helps you present large amounts of data to many users in ways that let them manipulate the data so that they can see trends and relationships more quickly. Can you imagine what an online search would be like if Internet search engines served up only a small fixed number of reports like most relational-reporting tools do? Search engines would be virtually worthless because the Internet contains too much information to include in a finite number of reports, and the contents and structure of the information change too frequently. Most business problems that have these characteristics are best solved when you use an OLAP application.
Figure 1 illustrates the process that decision makers go through to make a decision. The decision-making process is divided into two major phases: the data phase and the human phase. The data phase consists of all the steps in data delivery, and the human phase includes the steps that the decision maker goes through after receiving the data.
To better understand these steps, imagine you're a corporate department head and you receive a quarter-end expense report. On the report, you notice a huge jump in travel expenses for your department over the past quarter. You request another report that shows line-item details about travel expenses, and you figure out the problem is that employees are using multiple methods to book travel. After consulting with the accounting department, you institute a company policy that requires travelers to book their trips through the company travel agent to help ensure that travel costs return to normal levels.
Several things happened on the way to this decision. First, the company captured travel-expense data near the time the expenses occurred. Then, a period of time passed before the quarterly expense report was generated and distributed to you. This is the extraction, transformation, and loading (ETL) lag that Figure 1 shows. After you received the report, more time passed while you reviewed and discovered the problem; this time is the discovery lag. Finally, you needed more time to retrieve a line-item report and arrive at a decision to change the travel policy. This last time period is the decision-making lag.
In this example, relational reporting addresses only the data-delivery part of the process. Fortunately, to make your travel-policy decision, you needed only a limited amount of data; so two reports provided the information you needed to discover the root cause of the problem. For a more complex problem—say your gross margins in Europe have been slowly eroding over the past five quarters—you might require an enormous amount of data to find the root cause of the problem.
The part of decision making that relational reporting doesn't address is the human phase. Three primary drivers—problem abstraction, business-logic handling, and the discovery process—determine how long the human phase of the process will take. Let's see how OLAP applications address each of these drivers.
Problem abstraction is the biggest advantage that OLAP applications have over other technologies such as relational reporting. What I mean by abstraction is that the terminology of the application is appropriate for the problem at hand. Abstraction lets business decision makers navigate data without the help of a data expert. When you use a relational-reporting tool to retrieve data, you deal with relational tables, columns, rows, joins between tables, foreign keys, and other basic concepts. These concepts are appropriate for a database designer, not a business analyst. When an analyst retrieves data in an OLAP application, the application presents data as concepts such as geographies, fiscal time periods, customers, product lines, and financial metrics. These are business concepts appropriate for solving business problems. Just think how ineffective an architect would be if she were forced to design a new building by thinking only in terms of boards, drywall, nails, bricks, and the like. An architect needs to describe the problem by using architectural concepts such as floors, walls, doorways, windows, and elevations.
The second concept that OLAP handles well is business logic—the rules for how you view data specific to your business or business segment. What metrics do you use to determine the health of your business or your functional department? Examples might be revenue per sales rep, revenue per head count, or if you're in retail, maybe sales per square foot of retail space. These metrics are what the BI industry calls key performance indicators (KPIs). A KPI is one type of business logic. Another type of business logic is selection or grouping. For example, how do you define your top customers? Is it by revenue, growth, profit, or perhaps some combination of the three? Selection logic encompasses the steps you perform to select items from your data that you want to treat as a group. The group could be problem products, risky customers, or second-tier customers. You define the names and the logic behind each group.
The business logic your company uses today is probably embedded in SQL statements that generate the reports you distribute to decision makers. So what happens when a decision maker needs a new report that uses the currently accepted definition of the top 10 customers? Typically, the DBA who's generating the report finds an old report that contains the 50 lines of SQL that define the top 10 customers, and she pastes that SQL fragment into the definition of the new report. If business logic isn't embedded in SQL, it's probably embedded in a few hundred Excel spreadsheets on decision makers' desktops.
At my company, we describe such a situation by using the term spreadsheet anarchy. Spreadsheet anarchy refers to the problem that arises when business logic is created and recreated hundreds of times on decision makers' desktops for analysis. Spreadsheet anarchy fights against the need to have a single version of the truth. How can you be sure that decisions are being made consistently throughout your organization if you don't know that everyone has the same definition of your key performance metrics or selection criteria (e.g., top customers)? Modern OLAP applications let you centrally define and reuse your business logic. You can imagine the benefits of having just one definition of top 10 customers and reusing that definition across multiple databases and analytic applications. Then, when the situation warrants it, you can change the definition of the top 10 customers in one location and be certain that all the uses of that logic have been updated.
The third area in which OLAP applications are different from relational-reporting applications is in addressing the discovery process of the human phase of decision making. OLAP provides three crucial ways to facilitate the discovery process: organization, visualization, and navigation. A user of an analytic application can organize, visualize, and navigate data in ways that let her quickly discover problems and find solutions.
OLAP addresses users' organization needs by presenting large amounts of data in multiple dimensions where each dimension is structured in one or more hierarchies. Users can focus on the dimensions (e.g., products, time, customers) that are most relevant to the problem at hand, then view each dimension at the appropriate level of summarization. For example, if I want to understand how a product's profitability changed over the past few months, I need to view only the time dimension (by months) and individual products. I exclude all other dimensions and dimension levels to eliminate any information that isn't relevant to what I want to understand—in other words, to shorten my discovery process.
Visualization is a primary differentiator between relational reporting and OLAP. Most relational reports are distributed as a grid of numbers. A good OLAP tool provides a full suite of business charts as well as other visualization tools so that you can present data in the most appropriate manner. For example, what's the best way to view your sales-force locations relative to the size and location of sales opportunities? It's a geographical map, not a grid of numbers. Of course, you can take the grid of numbers from your relational reporting tool and drop them into PowerPoint and create whatever visualization you want. But that kind of visualization is static, not dynamic.
OLAP is about navigation through lots of data. The assumption in an OLAP application is that no matter what data you're looking at, you'll always have more questions. You need to be able to select a piece of data from a chart or geographic map and drill down to view more details. Drilling down doesn't just mean the next layer of data in the dimension you're viewing; drilling down can mean getting more detail from another dimension. For example, if you're looking at the past four quarters, you might want to drill down and see the top customers in the third quarter. If you're viewing your data in a static PowerPoint slide, you can't explore.
I could discuss many more ways in which relational reporting and OLAP are different, but this article gives you some of the most important differences. Now, when someone tells you he's building an OLAP cube that uses a reporting tool to generate reports, you can tell him he's missing the point. Using an OLAP cube to generate traditional reports helps to optimize the data phase of the process—the same phase that relational reporting has been working on for the last 30 years. But OLAP goes beyond simple data delivery to facilitate the human phase of the decision-making process. Likewise, if someone says the new version of her relational-reporting tool has all the analytic features you'll ever need, ask her how well the tool addresses abstraction, centralized business logic, and the discovery process. Those areas are the design center for OLAP tools.
In most organizations, decision makers aren't interactively exploring data. They're either delegating the job or still relying on static reports about their business. So what does that mean for the future of OLAP? Remember that 10 years ago, most people had never used the Internet and had no idea what a search engine was. I assert that we're in the same situation with analytics today. In 10 years, most business people will be using analytics every day, and they'll have a tough time imagining how they performed their jobs effectively back in 2003.
I'm sure you can tell that I prefer OLAP applications to relational-reporting tools, but even I recognize that the world has a place for relational reporting. When a problem calls for delivery of small amounts of data to large numbers of users, no solution is more appropriate than relational reporting. But these days, problems that require small amounts of data are rare. I'd love to hear your opinions or experiences regarding the topic of relational versus OLAP, so please email me at [email protected]