XML for Analysis


Jump into .NET analysis, but bring your SOAP

If you don't already know, let me be the first to tell you: SQL Server 2000 Analysis Services is based on a client/server architecture. That fact is important because client/server architectures are designed for LAN environments, not Internet applications. Client/server architectures are quickly going out of style because most midsized to large companies require enterprise applications that are easy to deploy and maintain and that support a mixture of Internet and intranet users. This requirement, among others, is driving Microsoft's overall product direction toward the componentized Web architectures of .NET. (For details about Microsoft's .NET strategy, see Michael Otey, "Untangling Microsoft .NET, " November 2000.) So, what happens to Analysis Services in the .NET world? Microsoft recently released the XML for Analysis specification, which is available through a software development kit (SDK) and will bridge the two worlds of client/server and .NET. Let's look at why you need the XML for Analysis SDK and explore an example of how to use it.

First, let's examine Analysis Services' client/server architecture so that we understand its strengths and limitations. Analysis Services has two primary components—the OLAP Server and the client software, PivotTable Service—which you can deploy on PCs that are connected across a network. PivotTable Service offers two programming interfaces, ADO MD and OLE DB for OLAP, to the front-end application, which typically resides on the client computer. As with other two-tier applications, these interfaces' strength is their performance. When used with these interfaces, Analysis Services can do a great deal of processing and caching on the client computer, thereby reducing network traffic and the burden on the server. Although the burden on the server isn't heavy, Analysis Services' scalability is limited because each client requires a constant connection to the server, and each server connection eats up valuable server resources. PivotTable Service's deployment and maintenance requirements are also drawbacks. PivotTable Service requires Microsoft Data Access Components (MDAC) 2.6, so most client PCs must install roughly 14MB of client software, not including the analysis application.

Microsoft designed XML for Analysis for an n-tier application. Figure 1 shows the usual distribution of components in such an application. PivotTable Service usually resides on a Web server along with Microsoft IIS; the client applications access the XML for Analysis objects through an Active Server Pages (ASP) page that resides on IIS. The client application uses Simple Object Access Protocol (SOAP) to communicate with XML for Analysis. SOAP is a COM-like mechanism that lets distributed software components discover each other's capabilities and use XML messages to interact through HTTP across the Internet.

The XML for Analysis architecture has several advantages over the standard two-tier Analysis Services architecture. One big advantage is that the client-to-Web server protocol is usually stateless. This means that the Web server doesn't have to remember clients between requests and therefore doesn't need a database connection for each client. Instead, the Web server pools a collection of database connections and reuses them for each client request. Fewer active sessions means less overhead for handling large numbers of clients. Another big advantage is that the client computer requires no special software. The client application just needs to be able to use SOAP to communicate with XML; no ADO or OLE DB database drivers are necessary. An application as simple as an HTML page with embedded script can use SOAP to communicate with XML for Analysis.

XML for Analysis' biggest disadvantage is probably its slower performance. In the two-tier Analysis Services configuration, PivotTable Service caches cube metadata and can respond quickly to numerous application requests for metadata. Such requests might be "What are the dimensions in this cube?" or "What children does this dimension member have?" In the XML for Analysis configuration, each of these requests requires a round-trip to the server. In addition, the XML messages that pass between the client and the Web server are very large. By my estimates, they're probably 10 times as large as the equivalent binary messages that OLE DB for OLAP uses. The messages are so large not only because the information is packaged as a string (similar to HTML) but also because each part of the information is wrapped in opening and closing XML tags. When an application is communicating through COM to PivotTable Service, metadata information is tightly packaged in binary buffers.

However, these limitations of XML for Analysis aren't prohibitive. If you consider them when designing an application, they're manageable. For example, to reduce the amount of traffic to the server, the application might cache some cube metadata. Also, any enterprise analysis application contains a variety of user profiles to which you can tailor your analysis solution. Typically, the largest group of users in an enterprise requires the least from an analysis application. You can meet these users' needs with simple static or parameterized reports. You might also have a small group of sophisticated users who require high performance and powerful analysis capabilities. You can best meet these sophisticated users' needs with a traditional client/server application running within an intranet (or VPN) environment. Finally, an intermediate group usually requires some exploratory capabilities but might also want the freedom of Internet access. You can best serve these users through XML for Analysis.

Using the XML for Analysis SDK

Now let's look at how to set up and use the XML for Analysis SDK. I started with Windows 2000 Professional and Analysis Services (server and client) installed. The next piece you need is the Microsoft XML Parser (MSXML) 3.0, which you can find at http://msdn.microsoft.com/xml/general/xmlparser.asp. Next, install IIS, which is an optional component of Win2K that you can find in Control Panel, Add/Remove Programs under Add/Remove Windows Components. Finally, you need the XML for Analysis SDK. I received an early copy, but you can download the final version at http://www.microsoft.com/data.

My copy of the SDK includes four directories. One contains presentations, another contains samples, a third contains the setup for XML for Analysis, and the final directory contains the programming specification. Running the setup program in the setup directory creates another directory that contains the ASP and DLL files that XML for Analysis requires. This runtime portion of the SDK is installed by default into C:\Program Files\XML for Analysis SDK. As Figure 2 shows, after installing the ASP and DLL files, you need to create a virtual directory in IIS named XOLAP that points to the XML for Analysis SDK directory. You now have everything you need to create an XML for Analysis application.

If you develop an application in Visual Basic (VB) 6.0 that uses XML for Analysis, you'll probably want to use the SDK's sample DLL, called SoapClient. This DLL has two published methods—Discover and Execute—that match the methods in the XML for Analysis specification. With these two methods, you can access all of XML for Analysis' features. You use the Discover method to retrieve OLAP metadata or information about the OLAP provider (e.g., PivotTable Service). You use Execute for executing MDX statements such as a query or a cube write-back.

To start a new VB 6.0 project, be sure to register the SoapClient.dll component in the Samples directory. Then in VB, add project references to the libraries Microsoft XML, v3.0 and SoapClientLibrary. You're now ready to write VB code to issue Discover and Execute methods and build your analysis application.

Remember that every Discover or Execute method you issue is self-contained; XML for Analysis doesn't maintain connection or state information between method calls. Therefore, you must provide all the connection information with every XML for Analysis method call, unlike when you're programming with the ADO interface. With ADO, you establish a connection to the server and reuse it with many commands before releasing it. Because XML for Analysis is stateless, it can share resources among clients and scale to support many clients. However, in some situations, maintaining a session is important—for example, when you want to create a calculated member or set, then use the member or set with subsequent MDX queries. Perhaps an even more compelling scenario happens during a write-back to a cube. Several write-back commands might together make one transaction, so you need to group them together with one commit operation. XML for Analysis supports stateful operation to handle these situations. The BeginSession command returns a SessionID, which the client can use to issue subsequent commands before finally issuing an EndSession command.

Let's look at a sample metadata request from the DemoSoap program that comes with the XML for Analysis SDK. This example uses the Discover method to request a list of the dimension names for the FoodMart 2000 Sales cube:

strResult = Discover("MDSCHEMA_DIMENSIONS", strRestrictions, strProperties)

The Discover method has three input parameters: Request-Type, Restrictions, and Properties. ADO programmers might see some similarities between this Discover call and the ADO Connection.OpenSchema method. The primary difference is that the Discover call's parameters are packaged inside XML tags. The first parameter, RequestType, identifies the type of discovery operation you're requesting. In this case, MDSCHEMA_DIMENSIONS means you're requesting the dimension schema rowset. The Restrictions parameter uses a list of XML name and value pairs to restrict the list of dimensions to be returned. To restrict the list of dimensions to the FoodMart 2000 catalog and the Sales cube, you use the XML code that Listing 1 shows.

The final parameter, Properties, identifies the context under which the application will make the Discover call. Because XML for Analysis is usually stateless, this parameter passes information that the server would otherwise establish and maintain with the connection. Listing 2 contains an example of the XML that the application passes for the Properties parameter.

Figure 3 shows an excerpt from the result of the MDSCHEMA_DIMENSIONS request. Like all XML that XML for Analysis returns, the first part of the response is a schema definition for what will follow. This schema definition helps the client determine which information to include in the result and with what organization. In Figure 3, I included only one of the dimension rows (the Customers dimension) for brevity.

Using the Execute method to issue commands to XML for Analysis is much like issuing Discover method calls. Execute has two input parameters, Command and Properties. The Command parameter is just an MDX string, and the Properties parameter is the same information that you passed in the Discover method. This Execute call is the equivalent of the ADO method cellset.Open. The result of any MDX query is also XML, but it's significantly more complex than the dimension rowset example.

Here's an important thing to note about executing MDX queries that return large resultsets. When you're using ADO in a client/server environment, PivotTable Service calculates only the portion of the result that the application requests. So if your application displays only a portion of the result on the screen, you need to request only that portion. This factor can save a lot of time when you're executing large queries with ADO. With XML for Analysis, this kind of optimization is possible but slightly more difficult. If you know when you initially execute a query that you want only a screenful of information, you can use a property set called BEGIN_RANGE and END_RANGE. These properties specify by ordinal number which cells you want to have returned. Later, if a user scrolls down to see more of the result, your application can issue the same query again and simply adjust the range properties. As you can imagine, if an MDX query returned a 100,000-row result, this approach could save you a lot in performance and memory usage.

XML for Analysis is an important part of Analysis Services that lets you use Analysis Services in a scalable n-tier environment designed for Internet applications. XML for Analysis can also help you use Analysis Services to build .NET Web Services. For more information about XML for Analysis, see the press release at http://www.microsoft.com/presspass/press/2000/dec00/xmlbasedprotocolpr.asp. You can also join the public newsgroup on the topic: microsoft.public.data.xmlanalysis.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.