Out-of-Process Caching in ASP.NET

Since its inception, ASP.NET has had a native Cache object, which is designed to make it easy for developers to store global data accessible from any present and future sessions. The Cache represented a quantum leap from the old-fashioned Application object, which is still supported in ASP.NET 4 for backward compatibility. The Application object is a plain global dictionary with minimal locking capabilities; the Cache object lets you prioritize cached items, supports various types of dependencies, and lets you periodically scavenge the memory and get rid of unused items. The Cache object is capable of automatically reducing the required amount of memory in case of pressure. Furthermore, it supports the same simple dictionary-based programming interface as Session state.

So what’s wrong with the Cache object? The Cache object is only ideal for applications hosted in a single server environment. An instance of the Cache class is created on a per-AppDomain basis and remains valid until that AppDomain is up and running.

If you’re looking for a global repository that works across a Web farm architecture, the Cache object isn't for you. It's interesting to note how the sentiment about this point has changed in the past few years. If you read ASP.NET books published three or more years ago, you'll likely find that the Cache object's limitations were clearly pointed out but no solution was offered or hinted at. This is because the number of Web applications requiring large caching across a Web farm was negligible up until a few years ago. Today, we live and code in a radically different world. Caching is a constant requirement and more often than not it's required by applications hosted in a farm. How would you replace the Cache object?

The first option to consider is Microsoft’s AppFabric Caching Service. If this approach doesn’t work for you, you can consider other valid options—both commercial (ScaleOut, NCache) and open-source (Memcached, SharedCache). Let’s find out more about the AppFabric Caching Service.

Beyond the Native ASP.NET Cache

Because the native ASP.NET Cache works in the context of the worker process, there’s no guarantee that in a farm scenario two successive requests are served by the same machine and therefore can access the "same" cache. As mentioned, until recently this fact didn’t look like a real problem for the ASP.NET team to address. However, Web applications today need an out-of-process cache that, like the Session state, multiple worker processes can share.

The out-of-process cache must also be easy and quick to scale out and offer a unified view to the client application. In other words, extending the native ASP.NET Cache object is not simply a matter of hosting it in an external process. An effective distributed cache offers advanced capabilities such as high availability and replication that depend on a specific network topology. To go beyond the ASP.NET native Cache object, you need a new infrastructure with both new hardware and software elements.

Windows Server AppFabric offers extensions to Windows Server that improve the application infrastructure and make it possible to run applications that are easier to scale and manage. One of the extensions is just AppFabric Caching Service, which provides Microsoft’s long-awaited distributed cache solution.

Introducing the AppFabric Caching Service

The AppFabric Caching Service is an out-of-process cache that combines a simple programming interface with a clustered architecture. It's articulated in two levels—the client cache and distributed cache, as Figure 1 shows.

The client cache, a component that you install on the Web server machine, represents the gateway used by ASP.NET applications to read and write through caching services. The distributed cache includes some cache server machines, each running an instance of the AppFabric Caching service and each storing data according to the configured topology. In addition, the client cache can optionally implement a local, server-specific cache that makes access to selected data even faster. Note that the data found in this local cache is not kept in sync with the data in the cluster.

The configuration of the cluster is saved in single store, which could be an XML file located on a shared network folder, a SQL Server database, or a custom store such as another database system. If you’re using an XML file, it will be named clusterconfig.xml and it will have content similar to what Figure 2 shows.

The configuration script for the servers in the cache cluster contains the name of the cluster and general settings such as size and data eviction policies. In addition, the configuration contains the list of servers and relative names and ports. You create and configure the cluster during the installation of AppFabric on the first server machine. The size of the cluster is expressed using relative indicators such as small, medium, or large. The size of the cluster doesn’t limit the number of cache servers you can run, but it sets some internal parameters that control the overall performance of the cluster. Small means fewer than 5 computers; large targets clusters with more than 15 computers. You can’t change this setting once in place; if your cluster grows, or you missed your estimations completely, the only alternative is reinstalling the services or going with a possibly poorly optimized service.

Creating Data Caches

Each server in an AppFabric configuration can have one or multiple caches of data, each of which can be replicated for high-availability purposes. A data cache is simply a logical way of grouping data. Each enabled server has at least one default data cache. You must ensure that at least one default data cache exists before you can start using the service. If not, it's your responsibility to create a default data cache.

You can configure a data cache individually through the cache section you see in Figure 2. This is useful for replicating the content of that data cache across the cluster's servers. With AppFabric, you don’t have to worry about which server plays the role of the primary server and which ones are backup servers for a data cache. You simply specify whether a given cache has to be replicated in the cluster. You enable replication through the secondaries attribute, as shown below:

<caches>
      <cache secondaries="1" name="Dino">
         :
 </caches>

Creating a data cache is a manual operation that requires access to the PowerShell console. You use the New-Cache command. The full syntax is shown below:

New-Cache \\[-CacheName\\] <String> 
           \\[-Eviction <String>\\] 
           \\[-Expirable <String>\\]
          \\[-Force \\[<SwitchParameter>\\]\\]
          \\[-NotificationsEnabled <String>\\]
          \\[-Secondaries <Int32>\\]
          \\[-TimeToLive <Int64>\\] 
          \\[<CommonParameters>\\]

Once you've successfully installed AppFabric you’re half way home because adapting your code to it is little work. Let’s look at the AppFabric Caching API.

Using the Caching API

To use AppFabric in your ASP.NET pages, you need to configure the Web server environment first. This requires adding a section to the application’s web.config file:

<dataCacheClient>
   <localCache isEnabled="false" />
     <hosts>
       <host name="YourMachine" cachePort="22233" />
       :
     </hosts>
</dataCacheClient>

You also need a few AppFabric assemblies referenced by the Web application so that you can safely declare the dataCacheClient section in the configuration. For some reason, the AppFabric assemblies don’t show up in the Visual Studio 2010 list of available assemblies. You have to manually pick them up from the %System%\AppFabric directory. You need two assemblies: Microsoft.ApplicationServer.Caching.Core and Microsoft.ApplicationServer.Caching.Client.

An AppFabric client can connect to all listed hosts. The AppFabric infrastructure tracks the placement of cached objects across all hosts and routes your client straight to the right host when a request for a particular cached object is made. With the client settings script entered in the application’s configuration file, here’s all that you need to grab a given data cache:

var factory = new DataCacheFactory();
var dinoCache = factory.GetCache("Dino");
var defaultCache = factory.GetDefaultCache();

From this point, you can use the object returned by GetCache in much the same way you would use the native Cache object of ASP.NET. But with just a little difference: You now can access information stored across an expansible cluster of servers:

dinoCache\\[key\\] = value;

You might want to call the factory and get cache objects only once in the application, preferably at startup. Compared to the ASP.NET Cache, AppFabric doesn’t offer dependencies, either on other cached items or on files or time. However, AppFabric offers you other capabilities that you can use to simulate dependencies on cached items. First, you have regions.

A data cache region is a logical way of grouping data within a given data cache. A region can only be created programmatically so that each module of the application can reasonably manage its own segment of the data cache with more ease. As you might know, a region is primarily a way to restrict the space where a cached piece of data can be retrieved. Related to regions are the search capabilities of AppFabric solutions. Data stored in a region can be tagged and ad hoc methods exist for you to perform tag-based queries.

A region doesn’t take up much space per se so it doesn’t cause any significant application overhead. Consider, however, that a region is limited to one particular host and subsequently a region that is particularly large and frequently accessed soon becomes a bottleneck because its content can’t be moved around the cluster to increase scalability. Regions support high availability. This means that secondary copies of a region are available on different hosts; the primary content of each region, however, is entirely contained on a single host. Furthermore, you can configure a cache to support asynchronous notification of some cache events. You enable this feature through the <serverNotification> element in Figure 2. In addition, each client application should tweak the poll interval to make it take the desired number of seconds:

<dataCacheClient>
   <clientNotification pollInterval="10" /> 
    :
</dataCacheClient>

A server notification can be sent for a variety of cache operations including insertion, deletion, and update of regions and individual items. Cache notifications also provide the extra service of automatically invalidating locally cached objects. When you receive a change notification your code can take some initiative and perhaps remove some related items—this offers an effective workaround for the lack of dependencies in AppFabric. You can restrict the scope of notifications you want by registering a proper callback.

Pick the Right Caching Layer

When you have a very slow data access layer that assembles data from various sources (e.g., relational databases, SAP, mainframes, documents) and when you have an application deployed on a Web farm, you definitely need a caching layer to mediate access to data and distribute the load of data retrieval on an array of servers. The Cache object of ASP.NET is simply inadequate and, in similar scenarios, I encourage you to use the AppFabric Caching Service instead.

In this article, I just scratched the surface of the AppFabric Caching Service. In future articles, I’ll investigate in more detail the notification mechanisms and, in particular, how to design an effective caching layer for your ASP.NET applications.

Comments

Plain text