Optimizing Use of ASP.NET Cache Functionality

ASP.NET's intrinsic cache functionality has long been one of my favorite aspects of ASP.NET. With a little bit of planning and organization, you can use it to create highly scalable applications with robust fidelity.

Related: Crafting Your Caching Layer

The Upside of Using the Cache

ASP.NET's cache can be used in a variety of ways—too many ways to cover in a single article. So I'm just going to address the caching of business objects, which can be problematic in some cases due to how difficult this makes it to unit test repositories or factories. Another caveat is that it’s possible to shoot yourself in the foot by allowing cached items to trump actual, tangible, data as it should exist within your application.

For the sake of simplicity, suppose that I have an application that tracks products that can be sold online. Furthermore, assume that while the underlying data storage mechanism for all of these products is a SQL Server database, the only way that products can be updated or added to the system is through my application. But let's also assume that this little application gets tons of traffic, and that products are fairly infrequently added or updated. And, when they're update, it's usually just to reflect something critical, such as a change in inventory status.

Assume also that I'm using LINQ to SQL to pull these products in and out of the database and into my application. With such a scenario I could easily create a repository that would put all of my products into memory on the web server when requested like so:

public List<Product> GetAllProducts()
{
    List<Product> output = 
        HttpContext.Current.Cache["AllProducts"] as List<Product>
    if(output == null)
    {
        output = db.Products
            .Select()
            .ToList();
        HttpContext.Current.Cache.Add(
            "AllProducts", output,null,Cache.NoAbsoluteExpiration,
            Cache.NoSlidingExpiration, CacheItemPriority.AboveNormal, null);
    }
    return output;
}

With this approach, I'm then free to slice and dice my products for consumption by various other methods within my repository, such as the ability to get all products from a certain product family:

public List<Products> GetProductsByProdLine(string prodLine)
{
    return this.GetAllProducts()
        .Where(prod => prod.ProductLine == prodLine)
        .ToList();
}

And, as you can see, I'm able to do that all from within memory using LINQ to Objects.

Assuming that I don't want to iterate over my product list each time I need a specific product, I can also store individual requests to products within the cache as well:

public Product GetProductById(int id)
{
    Product output =
        HttpContext.Current.Cache["Product" + id] as Product;
    if (output == null)
    {
        output = this.GetAllProducts()
            .Where(p => p.ProductId == id)
            .SingleOrDefault();
        HttpContext.Current.Cache.Add(
            "Product" + id, output, null, Cache.NoAbsoluteExpiration,
            Cache.NoSlidingExpiration, CacheItemPriority.BelowNormal, null);
    }
    return output;
}

Doing so can help make data retrieval much faster, though it will take up a tiny bit of RAM. Most CLR objects are tiny, so they don't take up much RAM. If your objects are huge, then you might want to rethink this pattern of usage, if you don't have gobs of RAM at hand. In most cases though, storing business objects in RAM is usually preferable to recreating them when needed.

And, in cases where caching a single entity that was pulled from a collection of cached objects doesn't make sense, just remember that using the pattern of caching a single object by ID can provide a huge boost in performance if that object is being requested on a regular basis.

The Downside of Extensive Caching

Taking this approach also adds two big problems. First, implementing unit tests on top of a repository that takes advantage of cache can be a total beast because you've thrown a big set of dependencies, which can’t be easily tested, into the mix.

Second, it's pretty easy to let cached items “trump” real data. This is because you now have a number of cached objects that need to be removed should you change any of your underlying data. Common approaches to dealing with this include setting sliding or absolute expiration times that are good enough, or using cache invalidation callbacks from SQL Server. Setting expiration times can provide a great boost even in highly volatile environments by setting cache timeouts for just a few seconds and accepting that data can/will be a few seconds old in some cases. Using cache invalidation callbacks is also a great way to keep volatile data up-to-date while enjoying the benefits of cache. But it requires a bit of extra work to set up.

Convention Can Provide the Best of Both Worlds

A solution I've been using of late doesn't involve either approach, but it nets me the best benefits with only a tiny extra bit of code. I use a simple wrapper or proxy around my access to the cache. The first benefit of this approach is that I'm creating my own API.

Yes, I know that doesn't sound like a benefit. But it's actually a huge benefit because any good API requires an interface. Interfaces, in turn, provide additional flexibility. So, in my case, I'm currently busy proffering objects and collections from my repository (or presentation models) for a solution that's web-based. But if I ever need to expose these presentation models to a WinForms or Windows Presentation Foundation (WPF) application, I would have tons of code hard-linked to ASP.NET's cache - which isn't going to fly.

By using an interface, and then a concrete implementation, I'm able to pass an instance of the implementation into my repository as it's created (this is a form of dependency injection known as constructor injection). This way, when my web apps are calling into my repository, they, of course, are using a caching proxy implementation that's just a simple wrapper around ASP.NET's caching functionality. But if I needed to expose these models to a WPF application, they could instantiate an instance of my repository with an implementation based around the enterprise caching blocks, for example.

Another great benefit of this approach is that unit testing (and some integration testing) now becomes much easier to handle, because I'm able to mock calls into my repository by specifying exactly what should occur when a repository method is called. This “mocking” let's me simulate exactly what I would expect to occur when interacting with my repository. Similarly, a huge benefit of mocking (using Rhino Mocks, for example) is that I'm also able to test out what happens in cases where my repository can't connect to the underlying data store, or encounters an error, by merely mocking those kinds of scenarios (which ends up being so much easier than trying to code or simulate those conditions in other ways).

Standardizing Cache Access

The final benefit of this approach is that it let me standardize on the creation of cache keys. If you look at the examples above, I'm just creating keys for my cached objects on the fly. And, since these keys are strings, IntelliSense won't help call out typos or other errors that would cause stupid caching problems.

So, with my interface, or cache proxy, I also added a couple of additional helper methods. One helps me generate cache keys in a standardize format that takes, more or less, a hierarchical approach to naming (more on that in a second). This particular approach is highly coupled to the model hierarchy within the site/solution I'm currently working on, but the pattern is pretty simple:

public string GetCacheKey(string repoName, ObjectType type, string? id)
{
    return string.Format("{0}::{1}->{2}",
        new object[] { repoName, type, id });
}

First, I just create a simple ObjectType Enum to list the three (in this case) different object types I'm caching: “Single” objects (where the id is the id of the object itself), “Sets” of objects (where the id is the id of the parent), and “All” where what's being cached represents an entire collection of objects, as is the case with the GetAllProducts() method, above.

The use of a constant to designate the repository name, along with wrapper methods to handle cache input and output (as well as to make a cache provider neutral mapping for cache retention priorities), yields an approach like this:

CacheProvider CacheProxy;
const string REPO_NAME = "Products";
public ProductRepository(ICacheProvider provider)
{
    this.CacheProxy = provider;
}
public List<Product> GetAllProducts()
{
    string cacheKey =
        CacheProxy.GetCacheKey(REPO_NAME, ObjectType.All, "");
    List<Product> output =
        CacheProxy.Get(cacheKey) as List<Product>;
    if (output == null)
    {
        output = db.Products
            .Select()
            .ToList();
        CacheProxy.Add(cacheKey, output, CacheType.Keep);
    }
    return output;
}

Where standardized cache key entries get really cool though, is when it comes to updating, removing, or adding new products or objects. Because my cache keys are standardized, if I add a new Product, the method that handles that addition can simply create a new cache key for the repo-name with an ObjectType of All and kick out just the collection of All Products. Individual Products, however, remain cached because they're all using different Cache Keys. Likewise, however, they can be removed as needed by simply generating cache keys matching their keys, and they'll be dropped. So, for example, in a Repository method that Updates a given product, the approach would be to handle the update as normal, then create a cache key for all products, and one for just the product ID in question. Then use those keys to kick both objects out of cache.

This approach doesn't require much more coding than normal, but it allows much more robust caching without the need to accept a certain threshold of stale data, nor does it require incurring the overhead of SQL Server cache invalidation.

Finally, since this approach to creating cache keys results in hierarchical names, it becomes fairly easy to remove entire blocks of cache key entries by creating a partial cache key and then removing any entry within the native cache providers' set of keys that .StartsWith(partialKey) as it were.

The point is that a bit of planning and effort can help take an already powerful component of ASP.NET and make it even more robust in terms of enabling unit testing, mocking, and more powerful and robust access—all through the use of a simple wrapper.

Comments

Plain text