ASP.NET MVC, SEO, and NotFoundResults: A Better Way to Handle Missing Content

One of the great things about getting older is you also become wiser. I'm convinced that I'm not nearly as wise as I should be by this point in my life, but I am still learning some new tricks. To that end, I wanted to cover some improvements that I've implemented in the way my ASP.NET MVC projects address SEO considerations for missing content.

MVC and SEO Redux

As I covered two years ago, one of the greatest benefits of ASP.NET MVC (in my mind) is how it is much more SEO-friendly than Web Forms. More specifically, what I really appreciate about ASP.NET MVC is how much easier it is to enable both "bot-friendly" and "user-friendly" errors when they presented themselves due to either mangled links, changes in URL schemes, or any of the other problems that plague the web in terms of link rot and other problems that can degrade search engine ranking.

Only, with the progression of time, I've been able to see that I was initially using a hammer where I could actually have been using something more like a scalpel. Figuring this out has not only allowed me to create a more streamlined approach to handling missing content on my sites, but it has also made it much easier to add some intelligence into the ways in which I'm processing requests for content that can't be found.

The NotFoundResult in Context

As part of my previous approach to addressing cases where content (notice I didn't say files) was either not found, moved, removed, or moved permanently, I created my own custom ActionResult called a NotFoundResult. This NotFoundResult was then leveraged, in turn, under two primary use cases. The first was in scenarios where a Controller Action was trying to look up an object in the database or elsewhere, and wasn't able to find a corresponding match, as follows:

public ActionResult WidgetDetail(string productId)
{
    if(productId.ToLowerInvariant() != "cucumbers")
        return new NotFoundResult();
    return View("Cucumbers");
}

And the second was by means of setting up a catch-all route that would map to a Controller Action that would, in turn, just throw out a new NotFoundResult of its own:

routes.MapRoute(
    "CatchAll",
    "{*catchall}",
    new { Controller = "Home", Action = "NotFound" }
);

The benefits of this approach were simple: Whenever I needed to report on content that wasn't found (404), had moved (301 or 302), or had been removed (410), I could do so by throwing out the appropriate HTTP Response Codes to allow bots to update indexes accordingly—while still spitting out perfectly "skinned" error pages that were user-friendly and readable.

A Smarter, Efficient NotFoundResult

The problem with this approach, however, was that I didn't need to go as "deep" as implementing a full-blown ActionResult (where I then had to tackle the prospect of wiring up "user-friendly" views on my own). Instead, I later found out that I could much more easily achieve the same results by merely extending an MVC ViewResult. Taking this approach made my code less brittle and tons easier to maintain.

Of course, gracefully handling 404s, 301s, and 410s for both end users and bots is one thing—but being able to proactively address issues with changes to your URL scheme or other perennial problems is another. To that end, I discovered that throwing a bit of logic into the mix to be able to better handle the distinction between, say, a 410 and a 301, was a big win from an SEO perspective.

For example, when making major changes to the URL schemes for some of my sites, I found that the ability to "map" those changes by means of distinctive 301 "handlers" was a great boon. So, in cases where changing a route from something like /products/product/cucumbers/ to /product/cucumbers/, I needed a way to ensure that search engines would pick up on the changes—especially when linked in to my sites from other sites. One way to accomplish this, of course, is to use the "Webmaster tools" section of the search engine you want to notify of the changes—but that's not guaranteed to work. As such, a better way to handle "moved" content is to use HTTP Moved Permanently (301) Response Headers.

To handle this requirement (and some others), I ended up with a simple .xml file that mapped old request paths (or URLs) to their new URLs—or to suggested URLs if the content was retired. In this way, when processing a NotFoundResult, I was then able to compare currently requested (and not found) URLs against a list of defined URLs that I knew had either been moved (301) or retired (410). Sample "syntax" for my XML was as follows:

<!-- Tom is no longer with SampleCompany -->
<descriptor path="/AboutTom" type="Removed" redirect="/AboutUs/" />
<!-- cleaner/revised sample url/redirect -->
<descriptor path="/Store/Widgets/" type="RedirectPermanent" redirect="/Widgets/" />

These XML values are then pulled into the site upon startup (or when the .xml file changes) and routed through a simple repository that attempts to match the currently requested URL against the older URL (or path) defined in the previous URL scheme.

Then, if there's a match, I can execute a switch against the NotFoundType, and process 301s, 302s, and 410s as needed:

public override void ExecuteResult(ControllerContext context)
{
    // store references for use in helper methods if needed
    this.Response = context.HttpContext.Response;
    this._controllerContext = context;
    // look for a matching 'descriptor' for the request:
    string path = context.HttpContext.Request.RawUrl;
    NotFoundDescriptor descriptor = this.LookupRepository.NotFoundDescriptorByPath(path);
    if(descriptor != null)
    {
        // NOTE: all 'matches' terminate execution
        // of this helper with a return; statement.
        switch(descriptor.NotFoundType)
        {
            case NotFoundType.Moved: // simple HTTP 302
                Response.Redirect(descriptor.NewLocation);
                return;
            case NotFoundType.MovedPermanently: // HTTP 301
                // .NET 4.0 feature .RedirectPermanent()
                Response.RedirectPermanent(descriptor.NewLocation);
                return;
            case NotFoundType.Removed: // custom implementation/'handler'
                this.ExecuteRemoved(descriptor.NewLocation);
                return;
        }
    }
    // if still here, then 404 (no matching content)
    Response.StatusCode = 404;
    Response.TrySkipIisCustomErrors = true;
    base.ViewName = "NotFound";
    // optionally: report on 404 path/etc.
    // otherwise, let ViewResult spit out
    //     a view (\Shared\NotFound.cshtml)
    //     while preserving HTTP 404 headers.
    base.ExecuteResult(context);
}

And, if there's not a match, it's just assumed that the NotFoundResult is due to mis-requested content or whatever else would cause a normal HTTP 404 to be returned.

In all cases, though, I'm still able to easily meet the need to both spit out proper HTTP response codes while simultaneously giving human visitors user-friendly error pages that are styled just like the rest of the site. (Or, in the case of 301s and 302s, users are transparently redirected.)

Benefits Beyond SEO

I've found that the benefits of my "smarter" and "simpler" approach to handling NotFoundResults extend above and beyond mere SEO concerns for missing or moved content. For example, I've found that it provides a very powerful way to provide marketing or vanity URLs that don't require me to go into my site and recompile it after hacking away at my routes. (I do have to make sure I'm throwing HTTP 301s to avoid cases where search engines think I have duplicate content though.)

Likewise, since I'm routing all my unmatched routes through a single piece of code, I've found it very easy to add in some alerting routines that let me know when 404s are being thrown on some of my sites. (Obviously, adding this to some sites would be a losing proposition.)

At any rate, if you're interested in a sample application that provides some working examples of how this process works, just drop me an email at [email protected], and I'd be happy to share.

Comments

Plain text