Putting the IIS SEO Toolkit to work with ASP.NET MVC URL Canonicalization

 

It's no secret that Microsoft has been putting a huge amount of emphasis on search over the past few years. But I think the $100M they just spent to advertise Bing won't make as much difference as some of the great new dev tools they've released.

Microsoft, the Web, and Search
Bing is a gimmick. By spending $100M on a very highly polished ad campaign, Microsoft isn't just trying to create a new brand and image for their search engine,but for themselves as a company. They're sending the message that they “get” the web and that they “get” search. Only, I'm not sure who they're trying to convince: themselves or us.
That probably sounds a bit harsh. But for all of Microsoft's great strengths, they still haven't quite gotten the web. At least, not as a company. Certain units with Microsoft definitely “get” the web. Others are patently clueless.
For example, key portions of Microsoft.com frequently just disappear without leaving any sort of forwarding address. I typically bump into this a few times every week, and I'm sure I'm not alone. Each time I encounter this problem where Microsoft.com has either removed or deleted content, the result is always the same: Microsoft.com has no CLUE about what I'm looking for or where the resource can be found. Take for example, a recent journey to Microsoft.com via a search engine to find a download for Microsoft IntelliMouse. The URL that Microsoft used to use made fairly decent sense:
But if you follow that link, you'll get a generic error page that tries to use search results to figure out what you're looking for, rather than just using an HTTP 301 to redirect (because it's not like that download no longer exists on the site).
My point, however, is that it's a pretty big deal for Microsoft to try and convince me that they 'get' the web and 'search' when they can't even address the most basic problem of link rot on their own site.
The IIS SEO Toolkit
My complaints aside, some groups within Microsoft do “get” the web and search. To prove that, the IIS Team recently released the IIS SEO Toolkit. If you haven't already heard of it, it's a simple yet powerful extension that you can add into IIS that will basically act as a bot or spider to crawl your own sites and give you an analysis of how well your site is doing from a search engine optimization (SEO) perspective. Better yet, not only is this toolkit insanely easy to install, but it actually works very well and does a really great job, overall, of helping point out any SEO flaws, weaknesses, or problems that your site (or sites) might have.
More than anything, this tool helps show how segments of Microsoft get the web and search because here's a tool that helps them help their customers to fix any issues they might be having with their own sites. Too bad Microsoft doesn't use this tool internally though. Forcing the departments that don't get the web to use this tool and wade through the pages of errors it would generate for parts of Microsoft.com would be a great experience. It would really help the company get the web as a whole, instead of just spending money to tell us that they get it.
Putting the Toolkit to Work
Using the SEO Toolkit couldn't be easier. All you need to do is download the toolkit, which you can do by either installing the Web Platform Installer, or just by clicking on the download type (x64 or x86) listed below the Web Platform Installer Link on the SEOToolkit homepage. Once it's installed (it's less than 1MB in size), you'll have a new set of tools that you can use from directly within the Internet Information Services Manager.
The tools are very intuitive and easy to use and provide some great analysis. I decided to put them to the test against one of my own sites, SQLServerVideos.com, to see how it would fare against some of the problems that I knew that I had. Happily, the tools did a spectacular job of detecting and outlining those problems, and picked up on a couple of other issues that I was blissfully unaware of.
Url Canonicalization
One area where I knew I had problems was with URL canonicalization. Mostly that was because some of the places on my site use links to other parts of the site with trailing slashes, and in other parts of the site I don't use trailing slashes in my links. Both of the following URLs would lead to the same page on my server:
However, they're different enough from each other that a search engine would have a hard time with them. Or, at least, it would only be able to consider one of those links as being the authoritative, or canonicalized, link to the content in question. Therefore, a common SEO problem that many sites have is that they dilute the applicability of their own content to search engines by providing multiple ways to access their content. This becomes even more problematic when other sites begin linking to your content as well - because it can further dilute the relevance of your content.
What's cool about the IIS SEO Toolkit though is that it can totally show you where you're running into problems like this. And, in my case, not only did it point out that I had set up some problems for myself with trailing slashes, but I had also managed mix and match case in some cases (e.g., using /transcripts/ and /Transcripts/ interchangeably), which has an equally negative effect in terms of SEO.
URL Canonicalization with ASP.NET MVC
To address issues of URL Canonicalization, you just need to pick a scheme and stick to it throughout your site. (Of course, when it comes to SEO, there are other things you need to worry about with your URLs but you can find out about some of them by using the SEO Toolkit.) In my case, I decided that I'd just go with all lower-case URLs followed by a trailing slash. And while I'm still not certain that's the perfect choice, it's still a much better choice than doing nothing and letting my non-canonicalized URLs frolic in the wild without any order or control.
More importantly though, putting this into play with ASP.NET MVC is not only easy, but highly configurable and easy to control. And there are a couple of reasons for that. First, by using a custom ActionFilter, I can define the rules that can easily be applied site-wide . Thismeans that if I need to change my rules later on, I can do it from a single location. Second, because my rules will be defined in an ActionFilter, I have complete control over which parts of my site I want to add them to. So, for example, if a browser is requesting /favicon.ico, I probably don't want to have my rule of forcing my url to end in a trailing slash in play. Same thing goes for my images, css files, javascript files, and so on.
 But at this point, MVC ActionFilters don't really provide much more than what you could get by using a classic ASP.NET HttpModule because you could just as easily define canonicalization rules in a module as well (and by default it wouldn't intercept requests for images, css, js, and other resources). ASP.NET MVC ActionFilters really shine when you have segments of your site where applying your canonicalization rules really doesn’t work - yet where the content is still dynamically generated or handled by ASP.NET. In cases like that, you'd have to bind additional logic into your HttpModule using classic ASP.NET that told it to ignore those “chunks” of your site. But with ASP.NET MVC ActionFilters, you simply leave off the Canonicalization ActionFilter, and you're good to go because the ActionFilter applies only to the Controller Actions or Controllers that you decorate with the filter.
So, in my case, I created a simple action filter that uses brute-force to determine if requested URLs match my rules. If they don't, then I force an HTTP 301 (which I should really refactor into its own method) to where the canonicalized, or authoritative, resource should be.
public class SEOCanonicalize : ActionFilterAttribute
{
    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        // grab the URL:
        HttpContextBase Current = filterContext.HttpContext;
        string path = Current.Request.Url.PathAndQuery ?? "/";
 
        // check for any upper-case letters:
        if (path != path.ToLower())
        {
            string newLocation = path.ToLower();
 
            Current.Response.StatusCode = 301;
            Current.Response.TrySkipIisCustomErrors = true;
            Current.Response.Status = "301 Moved Permanently";
            Current.Response.AppendHeader("Location", newLocation);
            return;
        }
 
        // make sure that the path ends in a "/"
        //      (doesn't apply in some cases... but those methods/etc won't
        //          be explicitly decorated with this attribute)
        if (!path.EndsWith("/"))
        {
            string newLocation = path + "/";
 
            Current.Response.StatusCode = 301;
            Current.Response.TrySkipIisCustomErrors = true;
            Current.Response.Status = "301 Moved Permanently";
            Current.Response.AppendHeader("Location", newLocation);
            return;
        }
 
        base.OnActionExecuting(filterContext);
    }
}
Then, with those rules defined in a single location, I'm then free to apply them to any Controllers or ControllerActions where I want my URLs canonicalized by merely decorating my code with the appropriate attribute like so:
    [SSVError]
    [Location]
    [SEOCanonicalize] // force url standardization
    public class VideoController : Controller
    {
        // because the ActionFilter is specified
        // at the controller level, it will be
        // applied to all Controller Actions.
 
        // Otherwise, I could have just applied
        // it to the one or two (for example)
        // actions that I wanted it added to.
 
    }
And, as the comments in that code indicate, ActionFilters can be applied either at the Controller level, or down on individual controller Actions. This, in turn, gives me perfect flexibility, perfect code reuse, and makes it very easy to quickly enforce canonicalization of my URLs on my site.
Conclusion
Of course, as great as the ASP.NET MVC is in helping me enforce these rules (primarily when other users or sites are linking to my site), and as great as the SEO Toolkit was in pointing out these issues, I still had to go back into my site and correct my root problems: namely sloppily-defined URLs within my own markup.
Interestingly enough, I think Microsoft is in the same boat because they need to figure out some way of fixing their problems with Link Rot. But with the SEO toolkit, they've got a great resource to help them see just how big the problem is. Then, maybe once they do that, the rest of the world will be able to take them a bit more seriously when they say that they “get” both the web and search.
 
Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish