SharePoint: Garbage and Governance - 14 Dec 2009

The last few days, I was feeling a bit under the weather, which is unfortunate because the weather here in Hawaii has been unusually gorgeous the last few weeks. All I wanted to do was be outside watching surfers, snorkeling with seals and turtles, and whale watching—it’s been a great few weeks here to be sure. Instead, I was in my office having trouble focusing on anything and finding myself deluged with a mountain of digital garbage. My business hit the “tipping point” as several of our storage options—servers, NAS devices, and cloud-based storage—all managed to fill up at just about the same time. Most problematically, one of my SharePoint content databases was much bigger than I would want it to be for manageable backup and restore.

I also discovered that one of our web applications (actually our most important one) had managed to get three additional content databases attached to it without the correct settings, and so site collections had managed to get created in the wrong content DB. What I was reminded of was the importance of content promotion and demotion strategies for SharePoint and other digital information in an enterprise. Let me explain…

SharePoint makes it easy for users to generate and store content. That’s its beauty. But all that content goes into content databases in SQL Server, where it’s not exactly easy to manage. Of course, the object of interest in this discussion is the content database, which can store one or more site collections.

The content database is the object you should be managing carefully, because it’s the most straightforward level of backup and restore, database protection, etc. And when it comes to managing content storage—particularly “throwing out” the digital garbage—nothing is easier than simply decommissioning site collections that live in a content database and detaching or deleting that database.

That’s why it’s particularly important that you consider the mapping of your site collections (and therefore sites) to content databases. A well designed implementation of governance should ensure that your content databases support your service level agreements or objectives for performance, backup and restore.

The classic example here is that if you have both Finance and Marketing intranet collaboration sites in the same content database, and Marketing does something to blow up their site that requires a restore, you end up “rolling back” Finance as well if you’re using SharePoint’s out-of-box backup and restore capabilities. This story changes with the addition of most third-party backup utilities and with SharePoint 2010.

Similarly, if you have a temporary collaboration project that is going to generate a sizeable amount of content—perhaps it includes media assets (videos, photos, audio) or lots of documents—when that project ends, how do you clean out the content database if that project shares a content database with a longer term collaboration effort, such as the Finance collaboration site?

So in addition to supporting performance, backup and restore SLAs, your content databases should—as much as possible—support content lifecycle requirements. Help make it easy for yourself to decommission content, whether that means deleting or archiving the content.

Along the same line of thought, you should consider content promotion strategies. What I mean by that is, many collaborative projects produce both short-term and long-term content. If a new product is being developed, the team’s meetings, discussions, and development documents are important at the time. When the project is finished, the final design is what needs to “live on” while most of the content that led to that final product does not need to persist.

The mistake many organizations make is to keep the entire project collaboration site alive in perpetuity, which leads to just the kind of digital garbage scenario I faced last week. Instead, you should have a plan and a process for moving the long term content to another “place” in your SharePoint implementation, a place where content lives longer; and another process for demoting, archiving, or deleting the “garbage” content.

In the process of fixing my own digital garbage problem, I used STSADM’s “mergecontentdb” operation to move site collections between content databases, so that all of my “expired” site collections were together in content databases that I was then able to take offline. I had to do this because the sites had been (unbeknownst to me) created in the wrong site collections to begin with.

A shout out to Todd Klindt for his 2007-but-still-excellent blog entry, Move Site Collections in a Single Bound, which does a great job of stepping you through the use of this command. This is also documented (if more succinctly) on TechNet.

A couple of other interesting blog entries I came across while waiting for my content databases to reorganize themselves include the following:

Cory Burns’ SharePoint Site Moves, Database Moves and Balancing Growth, which takes a more performance-oriented perspective on content databases.

Gary Lapointe’s super-useful “Create site in database” command, which is a tool I use regularly; it should be part of every SharePoint Pro’s toolkit.

So, let's enumerate the takeaways:
1. Your governance plan should address what types of site collections go into which content databases based on SLAs and content lifecycle requirements.
2. Create site collections in the correct content databases using a tool such as Gary’s STSADM extension.
3. Promote content out of limited-life collaboration sites into longer term stores.
4. Get rid of collaboration sites when the project or effort concludes, based on your content lifecycle governance.
5. Don’t get buried by digital garbage during periods of excellent weather or holidays. Trust me, it’s not worth it.

Dan

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish