A significant number of organizations that use Microsoft Office SharePoint Server (MOSS) are failing to leverage some easy ways to improve both the quality of their data and the quality of their search results and associated user experience.
Companies typically face common SharePoint search problems when they attempt to implement useful metadata options and quick and easy customizations. Others are constantly seeking small ways to move their intranets, collaboration data, and portals in the right direction for growth and maintenance. I want to provide help in those directions. I won't promise a huge lesson in enterprise information architecture or any grand scheme for overhauling the governance of your data or SharePoint environment, but I can suggest a number of free/inexpensive tools and useful ideas to help you improve the structure and content of your data.
The most common scenario I see among SharePoint-using clients is a lack of design and planning at the data level. Many organizations have spent considerable time and IT dollars building a hardware and farm infrastructure, but they've spent little or no time working on the design of the actual data. A significant number of these implementations include a Help desk, site-provisioning tools, custom site definitions and templates, and a formal process for managing the farm, but they don't have a single custom content type, and during the analysis and design phase they haven't created any customized search results. Site administrators typically let the site owners use the available out-of-the-box SharePoint tools to organize their own data. This approach leads to either very little data management or inconsistent architecture and design across sites and search results.
The problem increases over time as users add, version, and collaborate on larger and larger amounts of data in sites that have little or no metadata or classification. Users continue to upload documents into the pile and rely on SharePoint's search engine to index content and properly return results. Eventually, this system breaks down when the volume of documents becomes so large that search results are significantly littered with correct but unintended results. SharePoint's out-of-the-box search cries out for some options to filter the data into usable compartments. These filters can be standard metadata items such as the author, content type, and language; however, additional options available via search scopes and limitations based on location or custom properties can greatly increase relevancy.
IT pros within the organization face a daily challenge. They generally need to understand enough about all the disparate data sources within the corporate firewall to locate pertinent information to complete their job functions. They often ask to search for multiple locations in a single location instead of logging on to remote applications or websites and searching and tallying results manually. They want more options to sort and drill down on the data returned. They also might need to manage the data, either by asking for and receiving additional metadata within their results, gaining access to custom search applications, or modifying components of the actual data as required and allowed.
Start with Legwork
You need to realize that it's almost impossible in a large organization to perform a complete analysis and formulate a master plan in advance. Convincing budget makers, stakeholders, users, and a committee to take on months of meetings and design sessions is generally unattainable. The risk starts to become too visible. Although the rewards can seem empowering, they can also be very difficult to achieve. My opinion is that a waterfall approach to this process is a setup for failure.
Instead, I recommend tackling the first small problem you want to solve. Such problems will be different for each department and user, but you're probably considering this project because you're already aware of a few data, organizational, or search concerns based on community complaints. Those with the loudest complaint will be the most likely to help formulate a solution, creating the perfect opportunity to start solving specific, incremental problems.
This article is about tools and options for correcting such problems, but you need to understand the importance of advance legwork. Forming a small committee of decision makers and users willing to meet quickly every week can be beneficial. This group can help communicate requirements from different aspects of the organization and can evaluate potential tools and solutions in a testing environment. Those involved also serve to evangelize your options within the organization—key to getting the word out about any changes and to soliciting feedback.
Before we jump into your options, remember to stay focused without losing sight of the big picture. Keep your cycles short, and get some small wins, but understand that each small win adds another component to your overall solution. You'll gradually gain knowledge about the data in your organization while also solving specific problems. With proper attention to the big picture, you should end up with a relatively stable solution and a significant understanding of how your architecture is pieced together (as well as what gaps remain).
Let's begin with basic options available to everyone using at least MOSS. Following are simple descriptions of the options and how you can use them.
Content Sources. Content Sources denote the items that SharePoint's crawling engine looks at and creates a searchable index for. Keep in mind, these can be broken out for scheduling and different rules, even among internal SharePoint locations, helping with time management and handling large data stores.
Managed Properties. Managed Properties are the metadata items that the crawling engine finds when viewing your data. They even pick up custom columns in SharePoint lists. You can roll these up into custom properties and use them as rules and filters in search scopes and advanced searching techniques.
Search Scopes. You can tell SharePoint to limit the scope of a search based on managed property (equal or not equal to), by locations, and by content class. The content class is a little-known property in SharePoint that represents an item's internal classification—for example, List Type and Item Type. You can use these properties to create scopes that will return only web pages instead of list items or documents. A significant number of other classification options are also available.
Thesaurus. Many administrators aren't aware of the SharePoint thesaurus, a system-level XML file that lets you create global replacements for common terms. This feature removes the burden on site-collection administrators of creating custom keywords for each new site collection when common domain-level terms need to have synonyms in their searches.
Keywords and Best Bets. Critical and often overlooked, these items give the individual site collection the power to create keywords with any number of synonyms as a search replacement. The real benefit is the ability to create Best Bets, which let users add links to any content that will automatically appear at the top of search results when a keyword or synonym appears in the search terms.
Custom search pages. The makeup of the standard Search Results page has a significant number of web parts that represent a large number of options for the end user. I’ll discuss just the core Search Results web part. This web part is essentially just a large Extensible Stylesheet Language (XSL) transform code block. It takes the Extensible Markup Language (XML) search results and transforms it to whatever you, the end user, and designer put in place.
One of the best ways to evaluate your options is to look at the raw XML returned from your search to see what data is actually available for designing a custom search page. You'll see that many properties are included—specifically, the custom properties you've defined in your Shared Services Provider (SSP). This step lets you include additional data, group the data and add custom links based on If statements, and so on. Depending on the type of data you expect to return in your results, you can create very specific views of this data.
Understand that you aren't limited to the single Search Results page offered by SharePoint's out-of-the-box search center. You can create as many custom pages as you want, with very specific criteria and results layouts. Simply linking to them from appropriate locations within your organization can direct people to more focused search locations.
Third-Party or In-House Tool Options
Although there are many additional out-of-the-box enterprise search-management approaches, you need to be aware of additional tools and third-party components. The tools below are in no particular order, and many are open-source.
User ratings. With the advancement of Web 2.0—and its focus on socialization, networking, and data-interaction freedom—SharePoint content needs a boost to handle some of the requirements of this new environment. Thankfully, SharePoint is an easy platform to work with from a development perspective, and there are some free and inexpensive third-party web solutions that can do most of the work for you.
Rating content has become crucial to the interactive style of modern technical communication. Not all enterprise data needs or warrants rating from the user community, but a large amount does. You can download and install functionality to provide a common star-rating column for any SharePoint list or library. The tools are intelligent enough to permit only a single rating by each user account, and they also support comments. Site owners can add this feature to only the lists and libraries they choose. The ratings are simple and easy to add to search results pages, with filtering based on minimum star rating.
Facets. There are free open-source tools available that allow dynamic pivoting on properties returned in the Search Results XML file. You can customize the tools to add or limit the specific properties that are available for pivoting in each search result. They are 100 percent UI-based, letting you select links to continue drilling down and filtering on as many properties as you want. The software shows you which filters you've applied and lets you remove them individually at any time. You can add these web parts to any Search Results page.
Federation. In the summer of 2008, Microsoft released its Infrastructure Update for MOSS, which included the ability to call external (or internal) search locations and return the results to a web part. You can use almost any search engine to run queries in real time and return the results to SharePoint. This becomes a powerful tool when you're creating single-location search centers that can simultaneously search all internal search engines and return the results from a single query. With Federation, you can even search SharePoint search scopes specifically by using a federated location, thus querying multiple SharePoint search scopes in a single query but segmenting their results into usable buckets. These federated search web parts can also search external search engines if you need to include results from public locations. (Be aware that your users will be broadcasting search terms to public locations.)
Export data into SharePoint. Although this capability might seem backwards from a search perspective, consider exporting data from other line-of-business applications into HTML pages and importing them into SharePoint at regular intervals. Think about some potential wins: Owners of other applications get to choose what data to query and export, they can design a metadata scheme to apply to their data as it's imported into SharePoint, they choose the intervals at which data is exported, and they control the layout and structure of how their data is viewed. Using some of the SharePoint web services or relatively simple programming can accomplish the import tasks. Therefore, SharePoint can have native content added to lists and libraries and crawl it as local content instead of using the Business Data Catalog or Federated Search to query external data held within other applications.
Custom web parts. With four or five days of development, you can build a custom web part that queries an internal database, looks up metadata for common document details within your organization, and uploads a document with routing rules. If you have proprietary business data that would be beneficial to apply as metadata to SharePoint documents, a custom tool can be powerful. Essentially, the project queries other internal databases to look up pertinent data that you want to apply to documents being uploaded to SharePoint. With some basic business logic, this web part can look up linked data based on user selections, then upload and route the document based on the applied metadata. This solution lets you apply important properties to your SharePoint content without requiring your users to enter all the data by hand.
BDC. Although entire books have been written about the Business Data Catalog (BDC), it's worth mentioning the power that it can hold from a data-querying and -retrieval perspective. The BDC data can be quite interactive and used in various web parts, can connect to other web parts for filtering, and can be added to lists as custom columns. Ultimately, it can be read in a very similar fashion to list data in SharePoint. What we care about is how it can be searched. The BDC can connect to internal applications that can be accessed via an ADO.NET provider or web services. Data can be set up as a content source in SharePoint (Enterprise version) for crawling and indexing and can then be searched and returned via standard SharePoint searching capabilities. This can all be done without coding, yet a significant amount of XML must be written. A few excellent tools in the marketplace can help you create these XML definition files.
I want to call out some tools that are available as free downloads. Most of the options outlined above are available in some form at CodePlex. There are various versions, each with strengths and weaknesses. I encourage you to set up a test environment and test them. The site contains almost all the tools and utilities you'll need to help with search; I use them regularly. Here are some additional items and ideas to consider:
- Viewing tool—This tool lets you load and view all your SharePoint sites from a tree view, starting at a web application and drilling down to properties on a list item.
- Search tool—This tool lets you query the engine directly via an external UI.
- Tool for modifying relevancy rankings and testing the results.
- XSL samples from other people in the community—These can show you what others are building for search results pages.
- Adding wildcard searching options.
- Better management of searches based on custom properties.
- Regular Expression searching tools—These let you create custom regular expressions to search content in SharePoint. They're ideal for uncovering specific formats of data, such as credit card numbers, telephone numbers, and social security numbers.