To access all figures and tables for this article, please download the PDF version here.
In a previous article, I focused on the process of Planning Search Services for Microsoft Office SharePoint Server (MOSS) 2007. In this article, I am going to take what we learned from that article and apply it to Configuring Search Services.
When you initially install MOSS 2007 and perform the post-install configuration, you are guided through starting the Search Service. This article does not cover that process and makes the assumption the Search Service has been configured with the appropriate account and is running.
To start the configuration process, open the SharePoint 3.0 Central Administration utility. Once open, select the Share Service Provider (SSP) you created during the post-installation process. On the Shared Services administration home page, there are options grouped in to 6 sections. Locate the Search Settings option and select it. It is from this Configure Search Settings page that we have access to configure all of our search settings except for Keywords and Best Bets; more on configuring Keywords and Best Bets later.
Default Content Access Account
The account used for authentication during the Content Source Crawl process was configured when you initially setup and started the Share Service Provider (SSP) Search Service. This account must have read access to all configured Content Sources or the information will not be available to users in search results.
Configuring Content Sources
Use your Content Source documentation that was created during the planning process to start configuring the server. From the Configure Search Settings page, click on the Content sources and crawl schedules link. You will be taken to the Manage Content Sources where you can perform add, edit and delete operations. In Figure 1, I have four Content Sources configured. To add your first Content Source, click the New Content Source option on the blue menu bar. You will be taken to the Add Content Source page where you specify the Content Source Name, Content Source Type, Starting Addresses (Except for Business Data), Crawl Settings, and Crawl Schedules and whether or not a full crawl should be started immediately after the configuration process is complete. In Figure 2, I demonstrate how to setup a File Shares Content Source Type.
In the example File Shares Content Source, Figure 2, I have given it a name of Marketing Files to represent the type of information being crawled. I have then provided the file share I would like the crawler to scan and parse. In addition, I have told the crawler to scan and parse all files, including subfolders. I have set up both a Full Crawl and Incremental Crawl. The Full Crawl will run every morning at 4am and the Incremental Crawl will run hourly. Lastly, since this is a new Content Source, I have indicated I would like a Full Crawl to start once I have completed the configuration steps.
More on Crawl Schedules
Crawl Schedules are used to define the frequency at which Full and Incremental scanning and parsing is to be executed. The crawl process can consume significant server resources and network bandwidth (depending on the Content Source location). Based on the frequency of change and the need for up to date information, define your crawl schedules accordingly. For example, if your Content Source is an external web site, the source information may only be updated once each week. In this circumstance, it is not necessary to schedule a crawl more frequently than once each week.
Content Source Configuration Information
Table 1 provides information you will be configuring for each of your Content Sources.
Configuring Crawl Rules
To configure Crawl Rules, go to the Configure Search Settings page and click the Crawl Rules link. It is here that you can specify rules for including and excluding specific paths. In addition, if there special authentication needs you have for a specific source, they can be configured here. In the example in Figure 3, I have configured an exclusion rule. All content from the address http://library.findlaw.com/wills-trusts-and-estate-planning/* will be excluded from the search index.
Continue to add inclusion and exclusion rules as defined during your planning process. The Manage Crawl Rules page also supports a feature that allows you to test your rules. You simply type an address and MOSS will tell you if it matches a rule.
Configuring File Types
MOSS 2007 Search Services is initially configured to index many types of files including all Microsoft Office files, web pages, text files, xml and even image (tiff) files. The specific files that are indexed can be configured by clicking the File Types link on the Configure Search Settings page.
For MOSS 2007 indexing to function, it requires filters (IFilter) to be installed for each File Type configured. For example, when you install MOSS 2007, there is no IFilter to index Adobe PDF files. In most cases, these are files you will want included in your index. For this to be accomplished, you will need to install the filter.
Configuring Search Scopes
Configuring each Search Scope takes two steps: creating the Search Scope itself, then applying Scope Properties and Rules. As you learned in during the Planning Search Services process, Search Scopes provide you the ability to include or exclude content in a logical grouping. The content you choose to include or exclude can be a specific web address, a specific property, a Content Source or all content.
To get started, you need to first create a Search Scope. To do this, click the View Scopes link on the Configure Search Settings page. Now click the New Scope link; found on the menu bar. In the example in Figure 4, I am creating a new Search Scope named Sales. This Search Scope will contain only Sales related sites and the Marketing files. When you have finished entering your Search Scope information, click OK.
The next step in our process is to edit the new Search Scope properties and rules. To access the Search Scope Properties and Rules, place your mouse over the item, open the drop-down menu and click the Edit Properties and Rules link; see Figure 5.
From the Scope Properties and Rules page click the New Rule link to add a new rule. You will then be taken to the Add Scope Rule page. In Figure 6, I am configuring the Marketing Files Content Source to be included. You will follow these same steps to configure each rule.
Configuring Keywords and Best Bets
Keywords and Best Bets are one of the only search services you configure from your Portal, not the Central Administration utility. To configure Keywords and Best Bets, open a Portal site, go to Site Settings and click on the Search keywords link. Once there, click the Add Keyword link on the menu bar. At this point you will be on the Add Keyword page, as seen in Figure 7.
In Figure 7, I have configured a new Keyword named “Products”. In addition, I have configured the synonyms catalog and product; thus any of the three terms will locate the same Best Bets. For each Keyword, you can associate one or more Best Bets. These Best Bets will be displayed to the user when a match on the Keyword (or synonym) during a search.
This is the same process you will go through for each Keyword you wish to configure. After configuring my “Products” Keyword, I went to the Search Center and searched for product (a synonym for “Products”) and the results are shown in Figure 8. Notice the Best Bets section, highlighted in yellow, where our Keyword, Keyword Definition and Best Bets are displayed.
It is my hope you have a better understanding of how to configure Microsoft Office SharePoint Server (MOSS) 2007 Search Services. I realize this is not an all-inclusive, completely exhaustive, or step-by-step guide to all aspects of the configuration process. It is meant to provide the groundwork for helping you understand what the configuration process entails.
To gain further knowledge of the overall Search Service Configuration process, I invite you to read Deployment for Office SharePoint Server, which can be found on Microsoft Office SharePoint Server TechCenter.