I recently called Microsoft Customer Service and Support (CSS) to help resolve what I thought was an undocumented error. As it turns out, the error was documented—I just couldn't find a reference to it in the 80 PDF manuals that came with this particular Microsoft product. Luckily, the support engineer I talked with was familiar with the error and knew the exact manual that I had to reference.
After that support incident, I recalled that I had used the Adobe PDF IFilter plug-in for the Microsoft Indexing Service several years ago to search through PDF files. Back then, I had only a dozen Adobe PDF files in a directory of hundreds of .doc, .txt, .html and .mht files. However, I had to search every file for specific text strings, and IFilter served this purpose well.
With the propeller hat spinning full tilt, I decided to again use IFilter with the Indexing Service for the purpose of searching Adobe PDF files. But this time, I created a customized Microsoft Management Console (MMC) snap-in for the UI. Although you can use Adobe Acrobat Reader to search through PDF files in a specified directory, it takes an extremely long time if that directory is large (e.g., 65MB). With the MMC snap-in, the search is almost instantaneous. Here's how you can create this snap-in on your local computer:
- Go to http://www.adobe.com/support/downloads/detail.jsp?ftp ID=2611 and download IFilter 6.0. This version supports most 32-bit Windows desktop and server versions from Windows Server 2003 through Windows 2000. (See the IFilter 6.0 download page for details.) If you already have the IFilter 5.0 installed, uninstall it first. I found that version 6.0 automatically corrects a registry entry and a DLL registration that had to be manually corrected in version 5.0.
- Following the instructions provided on the Adobe Web site, install IFilter 6.0. I chose to install it to C:\ Program Files\Adobe\PDF IFilter. After you install IFilter, restart your machine.
- Select Run under the Start menu. Type mmc and click OK.
- From the File menu, select Add/ Remove Snap-in and click Add.
- In the Add Standalone Snap-in dialog box, highlight the Indexing Service snap-in and click Add.
- In the Connect to Computer dialog box, select Local computer and click Finish.
- Click Close in the Add Standalone Snap-in dialog box, then click OK in the Add/Remove Snap-in dialog box.
- In the Console Root window, right-click Indexing Service on Local Machine, select the New option, and click Catalog. In the Add Catalog dialog box, provide a name and location for the catalog you're creating. If you want to put the catalog in a new directory, be sure to create this directory beforehand in Windows Explorer. For this example, let's create the My Documents\Index Catalog Files\My PDFs directory for the catalog, which we'll name My PDFs. Click OK in the Add Catalog dialog box. When the message Catalog will remain off-line until Indexing Service is restarted appears, click OK again to create the catalog. In this case, the Indexing Service creates the My Documents\Index Catalog Files\My PDFs\catalog.wci directory.
- You need to stop the Indexing Service before you can restart it, so in the Console Root window, right-click Indexing Service on Local Machine and select Stop. Then, right-click Indexing Service on Local Machine and select Start. The unpopulated statistics for your new catalog will appear in the right pane. Don't worry if only zeros appear. This step simply builds the indexing framework for the catalog. In step 11, you'll provide a path to the directory containing the PDF files that will populate the catalog.
- When you use IFilter with the Indexing Service, the Indexing Service indexes not only PDF files but also all the files it natively supports, such as .doc, .txt, and .html files. Thus, I recommend that you use Windows Explorer to remove any nonessential subdirectories and files from the directory that contains the PDF files you want to be able to search. In my first test of the catalog, the directory of PDF files I wanted to search had a subdirectory that contained 50MB of streaming video files. Those streaming video files were indexed, which added an unnecessary 65MB to the index catalog.
- In the Console Root window, expand the directory that contains the My PDFs catalog. Right-click Directories, select New, then choose Directory. To fill in the Path field, browse to the directory that contains the PDF files you want to be able to search. For this example, let's say these files are in a directory named PDF Manuals. You can also enter the directory's Universal Naming Convention (UNC) name in the Alias (UNC) field if you want. Click OK. You can add as many directories as you want in the catalog by simply repeating this step.
- Right-click the path under the Directory header, then select All Tasks followed by Rescan (Full). At this point, if you click Indexing Service on Local Machine, you'll see the My PDFs entry starting to populate. This task will take about five minutes. Note that the more you move your mouse around, the longer it'll take to populate the catalog. Mouse movement causes the Indexing Service to pause because it perceives that movement as user activity on the PC.
- If you want to add a desktop icon for your new catalog, go to the Console Root window and expand the My PDFs catalog. Right-click Query the Catalog, then select the New Window option. The Query the Catalog dialog box should appear. Close the Console Root window behind the Query the Catalog dialog box because you don't need that window in your finished product. On the toolbar, click the Show/ Hide console tree button so that all you see is the Indexing Service Query Form. Maximize the Query the Catalog dialog box. On the File menu, select Save As. Name the file My PDFs.msc and save it in the folder that contains the index framework directory (My Documents\Index Catalog Files). I don't recommend that you save it directly in the index framework directory (My Documents\Index Catalog Files\My PDFs) because if you perform an Empty Catalog operation, that operation deletes everything in that directory, including the Management Saved Console (.msc) file you just created. Close all the MMC dialog boxes. When you're asked whether you want to save the console settings, click No. You just saved the .msc file, and you don't want to overwrite that file.
- Use Windows Explorer to create a shortcut to the My PDFs.msc file.
- Test your new MMC by clicking the shortcut. A window that's titled "My PDFs - Query the Catalog" should appear that contains the Indexing Service Query Form.
The custom MMC works well and performs searches in seconds. However, I've come across two quirks you need to be aware of:
- When you're searching for a specific term such as Root Kit Virus, be sure to enclose the term in quotes and select the Advanced Query option. If you don't select the Advanced Query option, the Indexing Service will return every document that has any of those words within its contents. Although there's a Tips for searching link that has a Query Syntax button to help with search syntax, I've found that the button doesn't work on my machine. The workaround is to use the Help Topics option on the Help menu.
- If you right-click a drive in Windows Explorer and select Properties, you'll see the Allow Indexing Service to index this disk for fast file searching check box. Do yourself a big favor and leave this check box selected, which is the default. I disabled this one time for a test, thinking I could just re-enable it, but doing so broke the Indexing Service's ability to index Microsoft Internet Explorer's (IE's) .mht file type.
If Microsoft's Indexing Service is of interest to you, you can find more information about it in "How to create and configure a catalog for indexing" (http://support.microsoft.com/?kbid=308202).