This week, I’m taking a much-needed break from the technical details to share my thoughts about the gross inefficiency of most search engines and the unpredictable results they produce. As regular readers of this column are aware, I do a significant amount of research to deliver current information about bugs, hotfixes, and irregularities (and I occasionally offer a word of praise). I also do a great deal of research to answer seemingly simple questions for which no apparent answer exists among the endless information sources we have at our fingertips. I search through product-specific online documentation, enormous volumes of Resource Kit publications, TechNet CD-ROMs, and vast numbers of Web sites, including regular visits to Microsoft's and Windows 2000 Magazine's sites.
I can’t count the number of times I've searched in vain for specific information about Windows NT and Win2K, especially at Microsoft. I’m continually aghast that my searches for one or two keywords—common terms such as "high-encryption" or the text of an error message—return zero, nil, zilch, nada, over and over and over again. The searches can’t or won’t find what I’m looking for, regardless of whether I dig through online Help, the Resource Kit, TechNet, the online Knowledge Base, or even the Windows 2000 Magazine Web site.
For example, I went to the Microsoft Knowledge Base to troubleshoot an annoying Win2K RRAS event log message that appeared after I upgraded to Service Pack 1 (SP1). I entered the search string "Event Id 20106," and, of course, the search "did not find any matches." The equivalent search on TechNet "returned 0 results." And a search for "RRAS errors" returned a whopping two documents, neither of which even remotely applied to my problem. Would it be so difficult for Microsoft to document all the event log messages that Win2K can issue? The need for such a reference seems so obvious. Does this reference exist, and if so, where? Do you suppose that the Knowledge Base search engine can find this document for me?
I recently experienced a similar phenomenon when I attempted to locate back issues of this column at the magazine’s Web site. (I don’t keep local copies of my columns because I figure they’re online and eminently searchable.) However, in this instance, the failure was at least partly my fault because I didn’t read the search Help instructions until after I was thoroughly frustrated, a typical techie tactic. I went to http://www.win2000mag.com and entered the search string "keeping up with nt." You guessed it—no matches. I discovered that to find specific columns, I must preface the search string with "department" and a colon. Next, I typed the search string as department: "Keeping Up With NT," because that’s how the column title appears at the Web site. As luck would have it, my first attempt contained the only combination of upper and lowercase characters that would not produce a match. When I finally read the search Help instructions, I found a concise description of the search engine match criteria that explained how the engine treats lowercase and uppercase characters.
I have a few general suggestions that might reduce the frequency of the "0 results found" response. For starters, to ensure some base level of consistency, we need an international (perhaps language-specific) standard for search engines that outlines in detail the basic functionality. Many search engines employ similar standards for quoted strings, plus signs, and minus signs, but vary widely in how they treat uppercase versus lowercase characters, abbreviations, and special characters. With a standard in place, the hosting Web site would have to conform and clearly document any additional enhancements. features.
Next, every searchable Web site should include a Help button that describes how the matching algorithms work. If Microsoft had such a button on the Knowledge Base page, I might be able to figure out how to find "Event Id 20106."
Third, because terminology varies so widely across disciplines, each searchable site should make the search engine’s index of terms available upon request. If I can first look at an index of terms, I can enter a search string with some confidence that I'll get the results I'm after. At the Microsoft Knowledge Base site, for example, an index would let me know to enter "RRAS" instead of "Routing and Remote Access" and "128-bit" instead of "high-encryption."
Finally, we need a search engine that understands when terms are equivalent or closely-related; an engine with enough intelligence to return all references that contain either "128-bit" or "high-encryption." A search engine with smarts should be able to find all references to a proper name without requiring that I enter all possible permutations (e.g., first name-last name, last name-first name, first, last, comma here, comma there, and so on).
I took some measure of comfort when a highly respected colleague recently asked how I find anything in the Microsoft Knowledge Base. And I know, deep down, that millions of folks are in the same jungle of cyberspace research, enduring the litany of "nothing matched your search" and "0 results found." Collectively, we wonder whether we've gone temporarily insane or have somehow regressed overnight into technical novices. Although I claim no expertise in the subject of search engines, I'm a very frustrated user. I know the industry can do a much better job; we can only hope it’s sooner rather than later.