Progressive Perl for Windows: Searching the Registry

Use the RegistryFind.pl script to search the registry for keywords that you specify.

Dave Roth

November 18, 2001

7 Min Read
ITPro Today logo


At the very root of the Win32 OS is a database called the registry, which is a clever way of consolidating configuration information in one neat, easy-to-access place. The absolute beauty of the registry is that it unifies how you access configuration data. No longer does a script need to know how to parse different .ini, .cfg, .txt, and other configuration files.

Windows uses the registry to store almost all configuration information for the OS as well as other programs and services. A Perl script can be a mighty powerful tool with which to access the registry. You can access information such as IP addresses, installed programs, and ODBC configurations.

Administrators rely on the two GUI utilities regedit and regedt32 to examine and modify the registry. These utilities work well when you don't have to make many modifications or queries. However, using these tools can be quite time-consuming when you have many queries or modifications to make. A better approach is to leave the grunt work to a good script. Perl is one of the most flexible languages for writing such a script.

Scripting the Registry
As a quick overview for those not too familiar with it, the registry is much like a directory on a hard disk. It contains subdirectory-like containers, called keys, that can contain subkeys (much as a directory can contain subdirectories). A key or subkey can also contain values. Think of a value as a file—it's a named container that contains data. The top level of keys are called roots, each of which is the base of a different tree of registry data. Table 1 shows the roots on a Win32 (or a forthcoming Win64) machine.

The Win32 OS constantly interacts with the registry to query and update keys and values. When the OS actually accesses a value's data, it sees the data in one of several formats, called a data type. The data can be a regular string (REG_SZ), a string with environment variables (REG_EXPAND_SZ), a 32-bit value (REG_DWORD), or a binary string (REG_BINARY), among other types. Figure 1 shows how the Registry Editor displays all these registry components.

Looking for What You Want
In theory, you shouldn't need to manually manipulate the registry. A well-behaved program lets you modify its settings, then updates the registry with the new configuration. Likewise, such a program removes its settings from the registry if you remove the program. If only all programs were well behaved. Quite often, software upgrades don't remove registry keys and values that are no longer needed and applications fail to clean out the registry when you remove them.

The result can be serious registry bloat, in which the registry database gets so large that the system takes an annoyingly long time to process a simple query. When you're using an application such as Windows Explorer that constantly queries the registry, bloat can be frustrating. For example, if you want to create a new folder or shortcut, severe registry bloat might make you wait several seconds between the time that you right-click the folder in which you want to create the new item and select New, and the time that the list of possible items you can create appears.

When you encounter such a wait, it's time to search for orphaned values (i.e., values that aren't used) and value data in the registry. You could use regedit or regedt32's Find command; however, this command finds only the first instance of the search string. You must run the command repeatedly to find all occurrences of the string. Here, my Registry Find.pl script comes to the rescue by finding all instances of a string in one shot. After you find the orphans, you can evaluate whether to delete or modify them.

How It Works
RegistryFind.pl, which Listing 1 shows, isn't complex. Rather, the script illustrates how simple it is to use Perl to automate searching the registry for a series of keywords. RegistryFind.pl's ProcessKey() subroutine, which begins at callout A in Listing 1, is the real engine. The subroutine accepts two parameters: a registry key ($Root) and a path ($Path).

The $Root parameter is a Win32:: Registry object that represents a particular location in the registry. By default, the script passes in $HKEY_ LOCAL_MACHINE, but you can specify any registry root. The $Path parameter specifies a particular subkey in the root. For example, if you call the Process Key() subroutine with the code

ProcessKey( $HKEY_LOCAL_MACHINE,  "SOFTWARE\Microsoft" );

the subroutine examines the HKEY_ LOCAL_MACHINESOFTWAREMicro soft key. The line of code at callout I sets the root that the subroutine will use (i.e., the default root) if the user doesn't specify one. Later, the code at callout J looks up any registry root that the user specifies and assigns the appropriate Win32::Registry object to the $Config hash's root key.

The ProcessKey() subroutine attempts to open the specified key path at callout B. If the subroutine is successful, it collects a list of all the key's subkeys (at callout C), which the subroutine will process later. At callout D, the subroutine collects all the opened key's values and their data. At callout E, the subroutine examines each value to see whether it contains the target string.

Notice the line at callout F. For reasons that date back to 16-bit Windows, every registry key has a default class value. This value has no name, so it appears as an empty string (""). Win32 platforms don't really use these unnamed values, but because they don't have a name, the subroutine renames them with the string "" to make the user aware of their special property.

After the subroutine has examined all the values in the opened key, it closes the key and then iterates through each subkey (at callout H). Before the iteration loop, the subroutine appends a backslash to the $Path variable to make it easier to process if the script has to recursively call back into the ProcessKey() subroutine. If $Path is an empty string, then the subroutine doesn't append a backslash because a registry path can't begin with a backslash.

Note that RegistryFind.pl uses one of the tricks I discussed last month (at callout K). The script passes in the name of the script (in the $0 variable) on a call to the Win32::GetFullPathName() function. The first element of the array that the function returns is the directory; the second element is the script's filename. The code accesses the script's filename so that the Syntax() subroutine can print it. This technique is useful because a user might rename the script.

Give It a Try
The script is easy to use. Just run it, passing in the string you are searching for, with a command such as

perl RegistryFind.pl wmplayer

The above sample command prompts RegistryFind.pl to search through the entire HKEY_LOCAL_MACHINE root for any instances of the string wmplayer in value names and value data. You can easily modify the script to search key names for the specified target string. The script also lets you specify different registry roots. For example, if you use the command

perl RegistryFind.pl  -r HKEY_CLASSES_ROOT wmplayer

the script searches for the string wm player only in the HKEY_CLASSES_ ROOT subtree. You can also pass in multiple strings with a command such as

perl RegistryFind.pl  -r HKEY_CLASSES_ROOT  wmplayer wmserver mplayer2

As you look over the script, you'll notice that the script typically prints to the STDERR file handle. The exception is when the script finds a match at callout G. Printing to STDERR is useful because redirecting output that's printed to STDOUT doesn't affect output printed to STDERR. Therefore, you can redirect the output of the script to a file, but the information printed to STDERR will still appear on the screen. For example, if you enter the command

perl RegistryFind.pl wmplayer  > c:tempoutput.txt

RegistryFind.pl displays on the screen the information about the process of searching (e.g., the current number of keys searched, any errors, the search results) printed to STDERR. However, the script dumps data printed to STD OUT into the specified file. Redirecting the output is helpful when you want to later analyze the resulting search matches.

RegistryFind.pl is simple but complete. It's a good, reliable registry search tool that you can easily modify and expand. This tool can certainly help you track down problems such as unused keys that you should remove and misconfigured keys that you should update. I've used a similar tool for several years now. It's one of the most useful tools in my administrator's arsenal.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like