Dictionaries Trump Arrays

Replace VBScript dynamic arrays with faster, easier-to-code WSH dictionary objects


One of VBScript's quirkiest areas is resizable arrays (commonly known as dynamic arrays). The method for working with these arrays is not intuitive and thus hard to master and remember. I'm going to demonstrate-the process for creating and resizing a dynamic array to contain IP addresses associated with a computer, then discuss how the Microsoft Scripting Runtime Library's Dictionary object (aka dictionary) is an improvement over VBScript's native dynamic arrays.

Collecting Computer IP Addresses
Windows Management Instrumentation's (WMI's) Win32_NetworkAdapter Configuration class exposes IP address information on a computer, but this information has a complex structure. Computers usually have more than one instance of the class because everything from COM ports to VPN connections to VMware's and Microsoft Virtual PC's virtual networks have virtual adapters. Furthermore, an adapter can theoretically have multiple IP addresses, so the IPAddress property for each class instance is an array, even if it has only one address.

If we want to assemble all the addresses for a computer, we have to put together the code for it ourselves. To help separate the issues of WMI from those of VBScript array manipulation, we'll start by looking at just echoing addresses obtained through WMI, using the code in Listing 1. Because this script echoes data by using the console output stream, it must be run with cscript.exe as the host application.

The WMI query is at callout A in Listing 1; it filters the returned data down to just the IPAddress arrays. Next, the code deals with a problem that can occur for any adapter without valid TCP/IP configuration information: IPAddress might contain null data, which causes errors if the script tries to treat the null data as an array. To handle such a situation, the code at callout B in Listing 1 just skips over these instances.

Callout C is the heart of the Listing 1 script. This code extracts all the IP addresses and echoes them out one at a time. Scripts that collect data for an array from complex original information generally follow a similar process. The script drills down through the data until it finds valid information, then handles each element separately. Although there are alternative ways to perform similar operations, handling one item at a time is generally both the most efficient and the easiest to code and understand.

This script also shows us one way to avoid dealing with dynamic arrays. If we just want to generate a list of information, a command shell script can echo each item for us so that we don't have to worry about arrays.

Managing a Dynamic Array
The Listing 2 script finds IP addresses just like the Listing 1 script does, but instead of echoing them, the second script collects them in the dynamic array named Ips. The code at callout A in Listing 2 declares the Ips() array. The empty parentheses make this a dynamic array that we can freely resize.

The statement at callout B in Listing 2 will likely look peculiar to scripters. VBScript array indices always start with 0 and go up from there. If we use the statement

ReDim Ips(0) 

we have an array with exactly one element. Using -1 means that we have no elements in the array and must use ReDim again to add anything.

This is precisely why VBScript supports -1 as an index value. An empty array is still an array, but extending the array is easier if we add an element to the array exactly when we need it. We see this if we look at what happens at callout C in Listing 2 as we collect IP addresses.

We don't have any place to put the first IP address we find because the array has no elements. Obviously, we need to add one element to store the address. To do this correctly, we can use the UBound function every time we need room. UBound(Ips) will give us the index of the last element, -1 on the first pass-through, and we just add 1 to get the new upper bound we need for the array. On the first passthrough, the code

NewBound = UBound(Ips) + 1 

works out to

NewBound = -1 + 1 

or 0. This means the line at callout D in Listing 2 reduces to

ReDim Preserve Ips(0) 

which extends the array by one element. The Preserve keyword isn't important on the first pass. But when we already have elements in an array—as we will on later passes—Preserve tells VBScript to preserve existing array data. If you don't use the Preserve keyword, VBScript clears out every entry in the array when it resizes the array.

Finally, because callout E in Listing 2 on this pass reduces to

Ips(0) = Ip 

we insert the new IP address into the array element we just added.

If this process is difficult for you to digest, you're not alone. I had used the same process in Microsoft QuickBasic years ago but needed to read an explanation and then run some code before I understood it in VBScript. Even then, it took me a few months before I had the process memorized. It's awkward and not intuitive. Fortunately, a better way is available: You can use a dictionary as an array.

Using a Dictionary as an Array
A dictionary is a type of collection known as an associative array. With it, you can manage arrays of named values. Let's look at how we can use this instead of VBScript's dynamic arrays. There are two distinct approaches, depending on whether the data we collect has duplicate values. For general background information about the dictionary (as well as another approach to using it instead of a VBScript array), see "The Scripting Dictionary Makes It Easy" (August 2003, InstantDoc ID 39312).

Listing 3 demonstrates using a dictionary to collect data that might contain duplicate items. Instead of creating an Ips array as we did in Listing 2, we create a dictionary instance, as shown at callout A in Listing 3.

Every time we find a new IP address, we add a new entry as shown at callout B in Listing 3. The Count property represents the number of elements in the dictionary, so on its first pass, the statement at callout B reduces to

Ips(0) = Ip 

This is much simpler. If we want to use a VBScript array function such as Join on the information after we're done, we will need to extract the array of values, but we can do this in one line as well, shown at callout C in Listing 3. The Items collection is a native VBScript array.

The code in Listing 4 shows an even simpler approach. The keys of a dictionary must be unique—we can't have two keys named 0, for example—but if we don't need to preserve duplicate values, we can store the values in the keys. At callout A in Listing 4, we add each IP address as a key with a null string as the value. If the data we're collecting has duplicate values, duplicate items will be silently updated. After we're done collecting data, we can get a real array of IP addresses by calling Ips.Keys, as shown at callout B in Listing 4.

The Best Solution
Scripters have begun using dictionaries more frequently in the past few years, as "Understanding VBScript: The Dictionary Object—An Alternative to Arrays" (June 2000, InstantDoc ID 8797) points out. However, I still encounter dynamic arrays much more often than dictionaries in VBScript code.

When deciding whether to use a dynamic array or a dictionary, it's important to know that there's nothing sacred about VBScript arrays. You aren't required to use them simply because they're included in the language. Eric Lippert, the Microsoft developer who owned VBScript over most of theWindows Script Host (WSH) development life cycle, regularly recommends using a dictionary instead.

If you're interested in some of the internals of VBScript, he occasionally discusses items such as the history of VBScript development in his "Fabulous Adventures in Coding" blog (http://blogs.msdn.com/ericlippert). A significant point is that elements of VBScript such as its array handling usually are designed for compatibility, not necessarily for performance or ease of use.

The dictionary is almost universally available because it was introduced in WSH 2.0. Dictionaries might even be more compatible than VBScript arrays if your scripts will be used in a tool that includes non-VBScript components. A dictionary can be used and manipulated by any language that understands COM, something that isn't guaranteed for VBScript arrays.

You don't lose anything by using a dictionary instead of a VBScript array. There are useful VBScript functions that work only with arrays, notably Join and Filter, but you can turn a dictionary into an array in one step as demonstrated in Listings 3 and 4.

Ironically, dictionaries also win in the performance department. We expect tools that are internal to VBScript such as dynamic arrays to be significantly faster than external tools such as the dictionary. Creating a dictionary does take much longer than creating a dynamic array—milliseconds instead of microseconds. After that, a very peculiar thing happens. We see similar performance when adding elements to either a dynamic array or a dictionary at first, but as the array grows, it slows down because dynamic arrays are fakes. When we redimension an array containing data that you need to keep, VBScript creates a brand new array and copies each entry to it. As a result, resizing an array from 1000 to 1001 elements takes approximately 100 times as much work as resizing it from 10 to 11 elements. There are ways to work around this dynamic array problem, but they increase code complexity and don't solve the fundamental problem of nonlinear performance.

Dictionaries are really second-generation arrays. As such, they resolve many of the historic problems of earlier array structures and present arrays as objects rather than as a set of internal language hacks. If you're dissatisfied with VBScript's dynamic arrays, dictionaries are the logical next step.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.