Automatically Distribute Workloads by Opening Multiple Batch Processes

Downloads
27331.zip

Network administrators who are responsible for large user populations often must perform repetitive tasks on many computers. To perform repetitive tasks, you can take advantage of batch files that use a For command to loop through a list of remote machines. For example, you might use the For command to ping 5000 machines to determine whether they're powered on or off.

Because the batch file must handle each remote machine individually, a bottleneck typically occurs and the batch file takes a long time to complete. If you perform such repetitive tasks often, a better solution is to simultaneously run the batch file on many machines. Such a solution isn't new. Toby Everett used simultaneous processing in a Perl script, which he discussed in "An Approach to Parallelization, Part 1: How to Use the Process Farm," September 1999, http://www.winscriptingsolutions.com, InstantDoc ID 6123, and "An Approach to Parallelization, Part 2: The Internals of the Process Farm System," October 1999, http://www.winscriptingsolutions.com, InstantDoc ID 6190. And Jeff Price's VBScript file ThreadForker.vbs (http://cwashington.netreach.net/depo/view.asp?index=614&scripttype=vbscript) uses simultaneous processing. To use these scripts, however, you must know either Perl or VBScript. We've come up with a utility called Forker that performs simultaneous processing. Although the Forker utility uses a VBScript file, you need to know only the Windows shell scripting language to use the utility.

With the Forker utility, you can simulate and insert asynchronous behavior into a typically synchronous process, thereby greatly reducing the time required to complete repetitive tasks on large groups of remote machines. For example, when we ran a batch file that pinged 5000 machines, the time to task completion was 18 minutes. When we used the Forker utility to run the batch file, the time to completion was only 8 minutes. You can expect similar time savings for your batch files.

Dbfw.exe contains all the Forker utility's source files, including the two main files: DistributedWorkload.htm (the utility's UI) and SplitXRunBatch.vbs (the utility's primary script). You can find dbfw.exe in the Code Library on the Windows Scripting Solutions Web site (http://www.winscriptingsolutions.com). Dbfw.exe is a self-extracting file that extracts the source files to C:\temp. You can extract the source files to another location, but the path can't contain spaces because of the nature of several of the string concatenations in the files. All the files must be in the same directory.

The Forker utility is easy to use. After you've prepared your batch file and its accompanying input file (which contains the list of machines), you open the utility's UI. In the UI, you enter the batch file's pathname in the Batch File text box and the input file's pathname in the Input List File text box, then click Run Utility.

At this point, the UI performs several checks to make sure that the batch file and input file you specified exist. If they don't exist, a message box asks you to enter the correct pathnames. If they exist, the UI concatenates several strings to set up the paths as parameters for SplitXRunBatch.vbs. (If you care to see in realtime the paths that the UI is setting up, you can uncomment several MsgBox commands in DistributedWorkload.htm.)

SplitXRunBatch.vbs retrieves the parameters from the UI and opens the specified input file. The script splits the list of machines into X groups, where X is the optimal number of groups based on the total number of machines. The script then creates X batch processes, each of which runs an instance of the batch file. For example, if you have 5000 computers in the input list, the script splits them into 50 groups of 100 computers. The script then creates 50 batch processes, each of which runs the batch file on 100 computers. SplitXRunBatch.vbs goes into a sleep mode until all the batch processes are complete, at which point the script places all the output into a centralized file.

Now that you know how the utility works, let's look at the batch file and the input file. Then, you can learn how the script works.

The Batch and Input Files
We designed the Forker utility to work with batch files that use the For command to loop through a series of machines, whose names you supply in an input text file. The batch file needs to include a For command and an Echo command. The For command specifies the tasks to perform and sends the output to an output file. The Echo command redirects the output to a nonexistent file, which serves as a placeholder. SplitXRunBatch.vbs uses the placeholder to determine when the batch process has finished.

Listing 1 contains a sample batch file. Notice that it contains no other code except for the For and Echo commands — the batch file can be that simple. To adapt this For loop, simply replace the embedded Ping a n 1 command with your command. All the other elements in the For command need to stay the same. The Echo command's elements and position also need to stay the same.

Next, you need to create an input file that contains the machines on which you want to perform the tasks. Enter one IP address per line, and save the file as line-delimited ASCII text. (If you'd rather use machine paths instead of IP addresses, you can remove the a switch in the Ping a n 1 command that's embedded in the For loop.) If you accepted the default location of C:\temp when you extracted the files from dbfw.exe, the C:\temp\dbfw\ping folder contains a sample input file called list.txt.

In list.txt, we wrote the same entry — 127.0.0.1 — on each line, which isn't representative of an actual input file. In your input file, each line would contain a different IP address. In the Ping command, 127.0.0.1 is the loopback address used for testing purposes. (For more information about the loopback address, see Win2K's online Help documentation for the Ping command.)

How the Script Works
To show you how SplitXRunBatch.vbs works, we've broken the script into 10 sections. Here's an overview of what each section does.

Section 1. Section 1 sets the stage by declaring and initializing the script's variables and constants. Web Table 1 (http://www.winscriptingsolutions.com, InstantDoc ID 27331) describes the most important variables in the script. Section 1 also retrieves the parameters that the UI prepared.

Section 2. Section 2 contains Error-Correcting Code (ECC) that determines whether the correct number of parameters has been passed in. When you use the UI to provide the batch file's and input file's pathnames, the number of parameters will always be correct because the UI checks for the existence of these files. However, you can launch SplitXRunBatch.vbs from the command line if you enter the necessary parameters. In this situation, the ECC in Section 2 goes to work.

Section 3 and Section 10. One crucial element of the Forker utility is that each batch process reads a separate input file, writes to a separate output file, and creates a separate file to signal its completion. To house the batch processes' files, the script creates a directory. For example, if you enter C:\temp\dbfw\ping\pingem.bat in the UI's Batch File text box, the script sets up the directory structure, which will contain the following folders:

C:\temp\dbfw\ping\splitlists\ — will contain each batch process's input file
C:\temp\dbfw\ping\outputlists\ — will contain each batch process's output file
C:\temp\dbfw\ping\confirmcomplete\ — will contain the files confirming the completion of each batch process
C:\temp\dbfw\ — will contain the centralized data file (i.e., completeoutput.txt)

Section 3 and Section 10 set up the directory structure. Section 3 begins by creating an instance of the Scripting Runtime Library's FileSystemObject object, then creates the SplitLists folder by calling the BuildSubDirStructure subroutine and using the strOutput variable as the subroutine's parameter. Section 10 contains the BuildSubDirStructure subroutine. Section 3 then creates the OutputLists folder by calling the subroutine and using the strKnown variable as the parameter. Finally, Section 3 creates the ConfirmComplete folder by calling the subroutine with the strConfirmation variable.

Section 4 and Section 5. Section 4 opens the input file and reads the machine names into the arrList array. Section 4 also keeps track of the number of machines in the array and stores that number in the intCounter variable. Section 5 then runs an algorithm against this count to determine the best way to split the machines into groups.

Understanding how this algorithm works is important. Suppose you want to split 1003 machines into groups. If you could choose between 25 and 60 groups, which number is best? Basically, we assume that the highest number of groups (i.e., the divisor) with the smallest number of remaining machines (i.e., the modulo, or mod, value) is the best choice. In this example, the best choice is 59 groups, which means that 59 batch processes will concurrently run a batch file against 17 machines, with no remaining machines (1003 / 59 = 17).

Why did we choose a number between 25 and 60? We ran many tests from a 800MHz Pentium III machine with 256MB of RAM running Windows 2000 in a 10Mbps LAN. We used varying speeds across the WAN to connect to multiple remote machines. We found that opening fewer than 25 processes wasn't conducive to performance improvement because it didn't provide a significant decrease in the time the script takes to run. We also found that using more than 60 processes adversely affected CPU performance. Opening 60 processes affected the local CPU performance, but the effect wasn't so adverse that the administrator couldn't continue to work. So, we set a lower limit of 25 and an upper limit of 60 and assigned these limits to the intMinSplit and intMaxSplit variables, respectively.

You can easily modify the lower and upper limits to fit your environment. For example, if you're using a processor-heavy command or utility (e.g., the Pulist command to display remote processes), you might not want to open 60 command windows. Instead, you can modify the intMinSplit and intMaxSplit variables to represent a lower range, such as setting intMinSplit to 5 and intMaxSplit to 25. Other than having fewer batch processes open, the script's functionality won't change.

Section 6. Section 6 creates the input files for each batch process. These files contain the IP addresses of the machines on which the batch processes will run. For example, if you have 1003 machines, Section 6 would write the first 17 machines in the array to %strRootPath%\splitlists\output1.txt, the second 17 machines to %strRootPath%\splitlists\output2.txt, and so on until it finishes writing the last 17 machines to %strRootPath%\splitlists\output59.txt. Note that although the files are input files for the batch processes, the filenames use the word output because from the perspective of SplitXRunBatch.vbs, the files contain output.

If a mod value exists, the array appends the remaining machines to the final text file. For example, if you had 1005 machines, Section 6 would append the two remaining machines to %strRootPath%\splitlists\output59.txt.

Section 7. Section 7 starts the batch processes and passes in the values that the batch file needs. The values are

the numbered input file from which the batch file reads data (\splitlists\outputX.txt, where X is a number that increments by one)
the numbered output file to which the batch file writes data (\outputlists\knownX.txt, where X is a number that increments by one)
the numbered output file that the batch file creates on completion (\confirmcomplete\doneX.txt, where X is a number that increments by one)

Listing 2 shows the code in Section 7. Although this code is easy to follow, the overall interaction between SplitXRunBatch.vbs and each batch file isn't immediately obvious. So, let's examine how Section 7 in Listing 2 and the batch file in Listing 1 interact with each other.

Suppose you enter C:\temp\dbfw\ping\pingem.bat in the UI's Batch File text box. The code at callout A in Listing 2 creates the following input string for the first batch process: C:\temp\dbfw\ping\pingem.bat C:\temp\dbfw\ping\splitlists\output1.txt C:\temp\dbfw\ping\outputlists\known1.txt C:\temp\dbfw\ping\confirmcomplete\done1.txt

In the actual input, these four paths would all be on one line, with a space separating each path. We just put the paths on separate lines for easy reading. Section 7 passes this string to the batch process.

The first path in the string launches PingEm.bat, which Listing 1 shows. In Listing 1, notice the %1, %2, and %3 variables. These argument-holding variables bring input into a script at runtime. In this case, the %1 variable brings in the path C:\temp\dbfw\ping\splitlists\output1.txt, the %2 variable brings in C:\temp\dbfw\ping\outputlists \known1.txt, and the %3 variable brings in C:\temp\dbfw\ping\confirmcomplete\done1.txt.

If you have 1003 machines, the batch file runs 59 times. Each time the batch file runs, the text file number increments by one, so the filenames for the final files are output59.txt, known59.txt, and done59.txt.

Section 8. Section 8 counts the number of completion files (i.e., doneX.txt). When the number of completion files matches the number of batch processes, the operation is complete and the script moves on to Section 9. If the numbers don't match, Section 8 waits 2 seconds, then again counts the number of completion files.

Section 9. Section 9 compiles all the output from the batch files (i.e., knownX.txt) into one centralized file (completeoutput.txt). This code is straightforward — Section 9 simply writes each output file to completeoutput.txt.

Give It a Test Run
Network administrators typically need fast, effective, simple solutions. If you often perform repetitive tasks on many computers and you want to save valuable time, try the Forker utility. We've tested the Forker utility on Windows XP Professional Edition, Win2K, and Windows NT 4.0 machines running Microsoft Internet Explorer (IE) 5.5 and later. If you've set your IE configuration to a high security level, you might need to alter the security settings. Make sure you first test the utility and the batch file in a test environment. When paired with an ill-formed batch file, the utility will speed the rate at which damages might occur.

Comments

Plain text