Scriptwriting Methodology, Part 2: Advanced Data Manipulation and Formatting


Editor's note: This article is the conclusion of a two-part series about scriptwriting methodology. The series started in the March 1999 issue. Refer to the first installment for definitions and background information.

Suppose your IS director has just dramatically expanded the scope of your original script. What started out as a script to create a simple text file containing data about disk-space usage and file structures needs to become a script that creates an HTML table, complete with an IS logo and a date-and-time stamp, for the corporate intranet. Because this expanded report will be on the intranet, you also need to create a menu page from which users can access all the reports.

How do you revise your DirSizeReport.bat script to meet these new requirements? You again follow the three stages of script development—data capture, data manipulation, and data presentation—to create the revised HTMLDirSizeReport.bat script. (Because this script is long, you can find HTMLDirSizeReport.bat on the Win32 Scripting Journal Web site.)

Data Capture
Because of the expanded requirements, you now need to capture date and time information. You use the DATE command to capture the system date and the TIME command to capture the system time. You need to use the /T switch with both commands. This switch tells the command not to include a prompt that asks you to specify a new date or time. For example, if you type DATE, the script displays not only the current date but also a prompt for a new date. If you type DATE /T, the script displays only the current date—you don't get the new date prompt.

From the DATE command, you obtain output such as

Thu 02/18/1999

An example of the output from the TIME command is


If you're using Windows 2000 (Win2K), the TIME command displays military time (e.g., 14:14). If you're using Windows NT 4.0, the TIME command displays standard time (e.g., 2:14p).

Data Manipulation
In the original DirSizeReport.bat script, you used two modules: the dynamic recursion module (for manipulation down the folder hierarchy) and the dynamic repetition module (for manipulation across top-level folders). For the revised HTMLDirSizeReport.bat script, you again use both modules but you need to modify the dynamic recursion module. Before I show you the modification, I'll show you how to create the date-and-time module.

Date-and-time module. Although you can redirect the date and time information to a text file to use later in the data presentation stage, a more efficient approach is to capture the date and time information inside a FOR command and set this information to an environment variable. To capture the DATE information, you use the code

FOR /F "tokens=2,3,4 delims=/ " %%I
	in ('DATE /T') DO SET

The FOR command's file-parsing (/F) switch lets you parse data. In this case, you parse the output from the DATE command. Because the data you want to parse is a command's output, you use single quotes to enclose the data source \{i.e., ('DATE /T')\}. If the data source is an environment variable, you use double quotes \{e.g., ("%date1%")\}. If the data source is a file, you use no quotes around the filename \{e.g., (date.txt)\}.

The next part of the code ("tokens=2,3,4 delims=/ " %%I) specifies that you want to capture tokens 2, 3, and 4 from the DATE command output. Token 1 contains the day of the week (e.g., Thu). Because you don't need to capture the day of the week, you can leave out token 1. Tokens 2, 3, and 4 represent the month (e.g., 2), day (e.g., 18), and year (e.g., 1999), respectively. As the example output from the DATE command shows, you need to specify that the delimiter is a slash (/) and follow it with a space. The %%I iterator variable serves as a placeholder for the captured information.

Because you started with the %%I variable, you have three parsed tokens available for use (%%I, %%J, and %%K). You want to use the information that these variables capture, so you set them to the date1 environment variable with the DO SET date1=%%I/%%J/%%K command. Notice the addition of slashes after the %%I and %%J variables. You add these slashes to replace the slashes in the date that the FOR command removes when it parses the output.

To capture and set the time information, you use the code

FOR /F "tokens=1,2 delims=: " %%I in
   ('TIME /T') DO SET time1=%%I:%%Jm

The FOR command's /F switch lets you parse the output from the command. Token 1 contains the hour (e.g., 5), and token 2 contains the minutes followed by either an a to represent a.m. (e.g., 22a) or a p to represent p.m. (e.g., 22p). As the example output from the TIME command shows, you need to specify that the delimiter is a colon (:) and follow it with a space. The %%I iterator variable serves as a placeholder for the captured tokens.

In the DO SET time1=%%I:%%Jm command, you set the information that the %%I and %%J variables capture to the time1 environment variable. You add a colon after the %%I variable to replace the one that the FOR command removes when it parses the output. Because the TIME command uses only one letter to represent a.m. and p.m., you add the letter m after the %%J variable. However, if you're using Win2K, you don't include the m, because the DATE command displays military time.

Together, the two FOR commands create a date-and-time module that you can use to name files or place date-and-time stamps inside reports or log files. You'll probably use this key module in many of your scripts.

Revised dynamic recursion module. In the original DirSizeReport.bat script, you used the DIRUSE command with the /S switch to provide dynamic recursion. The original code was

DIRUSE /S /M /, "%Target_Directory%"
	| FINDSTR /V "\<\[0\].\[0\]\[0\]" |
	FINDSTR /V "SUBTOTAL">>"%TEMP%	%Target_Directory%output.txt"

For the revised HTMLDirSizeReport.bat script, you need to filter out additional lines from the DIRUSE command output. Because you'll insert formal header information in the data presentation stage, you need to remove the column headings Searching for directories and Size (mb) Files Directory in the output. The new code is

DIRUSE /S /M /, /Q:0 /D "%Target_
	Directory%" | FINDSTR /V "SUB-
	"Searching for directories" |
	FINDSTR /I /V /C:"Size (mb) Files

In this code, you use the FINDSTR command to remove the lines that contain the string "Searching for directories" and "Size (mb) Files Directory". The /I switch specifies that the string is case-insensitive, and the /V switch tells the command to display those lines that don't match the specified strings. The /C: switch prefacing the strings ensures that the FINDSTR command reads each string as one unit.

You need to be careful when you use the FINDSTR command, because incorrect string specification can result in filtering errors. For example, suppose you decide to simplify this filtering operation. You notice that both the column headings you want to filter contain the string "director". So, you replace the FINDSTR /I /V /C:"Searching for directories" and FINDSTR /I /V /C:"Size (mb) Files Directory" commands with the FINDSTR /I /V "director" command. This simplified command will filter out the two column headings. However, it will also filter out any directory names that contain director, such as Directors Work, Freds directory, and Directory123. Thus, you need to specify the string with sufficient granularity to ensure that you remove only those lines you want to filter.

Incorrect switch usage can also result in filtering errors. For example, if you don't preface the "Size (mb) Files Directory" string with the /C: switch, the FINDSTR command filters out any line that contains any of the individual elements of "Size", "(mb)", "Files", or "Directory".

In addition to filtering out the column headings, the revised dynamic recursion module has one other important change: the use of the DIRUSE command's /Q: and /D switches. The /Q: switch tells the DIRUSE command to mark those directories that exceed the specified size (in this case, zero bytes). The /D switch tells the DIRUSE command to display only those directories that exceed the specified size.

You redirect the output of the revised dynamic recursion module to the output.txt file in the %TEMP% directory. Because your script's expanded scope requires that you insert the three columns of output in output.txt into a table, you need to use the FOR command to arrange the data in individual table cells. To accomplish this task, you use the basic code

FOR /F "tokens=2,3,* delims=" %%I
	in (%TEMP%\output.txt) DO ECHO
	%%I %%J %%K

In this code, the FOR command parses the data in the output.txt file to find tokens 2, 3, and 4. As I mentioned in Part 1, you're better off using "tokens=2,3,* rather than "tokens=2,3,4, because the asterisk (*) specifies that you want to capture the entire final token. If you use "tokens=2,3,4 and the folder name contains a space, the FOR command will capture only the first word, because the command interprets the space as a delimiter. (If you use the delims=" code with no space or no other character, the default delimiter is a space or tab.) Once again, you use the %%I iterator as a placeholder for the captured information.

The DO ECHO command displays the data. You'll modify this section of the code in the data presentation stage so that the data appears in a table format.

Data Presentation
Your data presentation tasks were minimal when you created the original directory size report. However, now you need to post the report on the corporate intranet, which requires that you convert the report to an HTML-formatted file. This conversion requires that you build HTML templates, output and munge (i.e., transform) the HTML code, and incorporate the positioning information into the revised dynamic recursion module.

Build the templates. You use an HTML editor to build the templates for the directory size report and the directory size report menu. Screen 1 contains an example of a directory size report template that I created with Microsoft FrontPage. As Screen 1 shows, you position the various iterators in the HTMLDirSizeReport.bat script in the appropriate locations. Note that you need to reserve space in the top left corner for the IS logo.

Output and munge the HTML code. After you create the HTML templates, you need to output the HTML code. For example, Listing 1 contains an excerpt from the code for the directory size report template in Screen 1.

Using unaltered HTML code in a batch file can cause problems, because you're integrating two types of code. For example, HTML's < and > tag symbols are reserved redirection symbols in batch files. If you execute a script that uses unaltered HTML code, the NT command shell interprets these tag symbols as illegal characters, causing the script to fail. Thus, you need to munge the HTML code so that you can use it in a batch file.

Manually munging code is time-consuming, even in a small file. Fortunately, you can automate this process with the HTMLCreator.bat script in Listing 2. This script uses the Microsoft Windows NT Server 4.0 Resource Kit's munge.exe utility, which lets you search for and replace strings in a file.

Specifically, the HTMLCreator.bat script performs three important functions. First, it puts a caret (^) before each tag symbol so that the NT command shell won't consider the tags illegal characters.

Second, the script doubles the occurrences of percent signs (%). You need to add extra percent signs, because when the command shell performs the related task, the original percent signs disappear. By doubling the percent signs, the HTML code has the ones it needs.

Third, the script places an ECHO command at the beginning of each line and a redirection at the end of each line. However, you need to specify the filename of the file that is to receive the redirected output.

You run the HTMLCreator.bat script against the HTML code. The script munges the HTML template file and redirects the output to the NewReport.bat file. The NewReport.bat output will contain blank fields in those spots where HTMLDirSizeReport.bat script-generated data needs to reside (e.g., spots for the date and time information). You need to specify the appropriate environment variables in those blank fields. You then cut and paste the NewReport.bat code sections into your script. Listing 3 shows an example code section from the HTMLDirSizeReport.bat script.

Incorporate the positioning information into the dynamic recursion module. The HTML code for the directory size report template in Listing 1 provides the positioning information that the dynamic recursion module needs. Specifically, the ECHO command needs the positioning information to put the output from the DIRUSE command into a table format. Callout A in Listing 1 contains the positioning information for the %%I, %%J, and %%K iterators. In this code, the <tr> tag specifies the table row definitions, the <td width> tag represents the table data definitions for one cell, the </td> tag specifies the end of the table data definitions for the cell, and the </tr> tag represents the end of the table row definitions. (If you don't have a wealth of HTML coding expertise, don't be concerned, because your HTML editor writes all the HTML code for you.)

Now that you have the HTML code, you substitute the appropriate positioning information for each iterator and specify where you want to display the command's output. So, the basic code

FOR /F "tokens=2,3,* delims=" %%I
	in (%TEMP%\output.txt) DO ECHO
	%%I %%J %%K


FOR /F "tokens=2,3,* delims=" %%I
	in (%TEMP%\output.txt) DO ECHO
	^<tr^>^<td width^="9%%"^>%%I^<^/
	td^>^<td width^="10%%"^>%%J^<^/
	td^>^<td width^="81%%"^>%%K^<^/

In this case, you want to display the output in the DirSize_%Target_Directory% HTML file in the %TEMP% directory.

Putting All the Pieces Together
Now that you've prepared the modules and munged the template code, you're ready to compile the various pieces into a script. The contents of the NewReport.bat file can serve as your script's skeleton. You need to determine which code you need to run once (e.g., the date-and-time module) and which code you need to reuse (e.g., the :CLEAN_UP module, which deletes a temporary file after its use.) You also need to determine where you want to echo the output and how often you want to refresh the data. When you're putting all the pieces together, keep these tips in mind:

Relocate or bypass comments. Properly commented code is a necessity, yet verbose comments can reduce a script's performance. One way to get around this catch-22 is to move your comments outside the modules. In the HTMLDirSizeReport.bat script, most of the comments are outside the normal flow.

An alternative method is to use a GOTO command to skip over large blocks of comments. For example, the HTMLDirSizeReport.bat script uses the GOTO :NEXT command to bypass the general comments at the beginning of the script.

Modularize your code. If you organize your script into logical sections of code, you can more easily track down problems and reuse code in the same script or in different scripts. For example, the HTMLDirSizeReport.bat script uses the :CLEAN_UP module to delete the temporary output.txt file after each use. As your scripts increase in size and complexity, breaking the script into small modules will help you keep your sanity.

Control the flow. The advantage of using a script is that you can quickly accomplish repetitive tasks. However, that advantage can turn into a disadvantage if the script quickly accomplishes tasks that you didn't plan for it to do. At some point, most scriptwriters write a script that runs wild. Runaway scripts typically result from improper flow control. The script either runs commands out of order or goes into a loop that wasn't obvious to the scriptwriter.

You can minimize the possibility of improper flow control with commands such as CALL and GOTO. The CALL command invokes another script or procedure, and the GOTO command transfers control to a different location in a script. (If you're uncertain about how to use these commands, see the sidebar "How the CALL and GOTO Commands Work" on the Win32 Scripting Journal Web site.)

Your script is now complete and churning out reports for the intranet. After the IS director congratulates you on a job well done, he pauses for a second, then says, "But I'll bet you could make this report even better with a few small changes...."

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.