Skip navigation

Use the GetToken Function to Parse Delimited Lines

Downloads
45817.zip

Text-file processing is a common scripting function. It's so common, in fact, that the Microsoft Scripting Runtime Library's FileSystemObject object has the TextStream object, which provides properties and methods for processing text files. However, the command interpreter (cmd.exe) has one feature that's a step ahead of what the Scripting Runtime Library provides: string token parsing.

Cmd.exe provides token parsing with the For /f command. Token parsing lets you break a line of text into chunks, or tokens, for further processing. For example, suppose you have a text file named users.txt that contains lines following the format domainname;username;fullname;description. In the lines, semicolons (;) separate the tokens. Using the For /f command, you can parse each line and extract the username by using the following code in a .cmd script:

For /f "tokens=2 delims=;" %%t
 in (Users.txt) Do Echo %%t

(Although this command appears on several lines here, it would appear all on one line in the script.)

The GetToken function, which Listing 3 shows, provides a similar capability in VBScript scripts. As the syntax

GetToken(TextToParse,
 Delimiter, TokenNumber)

shows, GetToken requires three parameters. The TextToParse parameter specifies the line of text to be parsed. You can specify a single line of text, or you can use the TextStream object's ReadLine method to read in a line from a text file. The line of text can contain spaces. If the line contains quotation marks, the function handles them automatically. The Delimiter parameter specifies the delimiter character to use. (The function supports only a single character.) The TokenNumber parameter specifies the number of the token to retrieve.

As callout A in Listing 3 shows, GetToken works by first making sure that the string and delimiter characters aren't zero-length and that the token number isn't 0. When these conditions are met, GetToken uses VBScript's Split function to return an array of substrings based on the original string. The delimiter character determines the limits of each substring in the array.

As callout B shows, GetToken then subtracts 1 from the requested token number because VBScript arrays are zero-based. As long as the resulting token number is less than or equal to the number of array elements, GetToken retrieves the requested token from the array. GetToken uses the Trim function to remove any spaces from the beginning and end of the substring in that token. GetToken also uses the Replace function to remove any quotation marks from the substring because it's common for delimited text files to have quotation marks around strings. The cleaned-up string is then ready for use.

You can use the GetToken function in any script that needs to process delimited text files. For example, the SPCheck.vbs script in the article "Are the OS Service Packs on Your Remote Computers Up-to-Date?" uses GetToken to read in lines from a text file and extract the computer name from each line. Listing 2 shows the code that calls the function, then uses the extracted computer name.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish