Phil Robyn, at Berkeley, pointed out that the batch file I had scripted at tip 3530 did not
handle blank records and it did not preserve leading space characters in the records. Phil submitted the following
batch:
@echo off setlocal if \{%1\} EQU \{\} goto syntax if not exist %1 goto syntax set infile=%1 if \{%2\} EQU \{\} goto syntax set outfile=%2 type nul > %outfile% for /f "tokens=1* delims=:" %%a in ( 'type %infile% ^| sort ^| findstr /n /v /c:"CoLoRlEsS gReEn IdEaS"' ) do call :dedup %%a "%%b" endlocal&goto :EOF :syntax @echo ************************************** @echo Syntax: SortDup Input_File Output_File @echo ************************************** endlocal&goto :EOF :dedup set curr_rec=%2 if \[%curr_rec%\]==\[""\] set curr_rec=$$$blankline$$$ set curr_rec=Borrowing Phil's findstr idea, I countered with the following amendment:set curr_rec=%curr_rec:%curr_rec% =% if not defined prev_rec goto :write if "%curr_rec%" EQU "%prev_rec%" goto :EOF :write if "%curr_rec%" EQU "$$$blankline$$$" ( echo.>>%outfile% ) else ( echo>>%outfile% %curr_rec% ) set prev_rec=%curr_rec% goto :EOF"=% set curr_rec=%curr_rec:
@echo off setlocal if \{%1\} EQU \{\} goto syntax if not exist %1 goto syntax set file=%1 set file="%file:"=%" set work=%~pd1\%~nx1.tmp set work="%work:"=%" set work=%work:\\=\% sort %file% /O %work% del /f /q %file% for /f "Tokens=1* Delims=:" %%s in ('findstr /n /v /c:"dO nOt FiNd" %work%') do set record=###%%t###&call :output REM if exist %work% del /q %work% endlocal goto :EOF :syntax @echo *************************** @echo Syntax: SortDup Input_File @echo *************************** goto :EOF :output if not defined prev_rec goto :write if "%record%" EQU "%prev_rec%" goto :EOF :write set prev_rec=%record% set record=%record:###=% if "%record%" EQU "" goto :blknul if "%record%" GTR " " @echo>>%file% %record%&goto :EOF :blknul if defined bn_rec goto :EOF set bn_rec=Y @echo.>>%file%NOTE: Neither script gracefully handles records that contain batch control characters, such as &, |, and >. Neither do they address multiple blank records of differing length or null records. I elected to handle multiple blanks records and null records by outputting a single blank record. If you don't want to output any blank records, remove the last line ( @echo.>>%file%).
NOTE: Phil's script pipes the output of the sort command directly into the findstr command, while my script lets the sort write an output file (%work%). Phil's script runs faster on very small files, while mine is twice as fast when sorting larger files.
NOTE: Phil's script script use an Input_File and Output_File, while I elected to return the results in the Input_File. I don't delete the sort output file, which I created in the same folder as Input_File. If you wish to delete it, remove the REM from REM if exist %work% del /q %work%.