JSI Tip 7470. How can I manage service failures without configuring each service?


In tip 2516 ยป Managing Windows 2000 service failures, I described how to configure the actions you desire when a service fails.

I have scripted SvcChg.bat to provide you notify, service restart, program run, and computer restart capability, for all services.

SvcChg.bat should be scheduled At System Startup, and should be run with an account that is never logged on locally, so that the process runs in the background. The account should have local or domain administrative privileges. I use a Domain Admin account on domain controllers, and a member of the local Administrators group on all other computers.

SvcChg.bat reads configuration information from a SvcChg.txt file, that you place in the same folder as SvcChg.bat. The SvcChg.txt file contains:

email=NotifyPartyEmailAddress   - If missing or left blank, no email notification is provided. 

netsend=UserNameOrComputerName  - If missing or left blank, no NET SEND notification is provided.
                                  The Messenger service must be started if you configure netsend=.

seconds=nnn                     - is the number of seconds, minus 1, that SvcChg.bat sleeps between each checking cycle.
                                  The default is 61, and the maximum allowed is 900.

noactioncnt=                    - Is the number of Seconds= cycles that should elapse before any action is taken.
                                  Example: If you set noactioncnt=2, a service failure won't be actioned unless it
                                           persists for approximately 2*Seconds= seconds. If noactioncnt= is missing
                                           or 0, failures are actioned as they are detected.

startcnt=                       - Is the number of failures detected before a service is restarted. If missing or 0,
                                  the service is never started. Each detection cycle counts as 1 if the service is stopped.
                                  I usually use 1, but 2 or 3 might be used if you know you will be manually
                                  stopping services, and don't want them started unless you haven't done so in startcnt= cycles.   

runcnt=                         - Is the similar to Startcnt, but it determines when the batch or program configured
                                  at run= will be executed. If your batch or program starts the service, you could
                                  configure startcnt= as 0.

run=                            - Is the full path to you batch or program, which is invoked with:                         
                                  call YourRun=<Object> "Service Display Name" NumberOfTotalTimesThisServiceHasFailed since
                                  the computer was last restarted.

restartcnt=                     - Is similar to Startcnt, but it determines how many failures a service can experience
                                  before SvcChg.txt restarts your computer. If missing ot 0, your computer is never restarted.

The contents of my SvcChg.txt files is generally:

[email protected]
netsend=JSI009
noactioncnt=0
run=
runcnt=0
startcnt=1
restartcnt=5
seconds=61

where JSI009 is a my desktop computer. On domain controllers, and some servers,
I use netsend=, as the  Messenger service is NOT started.
NOTE: SvcChg.bat uses Blat for email notification and PsShutdown to restart your computer.

NOTE: A small SvcChg.log file is written to the same folder that contains SvcChg.bat, as are a few working files.

NOTE: If a service that was not started, when SvcChg.bat records the initial service state, is started, SvcChg.bat will use email= and netsend= to send "%computername% - New service <Service Display Name> was started".

NOTE: If you make service configuration changes, like disabling a service, use Scheduled Tasks to End Task the SvcChg.bat job, and restart it once you are finished. If you have restartcnt= configured when you stop and disable a service, your computer will restart after restartcnt= cycles, unless you stop the SvcChg.bat job.

SvcChg.bat contains:

@echo on
setlocal
:: Allow time for all services to start after the computer restarts.
ping -n 181 127.0.0.1>nul
set cfg="%~DP0SvcChg.txt"
set newsvc="%~DP0SvcChg.new"
set pipe="%~DP0SvcChg.log"
if exist %pipe% del /q %pipe%
call :SvcChgLog>>%pipe% 2>>&1
endlocal
goto :EOF
:SvcChgLog
set tempsvc="%TEMP%\newsvc.tmp"
set svctmp="%TEMP%\SvcChgTMP.TMP"
set svccnt="%TEMP%\svccnt.tmp"
if exist %newsvc% del /q %newsvc%
if exist %svccnt% del /q %svccnt%
if exist %tempsvc% del /q %tempsvc%
if exist %svctmp% del /q %svctmp%
if not exist %cfg% @echo %cfg% is missing.&endlocal&goto :EOF
for /f "Tokens=*" %%c in ('type %cfg%') do set %%c
cd /d %TEMP%
if \{%noactioncnt%\} LSS \{0\} set noactioncnt=0
if \{%noactioncnt%\} GTR \{999\} set noactioncnt=0
if \{%runcnt%\} LSS \{0\} set runcnt=0
if \{%runcnt%\} GTR \{999\} set runcnt=0
if \{%startcnt%\} LSS \{0\} set startcnt=0
if \{%startcnt%\} GTR \{999\} set startcnt=0
if \{%restartcnt%\} LSS \{0\} set restartcnt=0
if \{%restartcnt%\} GTR \{999\} set restartcnt=0
if \{%seconds%\} LSS \{0\} set seconds=61
if \{%seconds%\} GTR \{900\} set seconds=900
set /a noactioncnt=1000%noactioncnt%%%1000
set /a runcnt=1000%runcnt%%%1000
set /a startcnt=1000%startcnt%%%1000
set /a restartcnt=1000%restartcnt%%%1000
set /a seconds=1000%seconds%%%1000
@echo off
net start>N1.TXT
:loop
ping -n %seconds% 127.0.0.1>nul
net start>N2.TXT
if exist diff.txt del /q diff.txt
for /f "Skip=1 Tokens=*" %%f in ('fc /c /l N1.TXT N2.TXT') do set line=%%f&call :parse
sort diff.txt /o diffs.txt
set prev=none
set previd=first
for /f "Tokens=1* Delims=#" %%f in (diffs.txt) do set line=%%f&set id=%%g&call :parse1
if /i "%previd%" EQU "first" goto loop
if /i "%previd%" NEQ "none" call :svc
goto :loop
:parse
if "%line:~0,5%" NEQ "*****" goto idit
if "%line:~5,1%" EQU "" goto :EOF
set id=%line:~6%
goto :EOF
:idit
@echo %line%#%id%>>diff.txt
goto :EOF
:parse1
if /i "%line%" EQU "FC: no differences encountered" goto :EOF
if /i "%line%" EQU "%prev%" set prev=none&set previd=none&goto :EOF
if /i "%previd%" EQU "first" goto OK
:bad
if /i "%previd%" EQU "none" goto OK
call :svc
set prev=none
set previd=none
if /i "%id%" EQU "n2.txt" goto :EOF
:OK
set prev=%line%
set previd=%id%
goto :EOF
:svc
if /i "%previd%" EQU "N2.TXT" goto newservice
if not exist %svccnt% @echo %prev%#0>%svccnt%
if exist %svctmp% del /q %svctmp%
@echo %computername% - %prev%>%tempsvc%
set msg=
set shut=N
for /f "Tokens=1,2 Delims=#" %%c in ('type %svccnt%') do set svcname=%%c&set /a times=1000%%d%%1000&call :parse3
del /q %svccnt%
if exist %svctmp% copy %svctmp% %svccnt%
if "%shut%" EQU "Y" psshutdown -f -r -t 0
goto :EOF
:parse3
if /i "%prev%" NEQ "%svcname%" goto parse4
set /a times=%times% + 1
if %times% LSS %noactioncnt% goto :parse4
if %startcnt% GTR 0 if %times% GEQ %startcnt% set msg=%msg%Started&net start "%prev%"
if defined run if %runcnt% GTR 0 if %times% GEQ %runcnt% set msg=%msg%, %run%&call %run% "%prev%" %times%
if %restartcnt% GTR 0 if %times% GEQ %restartcnt% set msg=Computer restarted.&set shut=Y
if "%msg%" EQU "" set msg=No Action.
if "%msg:~0,1%" EQU "," set msg=%msg:~2%
if defined email blat %tempsvc% -to %email% -s "%computername% - Service %prev% is stopped. Action=%msg%"
if defined netsend net send %netsend% "%computername% - Service %prev% is stopped. Action=%msg%"
:parse4
@echo %prev%#%times%>>%svctmp%
goto :EOF
:newservice
set newpost=Y
if not exist %newsvc% goto postnew
for /f "Tokens=*" %%c in ('type %newsvc%') do if /i "%prev%" EQU "%%c" set newpost=N
:postnew
if "%newpost%" NEQ "Y" goto :EOF
@echo %prev%>>%newsvc%
@echo %prev%>%tempsvc%
if defined email blat %tempsvc% -to %email% -s "%computername% - New service %prev% was started."
if defined netsend net send %netsend% "%computername% - New service %prev% was started."



Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish