Practical SQL Server

Setting Up Additional Checks to Ensure Regular Transaction Log Backups

There’s simply no way to overstate the importance of regular Transaction Log Backups. Not only do they help protect from disaster, but regular execution of T-Log backups on Full (and Bulk-Logged) Recovery databases helps keep thing “fit and trim”. Most of the time, setting up a Notification for when T-Log Backup Jobs fail is enough to let you know when something goes wrong.

However, I have seen a couple of edge cases where this simply wasn’t enough. Typically this has been in multi-tenant systems and/or in systems with a high volume of transactions where any of the following things can or might have happened:

  • The UNC path or destination that the MIRROR TO clause (an Enterprise Edition only feature that lets you save backups to two locations (i.e., locally and to an off-box secondary location)) is pointing to is having issues and has slowed things down enough that while backups haven’t exactly failed, they’re no longer running on a regular schedule and it might therefore have been an hour or more since some databases have been adequately backed up.
  • Some third party SQL Server Backup Solutions have a GREAT ‘retry’ feature that lets you specify that if a backup fails for some reason, it should be retried N times with X amount of seconds to wait between each try. This is a great feature – unless you’ve run into problems, have too long of a wait between retries, and have GOBs of retries going on against multiple databases – at which point your backup job won’t actually ‘fail’ but can end up being seemingly ‘stalled’ for hours at a time. (e.g., if I’ve got a job set up to backup T-Logs against 20 databases every 10 minutes, and there’s some sort of problem executing the backups ‘retry’ attempts against a single database might end up taking 20 minutes – then the job will move on to the next database and so on – until you’ve spent hours with retry logic slowly sabotaging your backup chain before it’ll ever raise an error or alert)

In cases like this, and where it’s even more essential that T-Log backups are happening regularly, I’ve created a simple script that can be used to checkup on both the job that runs your backups (to make sure it has successfully completed within the last N minutes) and to checkup on the last backup dates of your FULL recovery databases.

The one obvious limitation with this job is that it runs via the SQL Server Agent – which means that if the SQL Server Agent (which is handling your backups) somehow crashes, then this ‘watchdog’ won’t be able to run either. (That said, in years of working with SQL Server I’ve never seen the SQL Server Agent seriously crash or unable to run – well, except when some enterprising soul explicitly shut it down. So if any of this is a concern, you might want to try and run this script via some OTHER scheduling mechanism than simply dropping it into a new SQL Server Agent Job and setting up a regular schedule for it to run under.)

Otherwise, the script is pretty simple. Just specify how far back you’d like your checks to run, and the name of the Job that’s handling your T-Log backups. (If you’ve got multiple jobs you could easily tweak this to account for an IN() clause.) Then, specify the name of an operator and the Database Mail Profile to use when sending alerts and this script will do the rest - once you schedule it for regular execution.


-- Variables:
DECLARE @NumberOfPastMinutesToCheck int = 30;
DECLARE @LogBackupJobName sysname = N'Customer Databases - TLOG';

-- Alerting details:
DECLARE @OperatorName sysname = 'Alerts';
DECLARE @DbMailProfileName sysname = 'General';

-- Logic/Implementation:

DECLARE @JobID uniqueidentifier; 
SELECT @JobID = job_id 
FROM msdb..sysjobs 
	name = @LogBackupJobName;

DECLARE @Subject nvarchar(200);
DECLARE @Body nvarchar(200);

-- Start with review of Backup Job - check to make sure it has 
--		run (successfully) within the last @NumberOfPastMinutesToCheck
DECLARE @ago datetime = DATEADD(n, - @NumberOfPastMinutesToCheck, GETDATE());

DECLARE @DateAsInt int;
DECLARE @TimeAsInt int;

SELECT @DateAsInt = CAST(CONVERT(varchar(20), @ago, 112) AS int);
SELECT @TimeAsInt = CAST(REPLACE(CONVERT(varchar(20), @ago, 108),':','') AS int);

DECLARE @states TABLE ( 
	run_date int, 
	run_time int, 
	run_status int

INSERT INTO @states 
	run_date >= @DateAsInt 
	AND run_time >= @TimeAsInt
	AND job_id = @JobId
	AND step_id = 0 -- job outcome (only)
	AND run_status = 1; -- succeeded (only)

DECLARE @successCounts int;
SELECT @successCounts = COUNT(*) FROM @states;

IF @successCounts < 1 BEGIN 
	SET @Subject = 
		'[SQL Server Backups] Transaction Log Backup Failure'
	SET @Body = 'The Transaction Backup Job (with ID ' 
		+ CAST(@JobId AS varchar(36)) 
		+ ' has not successfully completed in the last ' 
		+ CAST(@NumberOfPastMinutesToCheck AS varchar(20)) 
		+ ' minutes.';

	EXEC msdb..sp_notify_operator 
		@profile_name = @DbMailProfileName, 
		@name = @OperatorName, 
		@subject = @Subject, 
		@body = @Body;

-- Now check to see if any databases haven't 
--		been backed up in last @NumberOfPastMinutesToCheck:

DECLARE @databases TABLE (
	[name] sysname

INSERT INTO @databases
	recovery_model_desc = 'FULL'
	AND name NOT IN ('model')
		DISTINCT database_name 
		FROM msdb.dbo.backupset 
		WHERE [type] = 'L' AND backup_finish_date > @ago);

DECLARE @failedDbs nvarchar(2000) = '';

SELECT @failedDbs = @failedDbs + name + ',' 
	FROM @databases ORDER BY name;

IF @failedDbs != '' BEGIN 
	SET @Subject = '[SQL Server Backups] Transaction Log Backup Failure';
	SET @Body = 'The following databases have not been backed up within the last ' 
		+ CAST(@NumberOfPastMinutesToCheck AS varchar(20)) + ' minutes: ' 
		+ CHAR(13) + CHAR(10) + CHAR(9) + @failedDbs + '.';

	EXEC msdb..sp_notify_operator 
		@profile_name = @DbMailProfileName, 
		@name = @OperatorName, 
		@subject = @Subject, 
		@body = @Body;


Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.