Troubleshooting Always On Availability Group failovers can be a complicated and time-consuming task--and time is typically in short supply when you're in the middle of a crisis. Microsoft's new Failover Detection Utility is designed to reduce the amount of time it takes to comb through such logs as SQL Error Logs, Windows Cluster Logs, System Logs, System Health Extended Events Logs, and Availability Group Extended Events Logs.
Considering that some of these logs have to be reviewed for every replica in your Availability Group, and you're adding time on top of time. Microsoft aims to free up some of this time with the new Failover Detection Utility, a tool that requires minimal setup and then performs the heavy lifting of collecting the contents of each of these logs, then centralizing, analyzing and finally reporting against all logs to provide insight into unintended or failed Availability Group failovers.
The Failover Detection Utility can be downloaded here. Among the files included in the download is a JSON file (configuration.json) that you configure to identify where you wish to centrally store the culled log files from various sources across the landscape of replicas for your Availability Group. You also need to supply the list of replicas in the Availability Group and the Failover Policy Level for the Availability Group. Those five levels for failover policy are explained here.
Once you’ve configured the JSON file you need to initially “prime” the loading channel by collecting the various log files one last time, loading copies into the directory specified in the JSON file. It’s also important to note that the security context of the calling user must have proper access to the path noted in the configuration.json file. After doing so, you kick off the process by executing the included executable in the download: FailoverDetection.exe. The executable allows for additional parameters that guide how the logs are collected to the path specified in the JSON file and reported out:
Default Mode: (no parameters provided – default behavior)
The configuration file is loaded, and the log data is collected and analyzed. Its output and analysis is written to a report in a subdirectory (“Results”) of the path specified in configuration.json.
- Analyze: Using this parameter will cause the tool to work only with the logs already located to the configuration.json path. It performs all tasks provided through Default mode except for log file copies.
- Show: When this parameter/flag is supplied, the tool also displays the results in the command console in addition to output of a report file to the Results folder.
The Failover Detection Utility currently can detect multiple failover root causes spanning planned failovers, SQL internal issues and issue outside the confines of the SQL Server instance on a server. Microsoft offers additional details, including setup instructions and information about the specific root cause analysis offered currently through the tool, at the official release blog for the tool.