Q: What happens to running runbooks if the runbook server they're running on unexpectedly shuts down or crashes?
A: If a runbook server fails, then another runbook server will take over the execution of its runbooks. However, the runbook would be restarted from the beginning, as all the data bus content would be lost and there'd be no way for the new runbook server to know at what point the runbook was at.
It's therefore very important when writing your runbooks to not only have error checking built in but to have validations performed to check that a certain step hasn't already occurred.
I try to write runbooks under the assumption that every step could have already been performed, so I have a check before performing any action that's not idempotent (i.e., it can't be repeated without changing the value). This is also a good reason to try to avoid long running runbooks, because that increases the chance of a failure occurring during their execution.
Read more FAQs at John Savill's author page.