This Orchestrator Runbook, “SCOM2012R2_Check_HealthService” is setup to capture a “Health Service Heartbeat Failure” for Windows machines, and restart the HealthService and/or delete the corrupted HealthService cache folder and restart the service.
The Runbook will capture the alert from SCOM, once captured, it will wait 60 seconds, it will then ping the machine, and if the ping is successful then it will then wait for 180 seconds, then check to see if the HealthService on the machine is running. If the ping is unsuccessful, it will send an email indicating the machine is actually offline.
If the HealthService is running, then it is possibly a corrupted cache folder. It will then stop the HealthService, delete the cache folder, and restart the service.
If the HealthService is not running, it will then start the service.
In both events, an email will be sent out as an information alert, to indicate that the Runbook resolved the issue.
Details of Configuration
Monitor Alert Properties:
Link from Monitor Alert to Run Program:
Link from Run Program to Get HealthService Status:
Link from Get HealthService Status.
If Not running:
Start HealthService Properties:
Since the Stop HealthService Properties are almost the same as Start HealthService, we have omitted this.
Delete Folder Properties
This pertains to SCOM 2012R2. There is a duplicate run book with the same configuration that checks against the old folder structure: