Step-by-Step – SCOM 2012 R2 Update Rollup 7 (UR7) Install Procedure

My personal notes,  UR7 has a lot of security fixes, and it is highly recommended to upgrade your lab/Dev environments first before upgrading your Production environment(s). The step by step procedures below are the steps I took and in no way shape or form do I accept responsibility for any data loss, and/or issues within your environment. It is advised to always take a backup of your SQL databases and/or snapshots of your SCOM environment(s). Please take these notes as suggestions. Always refer to Microsoft’s KB (posted above) for full documentation steps.

Here are the key updates for UR7 (source Microsoft):

Issues that are fixed in this update rollup can be found here, https://support.microsoft.com/kb/3064919

Once you are ready to begin your upgrade, it is recommend you do the following server/roles in the order below:

  1. Install the update rollup package on the following server infrastructure:
  • Management server or servers
  • Gateway servers
  • Web console server role computers
  • Operations console role computers
  1. Apply SQL scripts.
  2. Manually import the management packs.
  3. Apply the agent update to manually installed agents, or push the installation from the Pending view in the Operations console.

Once you have downloaded the rollup files, I like to extract and only keep the language I need, in this case, ENU (English). You will need to install these with Administrative rights, I like to use PowerShell as Local Administrator. It really does frustrate me, as there is no indication that the rollup installed correctly, (other than looking at the file version number change via File Explorer).

1 (2)

2 (2)

Once the rollups are installed, you will now need to apply the SQL scripts. First update the Data Warehouse, then followed by the OpsMgr DB.

The scripts can be found here, “C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\SQL Script for Update Rollups\”

Please note, the user executing these scripts needs to have read and write permissions to the database(s).

3 (2)

5 (2)

4 (2)

Once you have successfully executed the SQL scripts, you will now need to import the updated Management Packs. These MPs can be found here, “C:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Management Packs for Update Rollups\”.

You will need to import the following MPs, please see below:

6 (1)

Once the MPs have been imported, you should now go back to your Pending Management view, under the Administrations pane, and update all servers.

7 (1)

And that is that! You are now on the latest System Center release for SCOM 2012 R2.

(more…)

Management Pack Backup Automation

Backup, backup, backup. It is never a bad idea to be safe and back up your data, or in this case your Management Packs. I recently created an automated, scheduled task that runs every Monday morning that backs up all the Management Packs within the environment. Please note, all sealed and un-sealed Management Packs will be backed up in a un-sealed format.

The following PowerShell code I have used:


$a = get-date
$a = $a.ToString("yyyy-MM-dd")
$rootMS = gc env:computername

Import-Module OperationsManager
New-SCOMManagementGroupConnection -ComputerName $rootMS

$path = New-Item -ItemType directory -Path "\\somepath\MPBackup\$a"

Get-SCOMManagementPack | Export-SCOMManagementPack -Path $path.FullName

Maintenance Mode History with SQL

Unfortunately SCOM 2012R2 does not have a native report and/or view that allows you quickly view the maintenance history on a specific server or servers. This handy SQL query I have used many times over to get the history of a given server or servers to find out when the machine entered MM (Maintenance Mode). Using the query below, run against the OperationsManager (or Data Warehouse) DB, and specify the server(s) you are interested with the date range:

---
USE OperationsManagerDW
SELECT ManagedEntity.DisplayName, MaintenanceModeHistory.*
FROM ManagedEntity WITH (NOLOCK)
INNER JOIN
MaintenanceMode ON ManagedEntity.ManagedEntityRowId = MaintenanceMode.ManagedEntityRowId
INNER JOIN
MaintenanceModeHistory ON MaintenanceMode.MaintenanceModeRowId = MaintenanceModeHistory.MaintenanceModeRowId

WHERE DisplayName Like 'server%.domain.net' AND ScheduledEndDateTime BETWEEN 'fromDateRange' AND 'toDateRange'

Wintel Gray Agents Runbook Automation

This Orchestrator Runbook, “SCOM2012R2_Check_HealthService” is setup to capture a “Health Service Heartbeat Failure” for Windows machines, and restart the HealthService and/or delete the corrupted HealthService cache folder and restart the service.

The Runbook will capture the alert from SCOM, once captured, it will wait 60 seconds, it will then ping the machine, and if the ping is successful then it will then wait for 180 seconds, then check to see if the HealthService on the machine is running. If the ping is unsuccessful, it will send an email indicating the machine is actually offline.

If the HealthService is running, then it is possibly a corrupted cache folder. It will then stop the HealthService, delete the cache folder, and restart the service.

If the HealthService is not running, it will then start the service.

In both events, an email will be sent out as an information alert, to indicate that the Runbook resolved the issue.

1

Details of Configuration

Monitor Alert Properties:

2

Link from Monitor Alert to Run Program:

3

Link from Run Program to Get HealthService Status:

4

Link from Get HealthService Status.

If Not running:

5

 Start HealthService Properties:

6

Since the Stop HealthService Properties are almost the same as Start HealthService, we have omitted this.

Delete Folder Properties

This pertains to SCOM 2012R2. There is a duplicate run book with the same configuration that checks against the old folder structure:

7

8

SCOM Wintel Gray Agents Health State and Cache Flush – Part II Automation

In the previous post, we learned we can clear the agents cache, recycle the health service, and this will (hopefully) resolve our gray agent issue. But, what happens when we have to do this for hundreds of agents? One word, PowerShell. PowerShell allows us to automate this task over hundreds of servers to make this very tedious task, actually very quick!Here is the code I use.

Just make sure all of your servers are within the list you are providing, and of course the account you are running as has Local Administrative rights on each server.

$list = gc “.\list.txt”
foreach ($server in $list)
{
       Write-Host $server Check Service: “ -NonewLine
if ((gwmi win32_service -computer $server -filter “name=’HealthService'” | %{$_.State}) -eq “Running”)
{
       gwmi win32_service -computer $server -filter “name=’HealthService'” | %{$_.StopService()}  | findstr ReturnValue | Tee-Object -var service | out-null
       $serviceResult = $service.split(“:”)
       if ($serviceResult[1] -eq ” 0″)
              { Write-Host “Successful” -f Green}
       else
{Write-Host “Failed” -f Red}
start-sleep 5
$a = gwmi win32_Directory -computer $server -filter “Name=’C:\\PROGRA~1\\SYSTEM~1\\Agent\\HEALTH~1\\HEALTH~1′”
                           $a.DeleteEx() | findstr ReturnValue Tee-Object -var status out-null
                           $statusresult $status.split(“:”)
                                  if ($statusResult[1] -eq ” 0″)
                                         {Write-Host “Successful”}
                                  else {Write-Host “Failed”}
       gwmi win32_service -computer $server -filter “name=’HealthService'” | %{$_.StartService()}  | findstr ReturnValue | Tee-Object -var service | out-null
       $serviceResult = $service.split(“:”)
       if ($serviceResult[1] -eq ” 0″)
              { Write-Host “Successful” -f Green}
       else
{Write-Host “Failed” -f Red}
       }
else
{Write-Host Stopped}
}

SCOM Wintel Gray Agents Health State and Cache Flush

Problem, you launch your SCOM console, and your server is in the following state. You browse the server, and check out the health service, and it is clearly running… So now what?

1 (1)

A Windows based machine appears in a “Not monitored” state. While SCOM thinks the machine is un-responsive, we can confirm this is not the case, as we can ping to the machine; in addition we are able to login to the machine.

2 (1)

This is a result of the SCOM health service needing its cache to be cleared.

SCOM has a built in task to do exactly what we want; however, since SCOM believes the machine is in an offline state, it will not be able to trigger the task to the “Not monitored” machine.

3 (1)

SOLUTION – MANUAL PROCESS

  1. Remote into the machine, and launch the services (services.msc). Locate the Microsoft Monitoring Agent service, and stop the service.

4 (1)

  1. Once the service has stopped, browse the following folder, “C:\Program Files\Microsoft Monitoring Agent\Agent\

5 (1)

Delete the entire (Health Service State) folder.

  1. Go back to the Windows Services (services.msc) and start the Microsoft Monitoring Agent service. This will rebuild the folder we just deleted.

Give SCOM a few seconds, maybe a few minutes, and the Health State of our machine will turn back to healthy!