System Center 2012 Scenario: Fabric Monitoring

System Center 2012 Scenario: Fabric Monitoring

This scenario provides the basic configuration and steps necessary to implement fabric monitoring in a private cloud based on Microsoft Windows Server 2012 and Microsoft System Center 2012.

System Center 2012 components and other requirements

This scenario uses the following System Center components in addition to Windows Server 2012.  The scenario assumes that these components are already installed and configured and working properly.  It is beyond the scope of this scenario to provide basic deployment and configuration information for these components.  You can refer to the individual documentation for each component for this information.

  • System Center 2012 – Operations Manager
  • System Center 2012 – Orchestrator
  • System Center 2012 – Service Manager
  • System Center 2012 – Virtual Machine Manager

Scenario description/overview

Fabric monitoring is one feature of Fabric Management.  It refers to monitoring those resources that support the private cloud to ensure that they are available and providing adequate performance.  In keeping with the goals of a private cloud, the monitoring needs to be as automated as possible and provide the ability to remediate issues in addition to simply detecting and reporting them. 

The fabric of a private cloud includes the infrastructure required to deliver cloud services.  This includes physical components such as the host servers, storage devices, network devices, and components of the facility such as power.  It also includes virtualized components such as virtual machines, virtual disks, and virtual network components.

Operations Manager is the primary component providing monitoring services in a System Center 2012 environment, and it is the central component of this solution.  Service Manager provides complete incident management and correlation of incidents to the configuration management database. By integrating the two components, alerts in Operations Manager can automatically create incidents in Service Manager where they can be managed by support personnel. 

In addition to detecting and managing issues, an automated monitoring solution should include the ability to remediate detected issues.  Orchestrator provides runbooks that can perform such remediation and that can be launched from Service Manager either automatically or by an operator.

The following table summarizes the function of each component in this solution.

Component Function
Operations Manager
Provides discovery of cloud resources, detection of issues, and collection of performance and operations data.
Service Manager Provides management of incidents and correlation with configuration items in the CMDB.
Orchestrator Provides runbooks for automated remediation of detected issues.
Virtual Machine Manager Primary tool for provisioning for provisioning of cloud resources. 

How does this scenario fit into your IT strategy?

Microsoft’s cloud strategy is hosted on the Private Cloud Solution Hub where architectural guidance is located. The strategy describes how a private cloud enables organizations to deliver information technology as services by providing a pool of computing resources delivered as a standard set of capabilities that are specified, architected, and managed based on requirements defined by a private organization.

How do you prepare System Center for this scenario?

The scenario assumes that you have installed and configured the System Components according to their individual requirements.  It does not assume that you have configured integration between them as this configuration is included as part of the scenario.  Further information on the integration between the different System Center components is available in the System Center 2012 Integration Guide hosted on the Microsoft TechNet wiki at http://social.technet.microsoft.com/wiki/contents/articles/13188.system-center-2012-integration-guide.aspx .

How to Accomplish This Scenario

Monitor Physical Resources

Implementing monitoring for the physical resources of the cloud primarily includes implementing standard features of Operations Manager.  This includes deploying agents and management packs for the physical computers and devices and discovering network components.

Compute Resources

The first task in implementing monitoring for servers and blade systems is deploying Operations Manager agents to the physical host computers.  The agent is typically included in the standard server build configuration so that monitoring can be performed immediately upon deployment of a new physical server.  Configuring Active Directory so that the agent can query for a management group assignment can assist in the agent locating the correct management group to perform initial monitoring.

When the agents are deployed, then you need to install management packs to provide required monitoring.  Monitoring the operating system and services such as Hyper-V can be performed by installing standard management packs including the following:

Monitoring of unique aspects of the physical computer must be performed by management packs specific to the brand of device, such as HP and Dell.  These management packs can be located in the management pack catalog for the particular equipment that your private cloud is based on.

Storage Resources

Storage devices are monitored by the operating system management packs for basic issues such as availability and free space.  Detailed monitoring of the physical devices though require management packs specific to the brand of device such as NetApp, HP, and EMC.  These management packs can be located in the management pack catalog for the particular equipment that your private cloud is built on.

Network Resources

Monitoring of network devices is a standard feature of System Center 2012 Operations Manager, and a variety of different manufacturer devices are monitored without requiring additional management packs.  Before a network device can be monitored, it needs to be discovered.  Operations Manager allows you to explicitly discover individual devices or perform a recursive query that can run on a schedule to identify new devices as they are introduced into the environment.  Using this feature, new devices can be automatically discovered and monitored with no operator intervention.

Monitor Virtual Resources

Monitoring of virtual devices includes monitoring of the virtual machines, virtual disks, and virtual network devices.  These are provisioned and managed by System Center 2012 Virtual Machine Manager, so monitoring of VMM is the primary task required to monitor these resources.

In addition to installing the System Center Monitoring Pack for System Center 2012 - Virtual Machine Manager in order for Operations Manager to discover and monitor VMM components, you must configure VMM to interact with an Operations Manager management server.  VMM performs some actions using the Operations Manager SDK that are typically performed with management packs for other products, and this is configuration required beyond installing the VMM management pack.

Part of the configuration of the VMM management pack is enabling Physical Resource Optimization (PRO).  Management packs that leverage this feature are able to access data from Operations Manager to be exposed in the VMM console.  They can also perform automated actions in response to particular conditions.  Vendors may provide PRO Enabled management packs for their applications or services.

Configure Incident Management

While Operations Manager specializes in detecting issues and collecting operational data, Service Manager provides complete management of the lifecycle of detected incidents.  By integrating the two components, resources discovered by Operations Manager can be managed by Service manager.  In addition, alerts created by Operations Manager can automatically create corresponding incidents in Service Manager and then kept in synchronization between the two tools.  Operations Manager integrates with Service Manager through two types of connectors that are both created and configured in the Service Manager console. 

Synchronize Configuration Items

The Configuration Items connector imports objects from Operations Manager as Configuration Items in Service Manager.  Discoveries in Operations Manager locate resources and their properties on managed computers, and the connector allows these objects to be automatically imported into Service Manager.

Any instance of a discovered class in Operations Manager that derives from a common set of classes will be imported into Service Manager.  The only requirement is that you import the management pack that includes the class definitions that you want to import.  The standard set of management packs are imported using a Windows PowerShell script.   Instances of classes that do not derive from this common set classes will not be imported unless you add their class to the Allowed List. 

Synchronize Alerts

The Alerts connector imports alerts from Operations Manager into Service Manager so they can be managed as incidents.  The incident in Service Manager remains in synchronization with the alert in Operations Manager allowing updates and resolution to be performed on either side.

In order to synchronize alerts in Operations Manager with Incidents in Service Manager, you must configure a Connector in Service Manager and a Subscription in Operations Manager.  The subscription defines which alerts will be forwarded to Service Manager.  This may be as simple as forwarding all new alerts with a Critical severity, or you can provide more granular criteria to define only a subset of alerts that are managed as incidents.  The connector in Service Manager defines what to do with the alert once it’s forwarded.  You can assign a template to different types of alerts in order to provide such details as who to assign the incident, its severity, and what CIs are affected. 

A typical strategy is to initially assign all incidents a basic template and manually provide these details as support personnel are able to review the incident.  As you gain experience with the types of alerts that you are experiencing, then more granular templates may be created in order to automate these details.

Configure Remediation

Management packs in Operations Manager specialize in detecting issues and collecting performance and operations data.  When an incident is assigned to a technician in Service Manager, an operator will be required to analyze the resulting alert and perform the steps required to correct the problem.  Once the operator validates that the problem has been resolved, they resolve the incident accordingly.

Runbooks in Orchestrator can be used to perform automated remediation for certain issues.  Using a runbook, a problem can be corrected and the resulting incident resolved with minimal operator intervention.  A well written runbook will validate that the conditions exist indicating the issue it is designed to correct, perform the required remediation steps, validate that the correction has been made, and then resolve any open alerts or incidents.

Install Integration Packs

Standard runbook activities installed with Orchestrator allow you to perform basic operations with Windows Server 2012, but you must have an Integration Pack installed in order to use activities designed to interact with another application or service. You should install at least the following Integration Packs to interact with the other System Center components.

You should also have the following integration packs installed to interact with basic services of the fabric.

You may require other Integration Packs to interact with other components and services in your environment.  You can typically identify these Integration Packs in the Technet Library but may need to contact the vendor directly.  Since the VMM management pack will allow you to interact with virtual resources such as storage and network, its Integration Pack can often be used to interact with these resources.  For those resources with no Integration Pack available, you can use the Run .Net Script activity to run a Windows Powershell script that accesses the resource.

Create Runbooks

There is no current standard set of runbooks for remediating common issues in the cloud fabric.  You will have to create runbooks for your own environment or obtain them from other vendors.  A standard process is to periodically analyze issues that have occurred with the cloud fabric and determine whether a runbook could be created to automate their remediation. 

Import Runbooks

In Orchestrator, Runbooks can be started from the Runbook Designer or Orchestration Console.  If they are imported into Service Manager though, they can be included in a Runbook Automation Activity Template where they can either be launched manually by an operator or automatically as soon as an incident is created.

Importing runbooks into Service Manager has the following advantages over running them from the Runbook Designer or Orchestration Console.

  • Clearly associate a runbook with the incidents that it is designed to remediate.  One or more runbooks can be associated with incidents that match a certain criteria so that the operator can immediately have applicable runbooks available.
  • Allows the operator to start the runbook with moving to an alternate console.
  • Provides the option of automatically launching the runbook in response to a particular incident.

In order to import runbooks into Service Manager, you must create a runbook connector.  The connector will connect to the Orchestrator Web Console server at periodic intervals and import runbooks.

Create Templates

To use a runbook that has been imported into Service Manager, you must create a Runbook Automation Activity Template.  This will allow you to define settings for the runbook and to map values into any parameters required by the runbook.  The template can then be used with a Work Item such as a Service Request.

In order to associate the runbook with an incident opened by Operations Manager, you must add the runbook activity template to a template based on the Incident class.  These are the only templates that are available to use in Alert Routing Rules in the Operations Manager connector.  Once the Incident template is created, you can add an Alert Routing Rule providing the criteria of the alerts that the runbook should be associated with.

TechNet Library Topics, TechCenter Pages, Blogs, Forums ,etc.

Leave a Comment
  • Please add 3 and 3 and type the answer here:
  • Post
Wiki - Revision Comment List(Revision Comment)
Wikis - Comment List
Sort by: Published Date | Most Recent | Most Useful
Posting comments is temporarily disabled until 10:00am PST on Saturday, December 14th. Thank you for your patience.
Comments
Page 1 of 1 (1 items)