Skip to content

SLA in SCSM 2012. Part 3. “Hidden” features and “pausing” SLA

SLA in SCSM 2012. Part 3. “Hidden” features and “pausing” SLA published on 21 Comments on SLA in SCSM 2012. Part 3. “Hidden” features and “pausing” SLA

In last two articles of SLA series I was wrote about storing of the SLA objects and describe how it’s works.Today it’s time to talk about some “hidden” features of the SLA management system in SCSM 2012. Most of this features are available out-of-box and supported by Microsoft but some of them are not.

Queues

The main goal of the SLA management system is track compliance of the agreement. And the fundamental criteria of the SLA tracking is which elements must be tracked. If we talk about SCSM 2012 then this criteria handled by queues. All other parameters like calendar, metric and SLO are very simple and has no any interesting parameters. But queues is a powerful instrument for SLA management. The queue sets on which object will applied one or other SLO object. It can be (even must be) written small book about queues and how we can use them in SCSM but in this article I will describe only small part regarding to SLA management system.

Queue is a group

… and you must understand that each time when you use it. This means what::

  1. Member in queue can be added dynamically (by membership rule) and static (aka “explicit member”)
  2. You can use exclusion
  3. Count of the queue and complexity of it membership rule affect performance of the SCSM server and database.

Unfortunately, with SCSM console you can use only dynamic membership rules. But you can always export management pack with queue and change it as you want.

Membership rules

As I just wrote above, for dynamic membership you must define the membership rule. The membership rule is just a set of criteria used by SCSM engine to populate queue with objects. Unfortunately, in most cases, the administrators of SCSM using the simplest criteria like “Urgency = High and Priority = High” or similar. But membership rule allows us to use much more complex criteria. We can use not just a property of the target class, but any relationship for selected target class. Do you have “VIP User” property in your User class and want to have queue with all incident where affected user is VIP user? No problem: just select combination class “Incident (typical)” and use the Affected User relationship and any properties of the User class for your criteria. Do you want to have a queue with all service requests what affect the given Business Service? No problem but little bit more complex. Just create your own combination class with “Affected Configuration Items” relationship with restriction to Business Service class.

With combination classes we can use any nesting level. As example you can create combination class for service request with affected business service and add “Service owner” relationship as child. In this case you can build queue to get all service request where owner of the affected service located in given organization unit or worked in given company i.e. you can use any property of service owner.

Also you must note what you can use relationships not only from Source to target but vice verse too. As example you can create combination class for incident and add “Has parent work item” but set direction from target to source (in this example the Target is child incident and the Source is parent) with SeedRole=”Target”. As result you can use this combination class to create queue with all incident which has parent incident with Urgency=”High” (or you can use any other properties of the parent incident).

A small summary about combination classes: if you see what combination classes provided out-of-box doesn’t contains all necessary relationship then you always can create you own. The high-level overview of process:

  1. Create combination class (or “type projection” in terms of object model) with necessary relationships and restrictions. You can do this with any text editor or with VSAE.
  2. Import management pack with created combination class to SCSM
  3. Create new queue based on created management pack

It’s time for some demo about membership rules. I little bit lazy so I will not show you how to create combination class and will use existing one. Let’s assume what we have some OU in our domain. This OU contains all accounts for our CIO, CEO and other leaderships. We want to create queue with all incident where affected user is user located in this OU. To do that we can:

  1. Create new queue and select “incident (typical)” combination class as target:
    image32
  2. On “Criteria” tab select “Affected User” and add “Distinguished Name” property to criteria. This property contains full LDAP path of the object in “CN=….,OU=…..,DN=…,DN=” format so we can use “Contains” or “Ends with” operators and full LDAP path as filter. In case of the screenshot below the queue will contains all incident where affected user located in OU “SystemCenter Inc\Chiefs” and lower OUs:
    image33
    Here is AD layout:
    image34

To get queue members you can use my script published early. But note what queue is a group and group recalculated each 60 seconds (by default)

This is all what you can use from UI. But if you will export management pack with queue then you can:

  1. Set exclusions for queue. This can be helpful if want to exclude some object from SLA management.
  2. Use complex operators like Contains/NotContains and Contained/NotContained. This allows you to check if some object contains (or contained) in other groups\queues
  3. Use more than one membership rule for one queue

In other word you can use all features of the group membership engine in OpsMgr\SCSM.

Queue and performance

The queue has same limitation as groups due to fact what queue is a group.The first of all (yes, this sounds little bit strange but…) try to use the fewest possible numbers of the queue:

  • Try to use same queue for all things: for SLA, for roles, for notification
  • Remove queues as soon as they not used any more

The second rule – use combination classes with fewest possible number of relationships. And you must never use “* (typical)” combinations  classes in you queues.

The third rule. If you have a lot of queues and groups then you can try increase calculation interval. But keep in mind what this affect the SLA workflows because as you know from part 2 the SLA’s workflow runs after queue recalculated and object added to queue.

Pausing SLA

ATTENTION! All information below provided as is without any warranties. This approach is totally unsupported by Microsoft and\or by author. The author is not liable for loss of information due to the use of these recommendations.

A long time ago I’ve published article about Pause status of the SLA in SCSM 2012. This article is very popular, but I couldn’t publish any information about how to do that until Microsoft approved it. Now I have a “green light” and it’s time to open Pandora’s box.

If you read carefully Part 1 and 2 then you already note to some properties of metric object and workflows what regarding to pausing SLA. This properties and workflows are not supported by Microsoft but worked in simple scenarios.

Criteria for pausing and resuming

To use pausing SLA in SCSM you must define to criteria: when it should paused and when resumed. This criteria is absolutely the same as in all other workflows (notifications)  and allow you to check values what was before change and after change.

But less words and more examples! One of the most popular requirements around SLA pausing is stop SLA calculation when incident change status to “Pending”. From information above we must define two criteria:

  1. Pause SLA: Status_BEFORE != “Pending” AND Status_AFTER = “Pending”
  2. Resume SLA: Status_BEFORE = “Pending” AND Status_AFTER != “Pending”

Note: for all below I will assume what your already have working SLO and you want implement pausing only.

The simplest way to create criteria for our SLA pausing workflows is create new subscription with necessary criteria in SCSM console. This notification must be created in same management pack with SLO object and must be disabled right after creation.  After that you can export this management pack and copy criteria from subscription rule to SLA workflow.

If you done all of this and looking into your management pack then it must look like this:

<Rule ID="NotificationSubscription_da22181e_fb15_444c_b21b_ef45ca49b11c" Enabled="true" Target="SystemCenter!Microsoft.SystemCenter.SubscriptionWorkflowTarget" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100">
        <Category>System</Category>
        <DataSources>
          <DataSource ID="DS" TypeID="SystemCenter1!Microsoft.SystemCenter.CmdbInstanceSubscription.DataSourceModule">
            <Subscription>
              <InstanceSubscription Type="a604b942-4c7b-2fb2-28dc-61dc6f465c68">
                <UpdateInstance>
                  <Criteria>
                    <Expression>
                      <And>
                        <Expression>
                          <SimpleExpression>
                            <ValueExpression>
                              <Property State="Pre">$Context/Property[Type='CustomSystem_WorkItem_Incident_Library!System.WorkItem.Incident']/Status$</Property>
                            </ValueExpression>
                            <Operator>NotEqual</Operator>
                            <ValueExpression>
                              <Value>{b6679968-e84e-96fa-1fec-8cd4ab39c3de}</Value>
                            </ValueExpression>
                          </SimpleExpression>
                        </Expression>
                        <Expression>
                          <SimpleExpression>
                            <ValueExpression>
                              <Property State="Post">$Context/Property[Type='CustomSystem_WorkItem_Incident_Library!System.WorkItem.Incident']/Status$</Property>
                            </ValueExpression>
                            <Operator>Equal</Operator>
                            <ValueExpression>
                              <Value>{b6679968-e84e-96fa-1fec-8cd4ab39c3de}</Value>
                            </ValueExpression>
                          </SimpleExpression>
                        </Expression>
                      </And>
                    </Expression>
                  </Criteria>
                </UpdateInstance>
              </InstanceSubscription>
              <PollingIntervalInSeconds>60</PollingIntervalInSeconds>
              <BatchSize>100</BatchSize>
            </Subscription>
          </DataSource>
        </DataSources>
        <WriteActions>
<!-- cuted here -->

Enabling pausing\resume SLA

Now you must copy-paste this criteria to corresponding SLA workflows. To do what you must copy entire <UpdateInstance> element from subscription rule. First of all you must find %SLO_NAME%-PauseEvent rule in your management pack (please see the Part 1 of this series about terms and abbreviation) and replace <UpdateInstance /> element with copied from subscription rule. After what you must enable SLO workflow by replace Enabled=”false” to Enabled=”true”. As result you must get something like this:

      <Rule ID="WorkflowSubscription_1ce1859c_73c0_43ad_bd76_87616e00ad96" Enabled="true" Target="SLAWorkflowTarget_bcae34c368294fdcb14f80d4111ee005" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100">
        <Category>System</Category>
        <DataSources>
          <DataSource ID="DS" TypeID="SystemCenter1!Microsoft.SystemCenter.CmdbInstanceSubscription.DataSourceModule">
            <Subscription>
              <InstanceSubscription Type="a604b942-4c7b-2fb2-28dc-61dc6f465c68">
                <UpdateInstance>
                  <Criteria>
                    <Expression>
                      <And>
                        <Expression>
                          <SimpleExpression>
                            <ValueExpression>
                              <Property State="Pre">$Context/Property[Type='CustomSystem_WorkItem_Incident_Library!System.WorkItem.Incident']/Status$</Property>
                            </ValueExpression>
                            <Operator>NotEqual</Operator>
                            <ValueExpression>
                              <Value>{b6679968-e84e-96fa-1fec-8cd4ab39c3de}</Value>
                            </ValueExpression>
                          </SimpleExpression>
                        </Expression>
                        <Expression>
                          <SimpleExpression>
                            <ValueExpression>
                              <Property State="Post">$Context/Property[Type='CustomSystem_WorkItem_Incident_Library!System.WorkItem.Incident']/Status$</Property>
                            </ValueExpression>
                            <Operator>Equal</Operator>
                            <ValueExpression>
                              <Value>{b6679968-e84e-96fa-1fec-8cd4ab39c3de}</Value>
                            </ValueExpression>
                          </SimpleExpression>
                        </Expression>
                      </And>
                    </Expression>
                  </Criteria>
                </UpdateInstance>
              </InstanceSubscription>
              <PollingIntervalInSeconds>60</PollingIntervalInSeconds>
              <BatchSize>100</BatchSize>
            </Subscription>
          </DataSource>
        </DataSources>
        <WriteActions>
          <WriteAction ID="WA" TypeID="SystemCenter1!Microsoft.EnterpriseManagement.SystemCenter.Subscription.WindowsWorkflowTaskWriteAction">
            <Subscription>
              <VisibleWorkflowStatusUi>false</VisibleWorkflowStatusUi>
              <EnableBatchProcessing>true</EnableBatchProcessing>
              <WindowsWorkflowConfiguration>
                <AssemblyName>Microsoft.EnterpriseManagement.ServiceManager.SLA.Workflows</AssemblyName>
                <WorkflowTypeName>Microsoft.EnterpriseManagement.ServiceManager.SLA.Workflows.ModifySLAOnInstanceUpdate</WorkflowTypeName>
                <WorkflowParameters>
                  <WorkflowParameter Name="WorkflowMode" Type="string">PauseEvent</WorkflowParameter>
                  <WorkflowArrayParameter Name="InstanceIds" Type="guid">
                    <Item>$Data/BaseManagedEntityId$</Item>
                  </WorkflowArrayParameter>
                  <WorkflowParameter Name="SLAConfigObjectId" Type="guid">7b1505b3-fd95-06f1-f10e-d69d00dfbe1c</WorkflowParameter>
                </WorkflowParameters>
                <RetryExceptions />
                <RetryDelaySeconds>60</RetryDelaySeconds>
                <MaximumRunningTimeSeconds>7200</MaximumRunningTimeSeconds>
              </WindowsWorkflowConfiguration>
            </Subscription>
          </WriteAction>
        </WriteActions>
      </Rule>

Also you must copy-and-replace  <UpdateInstance> element for %SLO_NAME%-ResumeEvent rule but swap <Operator> (Equal to NotEqual and vice verse).

Let’s remember configuration of my test SLO:

  • Name: Test SLO
  • Metric: From First Assigned Date to First Answer Date
  • Calendar: Monday-Friday, from 10:00 to 19:00
  • Target time: 1 hour
  • Warning threshold time: 50 minutes

My incident was assigned at 15:54 so the Target End Date time is 16:51:
image1

The Status of the incident was changed at 15:56:
image2
Right after that the PauseEvent workflow fired and SLO will be paused:
image3

Note: the Target End Date will NOT change when object goes to Paused state

The Status of the incident was changed to Active at 16:20:
image4

As result the SLO will be recalculated and Status changed to Active:
image5

As you can see the time when SLO was in Paused status in not included in total time. In other words if time before SLO breached at moment of SLO paused was 30 minutes then it will the same then SLO will resumed. From example above you can see this behavior:

SLO paused at 15:58
SLO resumed at 16:22
16:22 – 15:58 = 24 minutes
Target End Date before paused: 16:51

So expected new Target End Date must be
16:51 + 24 = 17:15
and you see what this is it: Target End Date after resumed: 17:15

The fast way to find pausing and resuming SLA workflows (rules)

To find necessary workflows you must build display name, find this display name in management pack and get ID of the workflow. No easy and little bit complex. To speedup this process I’ve created the PowerShell script::

param([string]$SLADisplayName)

if(!$SLADisplayName)
{
	write-host ""
	write-host "Usage:"
	write-host "Get-SCSMSLAWorkflows.ps1 ""SLO Display Name"""
	write-host ""
	return
}

import-module SMLets

$SLAConfigObject = Get-SCSMObject -Class (Get-SCSMClass -Name "System.SLA.Configuration") -Filter "DisplayName = '$SLADisplayName'"
if($SLAConfigObject)
{
    [guid]$SLAConfigObjectId = $SLAConfigObject.Get_Id()

    $pauseRule = Get-SCSMRule | ? {$_.DisplayName -eq ($SLADisplayName + "-PauseEvent")}
    $resumeRule = Get-SCSMRule | ? {$_.DisplayName -eq ($SLADisplayName + "-ResumeEvent")}
	write-host ""
    write-host "Pause and resume workflows for SLO '$SLADisplayName':"
    write-host ("`tPause workflow: `t" +  $pauseRule.Name)
    write-host ("`tResume workflow: `t" +  $resumeRule.Name)
    write-host ("`tManagement Pack: `t" +  $resumeRule.ManagementPack.DisplayName + " [Name: " + $resumeRule.ManagementPack.Name + "]")
	write-host ""
}
else
{
    write-error "SLO '$SLADisplayName' not found!"
}

Save this script as Get-SCSMSLAWorkflows.ps1 and run it set display name of you SLO as parameter:
image40

Everything you need is export management pack and find workflow by ID.

Pausing SLA. Epilog.

The high overview of pausing SLA process:

  1. Get ID (internal name) of Pause and Resume workflow
  2. Export management pack with SLO and search those workflows in management pack
  3. Add criteria for Pause and Resume workflows
  4. Enable workflows
  • Import management pack back to SCSM

You can use any criteria for pausing\resuming SLA and that can help you to build any SLA. But keep in mind what SLA pausing is not supported by Microsoft  so you must test each criteria before implement in real life.

Note: There is one more way to implement pausing. You can set the PauseEventCriteria and ResumeEventCriteria properties of the Metric object with SDK. In this case this criteria will be used for each SLO created based on this Metric.

Summary

The SLA system in SCSM 2012 (and SP1) are very powerful. The entire book can be written about SLA and this series of article just small part of the SLA system. After you impellent your SLA you should create reports and\or OLAP cubes to analyze you SLA metrics. But you can use SLA system not only for SLA itself. For example, you can use SLA system to set maximum period for approve process (“this review activity must be voted for 3 days or….[do something useful]”). You can run any other workflow when SLO object changed they status to Warning or Breached. For example above you can run PowerShell script if review activities still not approved and auto-approve them.

So don’t be a fear to test any SLA scenarios.

Share

21 Comments

Great article, Anton!

I have a question for you: We’re using 5 priorities (5 queues and SLOs), but I’m thinking that instead of adding this feature for the 5 of them, I could create a sixth queue and SLO called “All open incidents” to include all 5 priorities and add the pause/resume functionality there.

Have you tried something like that? Any recommendations?

Thanks!
German

Why pausing manually with SMLets (setting status to Pause and writing in PausedDate) does not Resume afterwards via resume workflow?

I need it because if someone puts ticket “On Hold” before SLO objects are generated for that ticket, they do not go “On Hold”, as there is no event firing the workflow, so I do it with Orchestrator instead.

Can i delete the subscription that I made after copying the criteria from it? Or should I just leave it disabled?

Yes, of course you can delete subscription.

I am running that ps script but its throwing the error as below..

Get-SCSMRule : The member “ManagementPack” is already present from the extended
type data file.
At C:\SCSM 2012 Source Files\Get-SCSMSLAWorkflows.ps1:19 char:30
+ $pauseRule = Get-SCSMRule <<<< | ? {$_.DisplayName -eq ($SLADisplayName
+ "-PauseEvent")}
+ CategoryInfo : NotSpecified: (:) [Get-SCSMRule], ExtendedTypeSy
stemException
+ FullyQualifiedErrorId : AlreadyPresentInTypesXml,SMLets.GetSMRuleCommand

Get-SCSMRule : The member "ManagementPack" is already present from the extended
type data file.
At C:\SCSM 2012 Source Files\Get-SCSMSLAWorkflows.ps1:20 char:31
+ $resumeRule = Get-SCSMRule <<<< | ? {$_.DisplayName -eq ($SLADisplayName
+ "-ResumeEvent")}
+ CategoryInfo : NotSpecified: (:) [Get-SCSMRule], ExtendedTypeSy
stemException
+ FullyQualifiedErrorId : AlreadyPresentInTypesXml,SMLets.GetSMRuleCommand

After that it prints blank names for all as below:

Pause and resume workflows for SLO 'MMXX_Incident Resolution Time SLO – P1':
Pause workflow:
Resume workflow:
Management Pack: [Name: ]

The pause-resume workflow works for some time but later it stops working for some reason. When I checked the MP, I realized that the 2 enabled lines in the XML had no criteria in it. Both of them turned into their default state. I tried this in 3 project but experienced this issue in each of them.

Leave a Reply

Primary Sidebar

%d bloggers like this: