DWS Service Desk Rescue Series: Improving Ticket Backlogs
July 11, 2023
DWS Service Desk Rescue Series: Improving Ticket Backlogs
Quite often, clients will contact the Service Desk or raise escalations to complain about the delays in restoring their impacted service or fulfillment of their Service Requests. The lack of timely resolutions to Incidents or Service Requests can be a cause of poor client experience as it can affect their ability to perform their work. Disruptions such as VPN connectivity issues, Laptop/Desktop issues, Software issues, Printer issues, etc. or delays in receiving a new Laptop/Desktop, New Phone, New ID creation, etc. can result in a great deal of frustration. This coupled with a lack of updates can compound the issue and cause even greater dissatisfaction.
The Backlog Analysis process focuses on understanding the flow rate of tickets, the processing rates, SLAs and Client/Supplier availabilities to establish a process behaviour. Once this is established, the process aims to identify the root cause of why there is an aging backlog. Could it be a lack of queue management, lack of proper reporting and measurements, lack of management oversight, lack of resources, workload dynamic shifts, or could it be a host of other factors contributing to the aging backlog.
Ticket Backlogs are ITSM records (Incidents, Problems, Changes, Service Requests, and Catalog Tasks) that the Service Desk has not been able to resolve at first touch and as a result, has transferred the record to other Technical Teams. While Ticket Backlogs are normal, the count and age of the backlog should be established as they can vary from client to client.
A healthy backlog rate should be established factoring the 1 –Incoming Volume of tickets, 2 – First Time Fix Rates/First Call Resolution Rates, 3 – Service Level Agreements, and 4 – SLA Hold usage (availability of the user/vendor/supplier to work on the issue).
Above and beyond the establishment of a Healthy Backlog rate, the aging status of the backlog should be looked at to ensure tickets are not slipping through the system. For example. For a given client, based on an incoming volume of 6000 tickets a week, SLA of 5 days, First Call Resolution of 85%, and SLA Hold usage of 47%, the normal/healthy backlog level is between 3800-4200 tickets. While the current backlog count is @3900. One would assume the backlog is healthy. However, when looking at the aging state of the backlog, 65-75% of the tickets are beyond the SLA due date and 10-15% of the tickets are greater than 120 days old.
Lack of Operational Management systems can result in growth and mismanagement of ticket backlogs. Leading to poor service experience and client dissatisfaction.
The Objective of The Process:
The objective of the Backlog Management process is to ensure the tickets (Incidents, RITMs, Tasks, Problems, Changes) are being monitored, managed, and driven to closure in a timely manner.
Client dissatisfaction due to the length of time being taken to resolve the issues
Tickets being placed on SLA Hold to avoid SLA impact
Tickets in a Queued state, indicating delays in picking up tickets and working on them
Lack of updates to client resulting in client dissat and status calls
Sample List of Benefits:
Quicker service restoration
Quicker service request fulfillment
Reduced escalations
Improved client experience
Sample List of Observations:
Client dissatisfaction due to the length of time being taken to resolve the issues
Tickets being placed on SLA Hold to avoid SLA impact
Tickets in a Queued state, indicating delays in picking up tickets and working on them
Lack of updates to client resulting in client dissat and status calls
Sample List of Recommendations:
Understand workload dynamics. Are volumes increasing? (OM3)
Understand impact of skills/training on backlogs. Are tickets aging due to skill issues? (OM5)
Understand of impact of staffing levels on backlogs. Do you have sufficient staff to handle the workload you are receiving? (OM6)
Understand the impact to backlogs as a result of knowledgebase gaps.Is your knowledge base up to date and is it being consistently used? (OM8)
Understand the impact of technology & tools to your backlog. Is The ticketing tool, software distribution tool, remote takeover tool, etc.impacting your ability to work on the tickets? (OM9)
Understand utilization dynamics. Are people working at expected levels? Do you have the right reports to show you what is happening? Is there analytics taking place to help you understand where the issues are? (OM14)
Understand the lack of continual service improvement management has on the backlog. Is someone working with the knowledge, skills, reporting, and analytics teams to understand the root cause of the backlog and taking actions to address the issues? (OM15)
Understand the lack of issues/escalation management system have on the backlog. Are issues that are impacting the backlog being addressed in a timely manner? Issues such as client availability to troubleshoot or provide the required information? (OM17)
Understand the impact of queue management on the backlog. Is queue management taking place? Is there a process to deal with aging tickets? Is the queues being managed 7/24? Are tickets being assigned to people to work or are people picking up tickets they want to work? (OM19)
Understand the impact of the Event / Monitoring & Alerting Management on the backlog. Are false alerts resulting in the creation of tickets? Is event management process in place to deal with the false alerts and remediate them immediately? (CORE-C)
Understand issues with Performance & Capacity Management on the backlog. Are issues with CPU, Memory, Disk, etc. resulting in alerts being generated, contributing to the backlog? (CORE-C)
Understand the issues with Asset / Configuration Management on the backlog. Are issues with the asset database resulting in tickets being unable to be marked complete, resulting in the growth and aging of the backlog?(CORE-C)
Understand issues with cross-department management on the backlog. Are Issues with other teams impacting the backlog? Are other teams slow to respond/resolve? Are other teams bouncing tickets back? (UM5)
Understand issues with Vendor Management on the backlog. Are the Vendors slow to respond, slow to resolve, rejecting work being assigned to them? Are the contracts with the vendor not clear in terms of expectations?(UM6)
Understand issues with Global Delivery Management on the backlog. Are Issues with GD resources/processes impacting the backlog? Hours of work?Quality of work? Responsiveness? DOU/SOW issues with GD teams? (UM9)
Sample List of Areas to Probe:
Compare overall Backlog count to the Healthy Backlog target that has been set and determine if the current backlog is healthy or unhealthy? To determine the healthy target, factor in, incoming volumes, resolution SLAs, typical SLA hold usage, 3 strike policy, etc.
Compare current backlog to previous week backlog count and determine % increase/decrease.
Explain the root cause of any variance.
Have overall counts changed?
Have counts in various buckets changed due to work being done or lack of work and tickets are shifting to the right? (30 days to 60 days to 120 days, etc.)
Review distribution of aging tickets across a by-week timeline to identify severely aged tickets. Explain the aged tickets.
Review aged tickets by Incident State and Days Aging and compare current week against previous week to identify shifts in age groups.
Review aged tickets by Incident State and Days Aging to identify the distribution of aged tickets by state and days aging. Identify oldest groups and explain why.
Review aged tickets by Incident State and Last Update date (by week) to identify which tickets have not been updated in a timely manner.
Review aged tickets by Incident State, Days Aging and Escalation Type to identify if Users are escalation aged tickets.
Review aged tickets by Incident State Reason and Days Aging to identify reasons for why backlog tickets are on SLA Hold.
Review aged tickets by category and days aging to identify and anomalies. Drill down by looking at short description.
Review aged tickets by subcategory and days aging to identify and anomalies. Drill down by looking at short description.
Review aged tickets by contact type and days aging to identify and anomalies. Drill down by looking at short description.
Review aged tickets by Assigned To and days aging to identify and anomalies. Drill down by looking at short description.
Review aged tickets by Assignment group and days aging to identify and anomalies. Drill down by looking at short description.