Organizations that don't adequately manage performance and capacity not only expose key business operations to unnecessary risk, but also invite cost inefficiencies attributable to under- or over-utilized resources. The Performance Management process ensures that the performance of the infrastructure hardware and software is operating at acceptable levels and services are not at risk. It ensures warnings, alerts, and utilization targets are established and tracked at defined intervals to ensure disruptions are avoided. The Capacity Management process ensures that there are available resources to prevent performance issues that might impact the service.
The Objective of the process:
The objective of Performance Management is to optimize the effective use of current resources. Performance Management links withControl Capacity to encompass best practice Capacity Management.
- Review the current Performance and Capacity report and identify areas of improvement
- Review the current roles and responsibilities and identify areas of improvement
- Review the current process and identify areas of risks and improvement
- Review the current process of identifying actions that need to be taken and identify areas of improvement
- Establish an optimized governance structure around Performance & Capacity management
- Ensure monthly meetings are scheduled to review Performance & Capacity reports with local management team before being sentto the Enterprise owner.
- Ensure monthly meetings are scheduled with the Enterprise Owner to review the reports and the overall efficiency of the process.
Sample list of benefits:
- Minimize impact on business-critical applications
- Proactively monitor the resource utilization to identify any risks
- Develop a proactive risk mitigation plan
- Improve availability
- Improvement management of IT resources
- Reduced financial impact as a result of improved resource management
Sample list of observations:
- Storage causing outages and service disruptions due to out of capacity issues.
- Compute resources causing outages or service disruptions for business applications.
- Network resources causing outages or service disruptions due to bandwidth or latency issues.
- Firewall Resources causing outages or service disruptions due to utilization issues.
Sample list of recommendations:
- Implement a robust monitoring and alerting system to monitor compute, storage, network resources.
- Set accurate thresholds for alerts and warnings
- Work with business leaders to understand demand management
- Work With Application, Database, and other teams to understand performance and capacity requirements for products and services being released, upgraded, ormanaged.
Assessment Questions:
- Do you monitor infrastructure hardware components such as CPU, Memory and Storage?
- Do you monitor infrastructure software components such as operating systems, applications, processes, databases, etc.?
- What specific attributes do you monitor?
- What KPIs have you set to alert you of any issues?
- What trigger points have you set before you have to add additional CPU, Mem, or Storage?
- Do you work with other infra teams, app/dev teams and lines of businesses to review Performance & Capacity reports?
- Do you review the current performance and capacity reports and identify areas of improvement? How often do you review the reports?
- Do you have a designated role on your team for Performance & Capacity or due you use an enterprise team?
- Did you have input into design and development of the Performance & Capacity reports?
- Do you just get Performance & Capacity reports, or do you get analysis with risks and recommendations with the report?
- Do you review the current process and identify areas of risks and improvement?
- Do you review the current process of identifying actions that need to be taken and identify areas of improvement?
- Do you establish an optimized governance structure around performance & capacity management?
- Do you ensure monthly meetings are scheduled to review performance & capacity reports with local management team before being sent to the enterprise owner?
- Do you ensure monthly meetings are scheduled with the enterprise owner to review the reports and the overall efficiency of the process?
- Is anyone looking at Performance & Capacity reports to understand performance and capacity issues?
- Are there Performance & Capacity issues on your team?
- Is your capacity growing or shrinking?
- Is there any impact of Performance & Capacity visible on event-based incidents?