Metric Definition
MTTR
Track from
Security alert resolution time
Security alert resolution time is the average elapsed time between a security alert being raised and that alert being fully resolved or closed. It measures how quickly a security team moves from detection to containment and remediation. A lower resolution time shrinks the window an attacker has to operate inside your systems.
8 min read
What is security alert resolution time?
Security alert resolution time is the average elapsed time between a security alert being raised and that alert being fully resolved or closed. If an alert fires at 09:00 and the team confirms it is contained and remediated at 13:00, that alert took four hours to resolve. Averaging this across every alert in a period gives you the headline figure, often reported as mean time to resolve, or MTTR.
The metric matters because every open alert represents unmanaged risk. A confirmed intrusion that sits unresolved for days gives an attacker time to move laterally, escalate privileges, and exfiltrate data. A false positive that sits unresolved still consumes analyst attention and erodes trust in the alerting system. Resolution time is the single number that tells leadership how fast the security function actually closes the loop, not just how fast it detects.
Resolution time sits alongside detection time. Detection measures how long it takes to notice something is wrong. Resolution measures how long it takes to make it right. A team can detect quickly and still resolve slowly if triage queues are deep, runbooks are missing, or ownership is unclear. Reading the two together tells you where the real bottleneck lives.
Resolution time should be measured from the moment the alert is raised to the moment the underlying issue is remediated, not the moment a ticket is acknowledged. Closing a ticket without remediating the root cause makes the number look good while leaving the risk in place.
How to calculate security alert resolution time
The basic calculation divides the total resolution time across all closed alerts by the number of alerts closed in the period. The real insight comes from breaking that lifecycle into its stages, because each stage is owned by different people and improved by different actions.
- 1
Time to acknowledge
The gap between an alert being raised and an analyst picking it up. This is driven by alert routing, on-call coverage, and how much noise buries the signal. Long acknowledgement times usually point to alert fatigue.
- 2
Time to triage
The time spent deciding whether an alert is a true positive, a false positive, or a duplicate. Good enrichment and context reduce this; raw alerts with no surrounding data lengthen it.
- 3
Time to investigate
The time spent understanding scope and impact once an alert is confirmed real. This depends on log access, tooling, and how well the affected systems are documented.
- 4
Time to remediate
The time spent containing and fixing the underlying issue: isolating a host, rotating a credential, patching a service, or removing access. This is where resolution time is genuinely earned or lost.
- 5
Time to verify and close
The time spent confirming the remediation worked and the threat is gone before the alert is closed. Skipping verification produces fast but unreliable resolution times.
Report resolution time separately by severity. A critical alert and a low-severity informational alert should never be averaged into one number, because the urgency and the acceptable resolution window are completely different. Most teams track median and 90th percentile alongside the mean, because a handful of very slow alerts can drag the average up while hiding the fact that typical alerts resolve quickly, or vice versa.
Security alert resolution time in a metric tree
A metric tree decomposes resolution time into the lifecycle stages above and then traces each stage back to the operational levers that drive it. This turns a single backward-looking number into a diagnostic map of where time is actually lost.
The first level splits the total into acknowledge, triage, investigate, remediate, and verify. Each of those decomposes further. Time to acknowledge is a function of alert volume, the false-positive rate, and on-call coverage. Time to remediate depends on how much of the response is automated, how clear the runbooks are, and whether the responder has the access needed to act without waiting on another team.
The value of the tree is precision. When resolution time rises, the tree tells you whether the cause is a flood of low-quality alerts, a triage queue with no owner, or a remediation step that always stalls waiting on infrastructure access. Each of those leads to a different fix owned by a different person.
Metric tree insight
In most teams the largest single block of resolution time is not investigation, it is the wait inside remediation when responders lack the access to act and have to raise a request with another team. Pre-authorising containment actions for on-call responders often removes hours from the average without touching detection.
Security alert resolution time benchmarks
Benchmarks vary widely by severity, sector, and the maturity of the security operation. The useful comparison is resolution time banded by severity rather than a single blended figure, because a blended number hides whether critical alerts are getting the urgency they need.
| Alert severity | Target resolution window | What good looks like |
|---|---|---|
| Critical | Under 1 hour to contain, under 24 hours to fully resolve | Confirmed active threats are contained almost immediately and remediation runs to completion within a day, with verification before close. |
| High | 4 to 24 hours | Clear ownership and runbooks let the team resolve serious but non-active issues inside a working day without escalation friction. |
| Medium | 1 to 5 days | Alerts are triaged the same day and remediated within the working week, batched where it is efficient to do so. |
| Low and informational | 5 to 30 days | Low-risk alerts are tuned, suppressed, or resolved on a predictable cadence rather than left to accumulate indefinitely. |
Trend matters more than any absolute target. A team whose median critical resolution time is falling quarter on quarter is improving, even if it has not yet hit a textbook number. Watch the gap between the median and the 90th percentile too. A wide gap means a long tail of alerts that stall, and that tail is usually where a serious incident hides.
How to improve security alert resolution time
Reducing resolution time is rarely about working faster. It is about removing the waits, the ambiguity, and the noise that sit between an alert and its fix. The biggest gains usually come from cutting alert noise and clarifying ownership rather than buying faster tooling.
Cut alert noise at source
Tune rules so the false-positive rate drops and analysts spend their time on real signal. Suppress known-benign patterns and deduplicate aggressively. Every false positive removed is acknowledgement and triage time given back to genuine threats.
Assign clear ownership
Make sure every alert class has a named accountable owner before it fires, not after. Ambiguity about who handles what is one of the most common reasons alerts sit in a queue untouched for hours.
Automate containment
Pre-authorise and automate first-response actions such as isolating a host, disabling a credential, or blocking an address. Automating the predictable parts of remediation lets responders focus on the judgement calls.
Verify before closing
Add a short verification step that confirms the threat is gone before an alert is closed. This raises resolution time slightly but cuts the reopened-alert rate, which is a far more expensive form of slowness.
The metric tree approach starts by finding the stage with the largest gap between current and achievable performance. If acknowledgement is slow, the fix is noise reduction and routing. If remediation stalls, the fix is access and automation. Spending effort on the wrong stage moves the headline number very little.
KPI Tree lets you connect each stage of the resolution lifecycle to the team that owns it. Detection engineering owns alert quality. The on-call rota owns acknowledgement. Incident responders own remediation. Platform and infrastructure own the access that responders depend on. With RACI ownership on each node and a push to the accountable owner when their stage starts to slow, the bottleneck surfaces to the right person while it is still cheap to fix, and the verified impact loop confirms whether the change actually moved the number.
Common mistakes when tracking security alert resolution time
- 1
Measuring acknowledgement instead of resolution
Stopping the clock when a ticket is acknowledged rather than when the issue is remediated makes the number look healthy while the actual risk window stays open. Always measure to remediation and verification.
- 2
Blending severities into one average
Averaging critical and informational alerts together produces a meaningless figure. A slow critical alert and a fast low-severity alert can cancel out and hide a serious problem.
- 3
Reporting only the mean
A few very slow alerts drag the mean up and obscure the typical case. Track median and 90th percentile alongside it so the long tail is visible.
- 4
Closing alerts without verifying
Closing fast without confirming the threat is gone produces flattering resolution times and a growing reopened-alert rate. Speed without verification is not resolution.
- 5
Tracking the total without decomposing it
Watching only the headline number tells you the operation is slow but not where. Without breaking the lifecycle into stages, you cannot tell whether the problem is noise, ownership, access, or tooling.
Related metrics
First Response Time
Customer Support MetricsMetric Definition
FRT = Total First Response Times / Total Tickets With a First Response
First response time measures the elapsed time between a customer creating a support ticket and receiving the first substantive response from a human agent. It is the metric that shapes the customer's initial impression of the support experience and sets the tone for the entire interaction.
Average Resolution Time
Customer Support MetricsMetric Definition
Average Resolution Time = Total Resolution Time Across All Tickets / Total Tickets Resolved
Average resolution time measures the mean elapsed time from when a support ticket is created to when it is fully resolved and closed. It captures the end-to-end customer experience of getting an issue fixed, encompassing wait times, agent work time, escalations, and any back-and-forth exchanges required to reach a solution.
Escalation Rate
Customer Support MetricsMetric Definition
Escalation Rate = (Escalated Tickets / Total Tickets Handled) x 100
Escalation rate measures the percentage of support tickets that are transferred from one tier or team to a higher tier or specialist group for resolution. It reflects the gap between the issues customers raise and the ability of frontline agents to resolve them, making it a key indicator of agent readiness, process maturity, and product complexity.
Ticket Volume
Customer Support MetricsMetric Definition
Ticket Volume = Total New Tickets Created in Period
Ticket volume is the total number of new support tickets created within a defined period. It is the fundamental demand metric for support operations, determining staffing requirements, budget allocation, and the urgency of self-service and product quality investments.
Why did my metric change?
Metric Definition
When security alert resolution time creeps up, this diagnostic framework helps you trace whether the cause is alert volume, triage delays or staffing rather than guessing.
Metric trees for operations teams
Metric Definition
Security alert resolution time is an operations and incident-response metric, so this guide shows how it fits alongside the other throughput and reliability measures operations teams own.
Decompose resolution time and find the real bottleneck
Build a resolution-time metric tree that connects each lifecycle stage to the team that owns it, with a push to the accountable owner when their stage starts to slow.