The Measurement Problem

Most security awareness programmes measure inputs rather than outcomes. Training completion rates, phishing simulation click rates, and quiz scores are the standard currency of programme reporting. They are easy to collect, easy to present, and almost entirely useless as indicators of actual security posture.

The fundamental problem is displacement. An organisation with a 98 per cent training completion rate but a poorly designed approval process for financial transactions is not more secure than one with a 60 per cent completion rate and a robust dual-authorisation protocol. Training completion measures whether people sat through a course. It says nothing about whether any subsequent behaviour changed, and even less about whether the organisation is less likely to experience an incident.

This is not a minor technical problem with how metrics are collected. It is a conceptual problem with what the field has decided to measure, and it has a direct consequence: security leaders cannot make a credible evidence-based case to their boards for investment in human-centred programmes because the data they have does not connect to the risk language boards understand.

 

Training completion indicates whether people completed a course. It tells you nothing about whether any behaviour changed, and nothing at all about the likelihood of an incident.

 

The COM-B Model as a Measurement Framework

The COM-B model introduced in Part 1 is not only a design tool. It is a diagnostic framework that can structure the measurement of a human security programme with considerably more precision than completion rates allow. By separately assessing capability, opportunity, and motivation for each target behaviour, security teams can identify exactly where a programme is failing and what kind of intervention is needed to address it.

COM-B Component

What to Measure

Methods

Leading or Lagging?

Capability (Knowledge)

Do employees know what the correct behaviour is?

Knowledge assessments, scenario-based questions

Leading

Capability (Skill)

Can employees perform the correct behaviour when required?

Simulated tasks, observed process completion

Leading

Opportunity (Physical)

Is the environment designed to make secure behaviour easy?

Process mapping, friction analysis, UX audit

Leading

Opportunity (Social)

Do workplace norms support secure behaviour?

Peer surveys, observational data, norm perception measures

Leading

Motivation (Automatic)

Are secure behaviours habitual and emotionally positive?

Longitudinal behavioural tracking, sentiment data

Leading

Motivation (Reflective)

Do employees believe secure behaviour is worthwhile?

Attitude surveys, qualitative interviews

Leading

Behaviour

Is the target behaviour actually occurring?

Behavioural logging, reporting rates, access logs

Both

Risk Outcome

Is the programme reducing incident rates or impact?

Incident data, near-miss reports, financial impact estimates

Lagging

Source: Adapted from Michie et al. (2011). COM-B measurement framework applied to security behaviour contexts.

Leading and Lagging Indicators

The distinction between leading and lagging indicators is borrowed from safety management, where it has a long and successful track record, but it applies with equal force to security (Hopkins, 2009). Lagging indicators measure what has already happened: incident counts, breach costs, and mean time to detection. They are important for accountability and trend analysis, but they are backwards-looking and provide no actionable signal until something has gone wrong.

Leading indicators measure the conditions that predict future outcomes: reporting rates, MFA adoption, time to complete critical patches, and phishing simulation results interpreted correctly as a diagnostic rather than a performance score. A programme that tracks only lagging indicators is managing the past. One that also tracks leading indicators has the information to manage the future.

Indicator Type

Example Metrics

Frequency

Primary Audience

Leading (Behavioural)

Phishing report rate; MFA adoption rate; password manager usage; policy exception requests

Monthly

Security team; CISO

Leading (Environmental)

Friction score for key secure processes; default security setting compliance; tool adoption rates

Quarterly

Security architecture; IT leadership

Leading (Cultural)

Security norm perception scores; reporting confidence index; trust in security team measures

Bi-annually

CISO; HR; Executive team

Lagging (Incident)

Social engineering incident count; credential compromise events; insider risk incidents

Monthly / per event

CISO; Board; Audit

Lagging (Impact)

Financial impact of human-element incidents; recovery time; regulatory consequence

Annually / per event

Board; CEO; Risk committee

A Five-Level Maturity Model

Maturity models provide a structured way to assess programme development and communicate progress to stakeholders. The following model describes five levels of maturity in human cyber risk management, from the reactive approaches that characterise most organisations to the predictive capability of the most advanced programmes.

Level

Name

Characteristics

Typical Metrics

1

Reactive

No structured programme. Training delivered in response to incidents. Measurement is absent or ad hoc.

Training completion (if tracked at all)

2

Compliance

Annual mandatory training. Phishing simulations run periodically. Metrics focused on completion and click rates.

Completion rate; click rate; quiz scores

3

Proactive

Risk-based programme design. Behavioural segmentation by role. Leading indicators tracked. Interventions designed using behavioural frameworks.

Reporting rate; MFA adoption; behaviour change rates

4

Integrated

Human risk is integrated into enterprise risk management. Behavioural data feeds into broader risk models. CISO presents human risk in financial terms to the board.

COM-B component scores; risk-adjusted human factor index

5

Predictive

Continuous behavioural monitoring. Intervention impact is measured with control groups. The programme adapts in near real-time based on behavioural signals.

Predictive risk models; causal attribution of behaviour change to incident reduction

Most organisations sit at Level 2. The transition to Level 3 is the most practically impactful step available, and it does not require sophisticated technology. It requires a shift in the questions being asked: from 'did people complete the training?' to 'did any behaviour change, and what drove the change?'

Communicating Human Risk to Boards

One of the most frequently cited frustrations among security professionals is the difficulty of communicating human risk to boards and executive teams in terms that generate meaningful engagement. The measurement frameworks described above are part of the solution, but the framing matters as much as the data.

Boards understand risk in financial terms. They understand probability and consequence, even if not in those words. A presentation that states '23 per cent of employees clicked on last quarter's phishing simulation' communicates almost nothing to a board, because the board has no reference point for whether 23 per cent is good, bad, or irrelevant to business risk. A presentation that states 'our current phishing susceptibility rate, applied to our transaction volumes, implies an expected annual exposure from business email compromise of approximately £400,000' is a different conversation entirely.

An imperfect estimate of behavioural risk expressed in financial terms is infinitely more useful to a board than a precise measurement of training completion. Boards make resource allocation decisions. To do so they need to understand what the risk is worth, not how many people sat through a module.

The methodology for converting behavioural metrics into financial risk estimates need not be actuarially precise. It needs to be defensible, transparent in its assumptions, and directionally useful. Start with published industry loss data, apply your organisation's known exposure profile, and adjust for your measured behavioural risk indicators. The conversation that follows will be more productive than any simulation click rate could generate.

A 90-Day Implementation Plan

For practitioners who want to move from the ideas in this series to practical action, the following phased approach provides a starting structure. It is not prescriptive, but it reflects the logical dependencies between diagnosis, design, and measurement.

Phase

Timeframe

Focus

Output

Diagnose

Days 1 to 30

Use COM-B to audit two or three key security behaviours. Interview employees. Map the friction in current processes. Identify whether the primary barriers are capability, opportunity, or motivation.

Barrier diagnosis report; prioritised behaviour list

Design

Days 31 to 60

Select one target behaviour. Apply EAST to design an intervention addressing the diagnosed barrier. Document the hypothesis and measurement plan before launch.

Intervention specification; measurement protocol; baseline data

Test and Learn

Days 61 to 90

Run a controlled pilot with a specific group. Measure both the behaviour and the COM-B components. Compare against baseline. Define what constitutes a successful outcome before you start.

Pilot results; decision on scaling or iterating; board-ready summary

The Ethical Dimension of Measurement

Programmes that track individual employee behaviour raise legitimate concerns about privacy and the potential for surveillance. Three principles should govern any measurement programme.

Transparency requires that employees know what is being measured and why. Covert individual monitoring, particularly when linked to punitive consequences, erodes the trust that effective security culture depends upon. Research by Lain and colleagues (2022) found that organisations with punitive phishing programmes had lower reporting rates than those with supportive approaches. The surveillance dynamic actively undermines the security outcome it purports to serve.

Aggregation means reporting behavioural metrics at the team or departmental level wherever possible. This provides the diagnostic granularity needed for intervention design without creating an individual performance assessment. Individual-level data should be accessed only when there is a specific, documented justification.

Purpose limitation requires that data collected for security improvement not be repurposed for performance management or disciplinary action. Under the UK GDPR and equivalent frameworks, this is both a legal and an ethical obligation. Compliance with it is also a practical requirement: employees who believe their security behaviour data could be used against them will behave accordingly.

Conclusion: From Awareness to Human Risk Management

Across this four-part series, the argument has been consistent. Understanding why people behave as they do requires theory, specifically dual process theory and the COM-B model. Understanding what makes them exploitable requires knowledge of cognitive biases and how attackers operationalise them. Changing behaviour requires an intervention design that addresses the actual barriers rather than assuming information provision will suffice. And knowing whether any of it has worked requires measurement frameworks that connect behaviour to risk, not to training completion.

The shift from security awareness to human risk management is not a rebrand. It is a substantive change in how the profession understands and addresses its most persistent challenge. The evidence base is mature, the frameworks are practical, and the tools exist. What has been lacking in many organisations is the willingness to measure honestly and to let the evidence determine the response. That willingness is, in the end, the only prerequisite that cannot be designed in by a choice architect.

 


References

Hopkins, A. (2009). Failure to Learn: The BP Texas City Refinery Disaster. CCH Australia.

Lain, D., Kostiainen, K. and Capkun, S. (2022). Phishing in organizations: Findings from a large-scale and long-term study. In 2022 IEEE Symposium on Security and Privacy (SP), pp.842-859.

Michie, S., van Stralen, M.M. and West, R. (2011). The behaviour change wheel: A new method for characterising and designing behaviour change interventions. Implementation Science, 6(1), p.42.

Service, O. et al. (2014). EAST: Four Simple Ways to Apply Behavioural Insights. Behavioural Insights Team.