Human Behaviour Under AI-Enabled Adversarial Pressure

The Discipline That Assumed a Static Threat

Behavioural cybersecurity emerged as a field from a recognition that technical controls, however sophisticated, cannot fully account for the human element of security risk. People make decisions under uncertainty. They are subject to cognitive biases, social pressures, and motivational dynamics that operate well below the level of conscious reasoning. They take shortcuts when they are busy, defer to authority when they are anxious, and extend trust when communication feels familiar. The discipline developed frameworks, interventions, and measurement approaches designed to understand these dynamics and build organisational resilience against adversaries that exploit them.

The problem is that the discipline largely assumed a relatively stable adversarial model. The phishing email, the vishing call, the pretexting scenario: these were the canonical threat vectors against which awareness programmes were calibrated. Skilled human social engineers could execute them with sophistication, but their capability was bounded by human cognitive bandwidth, by the time and effort required to research and personalise an approach, and by the inevitable inconsistencies that emerge when a human being attempts to sustain a deceptive persona under operational pressure.

That assumption of a bounded adversary is no longer valid. The emergence of AI-enabled adversarial capability has not simply made existing social engineering faster or more scalable. It has changed the nature of the threat in ways that require the discipline to revisit its foundational frameworks and ask harder questions about what human resilience actually means in this new environment.

The COM-B Lens: A Framework Built for This Moment

The COM-B model, developed by Michie and colleagues as part of the Behaviour Change Wheel framework, offers a structural lens that remains genuinely useful for understanding how AI-enabled adversaries target human behaviour. The model holds that any behaviour is a product of three interacting components: Capability, the physical and psychological capacity to perform the behaviour; Opportunity, the environmental and social factors that make the behaviour possible or impossible; and Motivation, the reflective and automatic processes that direct behaviour towards particular outcomes.

What makes COM-B valuable in this context is that it does not treat human vulnerability as a monolithic property. It treats it as a dynamic product of conditions, conditions that can be shaped by the environment, by the adversary, and by the defender. An AI-enabled adversary does not exploit "human weakness" in the abstract. It engineers the conditions under which specific individuals are most likely to behave in ways that serve the adversary's objectives, targeting capability, opportunity, and motivation with a degree of specificity and coordination that no human social engineer could achieve at scale.

Understanding the mechanism at each level is essential for designing defences that are proportionate to the actual threat.

Capability: The Erosion of Linguistic Heuristics

For many years, one of the most reliable and teachable signals of a social engineering attempt was the quality of the language used. Grammatical errors, awkward phrasing, inconsistent tone, and contextual implausibilities were not simply the products of careless adversaries. They were structural artefacts of the mismatch between the attacker's cultural and linguistic context and that of the target. Non-native speakers crafting communications in English for a UK or North American corporate audience would, almost inevitably, produce signals that an alert recipient could detect. Security awareness programmes invested significantly in training people to notice these signals, and for a period, the investment had genuine returns.

Large language models have systematically dismantled this heuristic. The linguistic fluency achievable through current-generation AI is not marginally better than that of an average human writer. In many domains, it is indistinguishable from an expert one. Content produced through AI systems is grammatically correct, tonally appropriate, and contextually coherent. It does not carry the surface-level errors that awareness programmes trained people to notice. The linguistic detection capability that organisations spent years building in their people has been rendered obsolete not through any failure of the training methodology, but through an irreversible change in the adversarial capability landscape.

The implications go beyond phishing detection. The human cognitive capacity to assess credibility through linguistic cues is a broad-spectrum faculty, not a narrowly trained security skill. We use it when we read news, evaluate advice, assess professional communications, and navigate our social environment. When that faculty is systematically confounded, the downstream effects are not confined to security contexts. The erosion of trust in digital communication is itself a security consequence, one whose full implications are still unfolding.

What replaces linguistic detection as a capability that organisations can meaningfully develop in their people? The answer points away from content-level detection and towards process-level verification habits. The capability that matters in an AI-enabled adversarial environment is not the ability to spot a bad email. It is the internalised reflex to verify through an independent channel before acting on any request that involves access, credentials, financial transactions, or sensitive information, regardless of how credible the communication appears. Building this reflex requires a different kind of training design than the "spot the phish" paradigm that has dominated the field.

Opportunity: The Precision of the Mapped Attack Surface

The opportunity component of the COM-B model refers to the environmental and social conditions that make a behaviour more or less likely. In security terms, opportunity has traditionally referred to the conditions that make a target more or less vulnerable at a given moment: the time-pressured executive, the distracted employee, the new joiner who does not yet know the verification protocols. Skilled social engineers have always sought to identify and exploit these windows of opportunity. The constraint was always the cost of identifying them, which required surveillance, reconnaissance, and inference, all of which consume human time and attention.

AI actors operating across digital environments face no comparable constraint. The digital surface of a professional's life is extraordinarily rich in observable signals, and those signals are increasingly accessible through legitimate means. Professional network profiles reveal reporting lines, project involvement, and career trajectories. Public communications reveal communication style, current preoccupations, and relationship dynamics. Calendar and meeting metadata, where accessible, reveal the rhythm of a person's working week and the moments when cognitive load is likely to be highest. Corporate communications and announcements reveal the organisational pressures that are currently live and the decision-making timescales in play.

An AI actor that synthesises these signals does not construct a generic model of "a busy professional." It constructs a model of a specific individual at a specific point in time, including the particular pressures they are under, the relationships they rely on, and the moments during which their attention is most divided. A message that appears to come from a trusted colleague, references a real project by its actual name, asks for something consistent with that colleague's genuine role, and arrives at a moment of documented time pressure is not a social engineering attempt in any sense that traditional security awareness addresses. It is a precisely mapped exploitation of a specifically identified opportunity window.

This precision changes the risk calculus in an important way. Generic awareness training operates at the population level. It aims to raise the average detection rate across the workforce by providing general heuristics applicable to a broad range of social engineering scenarios. The assumption underlying this approach is that the adversary is also operating at the population level, casting a wide net and accepting that most of it will be detected or ignored. When the adversary is operating at the individual level, selecting targets based on precisely characterised vulnerability profiles and timing attacks to coincide with mapped opportunity windows, population-level training provides much weaker protection than the aggregate statistics suggest.

The individual whose behavioural profile has been mapped and who is being targeted at a carefully selected moment of vulnerability is not meaningfully protected by the fact that the organisational average detection rate on the last phishing simulation was eighty-three per cent. Their specific vulnerability at that moment is the variable that matters. And it is a variable that population-level training does not address.

Motivation: Bypassing the Cognitive Architecture

The motivational dimension is where the interaction between AI-enabled adversarial capability and human cognitive architecture becomes most consequential, and most difficult to address through conventional security culture interventions.

Dual-process theories of cognition, with their origins in the work of Kahneman and Evans and subsequently developed by a wide range of researchers in social and cognitive psychology, characterise human decision-making as the product of two broadly distinct systems. System 1 processing is fast, automatic, associative, and largely unconscious. It operates on heuristics, pattern recognition, and emotional resonance. It is the system that recognises a familiar face, reads the tone of a message before processing its content, and generates an immediate sense of whether a situation feels right or wrong. System 2 processing is slow, deliberate, effortful, and conscious. It is the system that performs logical analysis, evaluates evidence, and overrides intuitive responses when they conflict with reasoned judgment.

Security awareness training, in virtually all its conventional forms, is directed at System 2. It attempts to load deliberate knowledge, heuristics, checklists and recognition criteria into the reflective system in the hope that this knowledge will be deployed when needed. The fundamental problem with this approach, which behavioural scientists have identified for decades, is that System 2 is only engaged under specific conditions. Under time pressure, cognitive load, strong social context, and emotional activation, System 1 dominates. People do not consult their training. They respond to the immediate social and emotional signals of the situation.

An AI actor that has constructed a detailed model of a specific individual's relational context can craft communications that are precisely calibrated to trigger System 1 responses. It can replicate not just the surface language of a trusted colleague, but the characteristic tone, the familiar reference points, the communication patterns that signal "this is a message from someone you trust." It can time the communication to arrive when cognitive load is high, and System 2 engagement is therefore least likely. It can frame the request in terms of urgency or authority that activate the emotional and social motivators most likely to drive automatic compliance. The target is not failing to apply their training. The attack is specifically designed to prevent the conditions under which training could be applied.

This is not a peripheral concern for security culture design. It is its central problem. If the primary vulnerability is the gap between what people know in reflective mode and how they behave under System 1 conditions, then the solution is not more knowledge transfer. It is the design of environmental, social, and process-level conditions that automatically engage verification behaviour, without requiring deliberate System 2 engagement. This means building verification into workflow design. It means creating social norms where asking "can I just confirm this through another channel" is the unremarkable default response to any unusual request, rather than a behaviour that requires deliberate effort and carries an implicit social cost.

The False Assurance of Phishing Simulation

The argument crystallises into a direct challenge to the dominant paradigm of security awareness measurement: the phishing simulation programme.

Phishing simulation, as typically deployed, works by sending a simulated phishing email to a population of employees and measuring the proportion who click, report, and ignore it. The aggregate metrics are then used as proxies for organisational resilience: a high click rate signals vulnerability; a high report rate signals a healthy security culture; and trends over time demonstrate the effectiveness of awareness interventions. The methodology is widespread, commercially well-supported, and intuitively appealing to leadership because it produces numbers.

The problem is that the numbers it produces do not measure what they claim to measure.

Phishing simulations, in their standard form, deploy generic or lightly personalised content against a broad population at an unspecified and uncontextualised moment in time. They test whether people can identify a social engineering attempt under conditions that are broadly representative of the pre-AI threat landscape. They do not test whether people can identify a social engineering attempt built on a detailed behavioural model of their specific individual profile. They do not test resilience at the moments of highest individual vulnerability. They do not test the response to communications that replicate trusted relationships with AI-level fidelity. And they do not test the motivational dynamics that operate when a communication has been specifically designed to trigger System 1 compliance.

A workforce that scores well on a phishing simulation programme has demonstrated population-level resilience against a class of threat that is rapidly becoming secondary. It has not demonstrated resilience against an adversary that is capable of individualised behavioural targeting. Using the former as evidence of the latter is not merely methodologically weak. It is creating a documented false assurance that may actively delay the investment in more structurally sound resilience approaches.

This is not an argument against phishing simulation as one element of a broader programme. It is an argument against its use as the primary or definitive measure of human security resilience in an environment where the adversarial capability has moved decisively beyond what simulation programmes are designed to assess.

Towards Resilience That Is Proportionate to the Threat

If the conventional toolkit is insufficient, what does proportionate resilience look like in an AI-enabled adversarial environment?

The first requirement is a shift in the unit of analysis. Population-level training and measurement must be complemented by individual-level risk assessment that acknowledges the differential targeting that AI-enabled adversaries can execute. This does not mean abandoning broad awareness programmes. It means recognising that certain individuals, by virtue of their role, access privileges, public profile, or position in the organisation's trust network, represent higher-value targets and may require substantively different resilience investment.

The second requirement is a reorientation of training design from knowledge transfer to behaviour design. The question is not "what do people need to know" but "what environmental, social, and process-level conditions will produce secure behaviour reliably, including under System 1 conditions." This means designing verification workflows that are frictionless enough to be used consistently. It means creating social norms around out-of-band verification that remove the social cost of appearing to distrust a colleague. It means building security into the default path rather than making it an effortful departure from it.

The third requirement is investment in informal trust networks as a structural security infrastructure. Security Champions Networks, properly designed as distributed influence networks rather than awareness delivery mechanisms, create a fabric of trusted relationships through which suspicious communications are more likely to be surfaced, discussed, and verified. The informal social layer of an organisation is both the primary target of AI-enabled social engineering and the primary mechanism of organic resilience. Investing in it deliberately, through the structured development of network-embedded champions who can receive and propagate behavioural security signals, is one of the most structurally sound responses to the precision targeting that AI-enabled adversaries can execute.

The fourth requirement is honesty about measurement. The metrics that security leaders report to boards and executives should accurately represent what is being measured and what it does and does not demonstrate about organisational resilience. Phishing simulation click rates are a measure of population-level heuristic detection under controlled conditions. They are not a measure of resilience against individualised AI-enabled social engineering. Saying so clearly and proposing measurement frameworks that are more proportionate to the actual threat is itself a strategic leadership act.

The Discipline Needs to Evolve

Behavioural cybersecurity is a young discipline, and its foundations were built for a threat landscape that no longer fully describes the environment its practitioners are navigating. The COM-B model, dual-process theory, and social influence frameworks remain genuinely valuable: they describe real mechanisms through which human behaviour is shaped, and those mechanisms are precisely the ones that AI-enabled adversaries are engineered to exploit. The frameworks are not wrong. But their application must evolve in response to an adversarial capability that is qualitatively different from what the discipline was designed to address.

The shift required is from a model that treats human beings as the primary variable to be trained, and training as the primary intervention, towards a model that treats the full environmental, social, and motivational system as the design surface. The human being is not the problem to be solved. The conditions under which human beings make decisions are the system to be understood, shaped, and continuously assessed against a threat that is itself continuously adaptive.

That is a harder problem than phishing simulation. It is also the right one.

Human Behaviour Under AI-Enabled Adversarial Pressure

The Discipline That Assumed a Static Threat

The COM-B Lens: A Framework Built for This Moment

Capability: The Erosion of Linguistic Heuristics

Opportunity: The Precision of the Mapped Attack Surface

Motivation: Bypassing the Cognitive Architecture

The False Assurance of Phishing Simulation

Towards Resilience That Is Proportionate to the Threat

The Discipline Needs to Evolve

Keep reading

Behavioural Convergence Theory

Nudging the Machine: Behavioural Interventions for Agentic AI and Multi-Agent Systems

Using Social Network Analysis to Engineer a High-Impact Security Champion Network