CONTROLLED AGENCY | Issue 09: The Experience Hijack

When AI Learns the Wrong Lessons: The Rise of Memory Poisoning Attacks

Jun 09, 2026

The most dangerous memory attacks against AI agents may now require nothing more than a conversation.

Issue 08 of this newsletter introduced persistent memory as a new attack surface, one that survives session termination, propagates influence across future interactions, and operates in the gap between sessions where traditional monitoring architectures have limited visibility. The attack class was real, and the governance implications were significant.

Since that issue, research on memory attacks has advanced rapidly. Recent work has revealed something more concerning than the existence of the attack surface itself: the barriers to exploiting it are becoming dramatically lower.

You no longer necessarily need infrastructure access to influence an agent’s memory. You no longer need write access to a vector database. In some cases, influence over the content that an agent processes may be sufficient. A small number of carefully crafted statements, disguised as operational context, technical facts, or incident reports, can shape how an agent reasons about future decisions without modifying its code, compromising its infrastructure, or altering its runtime environment.

I refer to this emerging class of attacks as the experience hijack.

The target is not the agent’s code, infrastructure, or runtime behavior. The target is the agent’s accumulated experience, the memory layer that allows agentic systems to learn from prior interactions and adapt their future decisions.

What the research established this month

Three recent research threads released in May and June 2026 point toward the same structural conclusion: memory itself is becoming a primary attack surface for agentic AI systems.

The first is MemMorph, a memory-poisoning attack that influences tool selection via long-term memory. Earlier tool manipulation attacks typically relied on poisoning tool metadata, descriptions, schemas, or capability declarations. MemMorph takes a different approach. Rather than directly instructing an agent to invoke a specific tool, it injects a small number of crafted memory records disguised as legitimate operational information.

These records reshape the contextual information available to the agent during future decision-making. The result is not explicit tool manipulation but altered reasoning. The agent arrives at its preferred tool selection through its own inference process, based on memories it now considers trustworthy.

This distinction matters, as the attack operates through the agent’s understanding of prior experience rather than through direct instruction. Defenses designed to detect prompt injections, malicious instructions, or suspicious tool descriptions may never observe the manipulation because the manipulation occurred earlier, during memory formation.

The second thread is sleeper memory poisoning.

In stateless systems, adversarial content generally affects only the current interaction. A malicious document, webpage, or email may influence behavior within a session, but its effect typically disappears once the session ends.

Memory-enabled agents change that equation.

If an attacker can influence what is written into long-term memory, a single exposure may affect future behavior long after the original content is gone. The sleeper variant is particularly concerning because the poisoned memory may remain dormant for extended periods. It activates only when specific conditions are met; a topic, context, workflow, or combination of circumstances that triggers retrieval.

The attacker does not need to be present when the impact occurs. The manipulation is completed long before the harmful behavior emerges.

The third thread is MemPoison, which explores how memory poisoning can persist despite emerging memory-management defenses.

Many organizations have begun deploying mechanisms such as memory summarization, selective retrieval, temporal decay, and memory eviction policies. MemPoison demonstrates that carefully crafted memory entries can remain highly retrievable and continue influencing agent behavior even within large memory stores containing thousands of benign records.

The significance is not merely that poisoning can occur. It is that poisoning can persist through the very mechanisms intended to reduce risk.

The lowering of the write barrier

One of the most important observations across this body of research is the gradual reduction in attacker requirements.

Earlier retrieval poisoning attacks generally assumed some level of access to underlying storage systems. That represented a meaningful barrier because successful attacks required infrastructure compromise, credential theft, insider access, or supply-chain compromise.

Subsequent research demonstrated attacks achievable through query-level interaction with the system.

Recent memory-poisoning research pushes that boundary further. In some scenarios, influence over content processed by the agent may be enough. The attacker may only need to place content into environments the agent already interacts with: documents, tickets, knowledge bases, web content, emails, or conversations.

For agents that routinely process external information, this represents a significant shift. Recent research suggests the traditional write barrier is collapsing, reducing the distinction between ordinary interaction and attack access.

This is what makes the experience hijack fundamentally different from many earlier attack classes.

Prompt injection requires the agent to process adversarial instructions directly. Tool poisoning often requires control over tool definitions or infrastructure. Memory-focused attacks can instead exploit the normal mechanisms through which agents learn from experience.

Why existing defenses struggle

Most organizations currently build agentic AI security around three checkpoints:

Input monitoring
Tool-call inspection
Output review

Each addresses an important risk. None was designed specifically for memory corruption.

Input monitoring focuses on detecting suspicious instructions, adversarial patterns, or known prompt-injection techniques. Memory-poisoning attacks often avoid these signals entirely because the injected content resembles legitimate operational information.

Tool-call inspection evaluates whether actions remain within the authorized scope. However, memory poisoning may not generate obviously anomalous actions. The resulting tool choices can appear entirely reasonable when viewed in isolation because the manipulation occurred earlier during memory formation.

Output review focuses on what the agent produces. Yet a poisoned memory may alter decision-making without creating visibly abnormal outputs. The effect may be subtle, cumulative, or operational rather than conversational.

The security challenge is therefore different.

The anomaly is not necessarily present in the input, the action, or the output. The anomaly exists within the experience that shaped the decision.

The OWASP Agentic AI framework now explicitly recognizes memory and context poisoning as a distinct security concern. This is an important milestone because it reflects growing industry recognition that memory is not simply a feature of agentic systems but a security boundary.

Recognition, however, is not the same as operational defense.

The governance implication: experience is infrastructure

Current governance frameworks, including NIST guidance, OWASP frameworks, ISO/IEC 42001, and emerging regulatory approaches such as the EU AI Act, increasingly emphasize recurring themes of identity, authorization, accountability, and auditability.

Memory introduces a fifth concern that remains comparatively underdeveloped: memory integrity.

An agent’s memory is not merely data storage.

It is the mechanism through which the system develops judgment. Memory influences what context matters, which patterns are recognized, which tools appear appropriate, and how future decisions are made.

When that mechanism is corrupted, the agent’s judgment may be corrupted as well.

Unlike traditional security incidents, the effects may emerge gradually. No single session appears suspicious, no single event explains the outcome. Instead, behavior shifts through the accumulation of experiences that were never genuine.

This is why memory integrity deserves treatment as a distinct governance requirement rather than a subset of audit logging.

Audit trails record what the system did.

Memory integrity seeks to establish whether the experiences informing those decisions remain trustworthy.

What does defending against this actually require

Three controls stand out as increasingly important.

Provenance-aware memory writes

Every memory entry should carry provenance information: the originating session, source content, writing mechanism, timestamp, and confidence level.

This metadata should remain preserved throughout summarization, consolidation, and retention operations.

Without provenance, a memory entry is effectively an unsigned assertion. With provenance, investigators can evaluate trustworthiness and remove suspicious entries without discarding legitimate institutional memory.

Memory integrity auditing

Memory auditing should occur independently of user sessions.

Organizations routinely monitor runtime behavior. They rarely inspect the memory layer itself.

Dedicated audits can identify anomalous memory entries, unusual provenance patterns, and clusters of information inconsistent with the agent’s declared operational purpose.

This is a fundamentally different task from runtime monitoring and requires dedicated controls.

Bounded memory lifetime and re-validation

Memory should not persist indefinitely without reassessment.

As memories age, they should undergo automated re-validation against current context, provenance information, and organizational policy.

Entries that fail validation should be quarantined for review rather than continuing to influence behavior indefinitely.

None of these controls is technically exotic. What is missing today is widespread adoption.

The structural argument

Issue 08 established that persistent memory creates a new attack surface for AI agents.

Issue 09 advances that argument further.

Recent research suggests that influencing this attack surface may require substantially less access than previously assumed. In some circumstances, ordinary interaction with content processed by the agent may provide sufficient opportunity for manipulation.

Memory poisoning is therefore not simply a database problem.

It is a challenge to the mechanism through which agents form experience.

A database compromise is a discrete event. Experience formation is a continuous process. It occurs every time an agent reads a document, processes a ticket, retrieves context, or participates in a conversation.

That distinction matters.

Organizations that recognize memory integrity as a first-class security requirement will be better positioned to govern agentic systems as they become more autonomous and persistent.

Those that do not may eventually face incidents that their audit trails can describe but cannot fully explain.

The explanation will reside in the memory layer itself.

And the memory layer remains one of the least instrumented components of modern agentic AI architectures.

The experience hijack is no longer a purely theoretical concern. Recent research has demonstrated high-success memory poisoning attacks that require only a small number of injected records and no infrastructure compromise. Governance frameworks are beginning to recognize the problem.

Operational defenses, however, remain at an early stage.

That gap is where the next generation of agentic AI security incidents is likely to emerge.

Miracle Owolabi

Discussion about this post

Ready for more?