A significant security vulnerability has been discovered in the Chrome extension for Anthropic’s Claude AI model, potentially allowing malicious actors to compromise user data and conduct unauthorized actions. The flaw enables other browser extensions, even those without extensive permissions, to embed hidden instructions that can hijack the AI agent’s capabilities.
Browser security firm LayerX disclosed the findings, detailing how the exploit bypasses existing security measures. This vulnerability could have broad implications as AI agents become increasingly integrated into daily online activities for businesses and individuals. The potential for these AI agents to access the internet and perform complex tasks makes their security paramount.
Exploiting Claude’s Chrome Extension Vulnerability
According to LayerX senior researcher Aviad Gispan, the flaw lies in the extension’s code, which permits any script within the browser’s origin to communicate with Claude’s large language model (LLM) without proper verification. This lack of validation means that any installed extension can leverage a content script, which requires minimal permissions, to issue commands directly to the Claude extension.
Demonstrated Malicious Actions
LayerX successfully demonstrated the exploit’s power by executing arbitrary prompts, bypassing Claude’s safety features, and circumventing user confirmation steps. The researchers were able to perform cross-site actions across various Google tools, effectively gaining unauthorized access and control.
As concrete proof of concept, LayerX reported exploiting the vulnerability to extract sensitive files from Google Drive folders and share them with unauthorized parties. They also demonstrated the ability to monitor recent email activity, send emails on behalf of users, and pilfer private source code from connected GitHub repositories.
Gispan characterized the vulnerability as a critical breach, stating it “effectively breaks Chrome’s extension security” by creating a “privilege escalation primitive across extensions.” This type of escalation is precisely what Chrome’s security model is designed to prevent.
The attack vector involves manipulating Claude’s perceived environment. Claude relies on text, user interface semantics, and screenshot interpretation to make decisions. Attackers can control this input by modifying Claude’s user interface to remove labels or indicators around sensitive information, such as passwords or sharing confirmations. Subsequently, they can prompt Claude to perform actions like sharing files with external servers.
This method presents a challenge for cybersecurity defenses, as there may be no overtly malicious activity to detect. Even when visible actions occur, the AI model can be prompted to erase evidence, such as deleting emails, making forensic investigation difficult.
Industry Reaction and Mitigation Efforts
Ax Sharma, Head of Research at Manifold Security, commented on the findings, calling it a “useful demonstration of why monitoring AI agents at the prompt layer is fundamentally insufficient.” He emphasized that the sophisticated aspect of the attack is not the injection itself, but the manipulation of the agent’s perceived reality, making its actions appear legitimate internally.
LayerX reported the flaw to Anthropic on April 27th. According to LayerX, Anthropic initially stated the bug was a duplicate of another vulnerability already slated for a future update. A subsequent fix was released on May 6th, introducing new approval flows for privileged actions to make exploitation more difficult.
However, Gispan indicated that even with the partial fix, he was still able to gain control of Claude’s agent in some scenarios. He noted that switching to a “privileged” mode, even without user notification or consent, allowed for the circumvention of these new security checks and injection of prompts, similar to the original exploit.
Anthropic did not provide a comment to CyberScoop regarding the research or their ongoing mitigation efforts. The effectiveness of the May 6th update and the timeline for a complete resolution remain points of concern for users of AI agents integrated with browser extensions.

