Cybersecurity researchers have uncovered a series of seven vulnerabilities affecting OpenAI’s latest AI models, GPT-4o and GPT-5, including methods for attackers to steal personal information from user memories and chat histories without their knowledge. These findings, detailed by Tenable, highlight significant risks associated with the increasing integration of AI chatbots with external web services. While OpenAI has reportedly begun addressing some of these disclosed issues, the vulnerabilities expose the AI systems to indirect prompt injection attacks, allowing malicious actors to manipulate the chatbot’s intended functions.
The discovered vulnerabilities stem from how large language models (LLMs), like those powering ChatGPT, process information from various contexts, including web browsing and search. Security researchers Moshe Bernstein and Liv Matan outlined in a report that these flaws can be exploited through sophisticated attack techniques that trick the AI into executing unintended or harmful commands. This news follows a wave of recent disclosures detailing various prompt injection attacks against AI tools, indicating a broader challenge in securing these evolving technologies.
Understanding the ChatGPT Vulnerabilities
At the core of these vulnerabilities is the concept of indirect prompt injection, where attackers don’t directly interact with the AI but instead embed malicious instructions within external data sources that the AI then accesses and processes. This technique leverages the AI’s reliance on web browsing and search capabilities, turning trusted external information into a potential attack vector.
One significant vulnerability involves indirect prompt injection via trusted sites in the Browsing Context. Attackers can embed malicious instructions within the comment sections of web pages. When ChatGPT is prompted to summarize such pages, it may inadvertently execute these hidden instructions, leading to the compromise of user data or an alteration of its behavior.
A particularly concerning finding is the zero-click indirect prompt injection vulnerability in Search Context. This method allows attackers to trick the LLM into executing malicious commands simply by referencing a niche website within a natural language query. If the website has been indexed by search engines like Bing and subsequently crawled by OpenAI’s systems, the LLM may execute hidden directives within the site’s content without any direct user interaction or prior knowledge.
Further attacks identified include prompt injection via one-click, which exploits URL structures to automatically execute a malicious query when a user clicks a crafted link. The safety mechanism bypass vulnerability takes advantage of allow-listed domains, such as Bing, to mask malicious URLs within seemingly legitimate ad tracking links. Additionally, the conversation injection technique embeds malicious instructions directly into website content, causing the LLM to respond with unintended replies in subsequent interactions due to the prompt being incorporated into the conversational context.
Researchers also detailed a malicious content hiding technique, which exploits how ChatGPT renders markdown to conceal malicious prompts. Any data appearing on the same line as a fenced code block opening () after the first word might not be rendered, allowing malicious instructions to remain hidden. Lastly, the memory injection technique involves poisoning a user’s ChatGPT memory by concealing hidden instructions on a website that, when summarized, become embedded in the AI’s recollection of the interaction.
Broader Implications of AI Vulnerabilities
These disclosures add to a growing list of security concerns surrounding AI chatbots and their integration with external systems. Recent research has demonstrated similar prompt injection techniques affecting other AI platforms, including Anthropic’s Claude and Microsoft 365 Copilot. These attacks have shown capabilities ranging from remote code execution and data exfiltration to bias amplification and the circumvention of safety guardrails.
The underlying issue, as noted by Tenable researchers, is the fundamental challenge LLMs face in distinguishing between legitimate user commands and attacker-controlled data ingested from external sources. Exposing AI chatbots to external tools and systems, a necessity for building functional AI agents, inherently expands their attack surface. This provides threat actors with more opportunities to conceal malicious prompts that can be inadvertently parsed and executed by the models.
Researchers have cautioned that prompt injection is an inherent challenge in the current architecture of LLMs and is unlikely to be resolved comprehensively in the immediate future. They emphasize the critical need for AI vendors to rigorously ensure that all safety mechanisms, such as URL safety checks, are functioning correctly to mitigate the potential damage from such attacks.
This ongoing development coincides with research highlighting other potential weaknesses in AI models. Studies have indicated that training AI models on “junk data” or contaminated internet content can lead to LLM “brain rot,” degrading their performance and reliability. Furthermore, recent findings suggest that poisoning AI models with a relatively small number of malicious documents can effectively backdoor them, questioning previous assumptions about the scale of data control required for such tampering.
Looking ahead, the focus will remain on how AI developers and cybersecurity professionals address these evolving vulnerabilities. The ongoing integration of AI into various aspects of digital life necessitates robust security measures and a proactive approach to identifying and mitigating new attack vectors. Users are advised to remain vigilant about the information they share with AI chatbots and to be aware of the inherent risks associated with their operation, especially when integrated with external web services.

