Cybersecurity researchers have identified a critical security vulnerability in Ollama, a popular open-source framework for running large language models (LLMs) locally. This flaw, dubbed “Bleeding Llama,” could allow remote attackers to leak the entire process memory of an unauthenticated Ollama server, potentially exposing sensitive data. The vulnerability, tracked as CVE-2026-7482, carries a high CVSS score of 9.1, indicating a severe risk.
Ollama, which boasts over 171,000 stars on GitHub, enables users to deploy LLMs on their own hardware, circumventing reliance on cloud services. The “Bleeding Llama” vulnerability arises from an out-of-bounds read flaw within Ollama’s GGUF model loader, specifically impacting the /api/create endpoint. This issue could affect an estimated 300,000 servers globally, according to initial assessments by security researchers.
Ollama Vulnerability Allows Memory Leak
The core of the “Bleeding Llama” vulnerability (CVE-2026-7482) lies in how Ollama handles GGUF files, a format designed for storing large language models. According to the CVE description, the vulnerability exists in Ollama versions prior to 0.17.1. When an attacker provides a specially crafted GGUF file where the declared tensor offset and size exceed the file’s actual length, Ollama may attempt to read beyond its allocated heap buffer during the quantization process in files `fs/ggml/gguf.go` and `server/quantization.go`’s `WriteTo()` function. This out-of-bounds read is facilitated by the use of the `unsafe` package in Go, which bypasses memory safety guarantees.
In a hypothetical attack scenario, a malicious actor could upload a GGUF file with an artificially inflated tensor shape to an exposed Ollama endpoint. By triggering the `/api/create` endpoint, the out-of-bounds read vulnerability is activated. This could lead to the exfiltration of sensitive information residing in the Ollama process memory, such as environment variables, API keys, system prompts, and even concurrent user conversation data. This leaked data can then be potentially exfiltrated to an attacker-controlled server via the `/api/push` endpoint.
The exploitation chain for this Ollama vulnerability involves three key steps. First, an attacker uploads a crafted GGUF file with an exaggerated tensor shape to a network-accessible Ollama server using an HTTP POST request. Second, the attacker uses the `/api/create` endpoint to initiate model creation, thereby activating the out-of-bounds read. Finally, the attacker utilizes the `/api/push` endpoint to transfer the leaked data from the heap memory to an external server under their control. “An attacker can learn basically anything about the organization from your AI inference — API keys, proprietary code, customer contracts, and much more,” stated Cyera security researcher Dor Attias. The implications are particularly severe for organizations integrating Ollama with other tools, as outputs from those tools could also be compromised.
Recommendations for Mitigating Ollama Risks
To mitigate the risks associated with this Ollama vulnerability, users are strongly advised to apply the latest available fixes promptly. Additionally, limiting network access to Ollama instances, auditing running instances for internet exposure, and isolating them behind firewalls are crucial security measures. Given that Ollama’s REST API does not offer built-in authentication, deploying an authentication proxy or API gateway in front of all Ollama instances is also recommended to further secure access.
Additional Unpatched Flaws in Ollama for Windows
Meanwhile, security researchers at Striga have detailed two separate vulnerabilities in Ollama’s Windows update mechanism that, when chained, can lead to persistent code execution. These flaws, disclosed on January 27, 2026, remain unpatched following the expiration of a 90-day disclosure period. The vulnerabilities are identified as CVE-2026-42248 (CVSS score: 7.7) and CVE-2026-42249 (CVSS score: 7.7).
According to Bartłomiej “Bartek” Dmitruk, co-founder of Striga, the Windows desktop client for Ollama runs automatically on login via the Windows Startup folder. It also periodically checks for updates via the `/api/update` endpoint. The two vulnerabilities exploit this update process. CVE-2026-42248 is a missing signature verification flaw, meaning the update binary is not properly verified before installation on Windows, unlike its macOS counterpart. CVE-2026-42249 is a path traversal vulnerability, stemming from the Windows updater creating the installer’s staging directory path directly from HTTP response headers without adequate sanitization.
Exploiting these Ollama vulnerabilities requires an attacker to control an update server accessible by the victim’s Ollama client. By overriding the `OLLAMA_UPDATE_URL` setting to point to a local HTTP server and having `AutoUpdateEnabled` set to its default `true` state, an attacker can influence the update responses. The path traversal vulnerability allows them to write an arbitrary executable to the Windows Startup folder. The missing signature check then prevents the legitimate update process from flagging this unauthorized file.
The chain can result in persistent code execution at the privilege level of the user running Ollama. The missing integrity check alone can lead to code execution, albeit not persistent, as a legitimate update might overwrite the staged file. However, when combined with the path traversal, an attacker can place an executable outside the standard update path, achieving persistent code execution. CERT Polska, which managed the coordinated disclosure, noted that Ollama for Windows versions 0.12.10 through 0.17.5 are affected. Users are advised to disable automatic updates and remove any Ollama shortcuts from the Windows Startup folder to prevent this silent on-login execution pathway.
Dmitruk further clarified that Ollama for Windows installations from versions 0.12.10 through 0.22.0 are vulnerable to these chained flaws. The path traversal allows attackers to place chosen executables in the Startup folder, and the missing signature verification ensures they remain there. The exploit allows for realistic payloads such as reverse shells, information stealers, or droppers. While removing the dropped binary from the Startup folder ends the persistence, the underlying flaws remain unaddressed in affected versions. The next steps for users involve applying patches as they become available and implementing the recommended workarounds to secure their systems.

