Hackers Feast on Leftovers
The LeftoverLocals exploit exposes a major GPU vulnerability, allowing unauthorized access to private data like LLM chat transcripts.
Do you remember back when graphics processing units (GPUs) were intended for rendering graphics? The days of the 3dfx Voodoo, and other powerhouses of the era where computer gaming started to come of age, are now long gone. As technology advanced, GPUs underwent a transformative evolution. Their parallel processing capabilities were recognized as valuable not only for graphical tasks but also for handling complex computational workloads. This realization led to the emergence of GPU computing, where GPUs began to play a crucial role in parallel processing for scientific simulations, artificial intelligence, and other data-intensive applications. Today, a GPU is more likely to be associated with machine learning than gaming.
This rapid advancement in GPU technology that resulted from our unquenchable thirst for more parallel processing power led to something of a Wild West in the industry. If you remember the “I'm a Mac, and I'm a PC ads” of the early 2000s, traditional CPUs were playing the role of the PC, with well-defined instruction set architectures and mountains of documentation. GPUs, on the other hand, were the cool, laid-back younger generation that were moving fast and breaking things. While this undoubtedly gave rise to the tremendous improvements in computing power of today’s GPUs, it also fostered an environment of rapid shifts in architecture, lackluster documentation, and an insufficient focus on matters of security.
We have to pay the piper eventually, and now that bill is coming due. Tyler Sorensen, a security researcher at Trail of Bits, has found a significant vulnerability that impacts GPUs from many major hardware manufacturers. Sorensen has found that GPU memory is often not protected as well as a system’s main memory, allowing it to be eavesdropped on with very little effort. Named LeftoverLocals, this exploit can reveal private information, like chat transcripts with large language models, without any special privileges on a system.
GPUs manufactured by Apple, Qualcomm, AMD, and Imagination are known to be vulnerable to LeftoverLocals. When running code on a GPU, much of the data is stored in an optimized GPU memory region called local memory. It was discovered that if a user has access to run any GPU compute applications, via OpenCL, Vulkan, or Metal, for example, they can eavesdrop on the contents of local memory that are being used by other applications on the system without escalated privileges. The attack can be implemented in less than 10 lines of code, and is quite simple to do, even for an inexperienced programmer.
Further complicating the matter, it is exceedingly difficult to determine if an application is using GPU local memory, leaving users uncertain if an application may be impacted by LeftoverLocals. It is similarly challenging to determine if another user is reading the local memory used by an application. This is very bad news from a security standpoint — there is an easy to implement exploit, and if we are being targeted, we are virtually blind to that fact.
At the present time, Apple, Qualcomm, and Imagination have released patches that protect some, but not all, of their GPUs from the exploit. AMD devices are still impacted, but they are hard at work on a fix. If you happen to have an NVIDIA or Arm GPU, you can rest easy — their devices are not impacted by LeftoverLocals. In any case, we hope that this exploit will be a wake-up call to GPU manufacturers. Progress must continue, but security cannot be taken too lightly in the process.