Artificial Intelligence

‘Whisper Leak’ LLM Side-Channel Attack Infers User Prompt Topics

Attackers intercepting network traffic can determine the conversation topic with a chatbot despite end-to-end encrypted communication. The post ‘Whisper Leak’ LLM Side-Channel Attack Infers User Prompt Topics appeared first on SecurityWeek.

Microsoft researchers have devised a new AI side-channel attack that relies on metadata patterns to infer the conversation a user has with a remote language model, even if the communication is end-to-end encrypted.

The issue, they say, impacts all LLMs and poses a significant risk to entities under surveillance from ISPs, governments, or cyber actors, as it exposes sensitive conversations, ranging from legal advice to medical consultations and other private topics, to eavesdropping.

“This especially poses real-world risks to users by oppressive governments where they may be targeting topics such as protesting, banned material, election process, or journalism,” Microsoft notes.

The attack, called Whisper Leak, assumes that the adversary is positioned to monitor the network traffic between the victim and the LLM. Even without being able to decrypt the traffic, the adversary can infer the conversation topic based on packet size and timing patterns in the chatbot’s responses.

The attack exploits the fact that LLMs generate responses by predicting tokens (words or sub-words) based on the user’s input and previously generated tokens, in a step-by-step manner. Additionally, they serve the tokens immediately or in batches, in a ‘streaming’ approach.

According to Microsoft’s researchers, this influences the timing and size of the data chunks the LLM sends to the client. The communication, however, is typically encrypted using HTTP over TLS (HTTPS).

“Modern TLS encryption schemes preserve the size relationship between plaintext and ciphertext. When data is encrypted, the resulting ciphertext size is directly proportional to the original plaintext size, plus a small constant overhead,” the researchers note in their technical paper.

Essentially, this means that, while the content of the communication is successfully encrypted, the size of the transmitted data chunks is leaked.

“For LLM services that stream responses token by token, this size information reveals patterns about the tokens being generated. Combined with timing information between packets, these leaked patterns form the basis of the Whisper Leak attack,” the researchers explain.

To evaluate the attack, the researchers simulated a scenario where the attacker could only observe the encrypted traffic, and trained a binary classifier to distinguish between the topic of “legality of money laundering” and background traffic.

The researchers’ experiment showed that 17 of the 28 tested models achieved over 98% accuracy in distinguishing the target topic, with some reaching over 99.9% accuracy. Essentially, they allow attackers to “identify 1 in 10,000 target conversations with near-zero false positives”, the researchers say.

The researchers suggest random padding, token batching, and packet injection as possible mitigation strategies. OpenAI and Microsoft Azure have implemented an additional field in streaming responses, adding a random sequence of text of variable length to mask the token length. Mistral added a new parameter with a similar effect.

Users, the researchers say, should avoid discussing sensitive topics with AI chatbots when using untrustworthy networks, should use VPN services, use providers that have implemented the mitigations, use non-streaming models, and stay informed on the provider’s security practices.

Latest News

Publisher