Hacking Generative AI

By Ido Kilovaty

Generative AI platforms, like ChatGPT, hold great promise in enhancing human creativity, productivity, and efficiency. However, generative AI platforms are prone to manipulation. Specifically, they are susceptible to a new type of attack called “prompt injection.” In prompt injection, attackers carefully craft their input prompt to manipulate AI into generating harmful, dangerous, or illegal content as output. Examples of such outputs include instructions on how to build an improvised bomb, how to make meth, how to hotwire a car, and more. Researchers have also been able to make ChatGPT generate malicious code. This article asks a basic question: do prompt injection attacks violate computer crime law, mainly the Computer Fraud and Abuse Act? This article argues that they do. Prompt injection attacks lead AI to disregard its own hard-coded content generation restrictions, which allows the attacker to access portions of the AI that are beyond what the system’s developers authorized. Therefore, this constitutes the criminal offense of accessing a computer in excess of authorization. Although prompt injection attacks could run afoul of the Computer Fraud and Abuse Act, this article offers ways to distinguish serious acts of AI manipulation from less serious ones, so that prosecution would only focus on a limited set of harmful and dangerous prompt injections.

Loyola of Los Angeles Law Review, Vol. 58, 2025, Kilovaty, Ido, Hacking Generative AI (March 1, 2024). Loyola of Los Angeles Law Review, Vol. 58, 2025,

download

Social SciencesRead-Me.OrgAugust 30, 2024AI, ChatGPT, Manipulation

CRIME