On January 2nd, in our segment on the show L’Altra Ràdio, Javier Otero and Marcos Montero, heads of Artificial Intelligence and Digital Transformation at IThinkUPC, spoke about one of the major security risks facing generative artificial intelligence: Prompt Injection.
This concept, derived from the traditional “SQL Injection” used in databases, involves inserting malicious or misleading instructions into the system to trick the AI into bypassing its security barriers and providing restricted information. During the program, Javier and Marcos explained the key aspects of this vulnerability:
- Direct Prompt Injection: This occurs when a user attempts to directly deceive the AI through language and context manipulation. A real-world example is the “Bambi” case: instead of asking directly how to make a bomb (an instruction blocked by the system), an emotional story about a nuclear engineer mother is used so that the AI, in its attempt to be empathetic, ends up revealing the process.
- Indirect Prompt Injection: In this case, the AI processes information that already contains hidden instructions. For instance, invisible commands can be included within a resume or a scientific article to force the system to ignore certain data or prioritize a specific candidate during an automated analysis.
- Constant Updates: AI models are updated almost daily to fix these weaknesses as new ethical or malicious “hacking” methods are detected. Consequently, a vulnerability that works today might be resolved by tomorrow.
If you want to learn more details about the risks of Prompt Injection and how AI model security is tested, we invite you to listen to the L’Altra Ràdio podcast (starting at the 6:35 mark).