Prompt Injection
An attack where malicious input manipulates an AI model into ignoring its instructions or producing unintended outputs.
An attack where malicious input manipulates an AI model into ignoring its instructions or producing unintended outputs.
Prompt injection is a security vulnerability specific to large language models (LLMs) where an attacker crafts input that causes the model to override its system prompt, ignore safety guardrails, or execute unintended actions. There are two main forms:
Prompt injection is particularly dangerous for AI agents with tool access, where a successful injection could cause the agent to read unauthorized files, send emails, or modify data.
As organizations deploy AI agents with increasing autonomy and tool access, prompt injection becomes a critical attack vector. Security teams need to evaluate how AI tools handle adversarial inputs as part of their vendor risk assessment.
We use cookies and similar technologies to improve your experience, analyze traffic, and support marketing. Cookie Policy