The Microsoft Defender for AI Alerts

Microsoft Defender for AI offers real-time threat detection and response for generative AI applications. It is generally available and covers Azure OpenAI and Azure AI Model Inference services on Azure Commercial Cloud, providing activity monitoring and prompt evidence. This blog post explains Defender for AI alerts, their relation to MITRE ATT&CK, and mitigation strategies. It also outlines five key generative AI security threats: poisoning, evasion, functional extraction, inversion, and prompt injection attacks. Poisoning attacks corrupt training data, while evasion attacks bypass security controls. Functional extraction aims to replicate the model, and inversion attacks infer sensitive data. Prompt injection manipulates AI behavior through malicious prompts. Defender for AI generates alerts for various threats, such as detected credential theft attempts on AI models. This alert, classified as medium severity and an inversion attack, warns of credentials appearing in AI responses, often due to training data or prompt triggers. Mitigation involves training data hygiene, output filtering, zero trust principles, and prompt injection defense. A blocked jailbreak attempt alert, medium severity and prompt injection, signifies that Azure AI Content Safety Prompt Shields prevented manipulation of AI safeguards. To avoid recurrence, continuous use of Prompt Shields, retrieval isolation, regular testing, and Zero Trust are recommended. A detected jailbreak attempt alert, also medium severity and prompt injection, means Prompt Shields identified an attempt that wasn't fully blocked due to content filtering or low confidence settings. Similar mitigation strategies apply, emphasizing prompt shield usage, retrieval isolation, continuous testing, and Zero Trust. A high severity alert for a corrupted AI application directing phishing attempts indicates a poisoning attack, where the AI system, model, or data has been compromised to share malicious URLs.

techcommunity.microsoft.com

RSS Hunter

2025-12-03

Create attached notes ...