Pentesting AI systems
Pentesting AI systems has become essential, driven by mandates from regulations such as the EU AI Act and the recent Executive Order on AI from the White House. These directives require rigorous testing of foundational models and high-risk AI applications.
Our course is designed to equip security professionals with the skills to thoroughly red-team AI systems, addressing traditional security vulnerabilities and novel adversarial machine learning and Responsible AI issues. This comprehensive approach ensures that AI system failures are identified and addressed before deployment.
The course offers a hands-on learning environment where participants can experiment with custom-built AI applications. Practical exercises challenge them to red-team these applications, followed by sessions on strengthening AI defenses. The training covers a range of prompting strategies for large language models, from zero-shot to more complex methods like Retrieval Augmented Generation. We utilize open-source tools to demonstrate effective red-teaming strategies. The course also examines various threat taxonomies and failure models, using the frameworks of MITRE ATT&CK and the NIST Cybersecurity Framework to distinguish between intended and unintended failure modes.
A significant focus of our course is the OWASP LLM Top 10, a set of guidelines that identify the most critical security risks to large language models. This component of the curriculum is dedicated to deep-diving into each of these top security threats, providing participants with the knowledge to identify, exploit, and mitigate vulnerabilities in line with the latest industry standards. The top 10 risks include:
- Input Data Manipulation
- Malicious Model Training
- Model Theft
- Data Privacy Violations
- Inference Attacks
- Model Inversion Attacks
- Misuse of Automated Outputs
- Supply Chain Attacks
- Weakness in Transfer Learning
- Adversarial Sample Attacks
By the end of the course, participants will have mastered various attack strategies, including:
- Perturbation
- Poisoning
- Model inversion
This knowledge is vital for text-generating systems and extends to multimodal models such as Whisper and Stable Diffusion. Understanding these diverse AI technologies is crucial because it enables security professionals to apply red teaming techniques across various platforms and modalities, from audio processing to image generation. This broad competence ensures that professionals are well-equipped to anticipate and mitigate potential threats in different AI applications, enhancing overall system robustness and reliability.
In the final modules, we’ll explore mitigation strategies against advanced threats and discuss red teaming requirements under current and upcoming regulations. Participants will also receive a comprehensive checklist to assess potential harms, categorized by:
- Base model
- AI application
- Any additional features