Red Teaming - Sparring & Support


We hack your agents before others do.
With Agentic Red Teaming, we test your GenAI applications, RAG pipelines and tool-calling agents under real attack conditions. The aim: to make vulnerabilities visible, prioritise risks and enable you to implement effective fixes quickly - including sparring by our experts.



What you will learn about Red Teaming

  • Intro & Red Teaming Fundamentals
    Core concepts of red teaming, ethics, scope, and rules for safe, controlled testing
  • Framing & Safety Fundamentals
    Threat modeling & scoping, rules of engagement; safe handling of data (PII/secrets); measurable success criteria for each run
  • LLM Jailbreaking – Core Techniques
    Overview of prompt injection & jailbreak patterns, plus tool/permission abuse (conceptual tactics, no exploit how-tos)
  • Advanced Attacks
    Multi-turn tactics, multimodal risks (text/image/audio), converter vectors (format & channel switching), and other current patterns; typical detection signals
  • Automation & Metrics – Prompt Sets, Replay Harness & Scoring
    Evaluation rubrics, regression suites, comparability across runs/models
  • Defense, Detection & Reporting – Guardrails
    Policies (input and output controls, approval gates), AI compliance (e.g., EU AI Act/GDPR/Swiss FADP), reporting (risk rating, evidence, executive summary) & mitigation planning (short-, mid-, long-term roadmap)



After this course, your business and tech teams will better understand:


• AI security fundamentals

• Agentic AI fundamentals

• Prompt Injection & Jailbreaks – How users can trick an AI into ignoring its rules and doing things it should not do
• RAG Poisoning – How false or manipulated documents are injected into a system’s knowledge base to produce wrong answers
• Data Exfiltration – How sensitive data can be leaked through an AI system without being noticed

• PII Leakage – When an AI exposes personal data that should be protected
• Tool & Function-Call Abuse – When an AI is manipulated into executing connected tools or systems in dangerous ways
• Insecure Output Handling – When AI responses are used without being checked for risks such as code, commands, or sensitive content
• Long-Context Contamination – A security and quality issue where earlier or hidden inputs in long contexts push the model into unsafe, incorrect, or manipulable behavior
• Cost Abuse & Denial of Service (DoS) – When an AI system is flooded with requests to drive up costs or make it unavailable

• Hallucination Induction – Techniques that deliberately cause an AI to generate convincing but false information


How we work

  1. Scoping & Threat Model - Objectives, protection needs, architecture, data flows, policies.
  2. Attack Simulation - Manual exploits, automated red-team agents, fuzzing & scenario testing (prod-like in a safe sandbox).
  3. Purple-Team Fix Sprint - Joint hardening: RAG controls, tool safeguards, guardrails, prompt & policy design, safe output handling.
  4. Report, Retest & Enablement - Action plan, KPIs, go-live gates, runbook, training.

We build agentic end-to-end automation every day, test chatbots and agents, so we know where it breaks in the real world. We want to share this valuable experiences with you.


Our promise: hands-on, measurable risk reduction, and awareness of the risks and migitation methods that exist.

Ready to leverage your AI potential?

Contact us today to schedule a free initial consultation and take the first step toward realizing high-value AI use cases in your organization.

Nina Habicht

CEO & Founder

Interested? Contact us

Our AI Experts will help you with your challenge.

+ 41 78 874 04 08

Chat with us


Your Questions Answered

  • Do you test Microsoft Copilot / Google Vertex / OpenAI- or Llama-based apps?

    Yes. We test platform copilots and custom LLM apps (OpenAI, Azure OpenAI, Anthropic, Google, Mistral, Llama, etc.), including tool-calling and retrieval layers.

  • Will you see our production data?

    No. We prefer prod-like sandboxes or masked data. If production is required, we define strict scopes, rate limits, and logging.

  • How do you handle sensitive data/PII?

    We follow least privilege, data minimization, encrypted transit/storage, and provide a data-processing addendum upon request.

  • What about compliance (EU AI Act, OWASP GenAI, NIST AI RMF)?

    Our findings map to these frameworks; your report includes a coverage matrix and recommended next steps.

  • Can you help fix issues, not just find them?

    Yes. That’s the Purple-Team Fix Sprint — we co-implement mitigations with your team and verify via retest.

  • Where do you operate?

    We serve customers across DACH (Switzerland, Germany, Austria) and the EU; remote-first with on-site options.

  • How soon can we start?

    Typical lead time is short; a scoping call is enough to confirm targets and data handling.