📖 Step 9: AI/LLM#342 / 350

Red Teaming

📖One-line summary

Testing that deliberately probes a model's vulnerabilities to verify its safety.

💡Easy explanation

A team that plays hacker, deliberately attacking an AI to find its weaknesses. They simulate worst-case scenarios before launch to strengthen defenses.

✨Example

의도적으로 약점 찾기

🔴 레드 팀 (공격)

→ "유해 콘텐츠 유도 시도"

→ "개인정보 추출 시도"

🟢 결과: 취약점 3건 발견 → 패치 완료

AI Safety

Jailbreak

Other terms in this category

Chain of Thought (CoT)