π Step 9: AI/LLM#303 / 350
Reinforcement Learning from Human Feedback
Reinforcement Learning from Human Feedback (RLHF)
πOne-line summary
A reinforcement learning technique that aligns model behavior using human preference feedback.
π‘Easy explanation
A training method where humans pick 'this answer is better,' and the model learns from that feedback. It's why ChatGPT answers politely instead of chaotically.
β¨Example
μΈκ° νΌλλ°±μΌλ‘ λͺ¨λΈ μ‘°μ¨
π A
μμ λ°λ₯Έ λ΅λ³
π B
무λ‘ν λ΅λ³
β μ¬λμ΄ Aλ₯Ό μ ν β
π§ λͺ¨λΈμ΄ A λ°©ν₯μΌλ‘ νμ΅