π Step 9: AI/LLM#258 / 291
Guardrails
Guardrails
πOne-line summary
Safety checks placed around model inputs and outputs to keep behavior within policy.
π‘Easy explanation
Safety checks on input and output that keep the AI from going off-policy. Profanity filters, banned-topic blocks β all guardrails.
β¨Example
Safety checks on input and output
π‘οΈMask PII in input
β
π€LLM call
β
π‘οΈFilter forbidden words in output
β‘Vibe coding prompt examples
>_
Write an input guardrail that masks PII (emails, phone numbers, national IDs).
>_
Build a post-processing guardrail that blocks output containing internal keywords; load the keyword list from env.
>_
Write code that calls a small classifier model as a guardrail to detect prompt injection or jailbreak attempts.
Try these prompts in your AI coding assistant!