📖 Step 9: AI/LLM#315 / 350

Throughput

📖One-line summary

The number of requests or tokens that can be processed per unit of time.

How many requests can be handled per second (or minute). The more users you have, the higher throughput you need.

1초에 몇 개 처리?

# 처리량 모니터링

▓▓▓▓▓▓▓▓▓▓ 100 req/s ✅

▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 250 req/s ⚡

▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 500 req/s 🔥

높을수록 더 많은 사용자 수용