Open main menu
Article
Quizzes
Tools
EN
Article
Quizzes
Tools
All quizzes
/
AI APIs & SDKs
/
When does ...
When does prompt caching deliver the greatest cost savings?
When a large, stable prefix (such as a long system prompt or reference document) is reused across many requests
When the user message is very short, reducing input token cost
When you use streaming, because cached tokens stream faster
When requests are sent in parallel, sharing compute across multiple users
Submit answers