Question 1

Why dedicate patterns to performance UX?

Accepted Answer

Long-running model work breaks flow without honest progress and limits. These patterns prevent silent failure and surprise throttling.

Question 2

What should users see during long generations?

Accepted Answer

Elapsed time, staged progress, cancel, and optional step detail. Indeterminate spinners alone are insufficient for multi-minute agent jobs.

Question 3

How do cost and token indicators help?

Accepted Answer

They set expectations before spend accumulates—especially for API-backed tools and team admins. Sudden hard stops without warning feel punitive.

Question 4

When should rate-limit warnings appear?

Accepted Answer

Before users hit the wall: soft warnings with reset time, upgrade path, or queue position. Post-hoc errors after lost work destroy trust.

Question 5

What is the difference between caching indicators and skeleton loaders?

Accepted Answer

Caching signals instant or near-instant reuse of prior work. Skeletons cover cold loads—use caching honesty so users know why a reply was fast.

Question 6

Do performance patterns apply to voice and video AI?

Accepted Answer

Yes—processing-time estimates and running meters matter for transcription and media jobs where latency is multiplicative, not a single token stream.

Performance AI UX patterns

Start here