Performance AI UX patterns

Performance patterns surface latency, progress, rate limits, and resource usage so users trust the system is working.

Start here

Core patterns for performance UX.

12 patterns

Frequently asked questions

Why dedicate patterns to performance UX?

Long-running model work breaks flow without honest progress and limits. These patterns prevent silent failure and surprise throttling.

What should users see during long generations?

Elapsed time, staged progress, cancel, and optional step detail. Indeterminate spinners alone are insufficient for multi-minute agent jobs.

How do cost and token indicators help?

They set expectations before spend accumulates—especially for API-backed tools and team admins. Sudden hard stops without warning feel punitive.

When should rate-limit warnings appear?

Before users hit the wall: soft warnings with reset time, upgrade path, or queue position. Post-hoc errors after lost work destroy trust.

What is the difference between caching indicators and skeleton loaders?

Caching signals instant or near-instant reuse of prior work. Skeletons cover cold loads—use caching honesty so users know why a reply was fast.

Do performance patterns apply to voice and video AI?

Yes—processing-time estimates and running meters matter for transcription and media jobs where latency is multiplicative, not a single token stream.