Performance docs

Latency and performance, using real logs.

These numbers come from analysis_logs.latency_ms in production D1, not invented marketing claims. They are observed telemetry for the last 30 days as of 2026-05-23, using nearest-rank percentile over successful logged requests. They are not a contractual SLA.

Try playground Errors and retries Enterprise options

Observed latency

Modalitynp50p95p99min–max
image256861 ms9115 ms10488 ms4683–10488 ms
text1998616 ms11180 ms12912 ms2704–14454 ms

Source: Remote D1 analysis_logs.latency_ms. Observed production telemetry, not a contractual SLA. Public demos and provider outages can differ from these values.

Headers

JSON API and demo responses expose Server-Timing: total;dur=<ms> and Access-Control-Expose-Headers: X-Request-Id, Server-Timing so browser clients can measure actual end-to-end request duration.

Integration guidance

  • Use a 30s timeout for text.
  • Use a 60s timeout for image.
  • Use async queues for image moderation when batching at scale.
  • Retry transient 503 with exponential backoff; do not retry validation failures.