Skip to content

Overview

Observe and control your AI applications.

Available on all plans

Cloudflare’s AI Gateway allows you to gain visibility and control over your AI apps. By connecting your apps to AI Gateway, you can gather insights on how people are using your application with analytics and logging and then control how your application scales with features such as caching, rate limiting, as well as request retries, model fallback, and more. Better yet - it only takes one line of code to get started.

Check out the Get started guide to learn how to configure your applications with AI Gateway.

Features

Analytics

View metrics such as the number of requests, tokens, and the cost it takes to run your application.

Real-time logs

Gain insight on requests and errors.

Caching

Serve requests directly from Cloudflare’s cache instead of the original model provider for faster requests and cost savings.

Rate limiting

Control how your application scales by limiting the number of requests your application receives.

Request retry and fallback

Improve resilience by defining request retry and model fallbacks in case of an error.

Your favorite providers

Workers AI, OpenAI, Azure OpenAI, HuggingFace, Replicate, and more work with AI Gateway.


Workers AI

Run machine learning models, powered by serverless GPUs, on Cloudflare’s global network.

Vectorize

Build full-stack AI applications with Vectorize, Cloudflare’s vector database. Adding Vectorize enables you to perform tasks such as semantic search, recommendations, anomaly detection or can be used to provide context and memory to an LLM.

More resources

Developer Discord

Connect with the Workers community on Discord to ask questions, show what you are building, and discuss the platform with other developers.

Use cases

Learn how you can build and deploy ambitious AI applications to Cloudflare’s global network.

@CloudflareDev

Follow @CloudflareDev on Twitter to learn about product announcements, and what is new in Cloudflare Workers.