Progressive Discovery Cuts API Load for AI Agents

Matt Carey of Cloudflare explains how progressive discovery lets AI agents access APIs efficiently, avoiding token overload and redundant integrations.

TL;DR

Matt Carey, an AI engineer at Cloudflare, explained how progressive discovery lets AI agents tap into Cloudflare’s API suite without overloading their memory limits. By revealing tools only when needed, the method cuts redundant work and keeps agents responsive.

Context Historically, each application built its own copy of common integrations, such as weather or email services, which duplicated effort and slowed development. Carey noted that this “every app simply implemented the same integrations” pattern created unnecessary work as the number of available services grew. The shift toward a shared API model lets agents discover and use tools on demand, reducing duplication.

Key Facts Carey pointed out that loading Cloudflare’s full Open API specification would require 2.3 million tokens, a figure that exceeds the context window of today’s language models. The context window is the amount of text a model can process at once, and surpassing it prevents the model from seeing the entire API list. To avoid this overload, he described three progressive‑discovery techniques: a command‑line interface for dynamic tool selection, a natural‑language search for specific tools, and a code‑generation mode that lets the model write the needed API calls itself. These approaches expose only the relevant subset of APIs for a given task, keeping token usage within limits.

What It Means By applying progressive discovery, AI agents can access Cloudflare’s vast array of services—such as workers, storage, and security features—without hitting memory constraints. This improves agent reliability and expands the range of tasks they can perform autonomously. Carey also highlighted security considerations, noting that untrusted agent code runs in isolated sandboxes to protect the broader infrastructure. The result is a thinner, interoperable layer that exposes capabilities through MCP while keeping underlying servers simple.

Future work will focus on shrinking agent frameworks, increasing reliance on TypeScript for type safety, and refining sandbox isolation as more agents adopt the shared‑API model.

Progressive Discovery Cuts API Load for AI Agents, Cloudflare Engineer Says

More in this thread

Google in Talks with Marvell for New AI Chip Development

UK Government Departments Present Conflicting AI Power Demand Forecasts

Meta to Spend $140B on AI While Monitoring Employees' Keystrokes for Training Data

Reader notes