The edge intelligence layer for
Route across 200+ models with edge-level capabilities, using a single API, in 2 minutes.
How it works
One gateway, many providers
Your application calls Edgee. We apply policies at the edge (routing, privacy controls, retries), then forward the request to the best provider for the job.
- Normalize responses across models so you can switch providers easily
- Observe and debug production AI traffic end-to-end
- Control costs with routing policies and caching
import Edgee from 'edgee';
const edgee = new Edgee('your-edgee-api-key');
const res = await edgee.send({
model: 'openai/gpt-5.2',
input: 'Explain edge computing like I’m 5',
});
console.log(res.text);Fallbacks, retries, and policy-based model selection.
Understand latency, spend, and errors per provider.
Configure data handling and retention for prompts.
The vision behind Edgee
Hear from Sacha, Edgee’s co-founder, as he explains how we’re reinventing the way edge computing and AI work together. Edgee acts like an intelligent nervous system, connecting large language models (the ”brain”) with lightweight edge tools and models that run close to users.
These edge reflexes complement larger models, making every interaction faster and smarter. Discover how Edgee bridges to a future where your applications don’t just stay smart, they react with the speed and efficiency of a reflex system right at the edge.
AI Gateway: technical overview
See how Edgee controls AI traffic at the edge
This animation walks through the core building blocks: one API to reach any model, policy routing with fallbacks/retries, streaming, observability for latency/errors/usage/cost, configurable privacy controls, BYOK, and edge tools/private models, so production teams can add capabilities without rewriting their integration.
Why Edgee AI Gateway?
An edge intelligence layer for your AI traffic
Edgee sits between your app and LLM providers with one OpenAI-compatible API, then adds edge-level capabilities like routing policies, cost controls, private models, and tools, so you can ship AI features with confidence.
Edge Models
Run small, fast models at the edge to classify, redact, enrich, or route requests before they hit an LLM provider.
Learn moreEdge Tools
Invoke shared tools we operate at the edge, or deploy your own private tools close to users and providers for lower latency and better control.
Learn moreToken compression
Reduce prompt size without losing intent to cut costs and latency, especially for long contexts, RAG payloads, and multi-turn agents.
Learn moreBring Your Own Keys
Use Edgee’s keys for convenience, or plug in your provider keys for billing and custom models.
Learn morePrivate Models
Spawn serverless OSS LLM instances on demand (deployed where you need), and expose them through the same gateway API alongside public providers.
Learn moreObservability
Track latency, errors, usage and cost per model, per app, and per environment.
Learn moreShip faster
Start with one key. Scale with policies.
Use Edgee’s unified access to get moving quickly, then add routing, budgets, and privacy controls as your AI usage grows.