The edge intelligence layer for AI

Route across 200+ models with edge-level capabilities, using a single API, in 2 minutes.

3B+Requests/Month
25%Cost Reduction
200+Models supported
100+Global PoPs

Trusted by

How it works

One gateway, many providers

Your application calls Edgee. We apply policies at the edge (routing, privacy controls, retries), then forward the request to the best provider for the job.

  • Normalize responses across models so you can switch providers easily
  • Observe and debug production AI traffic end-to-end
  • Control costs with routing policies and caching
Quick example
import Edgee from 'edgee';

const edgee = new Edgee('your-edgee-api-key');

const res = await edgee.send({
  model: 'openai/gpt-5.2',
  input: 'Explain edge computing like I’m 5',
});

console.log(res.text);
Routing

Fallbacks, retries, and policy-based model selection.

Observability

Understand latency, spend, and errors per provider.

Privacy

Configure data handling and retention for prompts.

The vision behind Edgee

Hear from Sacha, Edgee’s co-founder, as he explains how we’re reinventing the way edge computing and AI work together. Edgee acts like an intelligent nervous system, connecting large language models (the ”brain”) with lightweight edge tools and models that run close to users.

These edge reflexes complement larger models, making every interaction faster and smarter. Discover how Edgee bridges to a future where your applications don’t just stay smart, they react with the speed and efficiency of a reflex system right at the edge.

AI Gateway: technical overview

See how Edgee controls AI traffic at the edge

This animation walks through the core building blocks: one API to reach any model, policy routing with fallbacks/retries, streaming, observability for latency/errors/usage/cost, configurable privacy controls, BYOK, and edge tools/private models, so production teams can add capabilities without rewriting their integration.

Why Edgee AI Gateway?

An edge intelligence layer for your AI traffic

Edgee sits between your app and LLM providers with one OpenAI-compatible API, then adds edge-level capabilities like routing policies, cost controls, private models, and tools, so you can ship AI features with confidence.

Edge Models

Run small, fast models at the edge to classify, redact, enrich, or route requests before they hit an LLM provider.

Learn more

Edge Tools

Invoke shared tools we operate at the edge, or deploy your own private tools close to users and providers for lower latency and better control.

Learn more

Token compression

Reduce prompt size without losing intent to cut costs and latency, especially for long contexts, RAG payloads, and multi-turn agents.

Learn more

Bring Your Own Keys

Use Edgee’s keys for convenience, or plug in your provider keys for billing and custom models.

Learn more

Private Models

Spawn serverless OSS LLM instances on demand (deployed where you need), and expose them through the same gateway API alongside public providers.

Learn more

Observability

Track latency, errors, usage and cost per model, per app, and per environment.

Learn more

Ship faster

Start with one key. Scale with policies.

Use Edgee’s unified access to get moving quickly, then add routing, budgets, and privacy controls as your AI usage grows.

Want product news and updates?

Subscribe to our newsletter

We care about your privacy. Read our privacy policy.

Contact
Edgee - The edge intelligence layer for AI