The Doubleword Control Layer is the world’s fastest AI model gateway (450x less overhead than LiteLLM). It provides a single, high-performance interface for routing, managing, and securing LLM inference across model providers, users and deployments - both open-source and proprietary.
Transform your AI infrastructure from scattered API calls to a centralized, secure, and manageable platform that scales with your organization.
Key Resources
GitHub Repo — Explore the open-source Control Layer code.
Announcement Blog — Read the launch announcement and background story.
Demo Video — Watch the Control Layer in action.
CTO Blog — Technical deep dive by Fergus Finn.
Benchmarking Writeup — Performance comparison vs alternatives.
Key Capabilities
OpenAI-Compatible API
A single openAI-compatible gateway to all of your AI models across, all your
providers. Your existing code works without modification - simply point your
applications to the Control Layer's /ai/ endpoints, and supply a control
layer API key.
Using the Control Layer with the OpenAI Python SDK
Learn more about API Integration →
Learn more about adding endpoints to the Control Layer →
Centralized User Management & API Key Authentication
Manage all the users that can access all of your AI models from one simple interface. Map users to groups, let users create their own API keys, and monitor usage of AI APIs across your organization.
The control layer lets you turn unauthenticated, self-hosted model deployments into production-ready services with user authentication and access control.
The Control Layer's Users & Groups interface
Learn more about Users & Groups →
Real-Time Monitoring & Analytics
Track request volumes, token usage, response times, and model performance across your entire organization. Built-in analytics help you understand usage patterns, optimize costs, and identify bottlenecks with detailed insights into every API call. Store every request and response you send to LLM APIs for auditing, compliance, or later fine-tuning.
At a glance model metrics showing request counts, token usage, and average response times
Drill down into user behaviour across providers - with opinionated analytics
for understanding which models are being used, by whom, and for what purposes.
Traffic analytics showing gateway metrics, status codes, model breakdown, and usage over time
Learn more about Models & Access →
Interactive AI Playgrounds
Test and compare generative, embedding, and reranker models side-by-side with custom settings before integrating them into your applications. Experiment with different prompts, parameters, and models to find the best solution for your use case.
Chat Playground enabling side-by-side model comparison and testing