Overview
Understanding how TokenBill AI Gateway manages model routing and usage tracking.
How it all works together
Merchants set up providers, endpoints, and routing in the console. API users call the gateway with your access key. When the VE budget is exhausted, the gateway returns an error that includes a checkout link for top-up.
# VE_ID = Virtual Endpoint id in the URL path; LOGICAL_MODEL_NAME = Logical Model name in the JSON "model" field
curl -X POST 'https://<VE_ID>.ve.test.tokenbill.io/v1/chat/completions' \
-H 'Authorization: Bearer <ACCESS_KEY>' \
-H 'Content-Type: application/json' \
-d '{"model":"<LOGICAL_MODEL_NAME>","messages":[{"role":"user","content":"Hello"}]}'Three-Tier Architecture
TokenBill uses a unified 3-tier architecture to provide high availability, load balancing, and clear billing separation.
gpt-4odall-e-3Request Flow
Every API call to a Virtual Endpoint goes through the TokenBill Smart Proxy which resolves the Logical Model, selects the best Physical Model via priority + weight, and streams the response back to the client.
Dynamic Smart Routing
The Smart Proxy engine powers true real-time, in-memory traffic distribution. It utilizes an ultra-fast local L1 cache to compute weighted routes in microseconds without remote polling.
Magic Command: /tokenbill
No extra REST API needed. Simply type /tokenbill in any chat message, and the gateway intercepts it to return your VE's balance, budget, and available models. The response never contains api_key or upstream credentials.
See the full checkout top-up flow →Budget Exhausted
Your Credit balance is insufficient. Please top up: https://tokenbill.io/checkout?ve_id=ve_abc123
Checkout
VE: ve_abc123
Two-Layer Billing Architecture
TokenBill decouples physical model consumption from financial accounting using a standardized Credit and Unit system.
Multi-Format Upstream Support
Each Provider is a custom upstream account — your own deployment, a third-party API, or a cloud service. TokenBill normalizes different API formats behind a single OpenAI-compatible interface.