Self-hosted gateway unifying China's top 5 AI providers — DeepSeek, Qwen, GLM, Kimi, ERNIE — behind a single OpenAI-compatible endpoint. 5-minute deploy. Your server. Your keys. Zero platform fees.
$ curl -s https://chinaigateway.xyz/v1/chat/completions \ -H "Authorization: Bearer sk-IxF6Z..." \ -d '{"model":"deepseek-v4-pro","messages":[{"role":"user","content":"Hello"}]}' { "model": "deepseek-v4-pro", "choices": [{ "message": { "role": "assistant", "content": "Hello! I'm DeepSeek V4 Pro, running through Chinai Gateway — an open-source proxy that unifies Chinese AI models behind one standard API." } }], "usage": { "prompt_tokens": 8, "completion_tokens": 35 } }
Copy the repo, add your API keys to .env. One key or all five — your choice. Leave a provider blank and its models are disabled.
Single command boots PostgreSQL 16 + LiteLLM. ~430MB RAM total. Runs on any $5/month VPS alongside your existing services.
Use any OpenAI SDK — Python, JS, curl, LangChain, Cursor, AutoGPT. Change one base_url. All 14 models speak the same protocol.
💡 Demo key: read-only · $0.05 budget · 10 RPM · DeepSeek V4 Pro & Flash only. Full access: deploy your own (free, MIT, 5 min).
API keys are read from your .env file, injected into LiteLLM at startup, and never logged. No telemetry. No phoning home. No third party sees your credentials, requests, or responses. The MIT license means you can audit every line.
DeepSeek V4 Pro at ¥3/M tokens ($0.41) vs GPT-4o at $2.50/M. For a typical 1000-request/day app, that's $225/month → $18/month.
Change one field in your request to switch between DeepSeek, Qwen, GLM, Kimi, and ERNIE. Every model speaks OpenAI protocol. No SDK changes. No library updates. No vendor lock-in.
Built-in UI at /ui. Create virtual keys with per-user budgets and rate limits. Track spend per model. No extra tools required.
~430MB RAM total: PostgreSQL (~80MB) + LiteLLM (~300MB) + Nginx (~50MB). No GPU. This entire demo runs on a $2/month RackNerd VPS alongside Hysteria2 and other services. The infrastructure costs less than lunch.
Change one base_url in your OpenAI client. Every SDK, every framework, every tool that speaks OpenAI — they all just work.
┌─ Your Infrastructure ──────────────────────────────────────┐ │ │ │ ┌──────────┐ ┌───────────────┐ ┌────────────┐ │ │ │ Nginx │ ───▶ │ LiteLLM │ ───▶ │ DeepSeek │ │ │ │ :443 │ │ Proxy :4000 │ │ Qwen │ │ │ │ HTTPS │ │ Docker │ │ GLM │ │ │ └──────────┘ │ │ │ Kimi │ │ │ │ ┌─────────┐ │ │ ERNIE │ │ │ │ │PostgreSQL│ │ └────────────┘ │ │ │ │(internal)│ │ │ │ │ └─────────┘ │ │ │ └───────────────┘ │ │ │ │ ▲ .env file (API keys) — never leaves this server │ │ ▲ PostgreSQL — internal Docker network, no external port │ │ ▲ No telemetry, no phoning home, no analytics │ └────────────────────────────────────────────────────────────┘
Keys are read at container startup, injected as environment variables, and never persisted to disk or logged. PostgreSQL stores virtual key metadata — never provider credentials.
PostgreSQL listens on the internal Docker network. No port exposed to the host. No external access possible. Only LiteLLM can reach it.
Every line of our code is public. The Docker images are pulled from GitHub Container Registry with SHA256 pins. No black boxes. No trust required.
Chinai Gateway does not phone home. No usage analytics. No crash reports. No update checks. The demo at chinaigateway.xyz is the only thing we run — and it's optional.
| Route | Typical Latency | Overhead |
|---|---|---|
| Direct → DeepSeek API | ~200ms | baseline |
| Your App → Chinai Gateway → DeepSeek | ~220ms | +20ms (1.1×) |
| Your App → OpenRouter → DeepSeek | ~250ms | +50ms (1.25×) |
* Approximate, measured on a $5/month VPS. Streaming first-token latency is typically lower. Actual latency depends on model, prompt length, and network conditions.
| Model ▾ | Provider ▾ | Input / 1M ▾ | Output / 1M ▾ | Context | Features |
|---|---|---|---|---|---|
| deepseek-v4-pro | DeepSeek | ¥3 | ¥6 | 1,048,576 | AgentThinkingFunc Call |
| deepseek-v4-flash | DeepSeek | ¥1 | ¥2 | 1,048,576 | ThinkingBest Value |
| deepseek-chat | DeepSeek (legacy) | ¥1 | ¥2 | 65,536 | Deprecated Jul 2026 |
| deepseek-reasoner | DeepSeek (legacy) | ¥4 | ¥16 | 65,536 | Deprecated Jul 2026 |
| qwen-plus | Alibaba Qwen | ¥2 | ¥6 | 131,072 | ChineseFunc Call |
| qwen-max | Alibaba Qwen | ¥20 | ¥60 | 32,768 | Best CNFlagship |
| qwen-vl-plus | Alibaba Qwen | ¥2 | ¥6 | 32,768 | VisionImage |
| glm-4-plus | Zhipu GLM | ¥1 | ¥4 | 131,072 | Func Call128K |
| glm-4-flash | Zhipu GLM | Free | Free | 131,072 | Free TierFast |
| glm-4v-plus | Zhipu GLM | ¥5 | ¥5 | 32,768 | VisionOCR |
| kimi | Moonshot | ¥12 | ¥12 | 8,192 | Doc Analysis |
| kimi-128k | Moonshot | ¥60 | ¥60 | 131,072 | Ultra-Long128K |
| ernie-4.0-turbo | Baidu ERNIE | ¥4 | ¥12 | 8,192 | SearchEnterprise |
| ernie-speed | Baidu ERNIE | Free | Free | 131,072 | Free Tier128K |
# Replace YOUR_KEY with your master key from .env curl -X POST http://localhost:4000/v1/chat/completions \ -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-v4-pro", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum computing in simple terms."} ] }'
from openai import OpenAI client = OpenAI( api_key="YOUR_MASTER_KEY", base_url="http://localhost:4000/v1" ) # Streaming response with reasoning_content (DeepSeek V4 Pro) stream = client.chat.completions.create( model="deepseek-v4-pro", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ], stream=True, temperature=0.7 ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")
import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'YOUR_MASTER_KEY', baseURL: 'http://localhost:4000/v1', }); const response = await client.chat.completions.create({ model: 'deepseek-v4-pro', messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'Write a haiku about recursion.' } ], }); console.log(response.choices[0].message.content);
from langchain_openai import ChatOpenAI llm = ChatOpenAI( api_key="YOUR_MASTER_KEY", base_url="http://localhost:4000/v1", model="deepseek-v4-pro", temperature=0.7, ) response = llm.invoke("What is the capital of France?") print(response.content)
Neither. Chinai Gateway is free, open-source software (MIT) that you deploy on your own server. We don't run a hosted version. We don't charge anything. We don't see your data, keys, or traffic. The demo at chinaigateway.xyz is just that — a demo — running on a $2/month VPS with a read-only DeepSeek key capped at $0.05.
OpenRouter is a managed cloud service — your requests go through their infrastructure, and they charge 5.5% on top of model pricing. Chinai Gateway is self-hosted: you run it on your VPS, your data stays local, zero platform fees. We're pre-configured specifically for Chinese AI models with bilingual documentation. Think of OpenRouter as a service you rent; Chinai Gateway as infrastructure you own.
For DeepSeek, you register at platform.deepseek.com — the interface has an English option. For Qwen, GLM, Kimi, and ERNIE, the registration pages are primarily in Chinese, but the process is standard (phone/email verification, API key generation). Our docs/models.md links to each provider's key page. Once you have keys, everything else is in English.
Yes. Edit config.yaml and add any provider from LiteLLM's 100+ supported backends. Chinai Gateway is a starting point — not a walled garden. You can route some requests to DeepSeek (cheap) and others to Claude (quality), all through the same endpoint.
~430MB RAM total: PostgreSQL (~80MB) + LiteLLM (~300MB) + Nginx (~50MB). A $5/month VPS with 1GB RAM is more than enough. No GPU required. The demo runs on a $2/month RackNerd VPS alongside Hysteria2 and other services.
LiteLLM (the engine) is production-tested by thousands of teams. Chinai Gateway adds pre-configuration, docs, and deploy tooling. For critical workloads: add Nginx + HTTPS (our deployment.md covers this), set up monitoring, and pin Docker image SHAs. The MIT license means you can harden it to your own standards.
Docker ensures PostgreSQL + LiteLLM are isolated, versioned, and reproducible across any Linux server. It also keeps the host clean — no Python dependencies to manage. If you prefer bare metal, you can run LiteLLM with pip install litellm and connect to any PostgreSQL instance. But Docker is the path of least friction — and that's the point.
Chinai Gateway was built by a university student in China who wanted overseas developers to access Chinese AI models without the friction of registering on five different platforms and adapting five different API formats. It's MIT licensed — free forever, no strings attached. The project is a portfolio piece and a public good, not a startup.
One docker compose up -d. Five AI providers. Fourteen models. Your server, your keys, MIT license. Free forever.