How it works
How GonkaGate routes requests to Gonka Network
Routing architecture and health signals that drive node selection in Gonka Network, from client to response
- Routing + fallback — one decision
- Signals — availability + latency
- Visibility — usage in dashboard
Get a key to try routing live. The Gonka API overview has the full surface and compatibility details
Detailed API surface lives on the Gonka API page
Routing snapshot
Routing guarantees at a glance
Key outcomes you can rely on per request
- Primary goal
- Fast first byte
- Retries
- Up to
x3per request - Switch signal
- No data in
10 s - Visibility
- Dashboard usage after completion
Who it's for
- API teams — Stable streaming, bounded retries
- Product leaders — Lower perceived latency, higher reliability
- No-code builders — Predictability without routing tuning
Access check
Verify API key status and account access
Route selection
Pick the best node for the selected model
Fallback handling
Re-route on capacity or health issues
Usage logging
Capture cost, latency, and token metadata
Response delivery
Return an OpenAI-compatible payload
Request flow
Routing decisions use availability and latency signals
Why routing stays fast
- Pool
- Per-model route pool
- Scoring
- Recent success + latency
- Safety
- Unstable routes cool off
- Result
- Fast, reliable route now
Guardrails
If no data arrives in 10 s, stop and switch routes
- Attempt 1 — no data in
10 s, stop - Attempt 2 — first byte, continue stream
Up to x3 attempts, always with a different route
- Retry on — timeout or
429(rate limit) - Isolation — each attempt uses a new route
New routes warm up first; repeated failures go to quarantine
x3 — Quarantine, 5 m backoffRelease — HealthyOperational rules
- Slow first byte — weak signal; hard errors are strong signals
- No repeat — a route is not reused within one request
- Fallback — if healthy routes are missing, use cached endpoints excluding quarantine