How it works
How GonkaGate routes requests to Gonka Network
Routing architecture and health signals that drive node selection in Gonka Network, from client to response.
- Routing + fallback — one decision
- Signals — availability + latency
- Metadata — cost + latency per response
Get a key to try routing live. The Gonka API overview has the full surface and compatibility details.
Detailed API surface lives on the Gonka API page.
Routing snapshot
Routing guarantees at a glance
Key outcomes you can rely on per request.
- Primary goal
- Fast first byte
- Retries
- Up to
x3per request - Switch signal
- No data in
10 s - Visibility
- Usage metadata per response
Who it's for
Teams that need predictability
Built for stable streaming with bounded retries.
- API teams — Stable streaming, bounded retries
- Product leaders — Lower perceived latency, higher reliability
- No-code builders — Predictability without routing tuning
Access check
Verify API key status and account access.
Route selection
Pick the best node for the selected model.
Fallback handling
Re-route on capacity or health issues.
Usage logging
Capture cost, latency, and token metadata.
Response delivery
Return payload with usage fields.
Request flow
Routing decisions use availability and latency signals.
Why routing stays fast
Smart routing per model
We score routes per model and pick the one most likely to respond fast. Reliability stays high when the network is unstable.
- Pool
- Per-model route pool
- Scoring
- Recent success + latency
- Safety
- Unstable routes cool off
- Result
- Fast, reliable route now
Guardrails
Speed with safe recovery
If the first byte is slow, we switch quickly and keep retries bounded.
If no data arrives in 10 s, stop and switch routes.
- Attempt 1 — no data in
10 s, stop - Attempt 2 — first byte, continue stream
Up to x3 attempts, always with a different route.
- Retry on — timeout or
429(rate limit) - Isolation — each attempt uses a new route
New routes warm up first; repeated failures go to quarantine.
x3 — Quarantine, 5 m backoffRelease — HealthyOperational rules
- Slow first byte — weak signal; hard errors are strong signals
- No repeat — a route is not reused within one request
- Fallback — if healthy routes are missing, use cached endpoints excluding quarantine