Справочник Gonka AI API (через GonkaGate)

Справочник Gonka AI API через GonkaGate: эндпоинты, параметры, стриминг, коды ошибок и модели.

К быстрому старту

Нужен обзор Gonka API ? Начните здесь.

Переходите к подробным разделам справочника, не теряя контекст.

Стриминг — формат SSE и структура чанков.
Коды ошибок — статусы, причины и решения.
Лимиты — пропускная способность и ретраи.
Гайд по выбору моделей — ID моделей и советы по выбору.

Base URL

Все API запросы должны отправляться на:

Base URL:https://api.gonkagate.com/v1

Попробовать в Playground

Аутентификация

Авторизуйте API запросы с помощью Bearer токена в заголовке Authorization.

HTTP Header

Authorization: Bearer YOUR_API_KEY

Получить API ключ Подробнее об аутентификации

Эндпоинты

Создать Chat Completion

POST/chat/completions

Создаёт ответ для чат-диалога.

Тело запроса

Параметр	Тип	Обязателен	По умолчанию	Описание
`model`	string	Да	—	Model ID (e.g., qwen/qwen3-32b-fp8)
`messages`	array	Да	—	Array of message objects with role and content
`stream`	boolean	Нет	false	Enable streaming responses
`temperature`	number	Нет	1.0	Sampling temperature (0-2)
`max_tokens`	integer	Нет	4096	Maximum tokens in response
`top_p`	number	Нет	1.0	Nucleus sampling threshold
`stop`	string \| array	Нет	null	Stop sequences

model

Тип: string
Обязателен: Да
По умолчанию: —
Описание: Model ID (e.g., qwen/qwen3-32b-fp8)

messages

Тип: array
Обязателен: Да
По умолчанию: —
Описание: Array of message objects with role and content

stream

Тип: boolean
Обязателен: Нет
По умолчанию: false
Описание: Enable streaming responses

temperature

Тип: number
Обязателен: Нет
По умолчанию: 1.0
Описание: Sampling temperature (0-2)

max_tokens

Тип: integer
Обязателен: Нет
По умолчанию: 4096
Описание: Maximum tokens in response

top_p

Тип: number
Обязателен: Нет
По умолчанию: 1.0
Описание: Nucleus sampling threshold

stop

Тип: string | array
Обязателен: Нет
По умолчанию: null
Описание: Stop sequences

Ответ

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1735500000,
  "model": "qwen/qwen3-32b-fp8",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 10,
    "total_tokens": 35,
    "base_cost_usd": 0.000175,
    "platform_fee_usd": 0.0000175,
    "total_cost_usd": 0.0001925
  }
}

Примеры кода

chat_completions.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.gonkagate.com/v1",
    api_key="your-gonkagate-api-key"
)

response = client.chat.completions.create(
    model="qwen/qwen3-32b-fp8",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)
print(f"Usage: {response.usage.total_tokens} tokens")

Список моделей

GET/models

Возвращает список доступных моделей.

Ответ

json

{
  "data": [
    {
      "id": "qwen/qwen3-235b-a22b-instruct-2507-fp8",
      "name": "Qwen3 235B A22B Instruct 2507 FP8",
      "description": "A powerful 235B parameter model for complex reasoning tasks.",
      "context_length": 131072,
      "pricing": {
        "prompt": "0.000000350",
        "completion": "0.000000350"
      }
    },
    {
      "id": "qwen/qwq-32b",
      "name": "QwQ 32B",
      "description": "Compact reasoning model with strong logic capabilities.",
      "context_length": 32768,
      "pricing": {
        "prompt": "0.000000580",
        "completion": "0.000000580"
      }
    }
  ]
}

Примеры кода

list_models.py

import requests

response = requests.get(
    "https://api.gonkagate.com/v1/models",
    headers={"Authorization": "Bearer your-gonkagate-api-key"}
)

models = response.json()["data"]

for model in models:
    price_per_m = float(model["pricing"]["prompt"]) * 1_000_000
    print(f"{model['name']}: ${price_per_m:.2f}/1M tokens")

Смотреть список моделей

Параметры

Полный справочник параметров запроса.

Совместимость с OpenAI

GonkaGate поддерживает все стандартные параметры OpenAI Chat Completions. Справочник OpenAI API

Основные параметры

Обязательные и часто используемые параметры для chat completions.

Параметр	Тип	Обязателен	По умолчанию	Описание
`model`	string	Да	—	Model ID to use for completion The model ID specifies which Gonka Network model to use. Example: `qwen/qwen3-32b-fp8`.
`messages`	array	Да	—	Array of message objects with `role` and `content` A list of messages comprising the conversation. Each message has a `role` (system, user, assistant, or tool) and `content`. See Messages Schema below for full structure.
`stream`	boolean	Нет	false	Enable streaming responses If true, partial message deltas are sent as Server-Sent Events (SSE). Tokens are sent as they become available.
`max_tokens`	integer	Нет	4096	Maximum tokens in response The maximum number of tokens to generate. The model will stop when this limit is reached. Ограничения: 1 to context window size

model

Тип: string
Обязателен: Да
По умолчанию: —
Описание: Model ID to use for completion
The model ID specifies which Gonka Network model to use. Example: `qwen/qwen3-32b-fp8`.

messages

Тип: array
Обязателен: Да
По умолчанию: —
Описание: Array of message objects with `role` and `content`
A list of messages comprising the conversation. Each message has a `role` (system, user, assistant, or tool) and `content`. See Messages Schema below for full structure.

stream

Тип: boolean
Обязателен: Нет
По умолчанию: false
Описание: Enable streaming responses
If true, partial message deltas are sent as Server-Sent Events (SSE). Tokens are sent as they become available.

max_tokens

Тип: integer
Обязателен: Нет
По умолчанию: 4096
Описание: Maximum tokens in response
The maximum number of tokens to generate. The model will stop when this limit is reached.
Ограничения: 1 to context window size

Параметры сэмплирования

Контроль случайности и креативности выходных данных модели.

Расширенные параметры

Дополнительные опции для JSON mode, tools и воспроизводимости.

Схема массива messages

Структура параметра messages. Каждое сообщение — объект с role и content.

Поле	Тип	Обязательно	Описание
`role`	string	Да	The role of the message author Допустимые значения: `system`, `user`, `assistant`, `developer`
`content`	string \| null	Да	The message content. Can be null for assistant messages with tool_calls.
`name`	string	Нет	Optional name for the participant. Useful for multi-agent conversations.
`tool_calls`	array	Нет	Tool calls made by the assistant. Only present in assistant messages.
`tool_call_id`	string	Нет	ID of the tool call this message is responding to. Required for tool role.

role

Тип: string
Обязательно: Да
Описание: The role of the message author
Допустимые значения: system, user, assistant, developer

content

Тип: string | null
Обязательно: Да
Описание: The message content. Can be null for assistant messages with tool_calls.

name

Тип: string
Обязательно: Нет
Описание: Optional name for the participant. Useful for multi-agent conversations.

tool_calls

Тип: array
Обязательно: Нет
Описание: Tool calls made by the assistant. Only present in assistant messages.

tool_call_id

Тип: string
Обязательно: Нет
Описание: ID of the tool call this message is responding to. Required for tool role.

Примеры

json

[
  {
    "role": "user",
    "content": "Hello!"
  }
]

Схемы ответов

Структура API ответов.

Ответ Chat Completion

Поле	Тип	Описание
`id`	string	Unique identifier for the completion
`object`	string	Always "chat.completion"
`created`	integer	Unix timestamp when created
`model`	string	Model used for the completion
`choices`	array	Array of completion choices
`index`	integer	Index of this choice
`message`	object	The generated message
`role`	string	Always "assistant"
`content`	string \| null	The generated text
`tool_calls`	array	Tool calls (if any)
`finish_reason`	string	"stop", "length", "tool_calls", or "content_filter"
`usage`	object	Token usage and cost statistics
`prompt_tokens`	integer	Tokens in the input prompt
`completion_tokens`	integer	Tokens in the response
`total_tokens`	integer	Total tokens used
`base_cost_usd`	number	Base cost before platform fee (USD)
`platform_fee_usd`	number	Platform fee — 10% of base cost (USD)
`total_cost_usd`	number	Total cost deducted from balance (USD)

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1735500000,
  "model": "qwen/qwen3-32b-fp8",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 10,
    "total_tokens": 35,
    "base_cost_usd": 0.000175,
    "platform_fee_usd": 0.0000175,
    "total_cost_usd": 0.0001925
  }
}

Streaming ответ

При stream: true ответы отправляются как Server-Sent Events (SSE).

Формат чанка

Поле	Тип	Описание
`id`	string	Same ID for all chunks in a stream
`object`	string	Always "chat.completion.chunk"
`created`	integer	Unix timestamp
`model`	string	Model used
`choices`	array	Array with incremental content
`index`	integer	Choice index
`delta`	object	Incremental content
`role`	string	Role (first chunk only)
`content`	string	Text content to append
`finish_reason`	string \| null	null until final chunk, then "stop"
`usage`	object \| undefined	Token usage and cost (only in final chunk before [DONE])
`prompt_tokens`	integer	Tokens in the input prompt
`completion_tokens`	integer	Tokens in the response
`total_tokens`	integer	Total tokens used
`base_cost_usd`	number	Base cost before platform fee (USD)
`platform_fee_usd`	number	Platform fee — 10% of base cost (USD)
`total_cost_usd`	number	Total cost deducted from balance (USD)

json

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1735500000,"model":"qwen/qwen3-32b-fp8","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1735500000,"model":"qwen/qwen3-32b-fp8","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1735500000,"model":"qwen/qwen3-32b-fp8","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1735500000,"model":"qwen/qwen3-32b-fp8","choices":[{"index":0,"delta":{},"finish_reason":"stop"}],"usage":{"prompt_tokens":10,"completion_tokens":5,"total_tokens":15,"base_cost_usd":0.000075,"platform_fee_usd":0.0000075,"total_cost_usd":0.0000825}}

data: [DONE]

data: [DONE]

Означает завершение стрима. Больше чанков не будет.

Ответ списка моделей

Поле	Тип	Описание
`data`	array	Array of model objects
`id`	string	Model identifier for API requests
`name`	string	Human-readable model name
`description`	string \| null	Model description (from HuggingFace)
`context_length`	number \| null	Maximum context window in tokens
`pricing`	object	Cost per token in USD
`prompt`	string	Cost per input token in USD
`completion`	string	Cost per output token in USD

json

{
  "data": [
    {
      "id": "qwen/qwen3-235b-a22b-instruct-2507-fp8",
      "name": "Qwen3 235B A22B Instruct 2507 FP8",
      "description": "A powerful 235B parameter model for complex reasoning tasks.",
      "context_length": 131072,
      "pricing": {
        "prompt": "0.000000350",
        "completion": "0.000000350"
      }
    },
    {
      "id": "qwen/qwq-32b",
      "name": "QwQ 32B",
      "description": "Compact reasoning model with strong logic capabilities.",
      "context_length": 32768,
      "pricing": {
        "prompt": "0.000000580",
        "completion": "0.000000580"
      }
    }
  ]
}

Коды ошибок

GonkaGate использует стандартные HTTP коды и OpenAI-совместимые ответы об ошибках. Нажмите на ошибку, чтобы увидеть причины, решения и пример ответа.

HTTP коды статусов

Все возможные ответы об ошибках API с их значениями.

СтатусТипКодОписание

Обработка ошибок

Реализуйте логику повторных попыток с экспоненциальной задержкой для rate limits и серверных ошибок. Клиентские ошибки (4xx кроме 429) повторять не нужно.

error_handling.py

import openai
import time

client = openai.OpenAI(
    api_key="your-gonkagate-api-key",
    base_url="https://api.gonkagate.com/v1"
)

def chat_with_retry(messages, max_retries=3):
    """Make API request with exponential backoff retry logic."""
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="qwen/qwen3-32b-fp8",
                messages=messages
            )
            return response.choices[0].message.content

        except openai.RateLimitError as e:
            # 429: Wait and retry with exponential backoff
            wait_time = 2 ** attempt
            print(f"Rate limited. Waiting {wait_time}s...")
            time.sleep(wait_time)

        except openai.AuthenticationError as e:
            # 401: Invalid API key - don't retry
            print(f"Auth error: {e.message}")
            raise

        except openai.BadRequestError as e:
            # 400: Invalid request - don't retry
            print(f"Bad request: {e.message}")
            raise

        except openai.APIStatusError as e:
            # 5xx: Server error - retry with backoff
            if e.status_code >= 500:
                wait_time = 2 ** attempt
                print(f"Server error {e.status_code}. Retrying in {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise

    raise Exception("Max retries exceeded")

# Usage
result = chat_with_retry([{"role": "user", "content": "Hello!"}])
print(result)

Лимиты запросов

Дефолтные лимиты для chat completions. В продакшене могут быть переопределены конфигом.

Авторизованный эндпоинт Дефолтные лимиты для запросов с API-ключом.

Область	RPM	TPM	Burst (10s)	Параллельность
На API-ключ	600	2,000,000	200/10s	50
На пользователя	3000	10,000,000	1000/10s	200
На IP	3000	10,000,000	1000/10s	-
На API-ключ + IP	600	2,000,000	200/10s	-

Антиабьюз лимиты Дополнительные ограничения против злоупотреблений и чрезмерного шаринга ключей.

Лимит	Значение	Область
Уникальных IP на API-ключ (в час)	200	На API-ключ
Уникальных API-ключей на IP (в час)	200	На IP

Публичный Playground Отдельные лимиты для неавторизованного эндпоинта /v1/public/chat/completions, который использует Playground.

Лимит	Значение	Область
Запросов в минуту	20	На IP
Запросов в час	50	На IP

При превышении лимита проверьте заголовок Retry-After для времени ожидания. Реализуйте экспоненциальную задержку для повторных попыток.

Предупреждение о низком балансе

Когда ваш баланс опускается ниже $5, вы увидите предупреждение в Dashboard. Рекомендуем пополнить баланс, чтобы избежать прерывания сервиса.

Внешние ресурсы

Для протокольной и сетевой документации (официально). Эта страница описывает шлюз API GonkaGate.

Официальная документация Gonka Network

Справочник Gonka AI API (через GonkaGate)

Связанные гайды

Base URL

Аутентификация

Эндпоинты

Создать Chat Completion

Тело запроса

Ответ

Примеры кода

Список моделей

Ответ

Примеры кода

Параметры

Основные параметры

Параметры сэмплирования

Расширенные параметры

Схема массива messages

Схема массива messages

Примеры

Схемы ответов

Ответ Chat Completion

Ответ Chat Completion

Streaming ответ

Формат чанка

Ответ списка моделей

Ответ списка моделей

Коды ошибок

HTTP коды статусов

Обработка ошибок

Лимиты запросов

Предупреждение о низком балансе

Внешние ресурсы