Qwen: Qwen3 235B A22B Instruct 2507 FP8
qwen/qwen3-235b-a22b-instruct-2507-fp8Updated Qwen3 MoE FP8 non-thinking instruction model with 235B total parameters, 22B activated parameters, and 240K context on Gonka; focused on instruction following, reasoning, coding, tool usage, multilingual and long-context tasks
- Modalities
- text -> text
- Network cost
- N/A
- Platform fee
- N/A
- Total price
- N/A
- Context
- 240K
- Released
- Jul 21, 2025
Performance for Qwen3 235B A22B Instruct 2507 FP8
Gonka Network throughput, latency, and reliability metrics are coming soon.
Latency
E2E Latency
Tool call error rate
Structured output error rate
Pricing history for Qwen3 235B A22B Instruct 2507 FP8
SoonHistorical Gonka Network pricing for this model is coming soon.
Current pricing
Total price
Soon
per 1M tokens
History preview
Total price
Sample code and API for Qwen3 235B A22B Instruct 2507 FP8
GonkaGate accepts OpenAI-compatible requests and sends them to Gonka Network.
Get your API key
Create an API key from your GonkaGate dashboard and set it as an environment variable:
Create API keyexport GONKAGATE_API_KEY=gp-your-api-keyMake your first request
Use qwen/qwen3-235b-a22b-instruct-2507-fp8.
GonkaGate provides an OpenAI-compatible chat completions API for Gonka Network models. You can call it directly or use common OpenAI SDKs.
Set the base URL to https://api.gonkagate.com/v1 and pass your GonkaGate API key in the Authorization header.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.gonkagate.com/v1",
apiKey: process.env.GONKAGATE_API_KEY
});
const response = await client.chat.completions.create({
model: "qwen/qwen3-235b-a22b-instruct-2507-fp8",
messages: [
{ role: "user", content: "What is the meaning of life?" }
]
});
console.log(response.choices[0].message.content);Using third-party SDKs
For SDK and framework setup examples, see SDK documentation and integrations.
Enable streaming
Add "stream": true to your request body to receive responses as server-sent events:
curl -N https://api.gonkagate.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $GONKAGATE_API_KEY" \
-d '{
"model": "qwen/qwen3-235b-a22b-instruct-2507-fp8",
"stream": true,
"messages": [
{"role": "user", "content": "Hello"}
]
}'Endpoint
- Authorization
- Bearer $GONKAGATE_API_KEY
- Content-Type
- application/json
- Model
- qwen/qwen3-235b-a22b-instruct-2507-fp8
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| reasoning | map | — | Controls reasoning behavior for models that support thinking tokens, including reasoning effort and whether reasoning is included in the response. |
| max_tokens | integer | — | Sets the upper limit for the number of tokens the model can generate in response. |
| temperature | float | 1 | Influences the variety of the model's responses. |
| top_p | float | 1 | Limits choices to the smallest set of likely tokens whose probabilities add up to this value. |
| seed | integer | — | Samples deterministically when supported, so repeated requests with the same seed and parameters can return the same result. |
| presence_penalty | float | 0 | Adjusts how often the model repeats specific tokens already used in the input. |
| response_format | map | — | Requests a specific output format when the model supports it. |
| tools | array | — | Tool calling parameter following OpenAI's tool calling request shape. |
| tool_choice | string or object | — | Controls which tool, if any, is called by the model. |
| logprobs | boolean | — | Requests output-token log probabilities when supported. |
| top_logprobs | integer | — | Specifies how many most likely tokens to return at each token position, each with an associated log probability. |