Skip to main content

Qwen: Qwen3 235B A22B Instruct 2507 FP8

qwen/qwen3-235b-a22b-instruct-2507-fp8

Updated Qwen3 MoE FP8 non-thinking instruction model with 235B total parameters, 22B activated parameters, and 240K context on Gonka; focused on instruction following, reasoning, coding, tool usage, multilingual and long-context tasks

by qwen
Playground
240K context
Modalities
text -> text
Network cost
N/A
Platform fee
N/A
Total price
N/A
Context
240K
Released
Jul 21, 2025

Performance for Qwen3 235B A22B Instruct 2507 FP8

Gonka Network throughput, latency, and reliability metrics are coming soon.

Soon

Throughput

Soon
Gonka NetworkMetrics soon

Latency

Soon
Gonka NetworkMetrics soon

E2E Latency

Soon
Gonka NetworkMetrics soon

Tool call error rate

Soon
Gonka NetworkMetrics soon

Structured output error rate

Soon
Gonka NetworkMetrics soon

Pricing history for Qwen3 235B A22B Instruct 2507 FP8

Soon

Historical Gonka Network pricing for this model is coming soon.

Current pricing

Total price

Soon

per 1M tokens

History preview

Total price

Soon

Sample code and API for Qwen3 235B A22B Instruct 2507 FP8

GonkaGate accepts OpenAI-compatible requests and sends them to Gonka Network.

1

Get your API key

Create an API key from your GonkaGate dashboard and set it as an environment variable:

Create API key
export GONKAGATE_API_KEY=gp-your-api-key
2

Make your first request

Use qwen/qwen3-235b-a22b-instruct-2507-fp8.

GonkaGate provides an OpenAI-compatible chat completions API for Gonka Network models. You can call it directly or use common OpenAI SDKs.

Set the base URL to https://api.gonkagate.com/v1 and pass your GonkaGate API key in the Authorization header.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.gonkagate.com/v1",
  apiKey: process.env.GONKAGATE_API_KEY
});

const response = await client.chat.completions.create({
  model: "qwen/qwen3-235b-a22b-instruct-2507-fp8",
  messages: [
    { role: "user", content: "What is the meaning of life?" }
  ]
});

console.log(response.choices[0].message.content);

Using third-party SDKs

For SDK and framework setup examples, see SDK documentation and integrations.

3

Enable streaming

Add "stream": true to your request body to receive responses as server-sent events:

curl -N https://api.gonkagate.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $GONKAGATE_API_KEY" \
  -d '{
    "model": "qwen/qwen3-235b-a22b-instruct-2507-fp8",
    "stream": true,
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Endpoint

POST https://api.gonkagate.com/v1/chat/completions
Authorization
Bearer $GONKAGATE_API_KEY
Content-Type
application/json
Model
qwen/qwen3-235b-a22b-instruct-2507-fp8

Parameters

NameTypeDefaultDescription
reasoningmapControls reasoning behavior for models that support thinking tokens, including reasoning effort and whether reasoning is included in the response.
max_tokensintegerSets the upper limit for the number of tokens the model can generate in response.
temperaturefloat1Influences the variety of the model's responses.
top_pfloat1Limits choices to the smallest set of likely tokens whose probabilities add up to this value.
seedintegerSamples deterministically when supported, so repeated requests with the same seed and parameters can return the same result.
presence_penaltyfloat0Adjusts how often the model repeats specific tokens already used in the input.
response_formatmapRequests a specific output format when the model supports it.
toolsarrayTool calling parameter following OpenAI's tool calling request shape.
tool_choicestring or objectControls which tool, if any, is called by the model.
logprobsbooleanRequests output-token log probabilities when supported.
top_logprobsintegerSpecifies how many most likely tokens to return at each token position, each with an associated log probability.

Other Gonka Models

View all