Qwen3-Coder-Next is built for agentic coding: multi-step software tasks where the model needs to plan, call tools, recover from failures, and maintain context across long workflows.
On Novita AI, you can run Qwen3-Coder-Next through an OpenAI-compatible API—getting strong coding-agent performance without standing up or managing your own GPU infrastructure.
What is Qwen3-Coder-Next?
Model Overview
| Item | Details |
| Organization | Qwen Team (Alibaba) |
| Release Date | Feb 4, 2026 |
| Parameters | 80B total / ~3B activated (MoE) |
| Architecture | Hybrid Attention + High-sparsity MoE (Hybrid layout with Gated DeltaNet + Gated Attention) |
| Context Window | 262,144 tokens (256K) native, extendable |
Qwen3-Coder-Next is an open-weight, agentic code model optimized for strong real-world benchmarks while keeping inference costs low. Its MoE design limits active parameters at runtime, and hybrid attention enables long-context reasoning. The model is designed to plug directly into practical coding workflows—CLI tools, IDE agents, and structured tool calling—while remaining fast enough for everyday development.
Benchmarks and Performance
Benchmark

Practical takeaways
- Strong SWE-Bench Verified performance A 70.6% score indicates the model can handle real repository-level bug fixing, including search, patching, and test loops—an important signal for production-grade coding agents.
- Competitive multilingual repository support The 62.8% score on SWE-Bench Multilingual suggests the model is not strictly English-first, making it suitable for global teams with multilingual issues, comments, and documentation.
- Solid results on SWE-Bench Pro A 44.3% score on the harder Pro subset reflects stronger long-horizon reasoning, especially in multi-step debugging and recovery scenarios.
- TerminalBench relevance for tool use TerminalBench 2.0 evaluates structured command/output loops, which closely map to DevOps automation, CI debugging, and shell-driven agents.
- Aider score supports interactive coding A 66.2% Aider score is a good indicator for pair-programming workflows such as iterative edits, refactors, and incremental feature development.
Speed & latency
Developer experience depends heavily on latency, not just raw accuracy:

Why this matters on Novita AI
Qwen3-Coder-Next is built to be efficient—with 80B total parameters but only ~3B active at inference—making it especially cost-effective for agentic coding workloads. On Novita AI, this efficiency translates directly into predictable, competitive pricing:
- Input: $0.2 / million tokens
- Output: $1.5 / million tokens
Combined with Novita AI’s scalable API, you can deploy high-performance coding agents that handle long-context reasoning and multi-step workflows—without managing GPUs or incurring unpredictable infrastructure costs.
How to Access Qwen 3 on Novita AI
Quick Start: Playground
For the fastest evaluation, start with Novita’s Playground to experiment with prompts, compare models, and validate output quality before integration.

Use Qwen3-Coder-Next via API
How to Get API Keys
- Step 1: Create or Login to Your Account: Visit
[https://novita.ai](https://novita.ai)and sign up or log in. - Step 2: Navigate to Key Management: After logging in, find “API Keys”.
- Step 3: Create a New Key: Click the “Add New Key” button.
- Step 4: Save Your Key Immediately: Copy and store the key as soon as it is generated; it is shown only once.

Use the following code examples to integrate with our API:
Python (Example)
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["NOVITA_API_KEY"],
base_url="https://api.novita.ai/v3/openai",
)
resp = client.chat.completions.create(
model="Qwen/Qwen3-Coder-Next",
messages=[
{"role": "system", "content": "You are a senior software engineer."},
{"role": "user", "content": "Fix the bug and write tests. Here is the stack trace: ..."}
],
temperature=0.2,
)
print(resp.choices[0].message.content)
SDK
If you’re building agents, Novita integrates cleanly with frameworks that expect OpenAI Chat Completions:
- OpenAI Agents SDK compatibility
- Standard OpenAI Python/Node SDKs work with minimal changes due to API compatibility
Third-Party Platforms
Novita-hosted models can also be used across many popular ecosystems—so you can bring Qwen3-Coder-Next into existing tools without changing your workflow:
- Agent frameworks & app builders: Integration guides for Continue, AnythingLLM, LangChain, and Langflow.
- Hugging Face Hub: Novita is listed as an Inference Provider, enabling supported model runs via Hugging Face’s provider ecosystem.
- OpenAI-compatible tools: Novita follows the OpenAI API standard, so you can connect OpenAI-style apps and tools such as Cline, Cursor, Trae, and Qwen Code with minimal changes.
- Anthropic-compatible access: Novita also supports Anthropic SDK–compatible integration for Claude Code–style workflows.
- OpenCode & observability: Use Novita directly in OpenCode.
Conclusion
Qwen3-Coder-Next hits a practical sweet spot: agentic coding strength, long-context reasoning, and high throughput, powered by an MoE design that keeps runtime costs under control. For teams looking to ship coding agents—or simply accelerate development workflows—running Qwen3-Coder-Next on Novita AI through its OpenAI-compatible API is one of the simplest paths from evaluation to production.
Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.
Frequently Asked Questions
What is Qwen3-Coder-Next?
Qwen3-Coder-Next is an open-weight coding model from the Qwen team, built for agentic coding (multi-step coding tasks with tool use, execution feedback, and recovery). It’s based on Qwen3-Next-80B-A3B-Base and uses a hybrid attention + MoE architecture to achieve strong coding/agent performance with lower inference cost.
How much does Qwen3 Coder cost?
On Novita AI, Qwen3-Coder-Next is priced at $0.20 / 1M input tokens and $1.50 / 1M output tokens (serverless)
Which API providers offer Qwen3-Coder-Next?
Qwen3-Coder-Next is available through multiple API providers, with Novita AI as a cost-effective and OpenAI-compatible option. Other providers include Chutes, Parasail, and Together AI, which differ in latency, throughput, and pricing.
