Developers today face increasing complexity when building intelligent applications that combine vision and code. Traditional text-only models struggle with UI comprehension, layout translation, and structured visual reasoning. Qwen3-VL-235B-A22B bridges this gap through a powerful multimodal design that integrates visual perception with programming logic.
Readers will understand not only how Qwen3-VL-235B-A22B outperforms peers such as GLM-4.5V, but also how to implement it efficiently across development environments like Cursor, Trae, and Codex.
What is Qwen3-VL-235B-A22B?
Model Type: Multimodal (Vision-Language) large model in the Qwen3 family.
Architecture: Mixture-of-Experts (MoE) with ~235 B total parameters, ~22 B activated per inference.
Context Length: Supports up to 256 K tokens, extendable to 1 M tokens.
Visual Capabilities: Excels in GUI element recognition, screenshot-to-code (HTML/CSS/JS/Draw.io), and 2D/3D spatial reasoning.
Language Performance: Matches text-only LLMs in comprehension and reasoning while integrating visual input seamlessly.
OCR & Multilingual: Handles 32 languages with strong performance under blur, tilt, or low-light conditions.
Variants:
- Instruct — optimized for interactive tasks and dialogue.
- Thinking — tuned for extended reasoning and chain-of-thought inference.
Qwen3-VL-235B-A22B leads in OCR, GUI reasoning, and code generation, showing broad multimodal competence. Weaknesses are mainly in complex 3D spatial grounding and subjective alignment tasks. Overall, it’s one of the most balanced and high-performing vision-language models currently benchmarked.

How to Use Qwen3-VL-235B-A22B to Create a Fast Code Demo?
Qwen3-VL-235B-A22B showcases unmatched power in visual coding. With a record 92.0 on Design2Code and 80.5 on ChartMimic, it can accurately translate complex interfaces, charts, and dashboards into clean, executable code.
Novita AI offers APIs supporting a 32.8K context window, priced at $0.98 per 1K input tokens and $3.95 per 1K output tokens. It delivers strong performance with an average latency of 1.17 seconds and throughput of 26.78 TPS (tokens per second).
How to Develop Qwen3-VL-235B-A22B ’s Code Ability?
Prompt-engineering for visual-to-code workflows
- First instruct the model to describe a UI or chart image in detail, then ask for code generation. (Technique: Chain-of-Description).
- Provide clear examples of “screenshot → HTML/CSS/JS” conversions so the model learns pattern mapping.
Tool and agent integration
- Enable the model to call code-editing tools: open file, apply diff, run tests. Use it as an interactive “assistant” rather than static code generator.
- Loop: Plan → Act → Observe → Revise, with actual feedback from linting/tests, so the model improves via environment.
Fine-tuning / instruction-tuning on code corpora
- Gather datasets of UI screenshots + target code + tests. Fine-tune the model (or use LoRA) with a mix of reasoning dialogues and code generation.
- Mix reasoning tasks and code tasks so the model retains logic and execution understanding.
Long-context and multi-file awareness
- Leverage the model’s large context window (up to 256K tokens) to feed entire projects: multiple files, dependencies, interface specs.
- Include cross-file references and task specs so code output is contextual and correct.
Evaluation and iteration
- Benchmark using tasks like “convert UI mockup → code” (metrics: correctness, UI fidelity, runtime).
- Monitor error types (layout mismatch, logic bug, missing dependency) and iterate dataset and prompting accordingly.
How to Access Qwen3-VL-235B-A22B?
1. Interface (Easiest for Beginners)

2. API Access (For Developers)
Step 1: Log In and Access the Model Library
Log in to your account and click on the Model Library button.

Step 2: Choose Your Model
Browse through the available options and select the model that suits your needs.

Step 3: Start Your Free Trial
Begin your free trial to explore the capabilities of the selected model.

Step 4: Get Your API Key
To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

Step 5: Install the API
Install API using the package manager specific to your programming language.
After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.
curl "https://api.novita.ai/openai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer session_edv1fJHNhzoHlVygGK0VnwTpo2gxM4nMxwcg-Wp0sEDOr0f-lQSFbRWwqOUMyXhtRQHShteDw48v2QNP86fLPA==" \
-d @- << 'EOF'
{
"model": "qwen/qwen3-vl-235b-a22b-thinking",
"messages": [
{
"role": "system",
"content": "Be a helpful assistant"
},
{
"role": "user",
"content": "Hi there!"
}
],
"response_format": { "type": "text" },
"max_tokens": 16384,
"temperature": 1,
"top_p": 1,
"min_p": 0,
"top_k": 50,
"presence_penalty": 0,
"frequency_penalty": 0,
"repetition_penalty": 1
}
EOF
3. Local Deployment or Dedicated Endpoint
Requirements:
- Qwen3-VL-235B-A22B: 8 NVIDIA H200 GPUs.
Installation Steps:
- Download model weights from HuggingFace or ModelScope
- Choose inference framework: vLLM or SGLang supported
- Follow deployment guide in the official GitHub repository
You’d choose a dedicated endpoint when you need stable high-performance inference, custom model control, and lower cost under continuous or heavy workloads, instead of maintaining local GPUs and infrastructure.

4. Code Agent Tools Integration
By using Novita AI’s service, you can bypass the regional restrictions of Claude Code. Novita AI also provides access guides for Trae and Qwen Code, which can be found in the following articles.
Novita also provides SLA guarantees with 99% service stability, making it especially suitable for high-frequency scenarios such as code generation and automated testing.
In addition to Deepseek 0324, users can also access powerful coding models like Kimi-k2 and Qwen3 Coder, whose performance is close to Claude’s closed-source Sonnet 4, at less than one-fifth of the cost.
The First: Get API Key

Qwen3-VL-235B-A22B in Cursor
Step 1: Install and Activate Cursor
- Download the newest version of Cursor IDE from cursor.com
- Subscribe to the Pro plan to enable API-based features
- Open the app and finish the initial configuration
Step 2: Access Advanced Model Settings

- Open Cursor Settings (use Ctrl + F to find it quickly)
- Go to the “Models” tab in the left menu
- Find the “API Configuration” section
Step 3: Configure Novita AI Integration
- Expand the “API Keys” section
- ✅ Enable “OpenAI API Key” toggle
- ✅ Enable “Override OpenAI Base URL” toggle
- In “OpenAI API Key” field: Paste your Novita AI API key
- In “Override OpenAI Base URL” field: Replace default with:
https://api.novita.ai/openai
Step 4: Add Multiple AI Coding Models
Click “+ Add Custom Model” and add each model:
qwen/qwen3-vl-235b-a22b-thinkingzai-org/glm-4.6deepseek/deepseek-v3.1moonshotai/kimi-k2-0905openai/gpt-oss-120bgoogle/gemma-3-12b-it
Step 5: Test Your Integration

- Start new chat in Ask Mode or Agent Mode
- Test different models for various coding tasks
- Verify all models respond correctly
Qwen3-VL-235B-A22B in Claude Code
For Windows
Open Command Prompt and set the following environment variables:
set ANTHROPIC_BASE_URL=https://api.novita.ai/anthropic set ANTHROPIC_AUTH_TOKEN=<Novita API Key> set ANTHROPIC_MODEL=qwen/qwen3-vl-235b-a22b-thinking set ANTHROPIC_SMALL_FAST_MODEL=qwen/qwen3-vl-235b-a22b-thinking
Replace <Novita API Key> with your actual API key obtained from the Novita AI platform. These variables remain active for the current session and must be reset if you close the Command Prompt.
For Mac and Linux
Open Terminal and export the following environment variables:
export ANTHROPIC_BASE_URL="https://api.novita.ai/anthropic" export ANTHROPIC_AUTH_TOKEN="<Novita API Key>" export ANTHROPIC_MODEL="qwen/qwen3-vl-235b-a22b-thinking" export ANTHROPIC_SMALL_FAST_MODEL="qwen/qwen3-vl-235b-a22b-thinking"
Starting Claude Code
With installation and configuration complete, you can now start Claude Code in your project directory. Navigate to your desired project location using the cd command:
cd <your-project-directory> claude .
Qwen3-VL-235B-A22B in Trae
Step 1: Open Trae and Access Models
Launch the Trae app. Click the Toggle AI Side Bar in the top-right corner to open the AI Side Bar. Then, go to AI Management and select Models.


Step 2: Add a Custom Model and Choose Novita as Provider
Click the Add Model button to create a custom model entry. In the add-model dialog, select Provider = Novita from the dropdown menu.


Step 3: Select or Enter the Model
From the Model dropdown, pick your desired model (DeepSeek-R1-0528, Kimi K2 DeepSeek-V3-0324, or MiniMax-M1-80k,GLM 4.6). If the exact model isn’t listed, simply type the model ID that you noted from the Novita library. Ensure you choose the correct variant of the model you want to use.
Qwen3-VL-235B-A22B in Codex
Setup Configuration File
Codex CLI uses a TOML configuration file located at:
- macOS/Linux:
~/.codex/config.toml - Windows:
%USERPROFILE%\.codex\config.toml
Basic Configuration Template
model = "qwen/qwen3-vl-235b-a22b-thinking"
model_provider = "novitaai"
[model_providers.novitaai]
name = "Novita AI"
base_url = "https://api.novita.ai/openai"
http_headers = {"Authorization" = "Bearer YOUR_NOVITA_API_KEY"}
wire_api = "chat"
Launch Codex CLI
codex
Basic Usage Examples
Code Generation:
> Create a Python class for handling REST API responses with error handling
Project Analysis:
> Review this codebase and suggest improvements for performance
Bug Fixing:
> Fix the authentication error in the login function
Testing:
> Generate comprehensive unit tests for the user service module
5. Third-Party Platforms Integration
- OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline and Cursor, designed for the OpenAI API standard.
- Hugging Face: Use Modeis in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
- Agent & Orchestration Frameworks: Easily connect Novita AI with partner platforms like Continue, AnythingLLM,LangChain, Dify and Langflow through official connectors and step-by-step integration guides.
Qwen3-VL-235B-A22B demonstrates leading performance in visual coding, OCR, and reasoning benchmarks, redefining multimodal programming standards. With Novita AI’s 32.8 K-context API, flexible deployment (local or dedicated endpoint), and integration with modern coding agents, the model delivers high precision and scalability at a competitive cost.
Frequently Asked Questions
It combines a 235 B-parameter Mixture-of-Experts architecture with strong visual reasoning, achieving state-of-the-art results in Design2Code and ChartMimic benchmarks.
Apply chain-of-description prompting, integrate code-editing tools, fine-tune with UI-to-code datasets, and leverage its 256 K-token context for multi-file reasoning.
Yes. It connects seamlessly with Cursor, Codex, and Trae via Novita AI’s OpenAI-compatible API endpoints.
Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing an affordable and reliable GPU cloud for building and scaling.
Recommend Reading
- Qwen3 Coder vs DeepSeek V3.1: Choosing the Right LLM for Your Program
- Comparing Kimi K2-0905 API Providers: Why NovitaAI Stands Out
- How to Use GLM-4.6 in Cursor to Boost Productivity for Small Teams
Discover more from Novita
Subscribe to get the latest posts sent to your email.





