Qwen 3 0.6B,1.7B,4B are Free on Novita AI!

Table Of Contents

Key Features of Qwen 3
Qwen 3 Small Models
Training Methods of Qwen 3 Small Models
How to Access Qwen 3 Small Models via Novita API?

Refer your friends to Novita AI and both of you will earn $10 in LLM API credits—up to $500 in total rewards.

To support the developer community, Qwen2.5-7B, Qwen 3 0.6B, Qwen 3 1.7B, Qwen 3 4B is currently available for free on Novita AI.

Qwen3, an AI model family, is designed for developers seeking cutting-edge capabilities in reasoning, multilingual support, and lightweight efficiency. With free access on Novita AI’s platform and seamless API integration, Qwen3 enables dynamic applications, from coding assistance to complex problem-solving.

Key Features of Qwen 3

Hybrid Thinking Modes

Qwen3 models introduce a hybrid problem-solving approach with two modes:

Thinking Mode: For complex problems, the model reasons step by step, delivering thoughtful answers.
Non-Thinking Mode: For simpler tasks, the model provides fast, near-instant responses.

This flexibility lets users control the model’s reasoning effort based on task requirements. Harder problems benefit from extended reasoning, while simpler ones are solved quickly.

By combining these modes, Qwen3 achieves stable and efficient thinking budget control, offering scalable performance improvements tied to allocated computational reasoning budgets. This design makes task-specific budgeting easier, balancing cost efficiency and inference quality.

Multilingual Support

Qwen3 models support 119 languages and dialects, unlocking new possibilities for global applications. Optimized for coding, agentic capabilities, and MCP, Qwen3 enables users worldwide to leverage its power effectively.

Improved Agentic Capabilities

Qwen3 is optimized for coding and agentic capabilities, with enhanced support for MCP. Below are examples demonstrating how Qwen3 thinks and interacts with its environment.

Qwen 3 Small Models

Tie Embedding is a technique commonly used in natural language processing (NLP) models to share weights between different embedding layers. Specifically, it refers to tying (or sharing) the weights of the input embedding layer and the output embedding layer in a neural network, particularly in language models like transformers.

Training Methods of Qwen 3 Small Models

From the diagram, we can infer that Qwen 3 0.6B,1.7B,4B were trained through a Strong-to-Weak Distillation process, which is part of the pipeline for creating Lightweight Models. Here’s a step-by-step breakdown of the training process:

Base Models:
The process begins with pre-trained Base Models, which act as the foundation for subsequent training and distillation.
Frontier Models:
- Base Models are first trained through a multi-stage process to create Frontier Models like Qwen3-235B-A22B and Qwen3-32B.
- This training involves:
  - Stage 1 (Long-CoT Cold Start): Initial training with long chain-of-thought (CoT) reasoning.
  - Stage 2 (Reasoning RL): Reinforcement learning (RL) to enhance reasoning capabilities.
  - Stage 3 (Thinking Mode Fusion): Integration of Thinking Modes (e.g., reasoning and quick-response modes).
  - Stage 4 (General RL): General reinforcement learning for broader capabilities.
Strong-to-Weak Distillation:
- The large Frontier Models (e.g., Qwen3-235B and Qwen3-32B) are then used as teacher models to guide the training of Lightweight Models like Qwen3-4B.
- This distillation process ensures that the smaller models retain the knowledge and performance of the larger models while significantly reducing size and computational requirements.
Qwen3-4B:
- As a result of this distillation process, Qwen 3 0.6B,1.7B,4B are a lightweight version, benefiting from the knowledge of the larger models while being optimized for efficiency.

How to Access Qwen 3 Small Models via Novita API?

Step 1: Log In and Access the Model Library

Step 2: Choose Your Model

Browse through the available options and select the model that suits your needs.

Step 3: Start Your Free Trial

Begin your free trial to explore the capabilities of the selected model.

Try Qwen 3 Now!

Step 4: Get Your API Key

To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

Step 5: Install the API

Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="<YOUR Novita AI API Key>",
)

model = "qwen3-0.6b-fp8"
stream = True # or False
max_tokens = 2048
system_content = """Be a helpful assistant"""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)

Qwen3 offers unparalleled versatility with its hybrid thinking modes, multilingual capabilities, and lightweight efficiency. Whether you’re solving complex problems or building global applications, Qwen3 empowers you to achieve more. Start your journey today with Novita AI’s free access and explore the future of AI-powered development.

Frequently Asked Questions

What are Qwen3’s unique features?

Hybrid Thinking Modes, multilingual support, lightweight efficiency (0.6B, 1.7B, 4B models), and enhanced coding capabilities.

How do I access Qwen3 models?

Log in to Novita AI, select a model, get your API key, and integrate it into your project with the provided documentation.

Are Qwen3 models free to use?

Yes! Novita AI offers free access to Qwen3 models with easy API integration.

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.

Qwen 3 0.6B,1.7B,4B are Free on Novita AI!