Novita AI Launches Top THUDM Models: GLM-4 Series Model

Table Of Contents

What is GML-4 Series Models?
GLM-4-32B-0414 Benchmark
GLM-4 Series Capability
How to Access GLM-4 Series Model?
Conclusion
Frequently Asked Questions

Novita AI is excited to introduce five top-performing THUDM models, excelling in math, coding**:** GLM-4-32B-0414, GLM-Z1-32B-0414, GLM-Z1-Rumination-32B-0414, GLM-4-9B-0414, and GLM-Z1-9B-0414. To support developers and the open-source community, GLM-4-9B-0414 and GLM-Z1-9B-0414 are now available with free API access!

Novita AI proudly launches five top-tier THUDM models: GLM-4-32B-0414, GLM-Z1-32B-0414, GLM-Z1-Rumination-32B-0414, GLM-4-9B-0414, and GLM-Z1-9B-0414.
GLM-4-32B-0414, built on 15T of high-quality data and human preference alignment, leads with strong general abilities and excels in instruction following, tool use, and search QA.
For developers seeking high performance or cost-effective solutions, Novita AI now offers free API access to GLM-4-9B-0414 and GLM-Z1-9B-0414.

What is GML-4 Series Models?

THUDM’s GLM series demonstrates strong technical performance, especially in math, coding, and reasoning tasks.

The 32B models (GLM-4-32B, GLM-Z1-32B, GLM-Z1-Rumination) offer a balance of general capabilities and deep reasoning, with GLM-Z1-Rumination specializing in open-ended problem solving and search-augmented reasoning.
The 9B models (GLM-4-9B, GLM-Z1-9B) are highly optimized for math reasoning and general task performance, achieving an impressive performance-to-size ratio ideal for lightweight deployments.

GLM-4-32B-Base-0414 serves as the technical foundation for the entire series.

It was pre-trained on 15T of high-quality data, including substantial reasoning-focused synthetic data, establishing a strong base for complex task handling.
Post-training optimization involved human preference alignment, enhancing the model’s ability to deliver natural and user-aligned dialogue experiences.

Clear model tiering supports different development needs.

For complex reasoning, deep writing, and cross-domain analysis, GLM-Z1-Rumination-32B is recommended.
For strong general-purpose performance, GLM-4-32B is the ideal choice.
For budget-conscious projects or large-scale batch operations (e.g., translation, QA), the free GLM-4-9B and GLM-Z1-9B models offer an excellent cost-performance balance.

GLM-4-32B-0414 Benchmark

What GLM-4-32B-0414 Does Best

Instruction Following (IFEval):
GLM-4-32B-0414 excels in instruction understanding and execution, achieving the highest score among all models.

Tool Use Capability (BFCL-v3 / TAU-Bench):
GLM-4-32B-0414 demonstrates outstanding performance in tool use tasks across multiple industries (retail, airline), leading or tying for first place in both single-turn and multi-turn scenarios.
Its advantage is especially prominent in complex multi-turn tool use, outperforming the second-best model by nearly 10 points.

Search-Based Question Answering (SimpleQA, HotpotQA):
GLM-4-32B-0414 shows strong capabilities in search QA, achieving the highest score (88.1) in SimpleQA and nearly matching GPT-4o-1120 in HotpotQA, while significantly outperforming DeepSeek-V3-0324 and Qwen2.5-Max.

GLM-4 Series Capability

Code VS Gemini 2.5 Flash

How to Access GLM-4 Series Model?

Step 1: Log In and Access the Model Library

Try GLM-4 Demo Now!

Step 2: Choose Your Model

Browse through the available options and select the model that suits your needs.

Step 3: Start Your Free Trial

Begin your free trial to explore the capabilities of the selected model.

Step 4: Get Your API Key

To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

Step 5: Install the API

Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="<YOUR Novita AI API Key>",
)

model = "thudm/glm-4-32b-0414"
stream = True # or False
max_tokens = 2048
system_content = """Be a helpful assistant"""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)

Conclusion

GLM-4-32B-0414 demonstrates outstanding strength across key AI tasks, from instruction adherence to complex multi-turn tool use. Its balanced capabilities, combined with fine-tuned dialogue optimization, make it an ideal choice for developers needing robust, versatile models.
With Novita AI’s support, exploring GLM models has never been easier. Dive into the model playground and see what GLM-4 can do.

Frequently Asked Questions

What makes GLM-4-32B-0414 special compared to other models?

GLM-4-32B-0414 excels in instruction following, multi-turn tool use, and search-based QA, backed by extensive pre-training and fine-tuned dialogue optimization.

Can I try GLM-4-32B-0414 via Novita AI?

Yes, GLM-4-32B-0414 is available on Novita AI with competitive API pricing, while GLM-4-9B and GLM-Z1-9B are free to access.

What are GLM-4 series models best suited for?

GLM-4-32B-0414 is ideal for tasks requiring deep reasoning, complex dialogue, and high-accuracy instruction execution.

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.

Novita AI Launches Top THUDM Models: GLM-4 Series Model

What is GML-4 Series Models?

GLM-4-32B-0414 Benchmark

What GLM-4-32B-0414 Does Best

GLM-4 Series Capability

Code VS Gemini 2.5 Flash

How to Access GLM-4 Series Model?

Conclusion

Frequently Asked Questions

Recommended Reading

Product

RESOURCES

Partners

Company

What is GML-4 Series Models?

GLM-4-32B-0414 Benchmark

What GLM-4-32B-0414 Does Best

GLM-4 Series Capability

Code VS Gemini 2.5 Flash

How to Access GLM-4 Series Model?

Conclusion

Frequently Asked Questions

Recommended Reading

Related Posts

Product

RESOURCES

Partners

Company