Key Hightlights
GPT-OSS-120B : A 117B parameter open-weight MoE model by OpenAI, designed for enterprise-grade reasoning and production deployment with near-o4-mini performance.
GLM-4.5 : A foundation model unifies reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.
Novita AI not only provides stable API services but also offers extremely cost-effective pricing. For example, GPT-OSS-120B costs $0.1 per 1M input tokens and $0.5 per 1M output tokens, while GLM-4.5 costs $0.6 per 1M input tokens and $2.2 per 1M output tokens.
Basic Introduction of Model
GPT-OSS-120B
gpt-oss-120B is an open-weight Mixture-of-Experts (MoE) language model built by OpenAI, focusing on powerful reasoning, agentic applications, and scalable deployment. It is engineered for both high-performance enterprise and developer use, achieving near-parity with OpenAI’s o4-mini model on core reasoning benchmarks while remaining cost-efficient and accessible.
- Parameters: 117 billion total parameters with 5.1 billion active parameters per inference step.
- Architecture: Mixture-of-Experts design with 128 experts across 36 layers, utilizing SwiGLU activations. An efficient expert routing system activates only the most relevant subset of experts for each token, allowing for increased specialization and efficiency. Features learned attention sinks per attention head for performance improvements.
- Context Window: Supports a context window of 128K tokens, enabling long-form reasoning, multi-turn dialogues, and handling of extensive documents or codebases.
- Agentic Capabilities: Native support for function calling, web browsing, Python code execution, and structured outputs, making it suitable for building intelligent agents. Compatible with OpenAI’s response API formats and fine-tunable for custom use cases.
GLM-4.5
GLM-4.5 is a foundation model designed for intelligent agents with 355 billion total parameters and 32 billion active parameters. The model unifies reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications. GLM-4.5 is a hybrid reasoning model that provides two modes: thinking mode for complex reasoning and tool usage, and non-thinking mode for immediate responses.
Key Features and Architecture
- Parameters: 355 billion total parameters with 32 billion active parameters.
- Hybrid Reasoning: Two operational modes - thinking mode for complex reasoning and tool usage, and non-thinking mode for immediate responses.
- Model Versions: Available in base models, hybrid reasoning models, and FP8 versions.
- Context Window: 128K tokens.
- Licensing: MIT open-source license for commercial use and secondary development.
- Capabilities: Unified reasoning, coding, and intelligent agent functionalities for complex applications.
Benchmark Comparison of GPT-OSS-120B and GLM-4.5
1. Intelligence Benchmarks

2. Context Window:
GPT OSS 120B: 128K Tokens
GLM-4.5: 128K Tokens
3. API Pricing:
GPT OSS 120B: $0.1 / $0.5 in/out per 1M Tokens
GLM-4.5: $0.6 / $2.2 in/out per 1M Tokens
Applied Skills Test of GPT-OSS-120B and GLM-4.5
1. GPT-OSS-120B vs GLM-4.5: Which is better for code generation?
Task Description: Implement a Python class SmartQueue with the following functionality:
- Support priority queue operations (lower numbers = higher priority)
- Support batch operations (add/remove multiple elements at once)
- Implement an intelligent
auto_process()method that automatically processes the top N highest-priority items - Include basic statistics tracking (total processed count, current queue size)
Specific Requirements:
# Expected usage example:
queue = SmartQueue()
queue.add_task("task1", priority=2)
queue.add_batch([("task2", 1), ("task3", 3), ("task4", 1)])
print(queue.get_stats()) # Returns statistics dictionary
processed = queue.auto_process(count=2) # Process top 2 highest-priority tasks
print(processed) # Returns list of processed tasks
Evaluation Dimensions:
- Data Structure Choice: Whether appropriate data structures (e.g., heapq) are selected
- API Design: Reasonableness of method names and parameter design
- Error Handling: Handling edge cases (empty queue, invalid parameters, etc.)
- Code Organization: Class structure and method implementation logic
- Python Idioms: Usage of Python-specific features and conventions
- Algorithm Efficiency: Time complexity considerations for operations
Additional Challenge: The auto_process() method should return processed items in a meaningful format and update internal statistics.
GPT-OSS-120B

GLM 4.5

GLM 4.5: Suitable for learning and simple prototypes, needs optimization for production use
GPT OSS 120B: Ready for professional projects, meets industrial code quality standards
Try GPT-OSS-120B and GLM-4.5 Yourself!
2. How does GLM-4.5 handle ambiguous queries compared to GPT-OSS-120B?
Prompt:
“The bank called about the check. It was insufficient.”
This sentence contains multiple ambiguities. Please:
- Identify all possible interpretations of this sentence
- Rank these interpretations by likelihood in a modern context
- Explain your reasoning process for disambiguation
- If you needed additional context to be certain, what specific questions would you ask?
- Provide a response as if you were helping someone understand what likely happened
Be explicit about your thought process and any assumptions you’re making.
GPT-OSS-120B

GLM 4.5

GPT-OSS-120B demonstrated superior analytical depth and systematic reasoning, while GLM-4.5 showed better practical judgment in balancing thoroughness with usability.
GPT-OSS-120B excels at exhaustive disambiguation, while GLM-4.5 prioritizes clarity and actionable guidance.
How to Access GPT-OSS-120B and GLM-4.5 on Novita AI
Step 1: Log In and Access the Model Library
Log in to your account and click on the Model Library button.

Try GPT-OSS-120B and GLM-4.5 Now!
Step 2: Choose Your Model
Browse through the available options and select the model that suits your needs.

Step 3: Start Your Free Trial
Begin your free trial to explore the capabilities of the selected model.

Step 4: Get Your API Key
To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

Step 5: Install the API
Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.
from openai import OpenAI
client = OpenAI(
base_url="https://api.novita.ai/v3/openai",
api_key="",
)
model = "zai-org/glm-4.5"
stream = True # or False
max_tokens = 65536
system_content = ""Be a helpful assistant""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }
chat_completion_res = client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": system_content,
},
{
"role": "user",
"content": "Hi there!",
}
],
stream=stream,
max_tokens=max_tokens,
temperature=temperature,
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
response_format=response_format,
extra_body={
"top_k": top_k,
"repetition_penalty": repetition_penalty,
"min_p": min_p
}
)
if stream:
for chunk in chat_completion_res:
print(chunk.choices[0].delta.content or "", end="")
else:
print(chat_completion_res.choices[0].message.content)
GPT-OSS-120B stands out as a production-ready solution, delivering comprehensive analytical depth and systematic reasoning that meets industrial code quality standards. Its exhaustive approach to problem-solving makes it suitable for professional projects requiring detailed documentation and rigorous analysis.
GLM-4.5 excels as a development and prototyping tool, offering practical, user-focused responses with streamlined reasoning. It balances thoroughness with usability, making it ideal for learning environments and rapid prototyping.
Frequently Asked Questions
What is GPT-OSS-120B?
GPT-OSS-120B is a 117B parameter open-weight MoE model by OpenAI, designed for enterprise-grade reasoning and production deployment with near-o4-mini performance.
How to fit a GLM model?
GLM models can be deployed through official APIson platforms like Novita AI, with specific setup instructions varying by model version and hardware requirements.
Is GPT-4o open-source?
No, but GPT-OSS-120B is open-source and performs comparably to GPT-4o, and can be used on Novita AI.
About Novita AI
Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.
