MiniMax-M1, the world’s first open-weight hybrid-attention reasoning model, is now live on Novita AI! This breakthrough model features 456 billion parameters—with 45.9 billion activated per token—and natively supports a 1 million-token context, which is 8× larger than DeepSeek R1.
For a limited time, new users can claim $10 in free credits to explore MiniMax-M1’s advanced reasoning capabilities. Power your applications with cutting-edge hybrid-attention technology—MiniMax-M1 is just an API call away.
Here’s the current MiniMax-M1 pricing on Novita AI:
MiniMax-M1-80K: $0.55 / M input tokens, $2.2 / M output tokens
What is MiniMax-M1?
MiniMax-M1 represents a paradigm shift in large language model architecture. Developed by MiniMax-AI, this innovative model introduces the world’s first open-weight, large-scale hybrid-attention reasoning system that combines a hybrid Mixture-of-Experts (MoE) architecture with a revolutionary lightning attention mechanism.
Key Features of MiniMax-M1
🔹 Hybrid-Attention & Mixture-of-Experts
MoE layers activate 45.9 B parameters out of 456 B for each token, paired with lightning attention for speed and efficiency.
🔹 Massive 1 Million‑Token Context
Supports up to a million tokens natively, ideal for summarizing books, logs, and whole codebases.
🔹 Efficient Reinforcement Learning
CISPO enhances RL training efficiency by clipping importance-sampling weights—a first for MoE + hybrid attention architecture.
🔹 Dual Thinking Budgets: 40K & 80K
Choose between MiniMax‑M1‑40K or MiniMax‑M1‑80K depending on required reasoning depth and compute trade-offs.
🔹 Agentic Capabilities & Plugins
Built-in function calling, tool access (search, code execution, image/video generation, TTS), optimized for real-world agent workflows.
Benchmarks and Performance Analysis
MiniMax-M1 delivers exceptional performance across comprehensive AI benchmarks. Standard evaluations demonstrate that the model outperforms strong open-weight alternatives like DeepSeek-R1 and Qwen3-235B, particularly excelling in software engineering, tool usage, and long context understanding.
Comprehensive Performance Comparison
| Category | Task | MiniMax-M1-80K | MiniMax-M1-40K | Qwen3-235B-A22B | DeepSeek-R1-0528 | DeepSeek-R1 | Seed-Thinking-v1.5 | Claude 4 Opus | Gemini 2.5 Pro | OpenAI-o3 |
|---|---|---|---|---|---|---|---|---|---|---|
| Extended Thinking | 80K | 40K | 32k | 64k | 32k | 32k | 64k | 64k | 100k | |
| Mathematics | AIME 2024 | 86.0 | 83.3 | 85.7 | 91.4 | 79.8 | 86.7 | 76.0 | 92.0 | 91.6 |
| AIME 2025 | 76.9 | 74.6 | 81.5 | 87.5 | 70.0 | 74.0 | 75.5 | 88.0 | 88.9 | |
| MATH-500 | 96.8 | 96.0 | 96.2 | 98.0 | 97.3 | 96.7 | 98.2 | 98.8 | 98.1 | |
| General Coding | LiveCodeBench | 65.0 | 62.3 | 65.9 | 73.1 | 55.9 | 67.5 | 56.6 | 77.1 | 75.8 |
| FullStackBench | 68.3 | 67.6 | 62.9 | 69.4 | 70.1 | 69.9 | 70.3 | — | 69.3 | |
| Reasoning & Knowledge | GPQA Diamond | 70.0 | 69.2 | 71.1 | 81.0 | 71.5 | 77.3 | 79.6 | 86.4 | 83.3 |
| ZebraLogic | 86.8 | 80.1 | 80.3 | 95.1 | 78.7 | 84.4 | 95.1 | 91.6 | 95.8 | |
| MMLU-Pro | 81.1 | 80.6 | 83.0 | 85.0 | 84.0 | 87.0 | 85.0 | 86.0 | 85.0 | |
| Software Engineering | SWE-bench Verified | 56.0 | 55.6 | 34.4 | 57.6 | 49.2 | 47.0 | 72.5 | 67.2 | 69.1 |
| Long Context | OpenAI-MRCR (128k) | 73.4 | 76.1 | 27.7 | 51.5 | 35.8 | 54.3 | 48.9 | 76.8 | 56.5 |
| OpenAI-MRCR (1M) | 56.2 | 58.6 | — | — | — | — | — | 58.8 | — | |
| LongBench-v2 | 61.5 | 61.0 | 50.1 | 52.1 | 58.3 | 52.5 | 55.6 | 65.0 | 58.8 | |
| Agentic Tool Use | TAU-bench (airline) | 62.0 | 60.0 | 34.7 | 53.5 | — | 44.0 | 59.6 | 50.0 | 52.0 |
| TAU-bench (retail) | 63.5 | 67.8 | 58.6 | 63.9 | — | 55.7 | 81.4 | 67.0 | 73.9 |
Mathematics and Reasoning Excellence
Competition-Level Mathematics
MiniMax-M1-80K achieves outstanding results on AIME 2024 (86.0) and AIME 2025 (76.9), leading among open-weight models in mathematical reasoning. The model’s 96.8% accuracy on MATH-500 demonstrates exceptional capability in handling complex mathematical problems with precision.
Advanced Reasoning Tasks
The model excels in ZebraLogic with 86.8% performance, significantly outperforming open-weight alternatives like Qwen3-235B (80.3%) and DeepSeek-R1 (78.7%). Consistent performance across both 80K and 40K variants (81.1% and 80.6% on MMLU-Pro) showcases reliable reasoning capabilities.
Software Engineering and Coding Excellence
Real-World Software Development
MiniMax-M1 achieves remarkable 56.0% accuracy on SWE-bench Verified, dramatically outperforming Qwen3-235B-A22B (34.4%). This performance demonstrates the model’s ability to understand complex codebases, identify issues, and propose effective solutions in real-world scenarios.
Coding Versatility
Strong performance on LiveCodeBench (65.0%) and FullStackBench (68.3%) highlights the model’s versatility across programming paradigms and frameworks, establishing it as a leading choice for software development applications.
Long Context and Agentic Superiority
Extended Context Processing
MiniMax-M1 demonstrates exceptional long context understanding. On OpenAI-MRCR (128k), achieving 73.4% dramatically outperforms other open-weight models including Qwen3-235B-A22B (27.7%). The 1M token capability (56.2%) showcases unique coherence maintenance across extremely long sequences.
Agentic Capabilities
Strong TAU-bench performance (62.0% airline, 63.5% retail) significantly outperforms Qwen3-235B-A22B on airline tasks (34.7%), demonstrating competitive agentic capabilities across domain applications.
Performance Leadership Analysis

Open-Weight Dominance
MiniMax-M1 consistently outperforms open-weight competitors across benchmark categories, establishing clear leadership in the open-weight reasoning model space.
Context Length Advantage
The 1 million token context provides substantial advantages, with 45+ percentage point performance gaps compared to alternatives on long context tasks.
Commercial Competitiveness
While commercial models achieve higher scores in some categories, MiniMax-M1 offers competitive performance with open-weight accessibility and deployment flexibility advantages.
How to Access MiniMax-M1 on Novita AI
Getting started with MiniMax-M1 on Novita AI is streamlined and risk-free. New users receive $10 in free credits—sufficient to explore MiniMax-M1’s hybrid-attention reasoning capabilities, build prototypes, and launch initial use cases without upfront costs.
Use the Playground (No Coding Required)
Instant Access: Sign up, claim your free credits, and start experimenting with Qwen 3 and other top models in seconds.
Interactive UI: Test prompts, chain-of-thought reasoning, and visualize results in real time.
Model Comparison: Effortlessly switch between Qwen 3, Llama 4, DeepSeek, and more to find the perfect fit for your needs.
Integrate via API (For Developers)
Seamlessly connect MiniMax-M1 to applications, workflows, or chatbots using Novita AI’s unified REST API. No model weight management or infrastructure concerns—Novita AI provides multi-language SDKs (Python, Node.js, cURL) and advanced parameter controls.
Option 1: Direct API Integration (Python Example)
from openai import OpenAI
client = OpenAI(
base_url="https://api.novita.ai/v3/openai",
api_key="session_T3FM6wKgMO4YSk7lWfKo5H99EzvOqeJYVrxM6W1u2kuckMW0MuJhSAGDv9jNsFJ09pQ8r6mJHXJoldr_gxQ4WA==",
)
model = "minimaxai/minimax-m1-80k"
stream = True # or False
max_tokens = 20000
system_content = ""Be a helpful assistant""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }
chat_completion_res = client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": system_content,
},
{
"role": "user",
"content": "Hi there!",
}
],
stream=stream,
max_tokens=max_tokens,
temperature=temperature,
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
response_format=response_format,
extra_body={
"top_k": top_k,
"repetition_penalty": repetition_penalty,
"min_p": min_p
}
)
if stream:
for chunk in chat_completion_res:
print(chunk.choices[0].delta.content or "", end="")
else:
print(chat_completion_res.choices[0].message.content)
Option 2: Multi-Agent Workflows with OpenAI Agents SDK
Build advanced multi-agent systems by integrating Novita AI with the OpenAI Agents SDK:
- Plug-and-play: Use Novita AI’s MiniMax-M1 in any OpenAI Agents workflow
- Supports handoffs, routing, and tool use: Design agents that can delegate, triage, or run functions, all powered by MiniMax-M1’s hybrid-attention capabilities
- Python integration: Simply point the SDK to Novita’s endpoint (
https://api.novita.ai/v3/openai) and use your API key
Connect MiniMax-M1 API on Third-Party Platforms
- Hugging Face: Use Qwen 3 in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
- Agent & Orchestration Frameworks: Easily connect Novita AI with partner platforms like Continue, AnythingLLM, LangChain, Dify and Langflow through official connectors and step-by-step integration guides.
- OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline and Cursor, designed for the OpenAI API standard.
Conclusion
MiniMax-M1 establishes a new benchmark for open-weight reasoning models through its hybrid-attention architecture and exceptional performance across diverse domains.
The model’s 1 million token context capability and 75% FLOP reduction make it ideal for complex, real-world applications requiring both efficiency and advanced reasoning.
Ready to experience the future of AI reasoning? Try MiniMax-M1 on Novita AI and claim your $10 free credits today.
Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.
Discover more from Novita
Subscribe to get the latest posts sent to your email.





