MiniMax-M1 Now Available on Novita AI — Experience Hybrid-Attention Reasoning

Table Of Contents

What is MiniMax-M1?
Benchmarks and Performance Analysis
How to Access MiniMax-M1 on Novita AI
Conclusion

MiniMax-M1, the world’s first open-weight hybrid-attention reasoning model, is now live on Novita AI! This breakthrough model features 456 billion parameters—with 45.9 billion activated per token—and natively supports a 1 million-token context, which is 8× larger than DeepSeek R1.

For a limited time, new users can claim $10 in free credits to explore MiniMax-M1’s advanced reasoning capabilities. Power your applications with cutting-edge hybrid-attention technology—MiniMax-M1 is just an API call away.

Here’s the current MiniMax-M1 pricing on Novita AI:

MiniMax-M1-80K: $0.55 / M input tokens, $2.2 / M output tokens

Try MiniMax-M1-80K Demo Now

What is MiniMax-M1?

MiniMax-M1 represents a paradigm shift in large language model architecture. Developed by MiniMax-AI, this innovative model introduces the world’s first open-weight, large-scale hybrid-attention reasoning system that combines a hybrid Mixture-of-Experts (MoE) architecture with a revolutionary lightning attention mechanism.

Key Features of MiniMax-M1

🔹 Hybrid-Attention & Mixture-of-Experts

MoE layers activate 45.9 B parameters out of 456 B for each token, paired with lightning attention for speed and efficiency.

🔹 Massive 1 Million‑Token Context

Supports up to a million tokens natively, ideal for summarizing books, logs, and whole codebases.

🔹 Efficient Reinforcement Learning

CISPO enhances RL training efficiency by clipping importance-sampling weights—a first for MoE + hybrid attention architecture.

🔹 Dual Thinking Budgets: 40K & 80K

Choose between MiniMax‑M1‑40K or MiniMax‑M1‑80K depending on required reasoning depth and compute trade-offs.

🔹 Agentic Capabilities & Plugins

Built-in function calling, tool access (search, code execution, image/video generation, TTS), optimized for real-world agent workflows.

Benchmarks and Performance Analysis

MiniMax-M1 delivers exceptional performance across comprehensive AI benchmarks. Standard evaluations demonstrate that the model outperforms strong open-weight alternatives like DeepSeek-R1 and Qwen3-235B, particularly excelling in software engineering, tool usage, and long context understanding.

Comprehensive Performance Comparison

Category	Task	MiniMax-M1-80K	MiniMax-M1-40K	Qwen3-235B-A22B	DeepSeek-R1-0528	DeepSeek-R1	Seed-Thinking-v1.5	Claude 4 Opus	Gemini 2.5 Pro	OpenAI-o3
Extended Thinking		80K	40K	32k	64k	32k	32k	64k	64k	100k
Mathematics	AIME 2024	86.0	83.3	85.7	91.4	79.8	86.7	76.0	92.0	91.6
	AIME 2025	76.9	74.6	81.5	87.5	70.0	74.0	75.5	88.0	88.9
	MATH-500	96.8	96.0	96.2	98.0	97.3	96.7	98.2	98.8	98.1
General Coding	LiveCodeBench	65.0	62.3	65.9	73.1	55.9	67.5	56.6	77.1	75.8
	FullStackBench	68.3	67.6	62.9	69.4	70.1	69.9	70.3	—	69.3
Reasoning & Knowledge	GPQA Diamond	70.0	69.2	71.1	81.0	71.5	77.3	79.6	86.4	83.3
	ZebraLogic	86.8	80.1	80.3	95.1	78.7	84.4	95.1	91.6	95.8
	MMLU-Pro	81.1	80.6	83.0	85.0	84.0	87.0	85.0	86.0	85.0
Software Engineering	SWE-bench Verified	56.0	55.6	34.4	57.6	49.2	47.0	72.5	67.2	69.1
Long Context	OpenAI-MRCR (128k)	73.4	76.1	27.7	51.5	35.8	54.3	48.9	76.8	56.5
	OpenAI-MRCR (1M)	56.2	58.6	—	—	—	—	—	58.8	—
	LongBench-v2	61.5	61.0	50.1	52.1	58.3	52.5	55.6	65.0	58.8
Agentic Tool Use	TAU-bench (airline)	62.0	60.0	34.7	53.5	—	44.0	59.6	50.0	52.0
	TAU-bench (retail)	63.5	67.8	58.6	63.9	—	55.7	81.4	67.0	73.9

Models evaluated with temperature=1.0, top_p=0.95

Mathematics and Reasoning Excellence

Competition-Level Mathematics
MiniMax-M1-80K achieves outstanding results on AIME 2024 (86.0) and AIME 2025 (76.9), leading among open-weight models in mathematical reasoning. The model’s 96.8% accuracy on MATH-500 demonstrates exceptional capability in handling complex mathematical problems with precision.

Advanced Reasoning Tasks
The model excels in ZebraLogic with 86.8% performance, significantly outperforming open-weight alternatives like Qwen3-235B (80.3%) and DeepSeek-R1 (78.7%). Consistent performance across both 80K and 40K variants (81.1% and 80.6% on MMLU-Pro) showcases reliable reasoning capabilities.

Software Engineering and Coding Excellence

Real-World Software Development
MiniMax-M1 achieves remarkable 56.0% accuracy on SWE-bench Verified, dramatically outperforming Qwen3-235B-A22B (34.4%). This performance demonstrates the model’s ability to understand complex codebases, identify issues, and propose effective solutions in real-world scenarios.

Coding Versatility
Strong performance on LiveCodeBench (65.0%) and FullStackBench (68.3%) highlights the model’s versatility across programming paradigms and frameworks, establishing it as a leading choice for software development applications.

Long Context and Agentic Superiority

Extended Context Processing
MiniMax-M1 demonstrates exceptional long context understanding. On OpenAI-MRCR (128k), achieving 73.4% dramatically outperforms other open-weight models including Qwen3-235B-A22B (27.7%). The 1M token capability (56.2%) showcases unique coherence maintenance across extremely long sequences.

Agentic Capabilities
Strong TAU-bench performance (62.0% airline, 63.5% retail) significantly outperforms Qwen3-235B-A22B on airline tasks (34.7%), demonstrating competitive agentic capabilities across domain applications.

Performance Leadership Analysis

Benchmark performance comparison of leading commercial and open-weight models across competition-level mathematics, coding, software engineering, agentic tool use, and long-context understanding tasks. MiniMax AI uses the MiniMax-M1-80k model here for MiniMax-M1.

Open-Weight Dominance
MiniMax-M1 consistently outperforms open-weight competitors across benchmark categories, establishing clear leadership in the open-weight reasoning model space.

Context Length Advantage
The 1 million token context provides substantial advantages, with 45+ percentage point performance gaps compared to alternatives on long context tasks.

Commercial Competitiveness
While commercial models achieve higher scores in some categories, MiniMax-M1 offers competitive performance with open-weight accessibility and deployment flexibility advantages.

How to Access MiniMax-M1 on Novita AI

Getting started with MiniMax-M1 on Novita AI is streamlined and risk-free. New users receive $10 in free credits—sufficient to explore MiniMax-M1’s hybrid-attention reasoning capabilities, build prototypes, and launch initial use cases without upfront costs.

Use the Playground (No Coding Required)

Instant Access: Sign up, claim your free credits, and start experimenting with Qwen 3 and other top models in seconds.

Interactive UI: Test prompts, chain-of-thought reasoning, and visualize results in real time.

Model Comparison: Effortlessly switch between Qwen 3, Llama 4, DeepSeek, and more to find the perfect fit for your needs.

Integrate via API (For Developers)

Seamlessly connect MiniMax-M1 to applications, workflows, or chatbots using Novita AI’s unified REST API. No model weight management or infrastructure concerns—Novita AI provides multi-language SDKs (Python, Node.js, cURL) and advanced parameter controls.

Option 1: Direct API Integration (Python Example)

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="session_T3FM6wKgMO4YSk7lWfKo5H99EzvOqeJYVrxM6W1u2kuckMW0MuJhSAGDv9jNsFJ09pQ8r6mJHXJoldr_gxQ4WA==",
)

model = "minimaxai/minimax-m1-80k"
stream = True # or False
max_tokens = 20000
system_content = ""Be a helpful assistant""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)

Option 2: Multi-Agent Workflows with OpenAI Agents SDK

Build advanced multi-agent systems by integrating Novita AI with the OpenAI Agents SDK:

Plug-and-play: Use Novita AI’s MiniMax-M1 in any OpenAI Agents workflow
Supports handoffs, routing, and tool use: Design agents that can delegate, triage, or run functions, all powered by MiniMax-M1’s hybrid-attention capabilities
Python integration: Simply point the SDK to Novita’s endpoint (https://api.novita.ai/v3/openai) and use your API key

Connect MiniMax-M1 API on Third-Party Platforms

Hugging Face: Use Qwen 3 in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
**Agent & Orchestration Frameworks:**Easily connect Novita AI with partner platforms like Continue, AnythingLLM,LangChain, Dify and Langflow through official connectors and step-by-step integration guides.
OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline and Cursor, designed for the OpenAI API standard.

Conclusion

MiniMax-M1 establishes a new benchmark for open-weight reasoning models through its hybrid-attention architecture and exceptional performance across diverse domains.

The model’s 1 million token context capability and 75% FLOP reduction make it ideal for complex, real-world applications requiring both efficiency and advanced reasoning.

Ready to experience the future of AI reasoning? Try MiniMax-M1 on Novita AI and claim your $10 free credits today.

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.

MiniMax-M1 Now Available on Novita AI — Experience Hybrid-Attention Reasoning