MiniMax-M1 Now Available on Novita AI — Experience Hybrid-Attention Reasoning

MiniMaxM1

MiniMax-M1, the world’s first open-weight hybrid-attention reasoning model, is now live on Novita AI! This breakthrough model features 456 billion parameters—with 45.9 billion activated per token—and natively supports a 1 million-token context, which is 8× larger than DeepSeek R1.

For a limited time, new users can claim $10 in free credits to explore MiniMax-M1’s advanced reasoning capabilities. Power your applications with cutting-edge hybrid-attention technology—MiniMax-M1 is just an API call away.

Here’s the current MiniMax-M1 pricing on Novita AI:

MiniMax-M1-80K: $0.55 / M input tokens, $2.2 / M output tokens

What is MiniMax-M1?

MiniMax-M1 represents a paradigm shift in large language model architecture. Developed by MiniMax-AI, this innovative model introduces the world’s first open-weight, large-scale hybrid-attention reasoning system that combines a hybrid Mixture-of-Experts (MoE) architecture with a revolutionary lightning attention mechanism.

Key Features of MiniMax-M1

🔹 Hybrid-Attention & Mixture-of-Experts

MoE layers activate 45.9 B parameters out of 456 B for each token, paired with lightning attention for speed and efficiency.

🔹 Massive 1 Million‑Token Context

Supports up to a million tokens natively, ideal for summarizing books, logs, and whole codebases.

🔹 Efficient Reinforcement Learning

CISPO enhances RL training efficiency by clipping importance-sampling weights—a first for MoE + hybrid attention architecture.

🔹 Dual Thinking Budgets: 40K & 80K

Choose between MiniMax‑M1‑40K or MiniMax‑M1‑80K depending on required reasoning depth and compute trade-offs.

🔹 Agentic Capabilities & Plugins

Built-in function calling, tool access (search, code execution, image/video generation, TTS), optimized for real-world agent workflows.

Benchmarks and Performance Analysis

MiniMax-M1 delivers exceptional performance across comprehensive AI benchmarks. Standard evaluations demonstrate that the model outperforms strong open-weight alternatives like DeepSeek-R1 and Qwen3-235B, particularly excelling in software engineering, tool usage, and long context understanding.

Comprehensive Performance Comparison

CategoryTaskMiniMax-M1-80KMiniMax-M1-40KQwen3-235B-A22BDeepSeek-R1-0528DeepSeek-R1Seed-Thinking-v1.5Claude 4 OpusGemini 2.5 ProOpenAI-o3
Extended Thinking80K40K32k64k32k32k64k64k100k
MathematicsAIME 202486.083.385.791.479.886.776.092.091.6
AIME 202576.974.681.587.570.074.075.588.088.9
MATH-50096.896.096.298.097.396.798.298.898.1
General CodingLiveCodeBench65.062.365.973.155.967.556.677.175.8
FullStackBench68.367.662.969.470.169.970.369.3
Reasoning & KnowledgeGPQA Diamond70.069.271.181.071.577.379.686.483.3
ZebraLogic86.880.180.395.178.784.495.191.695.8
MMLU-Pro81.180.683.085.084.087.085.086.085.0
Software EngineeringSWE-bench Verified56.055.634.457.649.247.072.567.269.1
Long ContextOpenAI-MRCR (128k)73.476.127.751.535.854.348.976.856.5
OpenAI-MRCR (1M)56.258.658.8
LongBench-v261.561.050.152.158.352.555.665.058.8
Agentic Tool UseTAU-bench (airline)62.060.034.753.544.059.650.052.0
TAU-bench (retail)63.567.858.663.955.781.467.073.9
Models evaluated with temperature=1.0, top_p=0.95

Mathematics and Reasoning Excellence

Competition-Level Mathematics
MiniMax-M1-80K achieves outstanding results on AIME 2024 (86.0) and AIME 2025 (76.9), leading among open-weight models in mathematical reasoning. The model’s 96.8% accuracy on MATH-500 demonstrates exceptional capability in handling complex mathematical problems with precision.

Advanced Reasoning Tasks
The model excels in ZebraLogic with 86.8% performance, significantly outperforming open-weight alternatives like Qwen3-235B (80.3%) and DeepSeek-R1 (78.7%). Consistent performance across both 80K and 40K variants (81.1% and 80.6% on MMLU-Pro) showcases reliable reasoning capabilities.

Software Engineering and Coding Excellence

Real-World Software Development
MiniMax-M1 achieves remarkable 56.0% accuracy on SWE-bench Verified, dramatically outperforming Qwen3-235B-A22B (34.4%). This performance demonstrates the model’s ability to understand complex codebases, identify issues, and propose effective solutions in real-world scenarios.

Coding Versatility
Strong performance on LiveCodeBench (65.0%) and FullStackBench (68.3%) highlights the model’s versatility across programming paradigms and frameworks, establishing it as a leading choice for software development applications.

Long Context and Agentic Superiority

Extended Context Processing
MiniMax-M1 demonstrates exceptional long context understanding. On OpenAI-MRCR (128k), achieving 73.4% dramatically outperforms other open-weight models including Qwen3-235B-A22B (27.7%). The 1M token capability (56.2%) showcases unique coherence maintenance across extremely long sequences.

Agentic Capabilities
Strong TAU-bench performance (62.0% airline, 63.5% retail) significantly outperforms Qwen3-235B-A22B on airline tasks (34.7%), demonstrating competitive agentic capabilities across domain applications.

Performance Leadership Analysis

minimax benckmark
Benchmark performance comparison of leading commercial and open-weight models across competition-level mathematics, coding, software engineering, agentic tool use, and long-context understanding tasks. MiniMax AI uses the MiniMax-M1-80k model here for MiniMax-M1.

Open-Weight Dominance
MiniMax-M1 consistently outperforms open-weight competitors across benchmark categories, establishing clear leadership in the open-weight reasoning model space.

Context Length Advantage
The 1 million token context provides substantial advantages, with 45+ percentage point performance gaps compared to alternatives on long context tasks.

Commercial Competitiveness
While commercial models achieve higher scores in some categories, MiniMax-M1 offers competitive performance with open-weight accessibility and deployment flexibility advantages.

How to Access MiniMax-M1 on Novita AI

Getting started with MiniMax-M1 on Novita AI is streamlined and risk-free. New users receive $10 in free credits—sufficient to explore MiniMax-M1’s hybrid-attention reasoning capabilities, build prototypes, and launch initial use cases without upfront costs.

Use the Playground (No Coding Required)

Instant AccessSign up, claim your free credits, and start experimenting with Qwen 3 and other top models in seconds.

Interactive UI: Test prompts, chain-of-thought reasoning, and visualize results in real time.

Model Comparison: Effortlessly switch between Qwen 3, Llama 4, DeepSeek, and more to find the perfect fit for your needs.

Integrate via API (For Developers)

Seamlessly connect MiniMax-M1 to applications, workflows, or chatbots using Novita AI’s unified REST API. No model weight management or infrastructure concerns—Novita AI provides multi-language SDKs (Python, Node.js, cURL) and advanced parameter controls.

Option 1: Direct API Integration (Python Example)

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/v3/openai",
    api_key="session_T3FM6wKgMO4YSk7lWfKo5H99EzvOqeJYVrxM6W1u2kuckMW0MuJhSAGDv9jNsFJ09pQ8r6mJHXJoldr_gxQ4WA==",
)

model = "minimaxai/minimax-m1-80k"
stream = True # or False
max_tokens = 20000
system_content = ""Be a helpful assistant""
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)
 

Option 2: Multi-Agent Workflows with OpenAI Agents SDK

Build advanced multi-agent systems by integrating Novita AI with the OpenAI Agents SDK:

  • Plug-and-play: Use Novita AI’s MiniMax-M1 in any OpenAI Agents workflow
  • Supports handoffs, routing, and tool use: Design agents that can delegate, triage, or run functions, all powered by MiniMax-M1’s hybrid-attention capabilities
  • Python integration: Simply point the SDK to Novita’s endpoint (https://api.novita.ai/v3/openai) and use your API key

Connect MiniMax-M1 API on Third-Party Platforms

  • Hugging Face: Use Qwen 3 in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
  • Agent & Orchestration Frameworks: Easily connect Novita AI with partner platforms like ContinueAnythingLLM, LangChainDify and Langflow through official connectors and step-by-step integration guides.
  • OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline and Cursor, designed for the OpenAI API standard.

Conclusion

MiniMax-M1 establishes a new benchmark for open-weight reasoning models through its hybrid-attention architecture and exceptional performance across diverse domains.

The model’s 1 million token context capability and 75% FLOP reduction make it ideal for complex, real-world applications requiring both efficiency and advanced reasoning.

Ready to experience the future of AI reasoning? Try MiniMax-M1 on Novita AI and claim your $10 free credits today.

Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing the affordable and reliable GPU cloud for building and scaling.


Discover more from Novita

Subscribe to get the latest posts sent to your email.

Leave a Comment

Scroll to Top

Discover more from Novita

Subscribe now to keep reading and get access to the full archive.

Continue reading