MiniMax M2.5 on Novita AI: How to Setup and Cost Breakdown

Access MiniMax M2.5 via Novita: Setup Guide & Cost Analysis

MiniMax M2.5 is one of the fastest, most cost-effective AI coding agents available — and with Novita AI, you can access it for just $0.30/$1.20 per 1M tokens. Achieving 80.2% on SWE-Bench Verified and 51.3% on Multi-SWE-Bench, M2.5 delivers SOTA coding performance while completing tasks 37% faster than M2.1 — matching Claude Opus 4.6’s speed at a fraction of the cost.

This guide shows you exactly how to access MiniMax M2.5 through Novita AI’s OpenAI-compatible API, deploy it for production workloads, and maximize its unique strengths in agentic coding, tool use, and office automation.

What Is MiniMax M2.5?

MiniMax M2.5 is a 228.7B-parameter mixture-of-experts (MoE) model specifically trained for real-world productivity tasks. Built with 256 experts and 8 experts activated per token, it delivers frontier-level performance in coding, agentic tool use, web search, and office automation while maintaining extreme inference efficiency.

Architecture of Minimax M2.5

SpecificationMiniMax M2.5
Total Parameters229B
ArchitectureMixture-of-Experts (MoE)
Number of Experts256 total, 8 active per token
Context Length196,608 tokens (~196K)
Hidden Size3072
Layers62
Vocabulary Size200,064

Benchmark of Minimax M2.5

MiniMax M2.5 achieves state-of-the-art results across coding, agentic tasks, and office automation benchmarks — matching or exceeding models 3-5x more expensive. The model was trained with reinforcement learning in 200,000+ real-world environments, giving it unmatched generalization on practical tasks.

Coding and Agentic & Tool Use

MiniMax M2.5 does not dominate every benchmark, but it maintains consistently strong results across simulation, retrieval, and multi-turn reasoning tasks. Its profile suggests:

  • Strong agent-style task coordination
  • Robust retrieval and search integration
  • Stable multi-turn reasoning
  • Competitive structured environment simulation

Overall, MiniMax M2.5 appears optimized for applied agentic workflows and complex multi-step execution rather than purely academic reasoning benchmarks.

Office Automation

MiniMax M2.5 is not designed to dominate abstract academic reasoning benchmarks or pure mathematical competitions. Its strength lies in professional office execution tasks, especially those requiring structured, deliverable outputs.

BenchmarkMiniMax M2.5MiniMax M2.1Claude Opus 4.5Claude Opus 4.6Gemini 3 ProGPT-5.2
GDPval-MM59.024.661.173.528.154.5
MEWC74.455.682.189.878.741.3
Finance Modeling21.617.330.133.215.020.0

Speed of Minimax M2.5

Why M2.5’s Speed Matters: Completing SWE-Bench 37% faster than M2.1 means lower API costs AND faster iteration cycles. For a typical multi-file refactoring task, M2.5 finishes in 45 seconds vs M2.1’s 70 seconds — saving both time and money at scale.

Why MiniMax M2.5 on Novita AI?

Novita AI offers the best cost-performance trade-off for running MiniMax M2.5 in production. While self-hosting requires 4-8 H100 GPUs (minimum $5.80/hr), Novita’s serverless API costs just $0.30 input / $1.20 output per 1M tokens — with zero infrastructure overhead, instant scaling, and 99.5% uptime SLA.

Key advantages of Novita AI for MiniMax M2.5:

FeatureNovita AISelf-Hosted
Setup Time2 minutes (API key)2-5 days (GPU provisioning + setup)
Cost ModelPay-per-token ($0.30/$1.20 per 1M)Fixed GPU rental ($5.80/hr+ for 4×H100)
ScalingInstant auto-scalingManual GPU provisioning
MaintenanceZero (managed service)High (vLLM, drivers, updates)
Availability99.5% SLADepends on your infrastructure
Best ForVariable workloads, rapid prototyping, production APIs24/7 high-volume inference with predictable load

How to Access MiniMax M2.5 on Novita AI

Step 1: Log In and Access the Model Library

Log in to your account and click on the Model Library button.

Log In and Access the Model Library

Step 2: Choose Your Model

Browse through the available options and select the model that suits your needs.

Choose Your Model

Step 3: Start Your Free Trial

Begin your free trial to explore the capabilities of the selected model.

Novita AI provides OpenAI-compatible endpoints for MiniMax M2.5

Step 4: Get Your API Key

To authenticate with the API, we will provide you with a new API key. Entering the “Settings“ page, you can copy the API key as indicated in the image.

get api key

Step 5: Install the API

Install API using the package manager specific to your programming language.

After installation, import the necessary libraries into your development environment. Initialize the API with your API key to start interacting with Novita AI LLM. This is an example of using chat completions API for python users.

from openai import OpenAI

client = OpenAI(
    api_key="<Your API Key>",
    base_url="https://api.novita.ai/openai"
)

response = client.chat.completions.create(
    model="minimax/minimax-m2.5",
    messages=[
        {"role"Method 2: MiniMax Official API Platform: "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello, how are you?"}
    ],
    max_tokens=131072,
    temperature=0.7
)

print(response.choices[0].message.content)

Easily connect Novita AI with partner platforms like Trae, Continue, Codex, OpenCode, AnythingLLM, LangChain, Dify, Langflow, and Openclaw through official integrations and step-by-step guides.

Use Cases: Where MiniMax M2.5 Shines

You could also try to closely test M2.5 on software engineering tasks and see how it plans and executes on a closed scope. M2.5 would output a complete spec-first plan with UI wireframes and API endpoints. With that, it will add 1200+ lines of TypeScript/JavaScript code. The tests passed on the first run in 22 minutes, which makes it faster than Claude Opus 4.6’s average. The result is a functional application with JWT auth and MongoDB integration.

Build a React app with Node.js backend for user authentication, including database schema.
From website

MiniMax M2.5 on Novita AI delivers frontier-level agentic coding performance at 1/10th the cost of premium alternatives. With 80.2% SWE-Bench Verified, 37% faster task completion than M2.1, and $0.30/$1.20 per 1M tokens, it’s the optimal choice for production AI coding agents, office automation, and tool orchestration workflows.

Frequently Asked Questions

How does MiniMax M2.5 compare to M2.1?

M2.5 is 37% faster on SWE-Bench tasks and achieves 80.2% vs ~70% on SWE-Bench Verified. Both cost the same ($0.30/$1.20 per 1M tokens on Novita), making M2.5 the clear upgrade.

Can I self-host MiniMax M2.5 instead of using Novita API?

Yes, but it requires 4-8 H100 GPUs (minimum $5.80/hr on Novita GPU instances). Self-hosting only makes economic sense above 500M tokens/month — for most developers, the API is far more cost-effective.

Does MiniMax M2.5 support function calling?

Yes. M2.5 was extensively trained on tool use and function calling across 200,000+ real-world environments, achieving u003cstrongu003eindustry-leading performance on BrowseComp (76.3%)u003c/strongu003e and Wide Search benchmarks.

Novita AI is an AI & agent cloud platform helping developers and startups build, deploy, and scale models and agentic applications with high performance, reliability, and cost efficiency.

Recommend Reading


Discover more from Novita

Subscribe to get the latest posts sent to your email.

Leave a Comment

Scroll to Top

Discover more from Novita

Subscribe now to keep reading and get access to the full archive.

Continue reading