Use MiniMax M2.7 in Claude Code via Novita AI

Use MiniMax M2.7 in Claude Code via Novita AI

MiniMax M2.7 is available on Novita AI with an Anthropic-compatible endpoint, which means you can drop it into Claude Code with four environment variables and no other changes. Input costs $0.3 per million tokens — roughly 23x cheaper than Claude Opus 4.6 — while the model holds 97% tool-call accuracy across 40+ concurrent tools. This guide covers the exact setup, verified specs, cost math, and the tradeoffs worth knowing before you commit.

Quick answer: Set ANTHROPIC_BASE_URL=https://api.novita.ai/anthropic, ANTHROPIC_AUTH_TOKEN=<your key>, and ANTHROPIC_MODEL=minimax/minimax-m2.7 before launching Claude Code. That’s the entire integration.

Try MiniMax M2.7 on Novita AI

Why M2.7 Works as a Claude Code Backend

Claude Code routes every file read, terminal command, and edit through tool calls. The model behind it needs to reliably invoke those tools across long, multi-step sessions — not just on simple prompts.

M2.7’s headline stat is 97% skill-following accuracy with 40+ concurrent tools, each with definitions exceeding 2,000 tokens. That maps directly to Claude Code’s tool surface: Read, Write, Edit, Bash, Glob, Grep, WebFetch, and more. Where models with weaker tool adherence drop parameters or invoke the wrong function mid-session, M2.7 holds its accuracy.

The other relevant number is the context window: 204,800 tokens. That’s enough to keep a large codebase, conversation history, and tool results in context without forced compaction mid-task.

What M2.7 is good at in Claude Code:

  • Multi-file refactors requiring sustained tool chaining
  • Production debugging workflows (log analysis → database queries → fix)
  • SWE-Bench-style tasks: 56.2% on SWE-Bench Pro, 52.7% on Multi-SWE-Bench
  • Agent workflows that need role-consistent, multi-turn reasoning
  • Cost-sensitive teams running many concurrent sessions

Where to set expectations:

  • Text-only input modality (no image or file uploads to the model)
  • 204,800-token context is large but not the 1M tier of M3
  • Self-hosted deployment requires significant GPU — API access is the practical path

MiniMax M2.7 API Specs on Novita AI

FieldValue
Model IDminimax/minimax-m2.7
Anthropic base URLhttps://api.novita.ai/anthropic
OpenAI base URLhttps://api.novita.ai/openai/v1
Context window204,800 tokens
Max output131,072 tokens
Input modalitiesText
Output modalitiesText
Supported featuresFunction calling, structured output, reasoning, JSON mode, Anthropic API compatibility
Input pricing$0.30 per 1M tokens
Output pricing$1.20 per 1M tokens
Cache read pricing$0.06 per 1M tokens
Date checked2026-07-03, source: Novita AI model page and MiniMax M2.7 on Novita

The Anthropic-compatible endpoint (/anthropic) is what Claude Code uses. The OpenAI-compatible endpoint works for direct API calls and other tools.

Step 1: Get Your Novita AI API Key

  1. Go to novita.ai and create an account (free sign-up)
  2. Navigate to Settings → API Keys
  3. Click Create New Key, name it (e.g., claude-code-m2.7), and copy the key
  4. Keep the key out of source code, public repos, and shared notebooks

Store it as an environment variable rather than hardcoding it anywhere:

export NOVITA_API_KEY="your_key_here"

Step 2: Install Claude Code

If Claude Code is already installed, skip to Step 3.

macOS, Linux, or WSL:

curl -fsSL https://claude.ai/install.sh | bash

Windows PowerShell:

irm https://claude.ai/install.ps1 | iex

Windows CMD:

curl -fsSL https://claude.ai/install.cmd -o install.cmd && install.cmd && del install.cmd

Windows users need Git for Windows installed first.

Verify the installation:

claude --version

Step 3: Configure Environment Variables

Claude Code reads four variables to route requests to a custom model endpoint. Set all four before launching Claude Code.

macOS and Linux:

export ANTHROPIC_BASE_URL="https://api.novita.ai/anthropic"
export ANTHROPIC_AUTH_TOKEN="<your Novita API key>"
export ANTHROPIC_MODEL="minimax/minimax-m2.7"
export ANTHROPIC_SMALL_FAST_MODEL="minimax/minimax-m2.7"

Windows PowerShell:

$env:ANTHROPIC_BASE_URL = "https://api.novita.ai/anthropic"
$env:ANTHROPIC_AUTH_TOKEN = "<your Novita API key>"
$env:ANTHROPIC_MODEL = "minimax/minimax-m2.7"
$env:ANTHROPIC_SMALL_FAST_MODEL = "minimax/minimax-m2.7"

ANTHROPIC_SMALL_FAST_MODEL controls lightweight operations like file navigation and quick searches. Setting it to the same model keeps all requests on the same billing path and pricing.

To make the configuration persistent, add the export lines to your shell profile (~/.bashrc, ~/.zshrc, or PowerShell profile). Or keep a project-level .env file and source it before each session — just don’t commit it.

Step 4: Start Claude Code and Verify

Navigate to your project directory and start Claude Code:

cd your-project-directory
claude .

At the prompt, run a quick test:

> What files are in this directory?

If the model responds with a file list by invoking the appropriate tool, the integration is working. You can confirm the model in use and check per-request token costs in your Novita AI usage dashboard.

If you see an authentication error, double-check that ANTHROPIC_AUTH_TOKEN is set in the current shell session — not just in a different terminal window.

Step 5: Optimize for Agentic Workflows

M2.7’s 97% tool adherence and 204,800-token context make it well-suited for extended agentic sessions. A few practices that get more out of those capabilities:

Give the model the full codebase context first. Before asking for a change, let Claude Code read the relevant files. M2.7’s large context window means you can load multiple modules without hitting limits early.

> Read all files in src/api/ and identify any functions that handle authentication

Use explicit task decomposition for complex work. Breaking a large request into sequential steps reduces the chance of partial output or mid-session confusion:

> First, analyze the current database schema. Then propose the migration needed to add soft deletes. Don't write any code yet.

Leverage tool chaining for debugging. M2.7’s production debugging strength is most visible in multi-step sequences: reproduce → trace → fix → verify.

> Run the test suite, identify failing tests, find the root cause in the source code, and propose a fix

Keep sessions focused. M2.7 handles multi-turn coherence well, but narrower sessions with clear goals produce more reliable output than open-ended exploratory sessions that shift direction repeatedly.

Cost Analysis

At $0.3/$1.2 per million tokens (input/output), M2.7 is materially cheaper than most frontier options while staying competitive on agentic benchmarks.

Typical Claude Code workflow costs with M2.7 via Novita AI:

WorkflowEstimated cost
Small refactor (5 files, ~200 line changes)$0.02–$0.05
Feature implementation (20 files, ~1000 lines)$0.10–$0.20
Full codebase analysis (100+ files)$0.25–$0.60
One hour of continuous coding at 100 tokens/sec~$0.27

Model cost comparison for Claude Code use:

ModelInput per 1M tokensOutput per 1M tokensvs. M2.7 input
MiniMax M2.7 (Novita AI)$0.30$1.20
MiniMax M2.5 (Novita AI)$0.30$1.20same
GLM-5 (Novita AI)$1.00$3.203.3x more
Claude Opus 4.6~$7.00~$21.0023x more

Sources: Novita AI model pages, checked 2026-07-03. Verify current pricing before budgeting production workloads.

Prompt caching cuts costs further on repeated context. At $0.06/Mt for cache reads, workloads with stable system prompts or large shared context benefit significantly.

Real-World Use Cases

Multi-file refactoring. M2.7’s 52.7% on Multi-SWE-Bench shows it can handle complex cross-file changes. Claude Code provides the interactive approval layer — review each proposed change before committing.

Production SRE and incident response. The model was trained on production debugging scenarios: correlating metrics, querying databases, tracing root causes, applying non-blocking fixes. Claude Code’s terminal integration lets it run diagnostics and verify fixes in the same session.

Agent framework development. If you’re building something that orchestrates other tools or models, M2.7’s native Agent Teams support and role consistency make it a solid choice for the orchestrator layer. Its 97% tool adherence matters more here than raw benchmark scores.

Polyglot codebases. M2.7 shows strong performance across multiple programming languages and handles multi-language projects without the quality drop you sometimes see from models that were primarily trained on Python/JavaScript.

Cost-sensitive teams scaling agentic workflows. At $0.30/Mt input, teams running continuous Claude Code sessions across multiple developers can keep costs manageable without dropping to a significantly weaker model.

Troubleshooting

Authentication error (401)

Check that ANTHROPIC_AUTH_TOKEN is set in the active shell session. The variable must be exported before running claude .. Regenerating the key and re-exporting is the fastest fix if you suspect key corruption.

Model not found

Confirm the model ID is exactly minimax/minimax-m2.7 — no spaces, no capitalization changes. Display names like MiniMax M2.7 do not work as model IDs.

Requests reaching Anthropic instead of Novita

Verify ANTHROPIC_BASE_URL is set to https://api.novita.ai/anthropic. If the variable is missing or blank, Claude Code falls back to Anthropic’s endpoint and will reject requests without a valid Anthropic key.

Slow responses on first request

The first request in a session may take a few extra seconds due to cold-start behavior. Subsequent requests in the same session are typically faster. If slowness persists across requests, try reducing max_tokens on test calls to isolate whether it’s generation time or network latency.

Context window errors

M2.7 supports 204,800 tokens. If you’re loading very large codebases, Claude Code’s context management will handle compaction automatically. For manual control, limit the files you ask Claude Code to read in a single turn.

Higher-than-expected costs

Check ANTHROPIC_SMALL_FAST_MODEL. If it’s pointing to a different, more expensive model, lightweight operations like directory navigation will cost more than expected. Setting both variables to minimax/minimax-m2.7 normalizes all traffic.

FAQ

Is MiniMax M2.7 available on Novita AI with the Anthropic-compatible endpoint?

Yes. The model is available as minimax/minimax-m2.7 with both OpenAI-compatible and Anthropic-compatible endpoints. Claude Code uses the Anthropic path at https://api.novita.ai/anthropic.

What’s the difference between M2.7 and M2.5 for Claude Code workflows?

M2.7 is the stronger model across all benchmarks: SWE-Bench Pro improved from 52.2% to 56.2%, Multi-SWE-Bench from 51.3% to 52.7%, and MLE-Bench lite from 31.5% to 66.6%. The pricing is the same ($0.3/$1.2 per million tokens), so there’s no cost reason to use M2.5 over M2.7 for new setups.

Does M2.7 support vision or image inputs in Claude Code?

No. MiniMax M2.7 is text-only. If your workflow involves screenshots, UI analysis, or image context, you’d need a multimodal model. For text-based coding workflows, this limitation doesn’t apply.

Can I use M2.7 in Claude Code alongside MCP tools?

Yes. Claude Code’s MCP (Model Context Protocol) support works at the tool layer, not the model layer. M2.7’s high tool adherence makes it a good candidate for MCP-heavy setups.

Does function calling work reliably for Claude Code’s built-in tools?

Yes. M2.7 was trained specifically for high tool-call accuracy — 97% adherence across 40+ concurrent tools in production conditions. Claude Code’s built-in tools (Read, Write, Edit, Bash, Glob, Grep) are well within that range.

Will this setup work with the Claude Code VS Code extension?

The environment variables apply to the CLI runtime. If you’re using the VS Code extension, set the variables in your shell profile or in VS Code’s integrated terminal settings so they’re available when the extension starts a Claude Code session.

MiniMax M2.7 on Novita AI: Top-Tier Intelligence, Budget-Friendly Pricing Full launch coverage: M2.7’s self-evolution architecture, benchmark results, agentic capabilities, and pricing comparison against GLM-5 and Kimi K2.5 on Novita AI.

Use MiniMax M2.5 in Claude Code: Agentic Coding Guide Same integration pattern for M2.5 — useful if you’re comparing M2.5 and M2.7 for your workflow, or migrating an existing M2.5 setup to M2.7.

How to Use GLM-5 in Claude Code: Setup Guide Setup guide for GLM-5 in Claude Code via Novita AI — the model that benchmarks closest to M2.7 on Intelligence Index, at a different price and context size.


Novita AI is the AI & agent cloud platform helping developers and startups build, deploy, and scale models and agentic applications with high performance, reliability, and cost efficiency.