How to Use Kimi K2.7 Code in Claude Code via Novita AI

How to Use Kimi K2.7 Code in Claude Code via Novita AI

Kimi K2.7 Code is a coding-specialized MoE model from MoonshotAI with a 256K context window, interleaved thinking, and multi-step tool calling. Through Novita AI’s Anthropic-compatible endpoint, you can wire it directly into Claude Code — keeping your existing workflow while swapping to a model built specifically for agentic coding at a fraction of Claude Sonnet’s price.

This guide walks through every step: getting your API key, setting environment variables, and starting Claude Code with moonshotai/kimi-k2.7-code as the model.

Why Use Kimi K2.7 Code in Claude Code?

Claude Code uses the Anthropic SDK under the hood, so it needs an Anthropic-compatible endpoint — not an OpenAI-compatible one. Novita AI exposes exactly that at https://api.novita.ai/anthropic, making Kimi K2.7 Code a drop-in model for Claude Code without any wrapper libraries or extra tooling.

The practical case comes down to three things:

Cost. At $0.95 per million input tokens and $4.00 per million output tokens on Novita AI (verified June 2026), Kimi K2.7 Code is roughly 68% cheaper on input and 73% cheaper on output than Claude Sonnet 4.5 ($3.00/$15.00 per million tokens). For teams running hundreds of coding tasks a day, that difference is meaningful.

Context. The 256K token context window means you can send substantial repository context — multiple files, test output, architecture notes — without hitting a wall mid-session. Most day-to-day coding agents work well within 32K–64K tokens; having 256K means you rarely need to prune context.

Coding specialization. Kimi K2.7 Code is purpose-built for coding and agentic workflows, not a general-purpose model. Its interleaved thinking architecture generates ~30% fewer thinking tokens than Kimi K2.6, which translates to faster responses on multi-step coding tasks.

Kimi K2.7 Code Specs at a Glance

FieldValue
Model IDmoonshotai/kimi-k2.7-code
ArchitectureMixture of Experts (MoE)
Total parameters1T
Activated parameters32B per token
Context window262,144 tokens (~256K)
Max output tokens262,144 tokens
Input modalitiesText, image, video
Output modalityText
FeaturesFunction calling, structured outputs, reasoning (interleaved thinking)
Endpoints on Novita AIchat/completions, anthropic

For Claude Code, use the anthropic endpoint family — that is what the Anthropic SDK expects.

How Much Does Kimi K2.7 Code Cost on Novita AI?

Token typeNovita AI priceClaude Sonnet 4.5 price
Input$0.95 / 1M$3.00 / 1M
Cache-read input$0.19 / 1M
Output$4.00 / 1M$15.00 / 1M

Pricing based on the Kimi K2.7 Code model page on Novita AI as of June 2026. Novita AI also lists cache-read pricing, which matters for repeated-context workflows like agents that reuse the same system prompt and tool schema across many calls.

Step 1: Get Your Novita AI API Key

Sign up for a Novita AI account — new accounts get free trial credits.

Once logged in:

  1. Go to Key Management in your dashboard.
  2. Click Create New Key.
  3. Copy the key immediately and store it somewhere safe — it is shown only once.

You will use this key as ANTHROPIC_AUTH_TOKEN in the next step.

Step 2: Install Claude Code

Claude Code requires Node.js 18 or higher. Check your version first:

node --version

Install Claude Code globally:

npm install -g @anthropic-ai/claude-code

Verify the installation:

claude --version

Step 3: Configure Environment Variables

Claude Code reads four environment variables to know which endpoint, API key, and model to use. Set all four — ANTHROPIC_SMALL_FAST_MODEL controls which model Claude Code uses for lightweight sub-tasks like summaries and quick edits.

Mac and Linux

export ANTHROPIC_BASE_URL="https://api.novita.ai/anthropic"
export ANTHROPIC_AUTH_TOKEN="YOUR_NOVITA_API_KEY"
export ANTHROPIC_MODEL="moonshotai/kimi-k2.7-code"
export ANTHROPIC_SMALL_FAST_MODEL="moonshotai/kimi-k2.7-code"

To make these permanent, add the four lines to ~/.zshrc or ~/.bashrc, then run source ~/.zshrc (or ~/.bashrc).

Windows (Command Prompt)

set ANTHROPIC_BASE_URL=https://api.novita.ai/anthropic
set ANTHROPIC_AUTH_TOKEN=YOUR_NOVITA_API_KEY
set ANTHROPIC_MODEL=moonshotai/kimi-k2.7-code
set ANTHROPIC_SMALL_FAST_MODEL=moonshotai/kimi-k2.7-code

These environment variables last for the current session. For permanent setup on Windows, add them through System Properties → Environment Variables.

What each variable does

VariableValuePurpose
ANTHROPIC_BASE_URLhttps://api.novita.ai/anthropicPoints Claude Code to Novita AI’s Anthropic-compatible endpoint
ANTHROPIC_AUTH_TOKENYour Novita API keyAuthenticates your requests
ANTHROPIC_MODELmoonshotai/kimi-k2.7-codeSets the primary model for coding tasks
ANTHROPIC_SMALL_FAST_MODELmoonshotai/kimi-k2.7-codeSets the model for lightweight sub-tasks

Step 4: Launch Claude Code

Navigate to your project directory and start a session:

cd your-project-directory
claude .

Claude Code opens an interactive prompt. You can now describe tasks in plain English — implement a feature, fix a bug, refactor a module, write tests — and Kimi K2.7 Code handles the reasoning and code generation through Novita AI’s endpoint.

To verify the model is routing correctly, run /status inside the Claude Code session. It should show the configured base URL and model.

Practical Coding Workflow Tips

Send more context upfront. With 256K tokens available, you can include the full content of relevant files rather than just excerpts. Claude Code can reference the actual code rather than reasoning from summaries, which reduces hallucination on implementation details.

Use interleaved thinking for complex tasks. Kimi K2.7 Code reasons step-by-step before generating code. For multi-file refactors or architecture decisions, give the model enough context to see the full picture — it will plan before writing, which reduces follow-up corrections.

Multimodal debugging. Kimi K2.7 Code accepts images and video as input. If your workflow includes screenshot-based bug reports or UI review tasks, you can pipe those directly into the session. Responses are always text, so the output is code, plans, or analysis.

Cache-heavy system prompts. If you use a consistent system prompt across many sessions — coding standards, project conventions, architecture notes — Novita AI’s cache-read pricing at $0.19 per million tokens kicks in on repeated content. For teams with long, stable system prompts, this reduces per-task cost significantly.

Stay on one model for consistency. Setting both ANTHROPIC_MODEL and ANTHROPIC_SMALL_FAST_MODEL to moonshotai/kimi-k2.7-code keeps behavior consistent across the main task and sub-tasks. If you later want a lighter model for sub-tasks to save cost, you can swap only ANTHROPIC_SMALL_FAST_MODEL.

Troubleshooting

401 Unauthorized

Your API key is incorrect or has expired. Verify the key at Novita AI Key Management. Check for extra spaces or line breaks when copying the key.

Model not found / 404

Confirm the model ID is exactly moonshotai/kimi-k2.7-code — no extra spaces, correct capitalization. You can verify it on the Kimi K2.7 Code model page.

Slow responses on long prompts

Enable streaming by default in Claude Code — most configurations do this automatically. For very long context inputs (100K+ tokens), initial response latency increases. Consider trimming low-priority context first.

Environment variables not picked up

On Mac/Linux, confirm you sourced the profile file after editing it (source ~/.zshrc). On Windows, environment variables set via set last only for the current Command Prompt session — use the System Properties panel for persistent variables.

FAQ

Does Kimi K2.7 Code work with Claude Code’s tool use and MCP integrations?

Yes. Kimi K2.7 Code supports function calling through Novita AI’s Anthropic-compatible endpoint, which is what Claude Code uses for tool calls and MCP integrations.

Why use the Anthropic endpoint instead of OpenAI-compatible?

Claude Code is built on the Anthropic SDK. It communicates using Anthropic’s message format, not OpenAI’s. Novita AI’s https://api.novita.ai/anthropic endpoint translates that format, so Claude Code works without any modification.

How does Kimi K2.7 Code compare to Kimi K2.5 for Claude Code?

Kimi K2.7 Code generates approximately 30% fewer thinking tokens than K2.6 (and improves over K2.5’s efficiency), while maintaining coding quality. For Claude Code sessions with repeated multi-step tasks, fewer thinking tokens means faster responses and lower token cost per task.

Can I use this setup in VS Code or Cursor?

Yes. Claude Code integrates with VS Code and Cursor through plugins and the terminal. The same environment variable configuration applies — once set, both IDE integrations and the standalone terminal use the configured model.

Novita AI is an AI cloud platform that offers developers an easy way to access state-of-the-art models through a simple API, with affordable and reliable GPU infrastructure.