Kimi K2-Instruct-0905, the latest evolution from Moonshot AI, represents a breakthrough in agentic intelligence and coding capabilities. This state-of-the-art mixture-of-experts (MoE) language model is now accessible via Novita AI, bringing 1 trillion total parameters, 32 billion activated parameters, and an extended 256,000-token context window to developers worldwide. With support for Claude Code integration, developers can leverage its advanced agentic coding capabilities directly in their terminal workflows.
Current pricing for Kimi K2-Instruct-0905 on Novita AI: $0.6 / M input tokens, $2.5 / M output tokens
What is Kimi K2-Instruct-0905?
Kimi K2-Instruct-0905 is the latest, most capable version of Kimi K2. It is a state-of-the-art mixture-of-experts (MoE) language model, featuring 32 billion activated parameters and a total of 1 trillion parameters.
Enhanced agentic coding intelligence
Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.
Improved frontend coding experience
Kimi K2-Instruct-0905 offers advancements in both the aesthetics and practicality of frontend programming.
Extended context length
Kimi K2-Instruct-0905’s context window has been increased from 128k to 256k tokens, providing better support for long-horizon tasks.
Technical Architecture and Specifications
Kimi K2-Instruct-0905 represents cutting-edge engineering in mixture-of-experts architecture:
| Specification | Value |
|---|---|
| Architecture | Mixture-of-Experts (MoE) |
| Total Parameters | 1 Trillion |
| Activated Parameters | 32 Billion |
| Context Length | 256,000 tokens |
| Number of Layers | 61 (including 1 dense layer) |
| Attention Mechanism | MLA (Multi-Head Latent Attention) |
| Number of Experts | 384 |
| Selected Experts per Token | 8 |
| Vocabulary Size | 160,000 |
| Activation Function | SwiGLU |
This sophisticated architecture enables efficient processing while maintaining the full power of the trillion-parameter model through intelligent expert selection.
Benchmark Performance: Leading the Industry
Kimi K2-Instruct-0905 demonstrates exceptional performance across critical evaluation metrics, particularly in coding and agentic tasks:
Coding Excellence
| Benchmark | Metric | K2-Instruct-0905 | K2-Instruct-0711 | Qwen3-Coder-480B-A35B-Instruct | GLM-4.5 | DeepSeek-V3.1 | Claude-Sonnet-4 | Claude-Opus-4 |
|---|---|---|---|---|---|---|---|---|
| SWE-Bench verified | ACC | 69.2 ± 0.63 | 65.8 | 69.6* | 64.2* | 66.0* | 72.7* | 72.5* |
| SWE-Bench Multilingual | ACC | 55.9 ± 0.72 | 47.3 | 54.7* | 52.7 | 54.5* | 53.3* | – |
| Multi-SWE-Bench | ACC | 33.5 ± 0.28 | 31.3 | 32.7 | 31.7 | 29.0 | 35.7 | – |
| Terminal-Bench | ACC | 44.5 ± 2.03 | 37.5 | 37.5* | 39.9* | 31.3* | 36.4* | 43.2* |
| SWE-Dev | ACC | 66.6 ± 0.72 | 61.9 | 64.7 | 63.2 | 53.3 | 67.1 | – |
These results position Kimi K2-Instruct-0905 as a top performer in real-world coding scenarios, often matching or exceeding leading models like Claude Sonnet 4 and Claude Opus 4.
How to Access Kimi K2-Instruct-0905 on Novita AI
Option 1: Interactive Playground
Experience Kimi K2-Instruct-0905 immediately through Novita AI’s user-friendly interface:
- Instant access: No setup required
- Function calling support: Test tool calling capabilities directly in the playground
- Model comparison: Test against other leading models
- Real-time experimentation: Iterate quickly on prompts and use cases
Option 2: API Integration
Seamlessly integrate Kimi K2-Instruct-0905 into your applications:
from openai import OpenAI
client = OpenAI(
base_url="https://api.novita.ai/openai",
api_key="",
)
model = "moonshotai/kimi-k2-0905"
stream = True # or False
max_tokens = 131072
system_content = "Be a helpful assistant"
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }
chat_completion_res = client.chat.completions.create(
model=model,
messages=[
{
"role": "system",
"content": system_content,
},
{
"role": "user",
"content": "Hi there!",
}
],
stream=stream,
max_tokens=max_tokens,
temperature=temperature,
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
response_format=response_format,
extra_body={
"top_k": top_k,
"repetition_penalty": repetition_penalty,
"min_p": min_p
}
)
if stream:
for chunk in chat_completion_res:
print(chunk.choices[0].delta.content or "", end="")
else:
print(chat_completion_res.choices[0].message.content)
Option 3: Multi-Agent Workflows with OpenAI Agents SDK
Build advanced multimodal agent systems by integrating Novita AI with the OpenAI Agents SDK:
- Plug-and-play: Use Kimi K2-Instruct-0905 in any OpenAI Agents workflow.
- Supports handoffs, routing, and tool use: Design agents that can analyze visual content, delegate tasks, or run functions.
- Python integration: Simply point the SDK to Novita’s endpoint (https://api.novita.ai/v3/openai) and use your API key for seamless agent workflows.
Option 4: Connect Kimi K2-Instruct-0905 API on Third-Party Platforms
- Hugging Face: Use Kimi K2-Instruct-0905 in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
- Agent & Orchestration Frameworks: Easily connect Novita AI with partner platforms like Continue, AnythingLLM, LangChain, Dify and Langflow through official connectors and step-by-step integration guides.
- OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline, Trae, Cursor, Qwen Code, designed for the OpenAI API standard.
- Anthropic-Compatible API: Seamlessly integrate with Claude Code for agentic coding workflows and other Anthropic API-compatible tools.
Use Cases and Applications
Autonomous Coding Agents
- Code generation: Complete functions, classes, and modules
- Bug fixing: Identify and resolve software issues
- Code review: Automated code quality assessment
- Documentation: Generate comprehensive code documentation
Advanced Frontend Development
- Component libraries: Create reusable UI components
- Responsive design: Generate mobile-first, adaptive layouts
- Framework migration: Convert code between different frontend frameworks
- Performance optimization: Suggest and implement performance improvements
Long-Context Applications
- Document analysis: Process and understand lengthy technical documents
- Codebase exploration: Navigate and understand large software projects
- Multi-turn conversations: Maintain context across extended interactions
- Complex reasoning: Handle multi-step analytical tasks
Conclusion
Kimi K2-Instruct-0905 represents the cutting edge of agentic AI technology, combining massive scale with practical intelligence. Its enhanced coding capabilities, extended context window, and superior tool-calling abilities make it an ideal choice for developers pushing the boundaries of what’s possible with AI.
Available now on Novita AI, this model offers the perfect balance of power, accessibility, and cost-effectiveness for both research and production applications.
Try the Kimi K2-Instruct-0905 Demo on Novita AI today and experience the future of agentic intelligence!
Novita AI is a leading AI cloud platform that provides developers with easy-to-use APIs and affordable, reliable GPU infrastructure for building and scaling AI applications.
Frequently Asked Questions
Discover more from Novita
Subscribe to get the latest posts sent to your email.





