Kimi K2-Instruct-0905: Next-Generation Agentic AI Now Available on Novita AI

kimi-k2-0905

Kimi K2-Instruct-0905, the latest evolution from Moonshot AI, represents a breakthrough in agentic intelligence and coding capabilities. This state-of-the-art mixture-of-experts (MoE) language model is now accessible via Novita AI, bringing 1 trillion total parameters, 32 billion activated parameters, and an extended 256,000-token context window to developers worldwide. With support for Claude Code integration, developers can leverage its advanced agentic coding capabilities directly in their terminal workflows.

Current pricing for Kimi K2-Instruct-0905 on Novita AI: $0.6 / M input tokens, $2.5 / M output tokens

What is Kimi K2-Instruct-0905?

Kimi K2-Instruct-0905 is the latest, most capable version of Kimi K2. It is a state-of-the-art mixture-of-experts (MoE) language model, featuring 32 billion activated parameters and a total of 1 trillion parameters.

Enhanced agentic coding intelligence

Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks.

Improved frontend coding experience

Kimi K2-Instruct-0905 offers advancements in both the aesthetics and practicality of frontend programming.

Extended context length

Kimi K2-Instruct-0905’s context window has been increased from 128k to 256k tokens, providing better support for long-horizon tasks.

Technical Architecture and Specifications

Kimi K2-Instruct-0905 represents cutting-edge engineering in mixture-of-experts architecture:

SpecificationValue
ArchitectureMixture-of-Experts (MoE)
Total Parameters1 Trillion
Activated Parameters32 Billion
Context Length256,000 tokens
Number of Layers61 (including 1 dense layer)
Attention MechanismMLA (Multi-Head Latent Attention)
Number of Experts384
Selected Experts per Token8
Vocabulary Size160,000
Activation FunctionSwiGLU

This sophisticated architecture enables efficient processing while maintaining the full power of the trillion-parameter model through intelligent expert selection.

Benchmark Performance: Leading the Industry

Kimi K2-Instruct-0905 demonstrates exceptional performance across critical evaluation metrics, particularly in coding and agentic tasks:

Coding Excellence

BenchmarkMetricK2-Instruct-0905K2-Instruct-0711Qwen3-Coder-480B-A35B-InstructGLM-4.5DeepSeek-V3.1Claude-Sonnet-4Claude-Opus-4
SWE-Bench verifiedACC69.2 ± 0.6365.869.6*64.2*66.0*72.7*72.5*
SWE-Bench MultilingualACC55.9 ± 0.7247.354.7*52.754.5*53.3*
Multi-SWE-BenchACC33.5 ± 0.2831.332.731.729.035.7
Terminal-BenchACC44.5 ± 2.0337.537.5*39.9*31.3*36.4*43.2*
SWE-DevACC66.6 ± 0.7261.964.763.253.367.1

These results position Kimi K2-Instruct-0905 as a top performer in real-world coding scenarios, often matching or exceeding leading models like Claude Sonnet 4 and Claude Opus 4.

How to Access Kimi K2-Instruct-0905 on Novita AI

Option 1: Interactive Playground

Experience Kimi K2-Instruct-0905 immediately through Novita AI’s user-friendly interface:

  • Instant access: No setup required
  • Function calling support: Test tool calling capabilities directly in the playground
  • Model comparison: Test against other leading models
  • Real-time experimentation: Iterate quickly on prompts and use cases

Option 2: API Integration

Seamlessly integrate Kimi K2-Instruct-0905 into your applications:

from openai import OpenAI
  
client = OpenAI(
    base_url="https://api.novita.ai/openai",
    api_key="",
)

model = "moonshotai/kimi-k2-0905"
stream = True # or False
max_tokens = 131072
system_content = "Be a helpful assistant"
temperature = 1
top_p = 1
min_p = 0
top_k = 50
presence_penalty = 0
frequency_penalty = 0
repetition_penalty = 1
response_format = { "type": "text" }

chat_completion_res = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": system_content,
        },
        {
            "role": "user",
            "content": "Hi there!",
        }
    ],
    stream=stream,
    max_tokens=max_tokens,
    temperature=temperature,
    top_p=top_p,
    presence_penalty=presence_penalty,
    frequency_penalty=frequency_penalty,
    response_format=response_format,
    extra_body={
      "top_k": top_k,
      "repetition_penalty": repetition_penalty,
      "min_p": min_p
    }
  )

if stream:
    for chunk in chat_completion_res:
        print(chunk.choices[0].delta.content or "", end="")
else:
    print(chat_completion_res.choices[0].message.content)
  
  

Option 3: Multi-Agent Workflows with OpenAI Agents SDK

Build advanced multimodal agent systems by integrating Novita AI with the OpenAI Agents SDK:

  • Plug-and-play: Use Kimi K2-Instruct-0905 in any OpenAI Agents workflow.
  • Supports handoffs, routing, and tool use: Design agents that can analyze visual content, delegate tasks, or run functions.
  • Python integration: Simply point the SDK to Novita’s endpoint (https://api.novita.ai/v3/openai) and use your API key for seamless agent workflows.

Option 4: Connect Kimi K2-Instruct-0905 API on Third-Party Platforms

  • Hugging Face: Use Kimi K2-Instruct-0905 in Spaces, pipelines, or with the Transformers library via Novita AI endpoints.
  • Agent & Orchestration Frameworks: Easily connect Novita AI with partner platforms like Continue, AnythingLLM, LangChain, Dify and Langflow through official connectors and step-by-step integration guides.
  • OpenAI-Compatible API: Enjoy hassle-free migration and integration with tools such as Cline, Trae, Cursor, Qwen Code, designed for the OpenAI API standard.
  • Anthropic-Compatible API: Seamlessly integrate with Claude Code for agentic coding workflows and other Anthropic API-compatible tools.

Use Cases and Applications

Autonomous Coding Agents

  • Code generation: Complete functions, classes, and modules
  • Bug fixing: Identify and resolve software issues
  • Code review: Automated code quality assessment
  • Documentation: Generate comprehensive code documentation

Advanced Frontend Development

  • Component libraries: Create reusable UI components
  • Responsive design: Generate mobile-first, adaptive layouts
  • Framework migration: Convert code between different frontend frameworks
  • Performance optimization: Suggest and implement performance improvements

Long-Context Applications

  • Document analysis: Process and understand lengthy technical documents
  • Codebase exploration: Navigate and understand large software projects
  • Multi-turn conversations: Maintain context across extended interactions
  • Complex reasoning: Handle multi-step analytical tasks

Conclusion

Kimi K2-Instruct-0905 represents the cutting edge of agentic AI technology, combining massive scale with practical intelligence. Its enhanced coding capabilities, extended context window, and superior tool-calling abilities make it an ideal choice for developers pushing the boundaries of what’s possible with AI.

Available now on Novita AI, this model offers the perfect balance of power, accessibility, and cost-effectiveness for both research and production applications.

Try the Kimi K2-Instruct-0905 Demo on Novita AI today and experience the future of agentic intelligence!


Novita AI is a leading AI cloud platform that provides developers with easy-to-use APIs and affordable, reliable GPU infrastructure for building and scaling AI applications.

Frequently Asked Questions


Discover more from Novita

Subscribe to get the latest posts sent to your email.

Leave a Comment

Scroll to Top

Discover more from Novita

Subscribe now to keep reading and get access to the full archive.

Continue reading