GLM-4.7 on Novita AI: Long-Context Agentic Coding via API
By
Novita AI
/ January 27, 2026 / LLM / 6 minutes of reading
GLM-4.7 is now available on the Novita AI platform, bringing Z.AI’s latest flagship text model to a production-ready, OpenAI-compatible serverless API. GLM-4.7 is optimized for agentic coding, long-horizon planning, and tool-using workflows, with stronger “think → act” reliability and noticeably improved front-end aesthetics for real product delivery.
On Novita AI, you can run GLM-4.7 with 204,800 context, up to 131,072 output, fp8 quantization, and built-in support for Function Calling and Structured Output.
GLM-4.7 is Z.AI’s latest flagship text model, with major upgrades focused on advanced coding, long-range task planning, and more reliable tool collaboration—designed to complete tasks end-to-end rather than just generating isolated code snippets.
Core specs (official):
Context window: 200K tokens
Max output: 128K tokens
Capabilities: thinking modes, streaming, function calling, context caching, structured output (JSON), and MCP tool/data-source integration
💡What you get on Novita AI (production-ready serverless):
If you already use OpenAI’s Chat Completions style APIs, you can migrate by setting Novita’s base URL and switching the model name—no new protocol to learn.
Built for agentic delivery
Z.AI positions GLM-4.7 around “task completion,” with stronger instruction following during tool use and improved stability for complex agent loops.
GLM-4.7 Capabilities & Benchmarks
GLM-4.7 is designed around agentic coding (shipping tasks end-to-end), stronger reasoning with controllable thinking, and more reliable tool-using workflows—with a noticeable jump in web/UI generation quality (“vibe coding”).
Capabilities
Agentic Coding, end-to-end: better at planning, implementing, and iterating across multi-file projects and real agent frameworks.
Thinking before acting (more stable agents): improved instruction-following and complex-task stability; supports turn-level control to balance cost/latency vs. reliability.
Tool Using & Web browsing: stronger tool execution patterns and browsing-style tasks.
Complex Reasoning uplift: measurable gains on hard reasoning evaluations (including tool-augmented settings).
Vibe Coding (UI & slides quality): cleaner modern webpages and better-looking slides/layout.
Standardized Benchmarks
The following scores are reported by Z.AI:
Category
Benchmark
GLM-4.7
Coding (real bugfix)
SWE-bench Verified
73.8
Agentic / terminal
Terminal Bench 2.0
41.0
Coding (live)
LiveCodeBench v6
84.9
Tool use (interactive)
τ²-Bench
87.4
Web browsing
BrowseComp
52.0 (and 67.5 w/ context manage)
Reasoning (tools)
HLE (w/ Tools)
42.8
LMArena “Human Preference” Signal
LMArena rankings are based on blind user votes and are a useful “how it feels” complement to benchmarks.
WebDev Leaderboard: GLM-4.7 is #6 with Score 1447 (+10/-10), 4,833 votes (last updated Jan 16, 2026).
Text Arena (Overall): GLM-4.7 is #18 with Score 1443 (±7), 8,258 votes (last updated Jan 12, 2026).
🏆Open-model positioning: on both leaderboards, the models ranked above GLM-4.7 are shown with Proprietary licenses, while GLM-4.7 is MIT—making it the highest-ranked open-license model in WebDev and Text (Overall)at the time of those leaderboard updates.
Getting Started with GLM-4.7 on Novita AI
Option A: Use the Playground
The easiest way to get to know GLM-4.7 is to try it directly in the Novita AI Playground.You can start interacting with GLM-4.7 instantly in the Novita AI Playground—no setup, no code. Just sign up, open the Playground, and test prompts in real time. New accounts receive free credits after registration, so you can try the model right away.
Connect GLM-4.6V to your applications using Novita AI’s unified REST API.
Getting Your API Key on Novita AI
Step 1: Create or Login to Your Account
Visit https://novita.ai and sign up or log in to your existing account
Step 2: Navigate to Key Management
After logging in, find “API Keys”
Step 3: Create a New Key
Click the “Add New Key” button.
Step 4: Save Your Key Immediately
Copy and store the key as soon as it is generated; it is usually shown only once and cannot be retrieved later. Keep the key in a secure location such as a password manager or encrypted notes
Direct API Integration
from openai import OpenAI
client = OpenAI(
api_key="<Your API Key>",
base_url="https://api.novita.ai/openai"
)
response = client.chat.completions.create(
model="zai-org/glm-4.7",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, how are you?"}
],
max_tokens=131072,
temperature=0.7
)
print(response.choices[0].message.content)
Build sophisticated agent systems with plug-and-play integration—supporting handoffs, routing, and tool use via native function calling, plus the full long-context window for complex, multi-step tasks.
Option C: Connect with Third-Party Platforms
If you’re already building with agent frameworks or developer tools, Novita AI is designed to plug in with minimal friction:
Agent frameworks & app builders: Follow Novita’s step-by-step integration guides to connect with popular tooling such as Continue, AnythingLLM, LangChain, and Langflow.
Hugging Face Hub: Novita is listed as an Inference Provider on Hugging Face, so you can run supported models through Hugging Face’s provider workflow and ecosystem.
OpenAI-compatible API: Novita’s LLM endpoints are compatible with the OpenAI API standard, making it easy to migrate existing OpenAI-style apps and connect many OpenAI-compatible tools ( Cline, Cursor, Trae and Qwen Code) .
Anthropic-compatible API (Claude Code workflows): Novita also provides Anthropic SDK–compatible access so you can integrate Novita-backed models into Claude Code style agentic coding workflows.
OpenCode (Built-in provider): Novita AI is now integrated directly into OpenCode as a supported provider, so users can select Novita in OpenCode without manual configuration.
Production Patterns
Use Prompt Cache for long-horizon agents
If you run multi-turn workflows on large, stable context (repo snapshot, long spec, design doc), caching can significantly reduce cost—Novita exposes Cache Read pricing explicitly.
Structured Output for reliable pipelines
When integrating with workflow engines, validators, or UIs, prefer JSON-structured outputs (schema-driven) to reduce parsing edge cases. Novita lists Structured Output as supported for GLM-4.7.
Function Calling for tool-augmented coding
Wrap your tools as functions: repo search, ticket lookup, CI trigger, database read, web fetch—then let the model decide when to call them. GLM-4.7 is explicitly designed for stronger tool collaboration.
Thinking mode policy: “fast by default, deep when needed”
trivial Q&A / formatting: thinking off
debugging / multi-step refactors: thinking on
long tasks: consider modes that improve stability and cache hit rate
Conclusion
GLM-4.7 brings a practical set of upgrades for developers building agentic coding and long-horizon tool-using workflows: 200K context, controllable thinking, stronger function calling behavior, and better front-end “vibe coding” outputs.
On Novita AI, you can start immediately with an OpenAI-compatible serverless API, with transparent token pricing and built-in support for function calling and structured outputs—ready for production-grade agent pipelines.
Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing affordable and reliable GPU cloud for building and scaling.
Frequently Asked Questions
What is GLM-4.7?
GLM-4.7 is Z.AI’s flagship LLM, positioned for enhanced programming and more stable multi-step reasoning/execution, and it is released with an official open-weights model (available on Hugging Face).
Is GLM-4.7 free?
On Novita AI, GLM-4.7 is pay-per-token: $0.6/M tokens (input), $0.11/M tokens (cache read), and $2.2/M tokens (output) On Z.ai, access is commonly packaged via a paid Coding Plan (starting at $3/month). Some platforms may offer limited trials/quotas like Novita AI, but GLM-4.7 itself isn’t universally “free.”
Is GLM-4.7 really good?
For coding + agentic workflows, it’s positioned as a top-tier open model by its publisher. Z.AI reports strong results on coding and agent benchmarks (e.g., LiveCodeBench v6, SWE-bench Verified, BrowseComp, τ²-Bench), and frames it as competitive with Claude Sonnet 4.5 on several measurements
Discover more from Novita
Subscribe to get the latest posts sent to your email.