Building AI agents is increasingly straightforward. Deploying them at scale remains a significant challenge. Traditional infrastructure like containers, virtual machines, and serverless functions was designed for conventional web applications. It doesn’t meet the unique demands of AI agent workloads.
Novita Agent Runtime addresses this gap by providing purpose-built serverless infrastructure for deploying AI agents. Built on Novita’s agent sandbox, it supports LangGraph workflows, Microsoft AutoGen multi-agent systems, and custom implementations. Deploy your existing agents to production with minimal code changes and zero infrastructure management.
The Infrastructure Gap Holding Back AI Agents
Traditional applications follow a predictable pattern: receive a request, process it, return a response. The entire lifecycle completes in milliseconds or seconds, with each request handled independently.
AI agents operate differently. They maintain reasoning state across interactions, run extended workflows, call external services with unpredictable latency, and require strong isolation for concurrent users handling sensitive data.
Existing infrastructure doesn’t fit these needs well:
| Feature | Agent Sandbox | Container | Serverless | Virtual Machine |
| Startup Time | <200ms (MicroVM) | Seconds | 1s | 30+ seconds |
| Security Isolation | Strong (MicroVM) | Weak (shared kernel) | Strong | Strong |
| State Persistence | Instant suspend/resume | Supported | Stateless | Slow snapshot recovery |
| Max Execution | Hours | Unlimited | Typically 15 min limit | Unlimited |
| Developer Experience | Agent-optimized APIs | Generic APIs | Function-level APIs | Infra-level APIs |
Novita Agent Sandbox combines millisecond startup, MicroVM isolation, stateful execution, and agent-native APIs. Novita Agent Runtime, built on Agent Sandbox, lets you deploy existing agents to this infrastructure with minimal code changes.
Novita Agent Runtime: Purpose-Built for AI Agents

Novita Agent Runtime is a lightweight, framework-agnostic serverless deployment toolkit built on AgentCore-compatible architecture. It enables you to deploy existing AI agents to production safely and efficiently, without requiring extensive DevOps expertise or infrastructure management.
The toolkit supports both real-time interactions and long-running workloads, from conversational interfaces that need sub-second response times to complex reasoning tasks that may take hours to complete.
Novita Agent Runtime includes an SDK and CLI tools. The SDK provides decorator-based APIs to expose your agent as a standard HTTP service and methods to invoke agents programmatically. The CLI enables one-click configuration and deployment to the Novita Agent Sandbox ecosystem.
Key Capabilities of Novita Agent Runtime

Framework agnostic. Novita Agent Runtime works with LangGraph, Microsoft AutoGen, Google ADK, OpenAI Agents SDK, CrewAI, and custom implementations. Use your preferred framework without infrastructure lock-in.
Model agnostic. The runtime operates independently of your model choice. It works with Novita AI, Anthropic Claude, Google Gemini, OpenAI, and any other provider.
Sub-200ms cold starts. Lightweight virtualization technology achieves near-container startup speeds with hardware-level environment isolation. Even first requests receive sub-second responses.
Full session isolation. Each user session runs in a dedicated microVM with isolated CPU, memory, and filesystem resources. After session completion, the entire microVM terminates and memory is sanitized.
Hours-long execution. The platform supports long-running workloads spanning multiple hours, enabling complex agent reasoning, asynchronous workflows, and multi-agent collaboration.
Consumption-based pricing. Pay only for what you use. Scale from prototype to production without overpaying for unused capacity or worrying about resource planning.
From Code to Cloud in Three Steps
Step 1: Integrate the SDK. Add a few lines of code to your existing agent implementation using the SDK’s decorator-based API. The SDK automatically handles request routing, response formatting, and health checks.
from novita_sandbox.agent_runtime import AgentRuntimeApp
app = AgentRuntimeApp()
@app.entrypoint
def my_agent(request: dict):
# Agent business logic
return {"result": "..."}
Step 2: Deploy with one command. Use the CLI to configure and deploy your agent:
# Configure your agent novita-sandbox-cli agent configure # Deploy to cloud novita-sandbox-cli agent launch
The CLI generates a Dockerfile and .novita-agent.yaml configuration file, builds your sandbox template and uploads it, then generates an Agent ID in the format <agent_name>-<template_id>.
Step 3: Invoke your agent. After deployment, invoke your agent via CLI for quick testing:
# Quick test with CLI novita-sandbox-cli agent invoke "Hello, Agent!"
Or invoke programmatically via SDK:
import json
from novita_sandbox.agent_runtime import AgentRuntimeCliendevet
client = AgentRuntimeClient(api_key="your-api-key")
# Prepare request data (converted to a JSON string and encoded as bytes)
payload = json.dumps({"prompt": "Hello, Agent!"}).encode()
response = await client.invoke_agent_runtime(
agentId="agent-xxxxx",
payload=payload
)
Each invocation creates an isolated sandbox instance, executes the agent within that secure environment, and returns the result.
For a detailed walkthrough with code examples, see the Quick Start Guide. For complete installation instructions, see the Installation Guide.
Works With Every Major AI Framework
All framework integrations follow a consistent pattern: initialize the AgentRuntimeApp, set up your framework, define an entry point with the decorator, and run the application. The SDK works seamlessly with LangGraph, OpenAI Agents SDK, Microsoft AutoGen, Google ADK, and custom implementations.
from novita_sandbox.agent_runtime import AgentRuntimeApp
# 1. Create Agent Runtime application instance
app = AgentRuntimeApp()
# 2. Initialize your Agent framework
# 3. Define entry point with decorator
@app.entrypoint
def agent_invocation(request: dict) -> dict:
"""
Args:
request: Request data, typically contains fields like prompt
Returns:
Response data dictionary
"""
prompt = request.get("prompt", "")
# Call your Agent framework
result = your_agent.run(prompt)
return {"result": result}
# 4. Run the application
if __name__ == "__main__":
app.run()
For detailed integration examples, see the Agent Framework Integration Guide. For advanced features including streaming responses, multi-turn conversations, and environment variable management, refer to the Advanced Features documentation.
Transparent Pay-Per-Second Pricing
Novita Agent Runtime uses consumption-based pricing with per-second billing granularity.
CPU
| vCPUs | Unit Price (per second) |
| 1× CPU | $0.00000784/s |
| 2× CPU | $0.00001568/s |
| 3× CPU | $0.00002352/s |
| 4× CPU | $0.00003136/s |
| 5× CPU | $0.0000392/s |
| 6× CPU | $0.00004704/s |
| 7× CPU | $0.00005488/s |
| 8× CPU | $0.00006272/s |
Memory
| Memory | Unit Price |
| Valid values: multiples of 512 MiB, from 512 MiB to 8192 MiB (per GiB / s) | $0.00000256 / GiB / s |
| 512 MiB | $0.00000128 / s |
| 1 GiB | $0.00000256 / s |
| 2 GiB | $0.00000512 / s |
Storage: Each account includes 60 GB of free storage. Additional storage is billed at $0.000072/GB/h.
Start Building Production AI Agents Now
Novita Agent Runtime removes the infrastructure complexity that traditionally slows AI agent deployment. Purpose-built serverless infrastructure with sub-200ms cold starts, complete session isolation, and framework flexibility enables development teams to focus on building intelligent agent behaviors rather than managing infrastructure.
Get started today by installing the SDK and CLI, then deploy your first agent in minutes. For additional support, join the Novita AI Discord community or contact our sales team.
About Novita AI
Novita AI is an AI cloud platform that offers developers an easy way to deploy AI models using our simple API, while also providing an affordable and reliable GPU cloud for building and scaling.
Discover more from Novita
Subscribe to get the latest posts sent to your email.





