- How Sandbox Pricing Actually Works
- Per-Session Fees
- Compute Tiers: vCPU and Memory
- Storage: Ephemeral vs. Persistent
- Egress and Network Fees
- Package Caching Economics
- Idle Time and Autopause
- Self-Hosted: The Hidden Cost Model
- Cost Estimates for Three Common Workloads
- Questions to Ask Any Sandbox Vendor
- Conclusion
- FAQ
Before you commit to an AI agent sandbox platform, understand how its pricing model fits your actual workload. Sandbox costs are not just compute rates — they’re a combination of session fees, resource tiers, storage, egress, package caching behavior, and idle time handling. Get one dimension wrong and your cost estimate for a real coding agent or browser automation workflow can be off by an order of magnitude.
This guide breaks down each pricing axis, shows how they interact in common workloads, and gives you a comparison framework to evaluate vendors on cost before signing up.
How Sandbox Pricing Actually Works
Most managed sandbox providers bill on some combination of:
- Compute time: CPU and RAM consumed per second (or per minute) while the sandbox is running
- Session overhead: a flat per-session startup charge, or a minimum billing unit that applies even for short runs
- Storage: persistent volume space above the included free tier
- Egress: outbound data transfer, usually measured in GB
- Subscription tier: a monthly minimum that unlocks higher concurrency, longer sessions, or custom resource configs
No provider makes money on idle sandboxes that are autopause-eligible — but not all providers implement autopause the same way. The billing model’s edge cases matter as much as the headline rate.
Per-Session Fees
Some providers charge a flat fee for each sandbox that starts, independent of how long it runs or what resources it uses. Others bill only compute time with no per-session overhead.
A per-session charge matters most when you have high-frequency, short-lived workloads — for example, a code interpreter that spawns and destroys a sandbox for every user turn in a chat session. If a session costs $0.001 to start and your application runs 10,000 sessions per day, that’s $10/day in session fees before any compute is counted.
What to ask: Does the provider charge a minimum per-session fee, or only for actual compute time? What is the minimum billing unit (per second, per minute, per 5 minutes)?
For Novita Agent Sandbox, billing is per second based on actual vCPU and memory usage with no additional per-session startup fee. Pricing as of mid-2026: 1 vCPU at $0.0000098/s, with memory at $0.0000016/GiB/s. A short 5-minute task on 1 vCPU + 512 MiB RAM costs approximately $0.0032 total. (Source: Novita AI pricing page, verified in published Novita documentation.)
For E2B Pro (as documented in Novita’s published comparison articles), 1 vCPU is priced at $0.0000140/s with memory at $0.0000045/GiB/s, plus a $150/month subscription requirement to access custom CPU/RAM configuration and 24-hour session lengths.
Always verify current rates on each provider’s pricing page before committing — sandbox pricing is actively changing in this market.
Compute Tiers: vCPU and Memory
Compute is the dominant cost for most sandbox workloads. The variables are:
- vCPU count: most providers bill linearly per vCPU
- Memory: billed per GiB/s, usually at a lower rate than compute
- Configurability: some providers offer fixed tiers (e.g., 1/2/4/8 vCPU), others allow arbitrary allocation
For batch agent workloads — running many short tasks in parallel — the ratio of memory to vCPU matters. A data analysis task that loads a large CSV may need 4 GiB RAM but only 1 vCPU. Paying for a fixed 4 vCPU + 4 GiB tier when you only need 1 vCPU + 4 GiB wastes three CPUs of billing time per task.
What to ask: Can I configure vCPU and memory independently? Is there a minimum allocation? What GPU tiers are available if I need model inference inside the sandbox?
The practical implication: a provider with flexible per-resource billing gives teams running mixed workloads (some CPU-heavy, some memory-heavy) better cost control than one with fixed compute bundles.
Storage: Ephemeral vs. Persistent
Sandbox storage comes in two forms with different billing behavior:
Ephemeral storage is the sandbox’s local filesystem during a session. It disappears when the sandbox terminates. Most providers include a free allocation (10–20 GB is common) and do not charge extra for it within that limit.
Persistent storage survives across sessions. This is where agents store checkpoints, generated files, cached artifacts, or workspace state that needs to be available next time. Persistent volumes are typically billed per GB per month, similar to cloud block storage pricing.
The cost trap: if your agent generates large intermediate files (logs, model outputs, raw data) and those accumulate in persistent storage without cleanup, storage charges compound over time. An agent that generates 1 GB of output per day and retains everything for 30 days accumulates 30 GB of storage before you notice.
What to ask: What is the free ephemeral storage allocation per sandbox? Is persistent/workspace storage available and how is it priced? Is there a maximum sandbox disk size? Are there snapshot or template storage fees?
Novita Agent Sandbox includes 20 GB of free sandbox storage. Persistent storage pricing beyond the free tier should be verified on the current pricing page.
Egress and Network Fees
Egress fees catch developers by surprise because they are invisible during development but material at production scale.
Most managed cloud providers charge for:
- Outbound data transfer from the sandbox to the public internet
- Cross-region data transfer if your sandbox region differs from your application servers
- Large file downloads within sandboxes (e.g., downloading datasets, model weights, npm packages)
Sandbox workloads that pull external data — browser automation agents fetching pages, data agents downloading datasets, coding agents cloning repositories — can generate meaningful egress at scale. A coding agent that clones a 500 MB repository in every session and runs 1,000 sessions per day transfers 500 GB/day of egress.
What to ask: Does the provider charge for outbound egress? At what rate? Is inbound data (uploads to the sandbox) also charged? Are there egress caps or throttling at lower plan tiers?
Many sandbox providers do not publish explicit egress pricing and instead include it in platform-wide network cost summaries. Get a clear answer before scaling.
Package Caching Economics
Installing Python packages, npm dependencies, or system packages inside a sandbox on every run is expensive in time, not just cost. A fresh pip install torch can take minutes and add significant compute billing to every session.
Providers handle this differently:
No caching: every sandbox starts from a base image and installs packages from scratch each time. Startup latency is high; compute billing includes install time.
Template/snapshot caching: you create a pre-built sandbox template with packages installed. Sessions start from that snapshot. Startup is fast; package install compute is paid once when the template is built, not per session.
Implicit layer caching: some providers cache package layers automatically across sandboxes of the same image, similar to Docker layer caching, so frequently-used packages are pulled from cache rather than downloaded again.
The economics: if a 5-minute agent task requires 2 minutes of package installation per run, you’re paying 40% of your compute bill for setup, not work. Templates or snapshots eliminate that overhead at the cost of template storage and management complexity.
What to ask: Does the provider support sandbox templates or snapshots? Are templates billed per-template or only when sessions launch from them? How often do template images need to be rebuilt (e.g., when base packages update)?
Novita Agent Sandbox supports templates for pre-built environments. Teams running high-frequency tasks against a consistent package set should evaluate the template storage cost against the per-session package install time savings — for most workloads, templates pay for themselves quickly.
Idle Time and Autopause
Sandboxes that sit idle between agent steps waste money. An agent that pauses for 30 seconds while waiting for an LLM response is still consuming compute billing if the sandbox is running.
Autopause / autoresume (sometimes called pause/resume or snapshot-on-idle) means the sandbox is frozen when no code is executing and only billed for compute when active. This can dramatically reduce costs for workflows with long LLM-wait gaps — for example, a multi-turn coding agent where the LLM takes 10 seconds to generate each code snippet and the sandbox sits idle for those 10 seconds.
What to ask: Does the provider support autopause? What triggers a pause (idle time threshold, explicit API call)? How fast is resume — under 1 second, or closer to a full cold start? Is there a billing difference between a paused sandbox and a running one?
The tradeoff: autopause with slow resume adds latency to each agent step. For latency-sensitive interactive workloads, keeping the sandbox warm (and paying for idle time) may be the right call. For batch workloads running overnight, autopause is almost always worth it.
Self-Hosted: The Hidden Cost Model
Self-hosted or bring-your-own-cloud (BYOC) sandbox deployments have a fundamentally different cost structure than managed cloud services. The infrastructure bill is lower per unit of compute, but the operational overhead is real.
What you pay for in self-hosted:
- VM or bare-metal costs (typically at cloud spot/reserved rates, which are lower than managed sandbox rates)
- Storage: EBS/persistent volumes, snapshot storage, and outbound egress from your cloud account
- Ops engineering time: provisioning, scaling, patching, security hardening, and incident response
- Observability infrastructure: logging, metrics, tracing for sandbox lifecycle events
- Compliance work: if you need SOC 2, HIPAA, or similar controls, the work falls on your team
The common mistake is to compare self-hosted compute rates against managed sandbox rates and conclude the self-hosted option is cheaper. The ops and compliance overhead often costs more than the infrastructure savings, especially for teams with fewer than three platform engineers who can own sandbox infrastructure full-time.
Where self-hosted makes sense:
- Teams with existing cloud infrastructure and platform engineering capacity
- Regulatory environments where data cannot leave a specific cloud account or region
- Very high-volume workloads where the managed-vs-self-hosted cost delta at scale exceeds the ops overhead
Novita Agent Sandbox supports BYOC deployment into AWS or GCP accounts for teams that need sandboxes running inside their own VPC for compliance or network policy reasons. E2B does not currently document BYOC as an available option for standard Pro plans, though this may change — verify with each provider at the time of your evaluation.
Cost Estimates for Three Common Workloads
These estimates use Novita’s documented pricing as a reference point. Scale the estimates for your workload’s actual vCPU, memory, session length, and daily session count. Always verify current rates before using these figures for budget planning.
Workload 1: Coding agent (interactive, short sessions)
- Profile: 1 vCPU, 1 GiB RAM, average 10-minute session, 500 sessions/day
- Compute: (0.0000098 × 600s) + (0.0000016 × 1 × 600s) = $0.00588 + $0.00096 = ~$0.007 per session
- Daily: ~$3.50/day, ~$105/month for 500 sessions/day
- Key variable: package caching — without templates, add 2–3 minutes of install time per session
Workload 2: Data analysis agent (medium sessions, larger memory)
- Profile: 2 vCPU, 4 GiB RAM, average 30-minute session, 100 sessions/day
- Compute: (0.0000196 × 1800s) + (0.0000016 × 4 × 1800s) = $0.03528 + $0.01152 = ~$0.047 per session
- Daily: ~$4.70/day, ~$141/month for 100 sessions/day
- Key variable: output file retention — if each session generates 100 MB of stored output, 100 sessions/day = 10 GB/day of storage accumulation
Workload 3: Browser automation agent (long sessions, network-heavy)
- Profile: 2 vCPU, 2 GiB RAM, average 60-minute session, 50 sessions/day
- Compute: (0.0000196 × 3600s) + (0.0000016 × 2 × 3600s) = $0.07056 + $0.01152 = ~$0.082 per session
- Daily: ~$4.10/day, ~$123/month for 50 sessions/day
- Key variable: egress — browser agents fetching 10 MB of page data per session × 50 sessions = 500 MB/day of potential egress charges
These estimates exclude subscription fees, egress, and persistent storage. For providers with a monthly subscription minimum, add that fixed cost before comparing.
Questions to Ask Any Sandbox Vendor
Use this list when evaluating sandbox providers on cost:
Billing model
- Is billing per second, per minute, or in larger units?
- Is there a per-session minimum charge or startup fee?
- Is there a monthly subscription required to access custom resource configs or long sessions?
Compute
- Can vCPU and memory be configured independently?
- What are the minimum and maximum vCPU/memory allocations?
- Are GPU-attached sandboxes available and how are they billed?
Storage
- How much ephemeral storage is included per sandbox?
- Is persistent/workspace storage available? At what price per GB/month?
- Are there snapshot or template storage fees?
Egress
- Is outbound network egress charged? At what rate?
- Are there free egress tiers?
Idle time
- Is autopause supported? What triggers it?
- How fast is resume from paused state?
- Is a paused sandbox billed differently from a running one?
Session limits
- What is the maximum session duration at each plan tier?
- What happens to a session when it exceeds the limit — graceful termination or hard kill?
- What concurrency limits apply at each tier?
Package caching
- Are templates or snapshots supported?
- How are template builds billed?
Self-hosted / BYOC
- Is BYOC deployment supported?
- Which cloud providers (AWS, GCP, Azure)?
- What operational support is provided?
Pricing stability
- When were current rates last updated?
- Is there a committed-use or volume discount available?
Conclusion
Sandbox pricing is more than a per-second rate. The real cost of running AI agents in the cloud depends on how session minimums, compute configurability, storage retention, egress, package caching, and idle time handling combine for your specific workload profile.
Getting this right before you commit matters. A provider that looks cheap on vCPU rate can turn expensive when you factor in a $150/month subscription to unlock custom resource configs, or egress charges on a browser agent that fetches megabytes of page data per session. Conversely, a provider with autopause and snapshot templates can cost significantly less at scale than the headline rate suggests.
Use the estimates and question framework in this guide as a starting point. Plug in your actual session length, vCPU/memory profile, session frequency, and expected storage growth — then compare that against each provider’s current pricing page, not marketing summaries. Sandbox pricing in this market is actively changing, and the number that applies today may not apply in six months.
For teams already familiar with E2B’s SDK, Novita Agent Sandbox is worth evaluating: it uses the same E2B-compatible interface, bills per second with no monthly subscription requirement, and supports BYOC deployment for teams with VPC or compliance constraints. Whether it fits your workload depends on the variables above.
FAQ
What is the cheapest way to run AI agent sandboxes at scale?
The lowest total cost depends on your workload mix. For high-frequency short sessions, minimize per-session overhead and use templates to avoid paying for package installation time. For long-running sessions with LLM wait gaps, autopause significantly reduces idle compute billing. Compare providers on the specific vCPU, memory, and session duration profile that matches your use case — headline rates don’t reflect your actual cost without plugging in those variables.
Is self-hosted always cheaper than managed sandbox services?
Not necessarily. Self-hosted infrastructure has lower per-unit compute costs but adds real operational overhead: provisioning, scaling, patching, observability, and compliance work. For teams without dedicated platform engineering capacity, the ops cost often exceeds the infrastructure savings compared to a managed service. Evaluate the total cost of ownership, not just the cloud bill.
How does package caching affect sandbox pricing?
Without caching, every sandbox start includes package installation time billed as compute. For Python workloads that install common data science or ML libraries, installation can add 2–5 minutes of compute billing per session. Templates or snapshots let you pay for package installation once and reuse that environment across many sessions. For any workload running more than a few hundred sessions per day against a consistent package set, templates typically pay for themselves quickly.
What should I watch out for with egress pricing in sandbox workloads?
Browser automation, data ingestion agents, and workloads that download large files (datasets, model weights, packages from external registries) can generate significant outbound data transfer. Clarify whether your sandbox provider charges for egress and at what rate before scaling these workloads. In some cases, using package mirrors, pre-baked templates, or regional data sources within the same cloud provider can reduce egress charges substantially.
How do I evaluate idle time cost for agents with slow LLM responses?
Estimate the ratio of LLM wait time to active execution time in your workload. If an agent waits 10 seconds for an LLM response between each 2-second code execution step, roughly 83% of session time is idle. A provider with autopause that bills only for active compute saves most of that cost. Compare the pause/resume latency against your workload’s tolerance — if users are waiting on results interactively, slow resume adds noticeable lag.
