AIO Sandbox: Browser, Shell, Files, MCP, and VSCode in One Docker Container for AI Agents
The agent sandbox problem has been solved wrong for too long. Youβd spin up a browser sandbox for web automation, a separate code execution environment for Python, a third thing for file operations β and then spend half your integration time shuffling data between them.
AIO Sandbox fixes this by putting everything in one Docker container: browser, shell, files, MCP servers, VSCode, and Jupyter. One filesystem. One environment. Zero coordination overhead.
docker run --security-opt seccomp=unconfined --rm -it -p 8080:8080 ghcr.io/agent-infra/sandbox:latest
Thatβs it. You now have:
- Browser + VNC at
/vnc/index.html?autoconnect=true - VSCode Server at
/code-server/ - API docs at
/v1/docs - MCP services at
/mcp
Why This Matters
Traditional agent sandboxes are single-purpose. Browser sandbox? Great, but now how do you get that downloaded PDF into your code execution environment? Shell sandbox? Cool, but whereβs the browser for web scraping?
AIO Sandbox runs everything in the same container with a shared filesystem. Download a file in the browser β itβs immediately available in shell and Jupyter. Write code in VSCode β execute it via the shell API. No file transfers, no mounting tricks, no coordination logic.
from agent_sandbox import Sandbox
client = Sandbox(base_url="http://localhost:8080")
home_dir = client.sandbox.get_context().home_dir
# Execute shell commands
result = client.shell.exec_command(command="ls -la")
print(result.data.output)
# File operations
content = client.file.read_file(file=f"{home_dir}/.bashrc")
# Browser automation
screenshot = client.browser.screenshot()
Same client, same environment, same filesystem.
The MCP Integration
Pre-configured MCP servers ship out of the box:
| Server | Tools |
|---|---|
| browser | navigate, screenshot, click, type, scroll |
| file | read, write, list, search, replace |
| shell | exec, create_session, kill |
| markitdown | convert, extract_text, extract_images |
Point your MCP client at http://localhost:8080/mcp and youβre connected. No configuration, no server setup.
Real-World Example: Web to Markdown
Hereβs the pattern that shows why unified environments matter β scrape a webpage, convert to markdown, save to disk:
import asyncio
import base64
from playwright.async_api import async_playwright
from agent_sandbox import Sandbox
async def site_to_markdown():
c = Sandbox(base_url="http://localhost:8080")
home_dir = c.sandbox.get_context().home_dir
# Browser automation via CDP
async with async_playwright() as p:
browser_info = c.browser.get_info().data
page = await (await p.chromium.connect_over_cdp(browser_info.cdp_url)).new_page()
await page.goto("https://example.com", wait_until="networkidle")
html = await page.content()
screenshot_b64 = base64.b64encode(await page.screenshot()).decode('utf-8')
# Execute Python in Jupyter (same container!)
c.jupyter.execute_code(code=f"""
from markdownify import markdownify
html = '''{html}'''
screenshot_b64 = "{screenshot_b64}"
md = f"{{markdownify(html)}}\\n\\n"
with open('{home_dir}/site.md', 'w') as f:
f.write(md)
print("Done!")
""")
# Read result via file API
return c.file.read_file(file=f"{home_dir}/site.md").data.content
if __name__ == "__main__":
result = asyncio.run(site_to_markdown())
print("Markdown saved!")
Browser β Jupyter β File system. No file transfers. No mounting. Just works.
Agent Framework Integrations
Browser Use
from agent_sandbox import Sandbox
from browser_use import Agent, Tools
from browser_use.browser import BrowserProfile, BrowserSession
sandbox = Sandbox(base_url="http://localhost:8080")
cdp_url = sandbox.browser.get_info().data.cdp_url
browser_session = BrowserSession(
browser_profile=BrowserProfile(cdp_url=cdp_url, is_local=True)
)
agent = Agent(
task='Search for "AI agents" on DuckDuckGo',
llm=ChatOpenAI(model="gpt-4"),
browser_session=browser_session,
)
await agent.run()
LangChain
from langchain.tools import BaseTool
from agent_sandbox import Sandbox
class SandboxTool(BaseTool):
name = "sandbox_execute"
description = "Execute commands in AIO Sandbox"
def _run(self, command: str) -> str:
client = Sandbox(base_url="http://localhost:8080")
result = client.shell.exec_command(command=command)
return result.data.output
OpenAI Assistants
from openai import OpenAI
from agent_sandbox import Sandbox
sandbox = Sandbox(base_url="http://localhost:8080")
def run_code(code, lang="python"):
if lang == "python":
return sandbox.jupyter.execute_code(code=code).data
return sandbox.nodejs.execute_nodejs_code(code=code).data
# Use as a tool in function calling
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "calculate 1+1"}],
tools=[{
"type": "function",
"function": {
"name": "run_code",
"parameters": {
"type": "object",
"properties": {
"code": {"type": "string"},
"lang": {"type": "string"},
},
},
},
}],
)
SDK Options
Python:
pip install agent-sandbox
TypeScript/JavaScript:
npm install @agent-infra/sandbox
Go:
go get github.com/agent-infra/sandbox-sdk-go
Deployment
Docker Compose for production:
version: '3.8'
services:
sandbox:
image: ghcr.io/agent-infra/sandbox:latest
security_opt:
- seccomp:unconfined
shm_size: "2gb"
ports:
- "8080:8080"
environment:
TZ: America/New_York
Kubernetes deployment is documented with resource limits (2Gi memory, 1 CPU recommended).
The Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π Browser + VNC β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π» VSCode Server β π Shell Terminal β π File Ops β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π MCP Hub + π Sandbox Fusion β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β π Preview Proxy + π Service Monitoring β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Everything shares the same filesystem. Browser downloads land in /home/gem/. Code execution reads from /home/gem/. File APIs write to /home/gem/. One source of truth.
Why Now?
AI agents need to do things, not just talk. They need to browse websites, run code, manipulate files, interact with APIs. The traditional approach β spin up separate sandboxes for each capability β adds integration complexity that compounds as you add features.
AIO Sandbox is the βjust put it all in one containerβ solution. Itβs not clever. Itβs not novel architecture. Itβs just the obvious thing, done well.
Apache 2.0 licensed. Active development. SDKs in three languages.
Links: