AIO Sandbox: Browser, Shell, Files, MCP, and VSCode in One Docker Container for AI Agents

By Prahlad Menon 3 min read

The agent sandbox problem has been solved wrong for too long. You’d spin up a browser sandbox for web automation, a separate code execution environment for Python, a third thing for file operations β€” and then spend half your integration time shuffling data between them.

AIO Sandbox fixes this by putting everything in one Docker container: browser, shell, files, MCP servers, VSCode, and Jupyter. One filesystem. One environment. Zero coordination overhead.

docker run --security-opt seccomp=unconfined --rm -it -p 8080:8080 ghcr.io/agent-infra/sandbox:latest

That’s it. You now have:

  • Browser + VNC at /vnc/index.html?autoconnect=true
  • VSCode Server at /code-server/
  • API docs at /v1/docs
  • MCP services at /mcp

Why This Matters

Traditional agent sandboxes are single-purpose. Browser sandbox? Great, but now how do you get that downloaded PDF into your code execution environment? Shell sandbox? Cool, but where’s the browser for web scraping?

AIO Sandbox runs everything in the same container with a shared filesystem. Download a file in the browser β†’ it’s immediately available in shell and Jupyter. Write code in VSCode β†’ execute it via the shell API. No file transfers, no mounting tricks, no coordination logic.

from agent_sandbox import Sandbox

client = Sandbox(base_url="http://localhost:8080")
home_dir = client.sandbox.get_context().home_dir

# Execute shell commands
result = client.shell.exec_command(command="ls -la")
print(result.data.output)

# File operations
content = client.file.read_file(file=f"{home_dir}/.bashrc")

# Browser automation
screenshot = client.browser.screenshot()

Same client, same environment, same filesystem.

The MCP Integration

Pre-configured MCP servers ship out of the box:

ServerTools
browsernavigate, screenshot, click, type, scroll
fileread, write, list, search, replace
shellexec, create_session, kill
markitdownconvert, extract_text, extract_images

Point your MCP client at http://localhost:8080/mcp and you’re connected. No configuration, no server setup.

Real-World Example: Web to Markdown

Here’s the pattern that shows why unified environments matter β€” scrape a webpage, convert to markdown, save to disk:

import asyncio
import base64
from playwright.async_api import async_playwright
from agent_sandbox import Sandbox

async def site_to_markdown():
    c = Sandbox(base_url="http://localhost:8080")
    home_dir = c.sandbox.get_context().home_dir

    # Browser automation via CDP
    async with async_playwright() as p:
        browser_info = c.browser.get_info().data
        page = await (await p.chromium.connect_over_cdp(browser_info.cdp_url)).new_page()
        await page.goto("https://example.com", wait_until="networkidle")
        html = await page.content()
        screenshot_b64 = base64.b64encode(await page.screenshot()).decode('utf-8')

    # Execute Python in Jupyter (same container!)
    c.jupyter.execute_code(code=f"""
from markdownify import markdownify
html = '''{html}'''
screenshot_b64 = "{screenshot_b64}"

md = f"{{markdownify(html)}}\\n\\n![Screenshot](data:image/png;base64,{{screenshot_b64}})"
with open('{home_dir}/site.md', 'w') as f:
    f.write(md)
print("Done!")
""")

    # Read result via file API
    return c.file.read_file(file=f"{home_dir}/site.md").data.content

if __name__ == "__main__":
    result = asyncio.run(site_to_markdown())
    print("Markdown saved!")

Browser β†’ Jupyter β†’ File system. No file transfers. No mounting. Just works.

Agent Framework Integrations

Browser Use

from agent_sandbox import Sandbox
from browser_use import Agent, Tools
from browser_use.browser import BrowserProfile, BrowserSession

sandbox = Sandbox(base_url="http://localhost:8080")
cdp_url = sandbox.browser.get_info().data.cdp_url

browser_session = BrowserSession(
    browser_profile=BrowserProfile(cdp_url=cdp_url, is_local=True)
)

agent = Agent(
    task='Search for "AI agents" on DuckDuckGo',
    llm=ChatOpenAI(model="gpt-4"),
    browser_session=browser_session,
)
await agent.run()

LangChain

from langchain.tools import BaseTool
from agent_sandbox import Sandbox

class SandboxTool(BaseTool):
    name = "sandbox_execute"
    description = "Execute commands in AIO Sandbox"

    def _run(self, command: str) -> str:
        client = Sandbox(base_url="http://localhost:8080")
        result = client.shell.exec_command(command=command)
        return result.data.output

OpenAI Assistants

from openai import OpenAI
from agent_sandbox import Sandbox

sandbox = Sandbox(base_url="http://localhost:8080")

def run_code(code, lang="python"):
    if lang == "python":
        return sandbox.jupyter.execute_code(code=code).data
    return sandbox.nodejs.execute_nodejs_code(code=code).data

# Use as a tool in function calling
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "calculate 1+1"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "run_code",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {"type": "string"},
                    "lang": {"type": "string"},
                },
            },
        },
    }],
)

SDK Options

Python:

pip install agent-sandbox

TypeScript/JavaScript:

npm install @agent-infra/sandbox

Go:

go get github.com/agent-infra/sandbox-sdk-go

Deployment

Docker Compose for production:

version: '3.8'
services:
  sandbox:
    image: ghcr.io/agent-infra/sandbox:latest
    security_opt:
      - seccomp:unconfined
    shm_size: "2gb"
    ports:
      - "8080:8080"
    environment:
      TZ: America/New_York

Kubernetes deployment is documented with resource limits (2Gi memory, 1 CPU recommended).

The Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    🌐 Browser + VNC                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  πŸ’» VSCode Server  β”‚  🐚 Shell Terminal  β”‚  πŸ“ File Ops    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚              πŸ”— MCP Hub + πŸ”’ Sandbox Fusion                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚        πŸš€ Preview Proxy + πŸ“Š Service Monitoring             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Everything shares the same filesystem. Browser downloads land in /home/gem/. Code execution reads from /home/gem/. File APIs write to /home/gem/. One source of truth.

Why Now?

AI agents need to do things, not just talk. They need to browse websites, run code, manipulate files, interact with APIs. The traditional approach β€” spin up separate sandboxes for each capability β€” adds integration complexity that compounds as you add features.

AIO Sandbox is the β€œjust put it all in one container” solution. It’s not clever. It’s not novel architecture. It’s just the obvious thing, done well.

Apache 2.0 licensed. Active development. SDKs in three languages.


Links: