HUD Documentation — Evaluations and RL Environments.

Get up and running with HUD in minutes. Follow these four steps to install the CLI and run your first evaluation.

Install uv

HUD uses uv for fast, reliable Python package management.

macOS / Linux
Windows

curl -LsSf https://astral.sh/uv/install.sh | sh

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Install HUD Tool

Install the HUD CLI globally using uv:

uv tool install hud-python@latest --python 3.12

Get and Set API Keys

Get your HUD API Key from hud.ai
Get your Anthropic API Key from console.anthropic.com
Set them using the CLI:

hud set HUD_API_KEY=sk-hud-... ANTHROPIC_API_KEY=sk-ant-...

Run Your First Evaluation

Run the standard SheetBench-50 benchmark using Claude:

hud eval hud-evals/SheetBench-50 claude

This will stream results to your terminal and the HUD Dashboard.

SDK Quick Reference

import hud

# Run evaluation with the new eval API
async with hud.eval("hud-evals/SheetBench-50:0") as ctx:
    agent = MyAgent()
    result = await agent.run(ctx)
    ctx.reward = result.reward

CLI Quick Reference

# Create sample environment
hud init

# Change directory to the environment and build the docker container
cd sample && hud build

# Start hot-reload development server for an environment
hud dev . --build --interactive

# Run a task in your environment
hud eval tasks.json claude

Next Steps

Understand MCP

Learn how agents connect to environments

Run Benchmarks

Test on SheetBench and OSWorld

Build Environments

Create your own MCP environment

GitHub

Star us and contribute!

See the CLI Reference for detailed command documentation

Documentation Index

​SDK Quick Reference

​CLI Quick Reference

​Next Steps

Understand MCP

Run Benchmarks

Build Environments

GitHub

SDK Quick Reference

CLI Quick Reference

Next Steps