Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

agsh (agentic shell) is a shell where you type natural language instead of commands. An LLM interprets your instructions and executes them using built-in tools like file operations, search, web access, and shell command execution.

agsh [r] > find all Rust files in this project and count the lines of code

Instead of remembering find . -name '*.rs' | xargs wc -l, you describe what you want and the agent figures out how to do it.

Features

  • Natural language interface – describe what you want instead of memorizing syntax
  • Built-in tools – file read/write/edit, glob search, regex content search (ripgrep), web fetch, web search, shell command execution
  • Scratchpad – session-scoped working memory for the agent to store and retrieve intermediate results
  • Sub-agents – delegate research tasks to read-only sub-agents
  • Multiple LLM providers – OpenAI and Claude, with support for any OpenAI-compatible API
  • MCP support – extend the agent with tools from external MCP servers
  • Permission system – control what the agent can do (none/read/ask/write), switchable mid-session
  • Session management – conversations are persisted in SQLite; resume, export, or compact any session
  • Streaming output – responses stream to the terminal in real time with syntax highlighting
  • Interactive and one-shot modes – use it as a REPL or pipe a single prompt
  • Extended thinking – Claude provider supports extended thinking for complex reasoning

How It Works

  1. You type a natural language instruction
  2. agsh sends it to the configured LLM along with tool definitions and a system prompt
  3. The LLM decides which tools to call (if any) and returns text and/or tool calls
  4. agsh executes the tool calls, feeds results back to the LLM, and repeats until the LLM is done
  5. The final response is rendered as Markdown in the terminal

Installation

agsh is written in Rust and builds as a single binary.

Pre-Built Binaries

Download the latest release for your platform from the GitHub Releases page.

PlatformArchive
Linux (x86_64)agsh-linux-amd64.tar.gz
macOS (Apple Silicon)agsh-macos-arm64.tar.gz
Windows (x86_64)agsh-windows-amd64.zip

Extract the binary and place it somewhere on your $PATH:

# Linux/macOS
tar -xzf agsh-*.tar.gz
cp agsh ~/.local/bin/

Cargo Install

If you have Rust installed, you can install agsh directly from the Git repository:

cargo install --locked --git https://github.com/k4yt3x/agsh.git

This builds the latest version from source and installs it to ~/.cargo/bin/.

Building from Source

Prerequisites

  • Rust (edition 2024, requires Rust 1.85+)
  • A C compiler (for the bundled SQLite)

Build

git clone https://github.com/k4yt3x/agsh.git
cd agsh
cargo build --release

The binary will be at target/release/agsh. Copy it somewhere on your $PATH:

cp target/release/agsh ~/.local/bin/

Verify

agsh --version
agsh --help

Quick Start

1. Run the Setup Wizard

On first launch, agsh automatically starts an interactive setup wizard:

agsh

The wizard will guide you through:

  1. Provider selection — Choose between claude and openai
  2. Authentication — OAuth login (Claude only) or API key entry
  3. Model selection — Enter the model name to use
  4. Base URL — Optionally set a custom API endpoint

The wizard writes your configuration to ~/.config/agsh/config.toml. You can re-run it at any time with agsh setup.

You can also create the config file manually or use environment variables (OPENAI_API_KEY, AGSH_PROVIDER, etc.) and CLI flags (--provider, -m) as overrides. See Configuration for all options.

2. Start Using agsh

After setup, you will see a prompt:

agsh [r] >

You will see a prompt:

agsh [r] >

The [r] indicates read permission mode (the default). The agent can read files and search, but cannot write files or run commands.

3. Ask It Something

agsh [r] > what files are in the current directory?

The agent will use the find_files tool to list files and describe them.

4. Enable Write Mode

Press Shift+Tab to cycle the permission to write mode:

agsh [w] >

Now the agent can execute commands and modify files:

agsh [w] > create a file called hello.txt with the text "hello world"

5. One-Shot Mode

For quick tasks without entering the interactive shell:

agsh "what is my current working directory?"

The process exits after the agent responds.

6. Continue a Previous Session

To pick up where you left off, continue the last session:

agsh -c

Or resume a specific session by its UUID:

agsh -c 550e8400-e29b-41d4-a716-446655440000

See Sessions for more details.

Configuration Overview

The recommended way to configure agsh is with a config file at ~/.config/agsh/config.toml:

[provider]
name = "openai"
model = "gpt-4o"
api_key = "sk-..."

This is all you need to get started. See Config File for the full reference.

Required Settings

agsh requires three settings to function. If any are missing, it prints an error with setup instructions:

SettingConfig KeyEnv VarCLI Flag
Providerprovider.nameAGSH_PROVIDER--provider
Modelprovider.modelAGSH_MODEL-m, --model
API Keyprovider.api_keyOPENAI_API_KEY or CLAUDE_API_KEY

Override Layers

Configuration is layered. Higher-priority layers override lower ones:

  1. CLI flags – per-invocation overrides (--provider, --model, --base-url, -p)
  2. Environment variables – useful for CI, containers, or temporary overrides (AGSH_PROVIDER, etc.)
  3. Config file – persistent settings in ~/.config/agsh/config.toml
  4. Built-in defaults – permission defaults to read, streaming defaults to on

For example, --model gpt-4o-mini on the command line overrides both AGSH_MODEL and provider.model in the config file.

API Key Resolution

The API key environment variable depends on the configured provider:

  • Provider openai: reads OPENAI_API_KEY
  • Provider claude: reads CLAUDE_API_KEY (or CLAUDE_OAUTH_TOKEN for OAuth)

If the environment variable is not set, it falls back to provider.api_key in the config file.

Config File

agsh looks for a TOML configuration file at a platform-specific location:

PlatformPath
Linux~/.config/agsh/config.toml ($XDG_CONFIG_HOME/agsh/config.toml)
macOS~/Library/Application Support/agsh/config.toml
Windows%APPDATA%\agsh\config.toml

The config file is optional. If it does not exist, agsh silently skips it.

Set the AGSH_CONFIG_DIR environment variable to override the default location entirely — the value points at the agsh directory itself (contains config.toml and skills/). Useful for tests, portable installs, and isolating a per-project config from your global one.

Format

[provider]
name = "openai"
model = "gpt-4o"
api_key = "sk-..."
base_url = "https://api.openai.com/v1"

All fields under [provider] are optional individually – you can set some in the config file and override others with environment variables or CLI flags.

Fields

provider.name

The LLM provider to use.

ValueDescription
openaiOpenAI Chat Completions API (also works with OpenAI-compatible APIs)
claudeClaude API (Messages endpoint)

provider.model

The model identifier to send to the provider. Examples:

  • gpt-4o, gpt-4o-mini (OpenAI)
  • claude-sonnet-4-20250514, claude-haiku-4-5-20251001 (Claude)
  • Any model supported by an OpenAI-compatible endpoint

provider.api_key

The API key for authentication. It is recommended to use environment variables (OPENAI_API_KEY or CLAUDE_API_KEY) instead of storing the key in the config file.

provider.oauth_token

OAuth access token for the Claude provider. Can also be set via CLAUDE_OAUTH_TOKEN env var. The token is saved to the database on first use and loaded automatically on subsequent launches.

provider.oauth_token_url

Custom OAuth token refresh endpoint. Defaults to https://console.anthropic.com/v1/oauth/token.

provider.base_url

Custom API base URL. Useful for:

  • Self-hosted models via Ollama (http://localhost:11434/v1)
  • OpenRouter (https://openrouter.ai/api/v1)
  • Other OpenAI-compatible API providers

If not set, defaults to:

  • https://api.openai.com/v1 for the openai provider
  • https://api.anthropic.com for the claude provider

provider.reasoning_effort

Reasoning effort level for OpenAI o-series models. When set, the reasoning_effort parameter is included in API requests and max_completion_tokens is used instead of max_tokens.

Accepted values: low, medium, high. Omitted by default.

[provider]
reasoning_effort = "medium"

Examples

OpenAI

[provider]
name = "openai"
model = "gpt-4o"
# API key via env: export OPENAI_API_KEY=sk-...

Claude

[provider]
name = "claude"
model = "claude-sonnet-4-20250514"
# API key via env: export CLAUDE_API_KEY=sk-ant-api03-...
# Or OAuth token via env: export CLAUDE_OAUTH_TOKEN=sk-ant-oat01-...

Ollama (local)

[provider]
name = "openai"
model = "llama3"
api_key = "unused"
base_url = "http://localhost:11434/v1"

OpenRouter

[provider]
name = "openai"
model = "anthropic/claude-sonnet-4-20250514"
base_url = "https://openrouter.ai/api/v1"
# API key via env: export OPENAI_API_KEY=sk-or-...

[display]

Settings for output formatting.

display.render_mode

Output render mode. Equivalent to the --render-mode CLI flag.

ValueDescription
batSyntax-highlighted markdown via bat (default)
termimadTerminal formatting via termimad (box-drawn code blocks, reflowed paragraphs). Alias: rich
rawRaw markdown printed verbatim with aligned tables

Default: bat

[display]
render_mode = "raw"

display.show_session_id_on_create

Whether to display the session ID when a new session is created.

Default: false

display.show_session_id_on_exit

Whether to display the session ID when agsh exits.

Default: true

[display]
show_session_id_on_create = true
show_session_id_on_exit = false

display.show_path_in_prompt

Whether to show the current working directory in the interactive prompt.

Default: true

display.newline_before_prompt

Whether to add a blank line before the prompt after each agent response.

Default: true

display.newline_after_prompt

Whether to add a blank line after the prompt (before the agent response).

Default: true

display.input_style

Visual style applied to text typed into the REPL prompt. Makes submitted prompts easy to spot when scrolling back through a long session — reedline paints the buffer with this style on every repaint, including the final paint before the newline, so the styling lands in the terminal’s scrollback alongside the literal text.

Accepted values:

  • default (or unset): bold white-ish foreground on a slate-blue background, rendered in truecolor RGB so it looks the same across terminal themes.
  • none: disable styling entirely.
  • reverse: reverse video (swaps the terminal’s current foreground and background).
  • bold, dim, italic, underline: single attribute, no colour change.
  • A colour name (black, red, green, yellow, blue, magenta / purple, cyan, white): set only the foreground, mapped to the terminal’s palette.

Unknown values warn at startup and fall back to default.

Default: the banner preset described above.

[display]
show_path_in_prompt = false
newline_before_prompt = false
newline_after_prompt = false
input_style = "none"    # or "cyan", "bold", "dim", etc.

[web]

Settings for the HTTP client shared by fetch_url and web_search. All keys are optional; unset fields use the defaults shown below.

KeyTypeDefaultPurpose
user_agentstringReal Chrome UASome search engines block non-browser UAs. Override if you need a specific identifier.
request_timeout_secondsint30Total request budget (connect + TLS + read). 0 falls back to the default.
connect_timeout_secondsintunsetSeparate cap on TCP + TLS handshake. Fail fast on unreachable hosts without shortening the whole request budget.
read_timeout_secondsintunsetPer-chunk idle timeout. Catches bodies that stall mid-stream.
max_redirectsint10Cap on 3xx hops. 0 disables redirects entirely.
proxystringunset (honours HTTP_PROXY / HTTPS_PROXY / ALL_PROXY env)Proxy URL. Schemes: http://, https://, socks5://, socks5h://, socks4://. The literal string "none" explicitly disables env-var auto-detection.
ca_cert_filepathunsetExtra PEM bundle to trust on top of the system store. Useful for corporate MITM proxies or self-signed internal services. Accepts single-cert and multi-cert files.
https_onlyboolfalseRefuse plain http:// URLs.
min_tls_versionstringunset (reqwest default)Minimum TLS version. Accepts "1.0", "1.1", "1.2", "1.3". Unknown values log a warn and fall through. Note: the bundled rustls backend supports only TLS 1.2 and 1.3 — "1.0" / "1.1" will surface a build error.
danger_accept_invalid_certsboolfalseDANGEROUS. Disable TLS certificate validation entirely. Emits a warn! on every startup when enabled. Only use against trusted local dev servers.
danger_accept_invalid_hostnamesboolfalseDANGEROUS. Accept certificates whose hostname doesn’t match. Emits a warn! on every startup when enabled. Only use against trusted local dev servers.

Example: corporate proxy with a private CA

[web]
proxy = "http://corp-proxy.internal:3128"
ca_cert_file = "/etc/ssl/corp-root-ca.pem"
min_tls_version = "1.2"
request_timeout_seconds = 60

Example: local testing against self-signed certs

[web]
# Route everything through a local SOCKS proxy you control.
proxy = "socks5h://127.0.0.1:1080"
# Accept self-signed certs on dev.local — KEEP THIS OFF IN PROD.
danger_accept_invalid_certs = true

Example: fail-fast timeouts

[web]
request_timeout_seconds = 5
connect_timeout_seconds = 2
max_redirects = 0

[shell]

Settings for shell command execution.

shell.sandbox

Whether to enable read-only filesystem sandboxing for shell commands in read mode. When enabled (default), shell commands can be executed in read mode but with the filesystem physically write-protected. When disabled, shell commands require write mode.

Default: true

[shell]
sandbox = false  # disable sandboxed shell in read mode

The sandbox uses Landlock on Linux (kernel 5.13+) and sandbox-exec on macOS. On platforms where sandboxing is unavailable, shell commands always require write mode regardless of this setting.

[session]

Settings for session history retention and context window management.

session.context_messages

Maximum number of messages to send to the LLM API per request. Older messages are truncated from the beginning while preserving tool call chain integrity. The full history remains stored in SQLite – only the API payload is limited.

Default: 200

[session]
context_messages = 100

session.retention_days

Automatically delete sessions older than this many days on startup. Uses the session’s updated_at timestamp, so actively-resumed sessions are preserved even if originally created long ago.

Default: 90

[session]
retention_days = 30

session.max_storage_bytes

Maximum total byte size of all stored message content across all sessions. When exceeded on startup, the oldest sessions are deleted until the total is under the limit.

Default: 52428800 (50 MB)

[session]
max_storage_bytes = 10485760  # 10 MB

session.auto_compact

Automatically compact the conversation when input tokens exceed 80% of the context window. Compaction summarizes older messages and preserves recent ones, the todo list, and scratchpad entries.

Default: true

[session]
auto_compact = false

session.context_window

Override the model’s context window size (in tokens). Used for auto-compact threshold calculation. If not set, agsh infers the context window from the model name.

[session]
context_window = 200000

[thinking]

Settings for extended thinking (Claude provider only). Claude 4.6+ models use adaptive thinking automatically; older models use a fixed token budget.

thinking.enabled

Whether to enable extended thinking. When enabled, the model can use additional tokens for internal reasoning before responding.

Default: true

thinking.budget_tokens

Maximum number of tokens the model can use for thinking (for non-adaptive models).

Default: 16000

[thinking]
enabled = true
budget_tokens = 20000

[prompt]

Settings for injecting custom instructions into the system prompt. Use this to set installation-specific rules that should apply to every session – things the agent needs to know about your system, preferred tools, or policies.

prompt.instructions

A string of custom instructions that agsh will include in every system prompt, under a ## User Instructions section. The model is told to treat them as hard constraints unless they conflict with safety requirements.

Suitable use cases:

  • System-specific policies: “Never install Python packages globally with pip – always use uv or a venv.”
  • Installed tooling the agent should know about: “Poppler is available on this system – use pdftotext for PDFs.”
  • Workflow preferences: “Prefer ripgrep over grep; it’s installed and faster.”
  • Signing / compliance rules: “Git commits on this system must use gpg signing.”

Default: unset (no custom instructions).

[prompt]
instructions = """
Never install Python packages globally with pip. Always use `uv` or a venv.
Poppler is available on this system — use `pdftotext` for PDFs.
Prefer ripgrep over grep.
"""

Notes:

  • Empty or whitespace-only strings are treated as unset.
  • Instructions apply to sub-agents spawned via spawn_agent too.
  • Instructions are included at all permission levels (including none) because they are authored by you.

[mcp]

Settings for MCP (Model Context Protocol) tool servers. MCP allows agsh to discover and use tools provided by external servers.

[[mcp.servers]]

An array of MCP server configurations. Each entry defines a server to connect to at startup.

FieldRequiredDescription
nameYesUnique name for this server. Used as namespace prefix for tools (name__tool). Must match [A-Za-z0-9_-]+, must not contain __, and must not be agsh, ide, or start with mcp_.
transportYesTransport type: "stdio" (spawn subprocess) or "http" (streamable HTTP).
commandStdio onlyPath or name of the executable to spawn. On Windows, npx / .cmd / .bat / .ps1 are auto-wrapped in cmd /c.
argsNoArguments to pass to the command.
envNoEnvironment variables to set for the spawned process (stdio only).
urlHTTP onlyURL of the MCP server endpoint.
auth_tokenNoBearer token for HTTP authentication (sent as Authorization: Bearer <token>).
authNoOAuth authentication configuration (see below). Mutually exclusive with auth_token.
headersNoCustom HTTP headers to include with every request (HTTP only).
headers_helperNoPath to an executable whose stdout (Name: Value\n lines) is merged over headers at connect-time (HTTP only). Executed with AGSH_MCP_SERVER_NAME / AGSH_MCP_SERVER_URL in env; 15 s timeout.
permissionNoServer-wide permission override. Applies to every tool on this server, beating the readOnlyHint the server advertises and the [mcp].default_permission global fallback. See Permission resolution below.
allowed_toolsNoOptional allow-list of raw tool names (the form the server advertises, not the server__tool namespaced form). When set and non-empty, only these tools are registered; all others from this server are ignored.
disabled_toolsNoOptional block-list of raw tool names. Applied after allowed_tools — tools listed here are never registered. Both lists can coexist; the net set is allowed_tools \ disabled_tools.
tool_permissionsNoPer-tool permission overrides keyed by raw tool name. Beats the server-level permission and the server’s readOnlyHint when resolving a tool’s required permission.
samplingNoAllow this server to call sampling/createMessage against your configured LLM provider. Default false (reject). Enabling this lets a compromised server inject arbitrary messages into your LLM context and burn your provider quota — opt in per-server, deliberately.
sampling_limitNoCap on sampling calls per agsh session from this server when sampling = true. Default 10. Requests beyond the limit return an INTERNAL_ERROR to the server.
disabledNoWhen true, the server is skipped entirely at startup — no process is spawned, no HTTP connect is attempted. Flip it back with agsh mcp enable <name> or by editing the config. Defaults to false.

[mcp] top-level table

FieldPurpose
default_permissionFallback permission for MCP tools whose server didn’t advertise readOnlyHint and doesn’t have a permission override. Accepts "none", "read", "ask", or "write". If unset the hardcoded fallback is "write" (strict).
strictWhen true (default), every turn is gated on all enabled MCP servers being Connected. If any are not, the turn is rejected with a shell-style error instead of sending the request to the model. Set to false to proceed with whichever servers are ready (a warn log names the missing ones).
grace_secondsPer-turn cap on how long to wait for still-Pending servers to connect before applying the strict check. Default 3. Set to 0 to skip waiting (useful for scripts that want to fail fast).
connect_timeout_secondsPer-server timeout for connect + initialize + list_tools. A hung stdio spawn or slow HTTPS handshake can’t stall the whole fleet past this bound. Default 30.

Startup concurrency

MCP servers connect in parallel at startup, partitioned by transport so a fleet of stdio servers (process-spawn bound) doesn’t fight a fleet of HTTP servers (network bound):

  • stdio: AGSH_MCP_STDIO_CONCURRENCY (default 3)
  • http: AGSH_MCP_HTTP_CONCURRENCY (default 20)

These env vars are tuning knobs — rarely needed, but useful if you’re running ~30 stdio servers on a constrained box (lower it) or ~50 HTTP servers (raise it).

Permission resolution

Every MCP tool’s required permission is resolved through a five-step chain; the first match wins:

  1. server.tool_permissions[<raw-tool>] — explicit per-tool override.
  2. server.permission — explicit server-level override. Applies to every tool on that server regardless of what the server advertises.
  3. tool.annotations.readOnlyHint from the server: trueRead, falseWrite.
  4. [mcp].default_permission — global fallback.
  5. Hardcoded Write — strict ultimate fallback.

User-supplied config (1, 2, 4) always beats the server’s self-classification — if a server lies about a tool, you can override. But when no user config says anything, the server’s hint is trusted for that specific tool so readOnlyHint = false destructive tools don’t silently become Read-accessible just because the user opted into a lenient global default.

Hint spoofing: a compromised server could claim readOnlyHint = true on a destructive tool. Defend by setting server.permission = "write" on suspect servers (step 2 wins) or by listing the destructive tools explicitly in tool_permissions / disabled_tools.

Stale config: entries in allowed_tools / disabled_tools / tool_permissions that don’t match any advertised tool get a warn! line at connect time. The server still connects; you just see a heads-up so you can clean up after the server renames a tool.

Visibility across levels: the resolved permission doesn’t hide a tool from the agent. Every registered tool is listed in the system prompt with its required level noted inline, and a per-turn [Permission context] block names the current level plus any tools it blocks. The agent can still reason about an inaccessible tool and suggest /permission <level> to enable it; the permission gate is enforced at dispatch time. Keeping the tool catalogue visible across levels is also what lets the Anthropic prompt cache survive mid-session permission toggles.

Examples

Exa — reliable web search when the built-in DuckDuckGo scraper gets CAPTCHA’d. The free tier works without an API key; paste a key into the headers table for the paid tier:

# Free tier — no key required
agsh mcp add exa https://mcp.exa.ai/mcp
# Paid tier — expands from EXA_API_KEY at connect time
agsh mcp add exa https://mcp.exa.ai/mcp --header "x-api-key=${EXA_API_KEY}"

Well-annotated server — no config needed. Every tool is classified by its own readOnlyHint (read tools Read, write tools Write):

[[mcp.servers]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"

User-declared trust on an unannotated server — all tools accessible in Read:

[[mcp.servers]]
name       = "internal"
transport  = "http"
url        = "https://mcp.internal/…"
permission = "read"

Overriding a mis-annotated or distrusted tool — one specific tool requires Write:

[[mcp.servers]]
name      = "notion"
transport = "http"
url       = "https://mcp.notion.com/mcp"

[mcp.servers.tool_permissions]
"notion-do-something-scary" = "write"

Subset of a server’s tools — only query registers, all others are ignored:

[[mcp.servers]]
name          = "pg"
transport     = "stdio"
command       = "npx"
args          = ["-y", "@modelcontextprotocol/server-postgres"]
allowed_tools = ["query"]

Block-list with a narrow exception — all fs tools are Read-accessible except the two destructive ones, which are never registered:

[[mcp.servers]]
name           = "filesystem"
transport      = "stdio"
command        = "npx"
args           = ["-y", "@modelcontextprotocol/server-filesystem"]
permission     = "read"
disabled_tools = ["delete_file", "move_file"]

MCP tools are registered with namespaced names in the format servername__toolname to prevent collisions with built-in tools or between servers.

Tool and resource descriptions returned from MCP servers are truncated at 2048 characters to keep the system prompt bounded.

[tools] — built-in tool filters

The three knobs [[mcp.servers]] exposes for MCP tools also apply to agsh’s built-in tools (read_file, write_file, execute_command, web_search, etc.) via a top-level [tools] table. MCP per-server filtering is separate from this and keeps its own namespaces — this block only affects the built-ins.

KeyPurpose
allowed_toolsOptional allow-list of built-in tool names. When set and non-empty, only these built-ins register. Use agsh tools list to see the canonical names.
disabled_toolsBlock-list of built-in tool names. Applied after allowed_tools; a tool here is never registered even if it also appears in the allow-list.
tool_permissionsPer-tool required-permission override keyed by built-in name. Beats the hardcoded required level from the tool’s impl. Levels: none, read, ask, write.

Stale entries (a name that doesn’t match any built-in) emit a warn! at startup. agsh still starts — the warning just flags a likely typo or a tool the binary renamed.

Restrict a session to read-only inspection:

[tools]
allowed_tools = ["read_file", "find_files", "search_contents", "fetch_url"]

Force execute_command to need write so ask mode prompts for every shell call:

[tools.tool_permissions]
execute_command = "write"

Disable web access entirely in a locked-down environment:

[tools]
disabled_tools = ["web_search", "fetch_url"]

Sub-agents spawned via spawn_agent inherit the same filter — a disabled built-in is disabled everywhere. Run agsh tools list to see every built-in’s effective required permission, whether a [tools.tool_permissions] override is in effect, and whether the current config enables it.

Environment variable substitution

Every string field listed above (command, args, env values, url, headers values, auth_token) supports ${VAR} and ${VAR:-default} expansion from the process environment. Missing variables with no default leave the literal ${VAR} in place and log a warning at startup. Use this to avoid committing secrets:

[[mcp.servers]]
name = "github"
transport = "http"
url = "https://mcp.github.com"
auth_token = "${GITHUB_MCP_TOKEN}"

Environment variables

VariableDefaultPurpose
AGSH_MCP_TOOL_TIMEOUT600000 ms (600 s)Per-call timeout for MCP tools. Triggers notifications/cancelled on expiry.

agsh mcp CLI

Manage configured servers without editing config.toml by hand:

CommandAction
agsh mcp listPrint all configured servers.
agsh mcp get <name>Print full details for one server.
agsh mcp add <name> <url-or-command> [args...] [flags]Persist a server. Transport is auto-detected: a URL starting with http[s]:// means HTTP, anything else means stdio. Preserves existing formatting/comments via toml_edit.
agsh mcp remove <name>Best-effort revoke stored OAuth tokens (RFC 7009) at the provider, then delete the server entry, clear stored credentials, and drop any resource-update ledger entries.
agsh mcp disable <name>Set disabled = true on the server entry. The next agsh start skips it entirely.
agsh mcp enable <name>Clear the disabled flag, so the server connects on the next start.
agsh mcp reconnect <name>Smoke-test a connect; prints ok or the error.
agsh mcp tools <name>Connect and list every advertised tool with its resolved permission, the chain step that decided it, and whether the current config allows it. Useful for populating --allow-tool, --disable-tool, or --tool-permission overrides without leaving the CLI.
agsh mcp login <name>Drive interactive OAuth. If the server has no [auth] block and uses HTTP, assumes type = "oauth" and persists the block on success.
agsh mcp logout <name>Call the provider’s revocation_endpoint (RFC 7009) best-effort, then clear stored credentials + auth-probe cache.

agsh mcp add flags

FlagPurpose
--transport <stdio|http>Override the auto-detected transport.
--env KEY=VALUEEnvironment variable for stdio (repeatable).
--header KEY=VALUEHTTP header (repeatable).
--auth <oauth|client-credentials|client-credentials-jwt>Configure the [auth] block.
--auth-token <TOKEN>Static bearer token. Mutually exclusive with --auth.
--client-id, --client-secretOAuth / client-credentials client identifiers.
--signing-key <PATH>, --signing-algorithm <ALG>JWT signing material (client-credentials-jwt only).
--scope <SCOPE>OAuth scope (repeatable).
--redirect-port <PORT>Fixed OAuth redirect port (default: ephemeral).
--permission <none|read|ask|write>Per-server permission cap (applies to all tools on the server).
--allow-tool <NAME>Raw tool name to allow (repeatable). When set, only listed tools register.
--disable-tool <NAME>Raw tool name to block (repeatable). Applied after --allow-tool.
--tool-permission <NAME=LEVEL>Per-tool permission override (repeatable). LEVEL is none/read/ask/write.
--sampling, --sampling-limit <N>Opt into server-initiated sampling/createMessage.

Example: Notion

$ agsh mcp add notion https://mcp.notion.com/mcp
ok: added 'notion' to ~/.config/agsh/config.toml
probe: server requires OAuth.
running OAuth authorisation for 'notion' (use --no-login to skip).
no [auth] block for 'notion' — assuming OAuth authorization_code.
…
ok: authorized 'notion'

agsh mcp add on an HTTP endpoint:

  1. Probe — issues an unauthenticated GET (3 s timeout, redirects off) and classifies the response per the MCP authorization spec + RFC 6750 + RFC 9728:

    • 2xx → server is open, no login needed.
    • 401 / 403 with WWW-Authenticate: Bearer … → OAuth required. The resource_metadata="…" attribute (RFC 9728) is captured at DEBUG.
    • Any other status → couldn’t infer, prints the status code.
    • Network failure → prints the error.
  2. Auto-login — if the probe says OAuth is required (or --auth oauth was explicitly set), the OAuth authorization_code flow runs immediately as though the user had chained agsh mcp login <name> themselves. The synthesised [auth] = oauth block is written back to config.toml on success.

  3. Rollback on failure — if the OAuth flow errors out, the entry we just wrote is purged from config.toml (alongside any partial credentials + probe cache), leaving the user’s config clean. The command exits non-zero.

  4. --no-login — skips step 2. The entry is still persisted and the probe’s hint is still printed; run agsh mcp login <name> when ready. Useful for scripted setup or when you expect to edit [auth] by hand.

The probe and the auto-login only run for HTTP servers, and only when the user didn’t provide --auth-token (static bearer) or --auth (other than oauth). Stdio servers skip both.

Remote hosts / SSH sessions

The OAuth flow redirects the browser to http://127.0.0.1:<port>/callback. When agsh is running on a different host than the browser (SSH session, container, Codespace, WSL), the browser can’t reach back and shows a “connection refused” error page. agsh handles this automatically:

  • While agsh mcp login <name> waits for the callback it also watches stdin.
  • The browser’s address bar still contains the full callback URL (including code and state) even when the connection fails. Copy it, paste it into the agsh prompt, and press Enter.
  • Whichever completes first — the TCP callback or the pasted URL — wins.
$ agsh mcp login notion
server 'notion' has no [auth] block; assuming OAuth authorization_code.
Opening browser for MCP server 'notion' OAuth authorization...
If the browser didn't open, visit:
  https://mcp.notion.com/authorize?response_type=code&…
Waiting for OAuth callback (up to 120s).
  If the browser can't reach this host (e.g. you're over SSH), paste the full
  callback URL here and press Enter.
http://127.0.0.1:46437/callback?code=…&state=…     ← paste here
ok: authorized 'notion'

REPL parity

Inside the REPL:

  • /mcp list — list configured servers.
  • /mcp reconnect <server> — reconnect smoke-test.
  • /mcp login <server> / /mcp logout <server> — run the auth flow or revoke.
  • /mcp <server>:<prompt> [args...] — render a server-defined prompt as the next user turn.

Resources and prompts

In addition to tools, agsh exposes MCP resources and prompts through four builtin tools (deferred — the agent activates them when needed):

BuiltinPurpose
list_mcp_resourcesList resources from one or every configured server.
read_mcp_resourceRead a resource by server + uri; text inline, binary base64-encoded.
list_mcp_promptsList prompts from one or every configured server, including their declared arguments.
get_mcp_promptRender a prompt by server + name with optional arguments; returns <role>: <text> lines.
subscribe_mcp_resourceSubscribe to resources/updated notifications for a specific URI.
unsubscribe_mcp_resourceCancel a prior subscription.
list_mcp_resource_updatesPrint every resource that has been reported as updated since the session started.

Connection lifecycle

  • Reconnection is automatic for all transports (stdio, plain HTTP, OAuth-authenticated HTTP) when the transport closes mid-session. HTTP transports use exponential backoff (1s, 2s, 4s, 8s, 16s, capped 30s, max 5 attempts); stdio gets one immediate retry. The reconnect runs on a blocking thread to work around an upstream rmcp bug where the auth future is !Send.
  • Session-expired recovery: rmcp 1.5 transparently re-initialises HTTP sessions on 404 / JSON-RPC -32001. agsh relies on this; no per-call handling is required.
  • Cancellation: when the agent cancels a tool call (e.g. Ctrl-C), agsh sends notifications/cancelled to the server with the in-flight request id so the server can stop work.
  • Timeouts: tool calls default to 600 s; override with AGSH_MCP_TOOL_TIMEOUT in ms.
  • Tool list refresh: on tools/list_changed, agsh re-discovers the server’s tools and hot-swaps them in the registry — no restart needed.
  • Progress notifications: MCP tool calls attach a per-request progressToken; incoming notifications/progress render as a live status line under the tool invocation.
  • Server instructions: InitializeResult.instructions is captured once per connection and spliced into the system prompt (sanitised + truncated to 2048 chars) under ## MCP Server Instructions.
  • Auth-probe cache: 401 responses are cached for 15 minutes so a restart after a failed auth flow skips the unauthenticated probe and goes straight to OAuth. Cleared by agsh mcp logout.
  • resources/list_changed, prompts/list_changed, and resources/updated notifications are logged at info/debug level.

Server-to-client features

Featureagsh behaviour
roots/listReturns a single root: file://<current-working-directory> with the directory basename as the name.
elicitation/createAlways responds with Decline and logs a warning — interactive form/URL input is not wired into the REPL.
sampling/createMessageRejected with METHOD_NOT_FOUND unless the server has sampling = true in its config. When allowed, the current provider handles the request; per-session sampling_limit caps how many times each server may invoke it.

[mcp.servers.auth]

OAuth authentication for HTTP MCP servers. Set type to choose the authentication method. This is mutually exclusive with auth_token.

FieldRequiredDescription
typeYesAuth method: "client_credentials", "client_credentials_jwt", or "oauth"
client_idVariesOAuth client ID (required for client_credentials/jwt, optional for oauth with dynamic registration)
client_secretVariesClient secret (required for client_credentials, optional for oauth)
scopesNoOAuth scopes to request
resourceNoResource parameter (RFC 8707), client_credentials only
signing_key_pathJWT onlyPath to PEM private key file
signing_algorithmNoJWT signing algorithm: RS256 (default), RS384, RS512, ES256, ES384
redirect_portNoLocal port for OAuth authorization code callback. When omitted, agsh binds to a random ephemeral port (recommended). oauth only.

Examples

Stdio server

[[mcp.servers]]
name = "postgres"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"]
permission = "write"

HTTP server

[[mcp.servers]]
name = "web-tools"
transport = "http"
url = "http://localhost:8080/mcp"
permission = "read"

HTTP server with authentication

[[mcp.servers]]
name = "api"
transport = "http"
url = "https://api.example.com/mcp"
auth_token = "your-bearer-token"
permission = "write"

[mcp.servers.headers]
X-Custom-Header = "value"

Stdio server with environment variables

[[mcp.servers]]
name = "github"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
permission = "read"

[mcp.servers.env]
GITHUB_TOKEN = "ghp_..."

Multiple servers

[[mcp.servers]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]
permission = "read"

[[mcp.servers]]
name = "github"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
permission = "write"

HTTP server with OAuth client credentials

[[mcp.servers]]
name = "api"
transport = "http"
url = "https://api.example.com/mcp"
permission = "write"

[mcp.servers.auth]
type = "client_credentials"
client_id = "my-client-id"
client_secret = "my-client-secret"
scopes = ["read", "write"]

HTTP server with JWT client credentials

[[mcp.servers]]
name = "api"
transport = "http"
url = "https://api.example.com/mcp"

[mcp.servers.auth]
type = "client_credentials_jwt"
client_id = "my-client-id"
signing_key_path = "/path/to/private-key.pem"
signing_algorithm = "RS256"
scopes = ["admin"]

HTTP server with OAuth authorization code flow

On first connection, agsh opens a browser for authorization and stores the token for future use.

[[mcp.servers]]
name = "github-mcp"
transport = "http"
url = "https://mcp.example.com"

[mcp.servers.auth]
type = "oauth"
client_id = "my-app-id"
scopes = ["repo", "user"]
redirect_port = 8400

If client_id is omitted, agsh attempts dynamic client registration with the server.

Environment Variables

The config file is the recommended way to configure agsh. Environment variables are useful as overrides – for example, in CI pipelines, containers, or when you want to temporarily switch providers without editing your config.

Environment variables override config file values but are overridden by CLI flags.

agsh-Specific Variables

VariableDescriptionExample
AGSH_PROVIDERLLM provider nameopenai, claude
AGSH_MODELModel identifiergpt-4o, claude-sonnet-4-20250514
AGSH_PERMISSIONDefault permission modenone, read, write
AGSH_CONFIG_DIROverride the default config directory. Points at the agsh directory itself (contains config.toml and skills/). The only isolation knob that works on every platform — dirs::config_dir() ignores $XDG_CONFIG_HOME on macOS/Windows./tmp/agsh-test/agsh

MCP Variables

VariableDescriptionDefault
AGSH_MCP_TOOL_TIMEOUTPer-call timeout for MCP tools, in milliseconds. Applies to every remote tool invocation; on timeout agsh cancels the request and returns an error to the model.600000 (600s)

Provider API Keys

VariableUsed When
OPENAI_API_KEYProvider is openai
CLAUDE_API_KEYProvider is claude

OAuth Authentication

VariableDescription
CLAUDE_OAUTH_TOKENOAuth access token for the Claude provider

OAuth tokens (with sk-ant-oat01- prefix) are also auto-detected when passed via CLAUDE_API_KEY.

On first use, the OAuth token is saved to the database and loaded automatically on subsequent launches. Setting the env var again replaces the stored token.

Provider Base URL

VariableDescription
OPENAI_BASE_URLCustom base URL for the OpenAI-compatible endpoint

Logging

agsh uses the tracing framework. The log level can be controlled with:

VariableDescriptionExample
RUST_LOGStandard Rust log filteragsh=debug, agsh=trace

If RUST_LOG is not set, the verbosity flag (-v, -vv, -vvv) controls the level:

FlagLevel
(none)warn
-vinfo
-vvdebug
-vvvtrace

Logs are written to stderr so they do not interfere with agent output.

CLI Options

agsh [OPTIONS] [PROMPT]
agsh <COMMAND>

Commands

setup

Run the interactive configuration wizard. Prompts for provider, authentication, model, and base URL, then writes the configuration to ~/.config/agsh/config.toml.

agsh setup

This wizard also runs automatically on first launch when no config file exists.

export

Export a session as Markdown.

agsh export <SESSION_ID> [-o <PATH>]

Use -o - to print to stdout. See Sessions for details.

delete

Delete one or more sessions by UUID, or all sessions with --all.

agsh delete <SESSION_ID>...
agsh delete --all

list

List past sessions with ID, last update time, and a preview.

agsh list [-n <LIMIT>]

Default limit: 20.

Arguments

[PROMPT]

Run a one-shot prompt and exit. The agent processes the prompt, prints its response, and the process terminates.

agsh "list all files larger than 1MB in the current directory"

When omitted, agsh starts in interactive mode.

Options

-c, --continue [SESSION_ID]

Resume a session. Without a session ID, resumes the most recently updated session. With a session ID, resumes that specific session.

agsh -c                                          # resume last session
agsh -c 550e8400-e29b-41d4-a716-446655440000     # resume specific session

Errors if the session does not exist or is locked by another agsh instance.

--permission <MODE>

Set the initial permission mode. Accepts none (or n), read (or r), ask (or a), write (or w).

agsh --permission write
agsh --permission ask

Default: read.

--provider <NAME>

Set the LLM provider. Overrides AGSH_PROVIDER and the config file.

agsh --provider claude

Supported values: openai, claude.

-m, --model <MODEL>

Set the model name. Overrides AGSH_MODEL and the config file.

agsh -m gpt-4o-mini

--base-url <URL>

Set a custom API base URL. Overrides OPENAI_BASE_URL and the config file.

agsh --base-url http://localhost:11434/v1

--no-stream

Disable streaming mode. The agent waits for the complete response before displaying it. By default, responses are streamed token-by-token.

agsh --no-stream

--render-mode <MODE>

Set the output render mode. Accepts bat (default), termimad (or rich), or raw.

  • bat: Syntax-highlighted markdown output via bat.
  • termimad: Full terminal formatting (box-drawn code blocks, reflowed paragraphs, formatted tables).
  • raw: Raw markdown printed verbatim with aligned tables.
agsh --render-mode raw

Can also be set permanently via display.render_mode in the config file.

--thinking

Enable extended thinking (Claude provider only).

agsh --thinking

--thinking-budget <TOKENS>

Set the extended thinking token budget. Implies --thinking.

agsh --thinking-budget 20000

-v, --verbose

Increase log verbosity. Can be repeated up to three times.

agsh -v      # info
agsh -vv     # debug
agsh -vvv    # trace

--help

Print help information.

--version

Print version information.

Interactive Mode

Start agsh without the -p flag to enter interactive mode:

agsh

You get a prompt:

agsh [r] >

Type your instruction and press Enter to submit. The agent processes your request and prints its response (streamed in real time as Markdown). When it finishes, you get another prompt.

Keybindings

agsh uses Emacs-style keybindings (provided by reedline).

Input

KeyAction
EnterSubmit the current prompt
Alt+EnterInsert a newline (for multi-line input)
Shift+TabCycle the permission mode (none → read → ask → write → none)
KeyAction
Ctrl+AMove cursor to start of line
Ctrl+EMove cursor to end of line
Ctrl+FMove cursor forward one character
Ctrl+BMove cursor backward one character
Alt+FMove cursor forward one word
Alt+BMove cursor backward one word

Editing

KeyAction
Ctrl+DDelete character under cursor / exit on empty line
Ctrl+H, BackspaceDelete character before cursor
Ctrl+KKill text from cursor to end of line
Ctrl+UKill text from start of line to cursor
Ctrl+WKill word before cursor
Ctrl+YYank (paste) killed text

Control

KeyAction
Ctrl+CInterrupt the running agent; clear the line if idle
Ctrl+DExit the shell (when the line is empty)
Ctrl+RReverse incremental search through history
Ctrl+LClear the screen

Prompt Format

agsh [indicator] >

The indicator shows the current permission mode:

ModeIndicatorColor
None[n]Green
Read[r]Yellow
Ask[a]Magenta
Write[w]Red

The color provides a visual cue about the agent’s current capabilities. Red means the agent can modify your system.

Multi-Line Input

Press Alt+Enter to insert a newline instead of submitting. The prompt changes to show continuation:

agsh [r] > write a python script that
  ... prints hello world
  ... and saves it to hello.py

Press Enter on the last line to submit the entire multi-line input.

Pasting multi-line content also works seamlessly — all pasted lines appear in the buffer for review, and you press Enter to submit.

Slash Commands

agsh supports / prefix commands for controlling the shell:

CommandDescription
/helpShow available commands
/exitExit the shell
/clearClear the terminal screen
/sessionShow the current session ID
/permission [none|read|ask|write]Show or set the permission level
/compactSummarize and compact the session history
/cd [path]Change working directory
/mcp listList configured MCP servers with their live state (pending / connected / failed / disabled)
/mcp reconnect <server>Smoke-test connect for one server
/mcp login <server>Run the OAuth flow from the REPL
/mcp logout <server>Revoke cached credentials for a server
/mcp <server>:<prompt> [args...]Render a server-defined prompt and send it to the agent

/compact

The /compact command asks the LLM to summarize the entire conversation, then replaces the message history with a single summary message. This is useful for long sessions that are approaching the context window limit or becoming expensive.

After compacting, the session continues with the summary as context. The previous messages are removed from both memory and the database.

Shell Escape

Prefix any input with ! to execute it directly as a shell command, bypassing the LLM entirely:

agsh [r] > !pwd
/home/user/projects
agsh [r] > !ls -la
total 32
drwxr-xr-x  5 user user 4096 Mar  4 10:00 .
...
agsh [r] > !ping 1.1.1.1 -c 2
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
...

The command runs with inherited stdin/stdout/stderr, so it behaves exactly like a regular shell. This is useful for quick checks without waiting for the LLM.

Exiting

You can exit agsh in any of these ways:

  • Type /exit
  • Type exit or quit
  • Press Ctrl+D on an empty line

Interrupting the Agent

Press Ctrl+C while the agent is running to interrupt it. This cancels the current LLM request and kills any running shell commands that were spawned by the agent.

One-Shot Mode

One-shot mode runs a single prompt and exits, similar to bash -c:

agsh "your prompt here"

The agent processes the prompt (including any tool calls), prints its response, and the process terminates. The session UUID is printed to stderr on exit.

Examples

# Simple question
agsh "what is my current working directory?"

# File operations (requires write permission)
agsh --permission write "create a file called notes.txt with today's date"

# Search
agsh "find all TODO comments in this project"

# Web search
agsh "search the web for the latest Rust release"

Combining with Other Flags

All configuration flags work in one-shot mode:

# Use a specific provider and model
agsh --provider claude -m claude-sonnet-4-20250514 "explain this codebase"

# With write permission
agsh --permission write "run 'cargo test' and summarize the results"

# Disable streaming
agsh --no-stream "read README.md and summarize it"

Session Behavior

One-shot mode creates a new session for each invocation. The session UUID is printed to stderr when the run completes:

Session: 550e8400-e29b-41d4-a716-446655440000

You can resume this session later in interactive mode:

agsh -s 550e8400-e29b-41d4-a716-446655440000

Permissions

agsh uses a four-level permission system to control what tools the agent can use. This gives you control over the agent’s capabilities and prevents accidental modifications.

Permission Levels

LevelIndicatorAllowed Tools
None[n] (green)No tools. The agent can only respond with text.
Read[r] (yellow)Read-only tools: read_file, find_files, search_contents, fetch_url, web_search, execute_command (sandboxed), todo_write, spawn_agent, scratchpad tools
Ask[a] (magenta)All tools, but each call requires user approval (Y/n prompt)
Write[w] (red)All tools without restrictions: write_file, edit_file, execute_command (unsandboxed)

Each level includes all tools from the levels below it. Write mode includes all read tools.

Default Permission

The default permission is read. You can change it with:

  • CLI flag: agsh --permission write
  • Environment variable: export AGSH_PERMISSION=write

Changing Permissions at Runtime

Press Shift+Tab to cycle through permission levels:

none → read → ask → write → none → ...

Or use the /permission slash command:

/permission write
/permission ask

The prompt indicator updates immediately to reflect the new level. The agent learns the current level via a per-turn [Permission context] block prepended to your message (see How Permissions Work below).

Ask Mode

In ask mode, the agent has access to all tools, but each tool call is paused for your approval:

[ask] Shell ls -la (Y/n)

Press Enter or y to approve, or n to deny. If denied, the agent receives an error and may try an alternative approach.

This mode is useful when you want the agent to have full capabilities but want to review each action before it executes.

How Permissions Work

When the agent attempts to use a tool, agsh checks whether the current permission level allows it:

  • If allowed, the tool executes normally.
  • In ask mode, you are prompted to approve or deny.
  • If denied, agsh returns an error message to the agent explaining which level is required and suggests running /permission <level>.

Telling the agent the current level

agsh lists every registered tool in the system prompt with its required permission level inline — nothing is filtered out — and each user message carries a compact [Permission context] block:

<context>
[Permission context]
Current permission level: read
Only read-only tools are executable.
...
</context>

That two-line block is the only permission-dependent content in the request. The system prompt and the tools-array schemas stay byte-identical across /permission toggles, so mid-session level changes don’t invalidate the Anthropic prompt cache — the entire message history stays warm.

MCP tool permissions

MCP tools are classified through a 5-step resolution chain: per-tool override → server-level override → the server’s own readOnlyHint[mcp].default_permission → hardcoded Write fallback. See the Permission resolution section of the Config File docs for the full rules and how to override a misclassified tool.

Built-in tool permissions

Any built-in tool’s required permission can be overridden from config.toml without editing code — see [tools] — built-in tool filters. The same section documents how to allow-list or block-list specific built-ins (e.g. disabling web_search in a locked-down environment).

Examples

Read Mode (Default)

agsh [r] > read the contents of main.rs

The agent uses read_file and shows the contents. Shell commands also work in read mode, but run in a read-only sandbox – the filesystem is physically write-protected for the child process:

agsh [r] > list the files in this directory
agsh [r] > show me the git log

Commands like ls, cat, git log, df, ps, and uname work normally. Commands that attempt to write to the filesystem (e.g., touch, rm, mkdir) will fail with a permission error.

If you ask the agent to modify a file:

agsh [r] > add a comment to the top of main.rs

The agent will explain that it cannot write files in read mode and suggest switching to write mode.

Note: The read-only sandbox uses Landlock on Linux (kernel 5.13+) and sandbox-exec on macOS. On platforms where sandboxing is unavailable, shell commands are not available in read mode. You can disable sandboxed shell execution by setting sandbox = false under [shell] in the config file (see Config File).

Write Mode

agsh [w] > run cargo test and show me the output

The agent uses execute_command to run the tests and shows the results.

Sessions

Sessions persist your conversation history so you can resume later. Each session is identified by a UUID and stored in a SQLite database.

How Sessions Work

  • A session is not created when agsh starts. It is created lazily when you send the first message.
  • When a session is created, its UUID is printed to stderr.
  • When you exit agsh (Ctrl+D), the session UUID is printed again so you can note it for later.
  • Sessions include the full message history: your inputs, the agent’s responses, and tool call results.

Resuming a Session

Continue Last Session

agsh -c

This resumes the most recently updated session.

By UUID

agsh -c 550e8400-e29b-41d4-a716-446655440000

The agent loads the previous conversation history and continues from where you left off.

Session Locking

Only one agsh instance can be attached to a session at a time. This prevents race conditions from concurrent writes.

  • If you try to resume a session that is locked by a running agsh process, you will get an error.
  • If the locking process has exited (crashed or was killed), agsh detects this and allows you to take over the lock.

Storage Location

Sessions are stored in a SQLite database at a platform-specific location:

PlatformPath
Linux~/.local/share/agsh/sessions.db ($XDG_DATA_HOME/agsh/sessions.db)
macOS~/Library/Application Support/agsh/sessions.db
Windows%APPDATA%\agsh\sessions.db

Database Schema

The database has three tables:

sessions – one row per session:

ColumnTypeDescription
idTEXT (UUID)Primary key
created_atTEXT (RFC 3339)When the session was created
updated_atTEXT (RFC 3339)When the session was last updated
locked_byTEXT (PID)PID of the process holding the lock, or NULL
metadataTEXTReserved for future use

messages – one row per message in a session:

ColumnTypeDescription
idINTEGERAuto-incrementing primary key
session_idTEXT (UUID)Foreign key to sessions.id
roleTEXTuser, assistant, or tool_results
contentTEXTMessage content (plain text or JSON)
created_atTEXT (RFC 3339)When the message was saved

tool_outputs – scratchpad entries, one row per entry:

ColumnTypeDescription
session_idTEXT (UUID)Part of composite primary key
nameTEXTPart of composite primary key
contentTEXTThe stored content
created_atTEXT (RFC 3339)When the entry was created

Scratchpad entries are scoped to a session. Two sessions can have entries with the same name. Entries are preserved across compaction but deleted when a session is deleted.

History Retention

agsh automatically manages session storage on startup with sensible defaults:

  • retention_days (default: 90) – deletes sessions whose updated_at is older than this many days.
  • max_storage_bytes (default: 52428800 / 50 MB) – when total message content exceeds this limit, the oldest sessions are deleted until the total is under the limit.

You can override these defaults in the config file under [session]:

[session]
retention_days = 30          # delete sessions not used in 30 days
max_storage_bytes = 10485760 # cap total storage at ~10 MB

See Config File for details.

Context Window Limiting

Long sessions can exceed the LLM’s context window or become expensive. The context_messages setting (default: 200) limits how many recent messages are sent to the API:

[session]
context_messages = 100

The full history remains in SQLite for resumption. Only the API payload is truncated. The truncation preserves tool call chains (it never splits a tool use from its result).

Compacting a Session

If a session becomes too long, you can use the /compact command to have the LLM summarize the conversation and replace older messages with a structured summary. Recent messages are preserved verbatim. The summary includes key files, decisions, errors, and user preferences.

Compaction preserves scratchpad entries and the todo list, and re-injects environment context so the agent isn’t disoriented after compaction.

Auto-Compact

When auto_compact is enabled (default: true), agsh automatically compacts the conversation when the input token count exceeds 80% of the context window. This runs between turns, not during tool loops.

[session]
auto_compact = true
context_window = 200000  # optional override

Listing Sessions

To see past sessions:

agsh list

This shows a table with each session’s ID, last update time, and a preview of the first message:

ID                                    Updated              Preview
550e8400-e29b-41d4-a716-446655440000  2026-03-14 12:00:00  How do I implement a binary search tree?
a1b2c3d4-e5f6-7890-abcd-ef1234567890  2026-03-13 09:30:00  Fix the login page CSS

By default the 20 most recent sessions are shown. Use -n to change:

agsh list -n 50

Exporting a Session

You can export any session as a Markdown file:

agsh export 550e8400-e29b-41d4-a716-446655440000

This writes session-550e8400-e29b-41d4-a716-446655440000.md in the current directory with the full conversation history. User and assistant messages are rendered as Markdown sections, while tool calls and results are wrapped in collapsible <details> blocks.

To write to a specific file:

agsh export 550e8400-e29b-41d4-a716-446655440000 -o conversation.md

To print to stdout (for piping):

agsh export 550e8400-e29b-41d4-a716-446655440000 -o -

Deleting Sessions

Delete specific sessions by UUID:

agsh delete 550e8400-e29b-41d4-a716-446655440000

Delete multiple sessions at once:

agsh delete 550e8400-e29b-41d4-a716-446655440000 a1b2c3d4-e5f6-7890-abcd-ef1234567890

Delete all sessions:

agsh delete --all

Managing Sessions via SQLite

You can also manage sessions directly through the SQLite database. For example, to list all sessions:

sqlite3 ~/.local/share/agsh/sessions.db \
  "SELECT id, created_at, updated_at FROM sessions ORDER BY updated_at DESC;"

Skills

Skills are user-defined knowledge packages that give the agent non-standard knowledge – manuals, procedures, tool-specific instructions, and experience the LLM doesn’t have natively. Each skill is a directory containing a SKILL.md file with structured metadata.

How Skills Work

  • Skills live in ~/.config/agsh/skills/ (platform-specific config dir).
  • Each skill is a directory: skills/<name>/SKILL.md.
  • SKILL.md starts with a YAML frontmatter block declaring the skill’s metadata, followed by Markdown body content.
  • On every prompt, agsh discovers all valid skills and lists them in the system prompt with their description and when_to_use.
  • The agent invokes a skill by calling the skill tool with the skill name. The tool returns the full body, which the agent follows.
  • Skills are available in read, ask, and write permission modes (not in none).

File Format

A skill is a directory under ~/.config/agsh/skills/ containing a SKILL.md file:

~/.config/agsh/skills/
└── download-videos/
    └── SKILL.md

SKILL.md must begin with a YAML frontmatter block, followed by the skill body:

---
description: Download videos from various websites using yt-dlp
when_to_use: When the user wants to download a video from a website
allowed_tools: [execute_command]
version: "1.0"
user_invocable: true
---

# Download Videos with yt-dlp

## Installation

Install yt-dlp:

\```bash
pip install yt-dlp
\```

## Basic Usage

Download a video:

\```bash
yt-dlp "https://example.com/video"
\```

Required Frontmatter Fields

FieldDescription
descriptionOne-line summary of what the skill does. Shown in the system prompt.
when_to_useA hint telling the agent when to invoke the skill. Shown in the system prompt.

Skills missing either field are skipped at discovery with a warning log.

Optional Frontmatter Fields

FieldDefaultDescription
allowed_tools[]Array or CSV string of tool names the skill expects. Currently advisory (not enforced).
versionnoneFree-form version label (e.g. "1.0", "2024-03-14").
user_invocabletrueReserved for future /skill <name> slash command.

Variable Substitution

The skill body may reference these variables, which are expanded when the skill is loaded:

  • ${AGSH_SKILL_DIR} – the absolute path to the skill’s directory. Use this to reference bundled helper files (e.g. ${AGSH_SKILL_DIR}/helper.sh).
  • ${AGSH_SESSION_ID} – the current session UUID.

Storage Location

PlatformPath
Linux~/.config/agsh/skills/<name>/SKILL.md ($XDG_CONFIG_HOME/agsh/skills/)
macOS~/Library/Application Support/agsh/skills/<name>/SKILL.md
Windows%APPDATA%\agsh\skills\<name>\SKILL.md

How the Agent Uses Skills

When skills are available, the system prompt includes a ## Skills section like:

## Skills

- **download-videos**: Download videos from various websites using yt-dlp — When the user wants to download a video from a website
- **deploy-kubernetes**: Deploy services to a K8s cluster — When the user asks to deploy to Kubernetes

The agent loads a skill by calling the skill tool:

skill(name: "download-videos")

The tool returns the full body of SKILL.md (with variables expanded) as its output. The agent then follows the instructions.

Tips

  • Use short, unambiguous skill names (e.g. setup-postgres, not pg). The name is what the agent sees and calls.
  • Write description and when_to_use concisely – they go into every system prompt and consume tokens.
  • Keep each skill focused on a single topic or procedure. Spawn multiple skills rather than one giant one.
  • Bundle supporting files in the skill directory and reference them with ${AGSH_SKILL_DIR}/file.ext.
  • Skills are re-discovered on every prompt, so you can add, edit, or remove skills mid-session without restarting agsh.

Providers Overview

Providers are the LLM inference backends that agsh uses to process your instructions. agsh ships with two built-in providers:

ProviderAPIStreamingTool Calling
OpenAIChat CompletionsSSEFunction calling
ClaudeMessages APISSE (named events)Content blocks

Selecting a Provider

Set the provider via any configuration layer:

# CLI flag
agsh --provider openai

# Environment variable
export AGSH_PROVIDER=claude

# Config file (~/.config/agsh/config.toml)
[provider]
name = "openai"

OpenAI-Compatible APIs

The openai provider works with any API that implements the OpenAI Chat Completions format. This includes:

  • OpenAI (default endpoint)
  • Ollama (http://localhost:11434/v1)
  • OpenRouter (https://openrouter.ai/api/v1)
  • vLLM, LiteLLM, and other OpenAI-compatible servers

Set the --base-url flag or OPENAI_BASE_URL environment variable to point to the alternative endpoint.

Streaming vs Non-Streaming

By default, agsh uses streaming mode: tokens appear in the terminal as they are generated. Use --no-stream to wait for the complete response before displaying it.

Streaming is recommended for interactive use. Non-streaming may be useful for scripting or when the provider does not support SSE.

OpenAI Provider

The OpenAI provider uses the Chat Completions API. It also works with any OpenAI-compatible API endpoint.

Configuration

SettingValue
Provider nameopenai
Default base URLhttps://api.openai.com/v1
API key env varOPENAI_API_KEY
Auth methodBearer token (Authorization: Bearer <key>)

Minimal Setup

export AGSH_PROVIDER=openai
export AGSH_MODEL=gpt-4o
export OPENAI_API_KEY=sk-...
agsh

Config File

[provider]
name = "openai"
model = "gpt-4o"

Supported Models

Any model available through the OpenAI Chat Completions API (or compatible endpoint) that supports tool calling:

  • gpt-4o, gpt-4o-mini
  • gpt-4-turbo
  • o1, o3-mini
  • Third-party models via compatible APIs

Custom Base URL

To use an OpenAI-compatible endpoint, set the base URL:

# Ollama
agsh --provider openai --model llama3 --base-url http://localhost:11434/v1

# OpenRouter
agsh --provider openai --model anthropic/claude-sonnet-4-20250514 --base-url https://openrouter.ai/api/v1

Or in the config file:

[provider]
name = "openai"
model = "llama3"
api_key = "unused"
base_url = "http://localhost:11434/v1"

API Details

Endpoint: POST {base_url}/chat/completions

Tool format: Tools are sent as function definitions:

{
  "type": "function",
  "function": {
    "name": "read_file",
    "description": "Read the contents of a file at the given path.",
    "parameters": { "type": "object", "properties": { ... } }
  }
}

Tool results: Sent back as messages with role: "tool" and the corresponding tool_call_id.

Streaming: Uses Server-Sent Events (SSE) with data: {...} lines. The stream ends with data: [DONE].

Claude Provider

The Claude provider uses the Claude API Messages endpoint.

Configuration

SettingValue
Provider nameclaude
Default base URLhttps://api.anthropic.com
API key env varCLAUDE_API_KEY
OAuth token env varCLAUDE_OAUTH_TOKEN
Auth methodx-api-key header (API key) or Authorization: Bearer (OAuth)
API version2023-06-01
Max tokens8192

Quickest Start (OAuth Login)

Run the setup wizard and choose OAuth login when prompted:

agsh setup

This opens your browser for authorization, exchanges the code for tokens, and saves them to the database. No API key needed.

Minimal Setup (API Key)

export AGSH_PROVIDER=claude
export AGSH_MODEL=claude-sonnet-4-20250514
export CLAUDE_API_KEY=sk-ant-api03-...
agsh

Minimal Setup (OAuth Token)

export AGSH_PROVIDER=claude
export AGSH_MODEL=claude-sonnet-4-20250514
export CLAUDE_OAUTH_TOKEN=sk-ant-oat01-...
agsh

On the first run, the OAuth token is saved to the database. On subsequent runs, the token is loaded automatically without needing the environment variable.

Config File

[provider]
name = "claude"
model = "claude-sonnet-4-20250514"

Authentication

agsh supports two authentication methods for the Claude provider:

OAuth Login

The recommended way to authenticate. Run agsh setup (or let the first-launch wizard guide you) and select OAuth login. This performs an OAuth Authorization Code flow with PKCE:

  1. agsh generates a PKCE challenge and opens your browser to Claude’s authorization page
  2. You authorize the application in your browser
  3. You paste the authorization code back into agsh
  4. agsh exchanges the code for access and refresh tokens
  5. Tokens are stored in the database and refreshed automatically

The OAuth client ID defaults to Claude Code’s client ID but can be overridden via the CLAUDE_CLIENT_ID environment variable.

API Key

Traditional API key authentication using the x-api-key header. Set via CLAUDE_API_KEY env var or provider.api_key in the config file.

Manual OAuth Token

OAuth token authentication using the Authorization: Bearer header. Set via CLAUDE_OAUTH_TOKEN env var or provider.oauth_token in the config file.

OAuth tokens are automatically detected by their sk-ant-oat01- prefix, even when passed via CLAUDE_API_KEY.

Token lifecycle:

  1. Provide the initial token via env var, config, or OAuth login
  2. agsh saves it to the database on first use
  3. On subsequent launches, the token is loaded from the database
  4. If the token expires, agsh refreshes it automatically and updates the database
  5. Setting a new env var or config value replaces the stored token

Token refresh URL: Defaults to https://api.anthropic.com/v1/oauth/token. Configurable via provider.oauth_token_url in the config file.

Supported Models

Any model available through the Claude API:

  • claude-opus-4-20250514
  • claude-sonnet-4-20250514
  • claude-haiku-4-5-20251001

Custom Base URL

To use a Claude-compatible proxy or gateway:

agsh --provider claude --model claude-sonnet-4-20250514 --base-url https://my-proxy.example.com

API Details

Endpoint: POST {base_url}/v1/messages

Headers (API key):

  • x-api-key: <api_key>
  • anthropic-version: 2023-06-01
  • content-type: application/json

Headers (OAuth):

  • Authorization: Bearer <oauth_token>
  • anthropic-version: 2023-06-01
  • content-type: application/json

System prompt: Sent as a top-level system field in the request body (not as a message).

Tool format: Tools are defined with input_schema instead of parameters:

{
  "name": "read_file",
  "description": "Read the contents of a file at the given path.",
  "input_schema": { "type": "object", "properties": { ... } }
}

Tool use and results: Expressed as content blocks within messages:

  • Tool use: {"type": "tool_use", "id": "...", "name": "...", "input": {...}}
  • Tool result: {"type": "tool_result", "tool_use_id": "...", "content": "..."}

Streaming: Uses Server-Sent Events with named event types:

EventDescription
message_startMessage initialization
content_block_startBegin a text or tool_use block
content_block_deltaIncremental text (text_delta) or tool input (input_json_delta)
content_block_stopEnd of a content block
message_deltaFinal metadata including stop_reason
message_stopStream complete
pingKeep-alive

Tools Overview

Tools are the actions that the agent can perform on your behalf. The LLM decides which tools to call based on your instructions.

Available Tools

ToolPermissionDescription
read_fileReadRead file contents
edit_fileWriteMake string replacements in a file
write_fileWriteCreate or overwrite a file
find_filesReadFind files by glob pattern
search_contentsReadSearch file contents with regex
fetch_urlReadFetch a web page as markdown
web_searchReadSearch the web
execute_commandRead/WriteRun a shell command
todo_writeReadManage a structured task list
spawn_agentReadDelegate tasks to a sub-agent
scratchpad_writeReadStore content in the scratchpad
scratchpad_readReadRead a scratchpad entry
scratchpad_editReadEdit a scratchpad entry
scratchpad_listReadList scratchpad entries
scratchpad_deleteReadDelete a scratchpad entry
skillReadLoad a named skill’s instructions
render_imageReadView an image from in-memory base64 or scratchpad

Permission Requirements

Tools are grouped by the minimum permission level required:

Read permission (available in read, ask, and write modes):

  • read_file, find_files, search_contents, fetch_url, web_search
  • execute_command (sandboxed, filesystem write-protected)
  • todo_write, spawn_agent, skill, render_image
  • All scratchpad tools

Write permission (only available in write mode):

  • edit_file, write_file, execute_command (unsandboxed)

In ask mode, all tools are available but each call requires user confirmation.

In none mode, no tools are available. The agent can only respond with text.

Filtering Built-in Tools

Any built-in can be allow-listed, blocked, or have its required permission overridden via the [tools] table in config.toml. See [tools] — built-in tool filters. Run agsh tools list to see every built-in with its effective permission and current status.

MCP Tools

When MCP servers are configured, their tools are registered under a namespaced name of the form <server>__<tool> (e.g. notion__notion-search). They appear in the system prompt catalogue alongside the built-ins — with their resolved permission level annotated inline — and are called the same way.

agsh also exposes seven built-in MCP meta-tools for browsing server-side resources and prompts. All are deferred by default (loaded on first use):

ToolPermissionDescription
list_mcp_resourcesReadList resources a server exposes
read_mcp_resourceReadRead a server resource by URI
list_mcp_promptsReadList server-defined prompts
get_mcp_promptReadRender a server prompt with arguments
subscribe_mcp_resourceReadReceive change notifications for a resource
unsubscribe_mcp_resourceReadStop receiving change notifications
list_mcp_resource_updatesReadInspect pending resource-change notifications

Scratchpad Parameter

All tools support an optional scratchpad string parameter. When provided, the tool’s output is saved to the scratchpad under that name instead of being returned inline. This lets the agent store large outputs for later processing without consuming conversation context.

execute_command({"command": "pdftotext doc.pdf -", "scratchpad": "pdf_text"})

How Tool Calls Work

  1. The agent receives your instruction and decides which tools to call
  2. For each tool call, agsh checks the current permission level
  3. In ask mode, you are prompted to approve or deny each tool call
  4. If permitted, the tool executes and its output is fed back to the agent
  5. The agent may make additional tool calls or respond with text
  6. This loop continues until the agent has no more tool calls to make

Tool calls and their results are displayed in the terminal so you can see what the agent is doing.

todo_write

A built-in tool for managing a structured task list during a session. The agent uses this to track multi-step work and communicate progress. The task list is displayed in the terminal and injected into the conversation context each turn.

spawn_agent

Spawns a read-only sub-agent to perform research or analysis tasks. The sub-agent has access to the same tools (except spawn_agent and todo_write) and returns a report. This is useful for delegating exploration without polluting the main conversation context.

skill

Loads a named skill’s instructions. Skills are user-defined knowledge packages stored in ~/.config/agsh/skills/<name>/SKILL.md. The system prompt lists available skills with their description and when-to-use hint; the agent calls skill({"name": "<skill-name>"}) to load the full body. See Skills for how to author skills.

render_image

Displays an image the agent has in memory — as base64 bytes or in a scratchpad entry — as a multimodal content block. Complements fetch_url (network) and read_file (local file) by covering the third case: image data produced on the fly by a command pipeline.

Typical workflow:

execute_command({"command": "ffmpeg -i input.mp4 -vframes 1 -f image2pipe pipe: | base64 -w0", "scratchpad": "frame"})
render_image({"from_scratchpad": "frame"})

Parameters:

NameTypeRequiredDescription
from_scratchpadstringone of twoName of a scratchpad entry containing base64-encoded image bytes
base64stringone of twoBase64-encoded image bytes, passed inline

Exactly one of from_scratchpad or base64 must be provided. Prefer from_scratchpad for large images — inline base64 inflates tool-call JSON.

The bytes must decode to a supported raster image. PNG, JPEG, GIF, WebP, and BMP pass through unchanged; TIFF, ICO, HDR, EXR, TGA, PNM, QOI, DDS, and Farbfeld are auto-converted to PNG. Size cap is ~3.75 MB on the final payload.

Only call render_image when the current model supports vision input.

Redirecting output to the scratchpad

Several tools — execute_command, find_files, search_contents, fetch_url, spawn_agent — accept an optional scratchpad parameter that redirects their output to a named scratchpad entry instead of returning it inline. When this parameter is set, the tool produces its full, untruncated output: internal result-count caps (find_files 200, search_contents 100) and length caps (fetch_url max_length) are lifted for the scratchpad-bound result.

File Operations

read_file

Read the contents of a file at a given path. Supports text files and images.

Permission: Read

Parameters

NameTypeRequiredDescription
pathstringyesThe file path to read
offsetintegernoLine number to start reading from (0-based)
limitintegernoMaximum number of lines to read
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • When offset and limit are both omitted, defaults to the first 2000 lines. If the file has more, a truncation notice is appended.
  • Use offset/limit to page through large files.

Image files

Recognized image extensions are returned as base64-encoded multimodal content:

  • Provider-native (pass-through): .png, .jpg/.jpeg, .gif, .webp, .bmp
  • Convertible (decoded and re-encoded as PNG transparently): .tif/.tiff, .ico, .hdr, .exr, .tga, .pbm/.pgm/.ppm/.pnm, .qoi, .dds, .ff/.farbfeld
  • Unsupported (fall through to text read, which will fail on binary): .svg, .jxl, .heic, .avif

Images are rejected if the final payload exceeds 3.75 MB (~5 MB base64). Conversion can enlarge an image, so a small TIFF may produce a too-large PNG.

Only read image files when the current model supports vision input — text-only models will either error or silently drop the image block.

Examples

Read an entire file:

agsh [r] > show me the contents of src/main.rs

Read lines 10-20:

agsh [r] > show me lines 10 through 20 of src/main.rs

edit_file

Make a string replacement in a file. The file must have been read with read_file first (unless force is set).

Permission: Write

Parameters

NameTypeRequiredDescription
pathstringyesThe file path to edit
old_stringstringyesThe exact string to find and replace
new_stringstringyesThe replacement string
replace_allbooleannoReplace all occurrences (default: false)
forcebooleannoBypass read-before-edit requirement (default: false)
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • By default, only the first occurrence of old_string is replaced. Set replace_all to replace every occurrence.
  • The file must have been previously read with read_file on the same path. This prevents blind edits. Set force to bypass this requirement.
  • If old_string is not found, the tool returns an error (without modifying the file).

write_file

Create or overwrite a file with the given content.

Permission: Write

Parameters

NameTypeRequiredDescription
pathstringyesThe file path to write
contentstringyesThe content to write to the file
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • Creates parent directories if they do not exist.
  • Overwrites the file if it already exists.

Search Tools

find_files

Find files matching a glob pattern.

Permission: Read

Parameters

NameTypeRequiredDescription
patternstringyesGlob pattern to match files against
pathstringnoDirectory to search in (defaults to current directory)
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • Results are limited to 200 matches.
  • Returns one file path per line.

Glob Patterns

PatternMatches
*.rsAll .rs files in the current directory
**/*.rsAll .rs files recursively
src/*.txtAll .txt files in src/
test_*All files starting with test_

search_contents

Search file contents using a regex pattern. Powered by the ripgrep library.

Permission: Read

Parameters

NameTypeRequiredDescription
patternstringyesRegex pattern to search for
pathstringnoFile or directory to search in (defaults to current directory)
globstringnoGlob pattern to filter which files are searched (e.g., *.rs)
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • Searches recursively through directories.
  • Skips hidden files (starting with .) and common non-text directories (target, node_modules).
  • Results are limited to 100 matches.
  • Each result includes the file path, line number, and matching line.

Web Tools

fetch_url

Fetch a web page and return its content as markdown text.

Permission: Read

Parameters

NameTypeRequiredDescription
urlstringyesThe URL to fetch
max_lengthintegernoMaximum characters to return (default: 30000, 0 for no limit)
headersobjectnoCustom HTTP headers (overrides defaults like User-Agent)
regexstringnoIf provided, return only matching content (matches joined by newlines)
rawbooleannoReturn raw HTML instead of converting to markdown (default: false)
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • Fetches the page via HTTP GET.
  • Converts HTML to Markdown using fast_html2md (unless raw is true).
  • Truncates the output to max_length characters (default: 30,000).
  • HTTP timeout: 30 seconds.
  • Returns the HTTP status code as an error if the request fails (e.g., 404, 500).

Image URLs

If the response Content-Type is a supported raster image format, fetch_url returns a multimodal Image content block instead of markdown. No disk is touched — bytes are base64-encoded in memory.

Provider-native formats (passed through unchanged):

  • image/png, image/jpeg (and image/jpg), image/gif, image/webp, image/bmp (and image/x-ms-bmp)

Convertible formats (decoded and re-encoded as PNG transparently):

  • image/tiff, image/vnd.microsoft.icon / image/x-icon, image/vnd.radiance (HDR), image/x-exr, image/x-targa, image/x-portable-* (PNM), image/qoi, image/vnd.ms-dds, image/x-farbfeld

Unsupported formats (fall through to the text branch): image/svg+xml, image/jxl, image/heic, image/avif.

  • The max_length, regex, and raw options do not apply to image responses.
  • Size cap of ~3.75 MB applies to the output bytes (after conversion). Conversion can enlarge an image, so a 1 MB TIFF may produce a larger PNG.
  • Detection uses the response’s actual Content-Type header, so redirect chains and extension-less URLs are handled correctly.

Only fetch image URLs when the current model supports vision input — text-only models will either error or silently drop the image block.


Search DuckDuckGo and return the top results.

Permission: Read

Parameters

NameTypeRequiredDescription
querystringyesThe search query
headersobjectnoCustom HTTP headers (overrides defaults like User-Agent)
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • Returns up to 10 results per search.
  • Each result includes the title, source domain, URL, and a snippet with matched terms emphasised in bold.
  • Snippets are capped at 300 characters; use fetch_url on the result URL for the full page.
  • Uses HTML scraping (no API key required).
  • HTTP timeout: 30 seconds.

CAPTCHA detection

DuckDuckGo occasionally serves a bot-challenge page instead of results (detected by the anomaly-modal element). web_search returns a distinct error so the agent doesn’t silently retry:

DuckDuckGo served a CAPTCHA challenge (bot detection / rate limit).
Retry later.

If this happens often in your environment, configure a search-capable MCP server — see the MCP configuration examples for patterns that work well.

Shell Tool

execute_command

Execute a shell command and return its output.

Permission: Read (sandboxed) / Write (unsandboxed)

Parameters

NameTypeRequiredDescription
commandstringyesThe shell command to execute
timeout_msintegernoTimeout in milliseconds (default: 30000)
scratchpadstringnoSave output to the scratchpad under this name

Behavior

  • Executes the command via sh -c "<command>" on Unix, or powershell.exe -NoProfile -NonInteractive -Command "<command>" on Windows (same shell in both sandboxed and unsandboxed mode).
  • Captures both stdout and stderr.
  • Returns the exit code along with the output if non-zero.
  • Oversized output is losslessly persisted to the scratchpad by the agent layer — the tool itself never truncates.
  • Default timeout is 30 seconds. If the command exceeds the timeout, it is killed (on Unix, via the process group so backgrounded grandchildren are caught too).
  • Supports cancellation: pressing Ctrl+C while a command is running kills the child process.

Shell-specific semantics

  • Unix (sh -c): POSIX $VAR expansion applies. Pass a literal $ with single quotes ('$foo') or backslash escape (\$foo).
  • Windows (powershell.exe -Command): The script body reaches PowerShell directly. Use PowerShell syntax ($var = ..., $env:PATH) — and crucially, do not wrap your command in another powershell -Command "...". The outer PowerShell will expand your inner $var references to empty strings before the inner shell runs, producing a parser error on mangled syntax. If you need to invoke a nested script, drop it into a .ps1 file and run it by path, use -EncodedCommand <base64>, or escape each $ as `$.

Read-Only Sandbox

In read mode, commands run inside a filesystem sandbox that blocks writes to the user’s real data. Reads and program execution still work normally:

  • Linux: Uses Landlock LSM (kernel 5.13+). The child process is restricted via landlock_restrict_self before exec. Only READ_FILE, READ_DIR, and EXECUTE access rights are granted — writes anywhere on the filesystem return EACCES.
  • macOS: Uses sandbox-exec with a SBPL profile that denies all file-write* operations.
  • Windows: Spawns the child with a duplicated primary token dropped to Low integrity (SECURITY_MANDATORY_LOW_RID) via SetTokenInformation(TokenIntegrityLevel, …). Writes to the home directory, %APPDATA%, Program Files, and system directories — any location with Medium-or-higher integrity ACLs — are blocked by the kernel. Unlike Landlock, Low integrity is not a total write-denial: the child can still write to the small residual Low-integrity-writable surface (%LOCALAPPDATA%\Low, %TEMP%\Low, any path with an explicit Low-integrity write ACE) and to files it creates itself (which inherit Low integrity). For practical purposes this matches the guarantees of sandbox-exec and prevents the agent from touching user data, but full “zero writes anywhere” on Windows would require Windows Sandbox or an AppContainer and is out of scope.
  • Unsupported platforms: Shell commands are not available in read mode — switch to write mode to execute commands without a sandbox.

In write mode, commands run without any sandbox restrictions.

To disable sandboxed shell execution in read mode, set sandbox = false under [shell] in the config file. When disabled, shell commands require write mode.

[shell]
sandbox = false

Scratchpad

The scratchpad is a session-scoped working memory that the agent can use to store, retrieve, edit, and manage content without consuming conversation context. Entries are identified by string names and persist across turns within a session.

When the Scratchpad is Used

  • Proactively: The agent stores intermediate results (extracted text, API responses, research notes) for later use.
  • Via scratchpad parameter: Any tool can save its output directly to the scratchpad by including a scratchpad parameter in the tool call.
  • Automatically: When a tool’s output exceeds 30,000 characters, it is saved to the scratchpad under an auto-generated name (e.g., execute_command_1) and replaced with a preview in the conversation.

Tools

scratchpad_write

Store content in the scratchpad. If the name already exists, the content is overwritten.

Permission: Read

NameTypeRequiredDescription
namestringyesName for the entry
contentstringyesThe content to store

scratchpad_read

Read or search a scratchpad entry by name.

Permission: Read

NameTypeRequiredDescription
namestringyesThe entry name
offsetintegernoCharacter offset to start reading from (default: 0)
limitintegernoMaximum characters to return (default: 30000)
regexstringnoSearch the entry and return matching lines (max 100)

scratchpad_edit

Edit a scratchpad entry in place. Provide content for a full overwrite, or old_string/new_string for targeted replacement.

Permission: Read

NameTypeRequiredDescription
namestringyesThe entry name
contentstringnoFull replacement (mutually exclusive with old/new)
old_stringstringnoString to find
new_stringstringnoReplacement string
replace_allbooleannoReplace all occurrences (default: false)

scratchpad_list

List all scratchpad entries with their name, size, and creation time. No parameters.

Permission: Read

scratchpad_delete

Delete a scratchpad entry by name.

Permission: Read

NameTypeRequiredDescription
namestringyesThe entry name to delete

Lifecycle

  • Entries are scoped to the session and persist across turns.
  • Entries survive session compaction (/compact).
  • Entries are deleted when the session is deleted.
  • Two sessions can have entries with the same name without conflict.
  • Writing to an existing name overwrites it silently.