Introduction
agsh (agentic shell) is a shell where you type natural language instead of commands. An LLM interprets your instructions and executes them using built-in tools like file operations, search, web access, and shell command execution.
agsh [r] > find all Rust files in this project and count the lines of code
Instead of remembering find . -name '*.rs' | xargs wc -l, you describe what you want and the agent figures out how to do it.
Features
- Natural language interface – describe what you want instead of memorizing syntax
- Built-in tools – file read/write/edit, glob search, regex content search (ripgrep), web fetch, web search, shell command execution
- Scratchpad – session-scoped working memory for the agent to store and retrieve intermediate results
- Sub-agents – delegate research tasks to read-only sub-agents
- Multiple LLM providers – OpenAI and Claude, with support for any OpenAI-compatible API
- MCP support – extend the agent with tools from external MCP servers
- Permission system – control what the agent can do (none/read/ask/write), switchable mid-session
- Session management – conversations are persisted in SQLite; resume, export, or compact any session
- Streaming output – responses stream to the terminal in real time with syntax highlighting
- Interactive and one-shot modes – use it as a REPL or pipe a single prompt
- Extended thinking – Claude provider supports extended thinking for complex reasoning
How It Works
- You type a natural language instruction
- agsh sends it to the configured LLM along with tool definitions and a system prompt
- The LLM decides which tools to call (if any) and returns text and/or tool calls
- agsh executes the tool calls, feeds results back to the LLM, and repeats until the LLM is done
- The final response is rendered as Markdown in the terminal
Installation
agsh is written in Rust and builds as a single binary.
Pre-Built Binaries
Download the latest release for your platform from the GitHub Releases page.
| Platform | Archive |
|---|---|
| Linux (x86_64) | agsh-linux-amd64.tar.gz |
| macOS (Apple Silicon) | agsh-macos-arm64.tar.gz |
| Windows (x86_64) | agsh-windows-amd64.zip |
Extract the binary and place it somewhere on your $PATH:
# Linux/macOS
tar -xzf agsh-*.tar.gz
cp agsh ~/.local/bin/
Cargo Install
If you have Rust installed, you can install agsh directly from the Git repository:
cargo install --locked --git https://github.com/k4yt3x/agsh.git
This builds the latest version from source and installs it to ~/.cargo/bin/.
Building from Source
Prerequisites
- Rust (edition 2024, requires Rust 1.85+)
- A C compiler (for the bundled SQLite)
Build
git clone https://github.com/k4yt3x/agsh.git
cd agsh
cargo build --release
The binary will be at target/release/agsh. Copy it somewhere on your $PATH:
cp target/release/agsh ~/.local/bin/
Verify
agsh --version
agsh --help
Quick Start
1. Run the Setup Wizard
On first launch, agsh automatically starts an interactive setup wizard:
agsh
The wizard will guide you through:
- Provider selection — Choose between
claudeandopenai - Authentication — OAuth login (Claude only) or API key entry
- Model selection — Enter the model name to use
- Base URL — Optionally set a custom API endpoint
The wizard writes your configuration to ~/.config/agsh/config.toml. You can re-run it at any time with agsh setup.
You can also create the config file manually or use environment variables (
OPENAI_API_KEY,AGSH_PROVIDER, etc.) and CLI flags (--provider,-m) as overrides. See Configuration for all options.
2. Start Using agsh
After setup, you will see a prompt:
agsh [r] >
You will see a prompt:
agsh [r] >
The [r] indicates read permission mode (the default). The agent can read files and search, but cannot write files or run commands.
3. Ask It Something
agsh [r] > what files are in the current directory?
The agent will use the find_files tool to list files and describe them.
4. Enable Write Mode
Press Shift+Tab to cycle the permission to write mode:
agsh [w] >
Now the agent can execute commands and modify files:
agsh [w] > create a file called hello.txt with the text "hello world"
5. One-Shot Mode
For quick tasks without entering the interactive shell:
agsh "what is my current working directory?"
The process exits after the agent responds.
6. Continue a Previous Session
To pick up where you left off, continue the last session:
agsh -c
Or resume a specific session by its UUID:
agsh -c 550e8400-e29b-41d4-a716-446655440000
See Sessions for more details.
Configuration Overview
The recommended way to configure agsh is with a config file at ~/.config/agsh/config.toml:
[provider]
name = "openai"
model = "gpt-4o"
api_key = "sk-..."
This is all you need to get started. See Config File for the full reference.
Required Settings
agsh requires three settings to function. If any are missing, it prints an error with setup instructions:
| Setting | Config Key | Env Var | CLI Flag |
|---|---|---|---|
| Provider | provider.name | AGSH_PROVIDER | --provider |
| Model | provider.model | AGSH_MODEL | -m, --model |
| API Key | provider.api_key | OPENAI_API_KEY or CLAUDE_API_KEY | – |
Override Layers
Configuration is layered. Higher-priority layers override lower ones:
- CLI flags – per-invocation overrides (
--provider,--model,--base-url,-p) - Environment variables – useful for CI, containers, or temporary overrides (
AGSH_PROVIDER, etc.) - Config file – persistent settings in
~/.config/agsh/config.toml - Built-in defaults – permission defaults to
read, streaming defaults to on
For example, --model gpt-4o-mini on the command line overrides both AGSH_MODEL and provider.model in the config file.
API Key Resolution
The API key environment variable depends on the configured provider:
- Provider
openai: readsOPENAI_API_KEY - Provider
claude: readsCLAUDE_API_KEY(orCLAUDE_OAUTH_TOKENfor OAuth)
If the environment variable is not set, it falls back to provider.api_key in the config file.
Config File
agsh looks for a TOML configuration file at a platform-specific location:
| Platform | Path |
|---|---|
| Linux | ~/.config/agsh/config.toml ($XDG_CONFIG_HOME/agsh/config.toml) |
| macOS | ~/Library/Application Support/agsh/config.toml |
| Windows | %APPDATA%\agsh\config.toml |
The config file is optional. If it does not exist, agsh silently skips it.
Set the AGSH_CONFIG_DIR environment variable to override the default location entirely — the value points at the agsh directory itself (contains config.toml and skills/). Useful for tests, portable installs, and isolating a per-project config from your global one.
Format
[provider]
name = "openai"
model = "gpt-4o"
api_key = "sk-..."
base_url = "https://api.openai.com/v1"
All fields under [provider] are optional individually – you can set some in the config file and override others with environment variables or CLI flags.
Fields
provider.name
The LLM provider to use.
| Value | Description |
|---|---|
openai | OpenAI Chat Completions API (also works with OpenAI-compatible APIs) |
claude | Claude API (Messages endpoint) |
provider.model
The model identifier to send to the provider. Examples:
gpt-4o,gpt-4o-mini(OpenAI)claude-sonnet-4-20250514,claude-haiku-4-5-20251001(Claude)- Any model supported by an OpenAI-compatible endpoint
provider.api_key
The API key for authentication. It is recommended to use environment variables (OPENAI_API_KEY or CLAUDE_API_KEY) instead of storing the key in the config file.
provider.oauth_token
OAuth access token for the Claude provider. Can also be set via CLAUDE_OAUTH_TOKEN env var. The token is saved to the database on first use and loaded automatically on subsequent launches.
provider.oauth_token_url
Custom OAuth token refresh endpoint. Defaults to https://console.anthropic.com/v1/oauth/token.
provider.base_url
Custom API base URL. Useful for:
- Self-hosted models via Ollama (
http://localhost:11434/v1) - OpenRouter (
https://openrouter.ai/api/v1) - Other OpenAI-compatible API providers
If not set, defaults to:
https://api.openai.com/v1for theopenaiproviderhttps://api.anthropic.comfor theclaudeprovider
provider.reasoning_effort
Reasoning effort level for OpenAI o-series models. When set, the reasoning_effort parameter is included in API requests and max_completion_tokens is used instead of max_tokens.
Accepted values: low, medium, high. Omitted by default.
[provider]
reasoning_effort = "medium"
Examples
OpenAI
[provider]
name = "openai"
model = "gpt-4o"
# API key via env: export OPENAI_API_KEY=sk-...
Claude
[provider]
name = "claude"
model = "claude-sonnet-4-20250514"
# API key via env: export CLAUDE_API_KEY=sk-ant-api03-...
# Or OAuth token via env: export CLAUDE_OAUTH_TOKEN=sk-ant-oat01-...
Ollama (local)
[provider]
name = "openai"
model = "llama3"
api_key = "unused"
base_url = "http://localhost:11434/v1"
OpenRouter
[provider]
name = "openai"
model = "anthropic/claude-sonnet-4-20250514"
base_url = "https://openrouter.ai/api/v1"
# API key via env: export OPENAI_API_KEY=sk-or-...
[display]
Settings for output formatting.
display.render_mode
Output render mode. Equivalent to the --render-mode CLI flag.
| Value | Description |
|---|---|
bat | Syntax-highlighted markdown via bat (default) |
termimad | Terminal formatting via termimad (box-drawn code blocks, reflowed paragraphs). Alias: rich |
raw | Raw markdown printed verbatim with aligned tables |
Default: bat
[display]
render_mode = "raw"
display.show_session_id_on_create
Whether to display the session ID when a new session is created.
Default: false
display.show_session_id_on_exit
Whether to display the session ID when agsh exits.
Default: true
[display]
show_session_id_on_create = true
show_session_id_on_exit = false
display.show_path_in_prompt
Whether to show the current working directory in the interactive prompt.
Default: true
display.newline_before_prompt
Whether to add a blank line before the prompt after each agent response.
Default: true
display.newline_after_prompt
Whether to add a blank line after the prompt (before the agent response).
Default: true
display.input_style
Visual style applied to text typed into the REPL prompt. Makes submitted prompts easy to spot when scrolling back through a long session — reedline paints the buffer with this style on every repaint, including the final paint before the newline, so the styling lands in the terminal’s scrollback alongside the literal text.
Accepted values:
default(or unset): bold white-ish foreground on a slate-blue background, rendered in truecolor RGB so it looks the same across terminal themes.none: disable styling entirely.reverse: reverse video (swaps the terminal’s current foreground and background).bold,dim,italic,underline: single attribute, no colour change.- A colour name (
black,red,green,yellow,blue,magenta/purple,cyan,white): set only the foreground, mapped to the terminal’s palette.
Unknown values warn at startup and fall back to default.
Default: the banner preset described above.
[display]
show_path_in_prompt = false
newline_before_prompt = false
newline_after_prompt = false
input_style = "none" # or "cyan", "bold", "dim", etc.
[web]
Settings for the HTTP client shared by fetch_url and web_search. All keys are optional; unset fields use the defaults shown below.
| Key | Type | Default | Purpose |
|---|---|---|---|
user_agent | string | Real Chrome UA | Some search engines block non-browser UAs. Override if you need a specific identifier. |
request_timeout_seconds | int | 30 | Total request budget (connect + TLS + read). 0 falls back to the default. |
connect_timeout_seconds | int | unset | Separate cap on TCP + TLS handshake. Fail fast on unreachable hosts without shortening the whole request budget. |
read_timeout_seconds | int | unset | Per-chunk idle timeout. Catches bodies that stall mid-stream. |
max_redirects | int | 10 | Cap on 3xx hops. 0 disables redirects entirely. |
proxy | string | unset (honours HTTP_PROXY / HTTPS_PROXY / ALL_PROXY env) | Proxy URL. Schemes: http://, https://, socks5://, socks5h://, socks4://. The literal string "none" explicitly disables env-var auto-detection. |
ca_cert_file | path | unset | Extra PEM bundle to trust on top of the system store. Useful for corporate MITM proxies or self-signed internal services. Accepts single-cert and multi-cert files. |
https_only | bool | false | Refuse plain http:// URLs. |
min_tls_version | string | unset (reqwest default) | Minimum TLS version. Accepts "1.0", "1.1", "1.2", "1.3". Unknown values log a warn and fall through. Note: the bundled rustls backend supports only TLS 1.2 and 1.3 — "1.0" / "1.1" will surface a build error. |
danger_accept_invalid_certs | bool | false | DANGEROUS. Disable TLS certificate validation entirely. Emits a warn! on every startup when enabled. Only use against trusted local dev servers. |
danger_accept_invalid_hostnames | bool | false | DANGEROUS. Accept certificates whose hostname doesn’t match. Emits a warn! on every startup when enabled. Only use against trusted local dev servers. |
Example: corporate proxy with a private CA
[web]
proxy = "http://corp-proxy.internal:3128"
ca_cert_file = "/etc/ssl/corp-root-ca.pem"
min_tls_version = "1.2"
request_timeout_seconds = 60
Example: local testing against self-signed certs
[web]
# Route everything through a local SOCKS proxy you control.
proxy = "socks5h://127.0.0.1:1080"
# Accept self-signed certs on dev.local — KEEP THIS OFF IN PROD.
danger_accept_invalid_certs = true
Example: fail-fast timeouts
[web]
request_timeout_seconds = 5
connect_timeout_seconds = 2
max_redirects = 0
[shell]
Settings for shell command execution.
shell.sandbox
Whether to enable read-only filesystem sandboxing for shell commands in read mode. When enabled (default), shell commands can be executed in read mode but with the filesystem physically write-protected. When disabled, shell commands require write mode.
Default: true
[shell]
sandbox = false # disable sandboxed shell in read mode
The sandbox uses Landlock on Linux (kernel 5.13+) and sandbox-exec on macOS. On platforms where sandboxing is unavailable, shell commands always require write mode regardless of this setting.
[session]
Settings for session history retention and context window management.
session.context_messages
Maximum number of messages to send to the LLM API per request. Older messages are truncated from the beginning while preserving tool call chain integrity. The full history remains stored in SQLite – only the API payload is limited.
Default: 200
[session]
context_messages = 100
session.retention_days
Automatically delete sessions older than this many days on startup. Uses the session’s updated_at timestamp, so actively-resumed sessions are preserved even if originally created long ago.
Default: 90
[session]
retention_days = 30
session.max_storage_bytes
Maximum total byte size of all stored message content across all sessions. When exceeded on startup, the oldest sessions are deleted until the total is under the limit.
Default: 52428800 (50 MB)
[session]
max_storage_bytes = 10485760 # 10 MB
session.auto_compact
Automatically compact the conversation when input tokens exceed 80% of the context window. Compaction summarizes older messages and preserves recent ones, the todo list, and scratchpad entries.
Default: true
[session]
auto_compact = false
session.context_window
Override the model’s context window size (in tokens). Used for auto-compact threshold calculation. If not set, agsh infers the context window from the model name.
[session]
context_window = 200000
[thinking]
Settings for extended thinking (Claude provider only). Claude 4.6+ models use adaptive thinking automatically; older models use a fixed token budget.
thinking.enabled
Whether to enable extended thinking. When enabled, the model can use additional tokens for internal reasoning before responding.
Default: true
thinking.budget_tokens
Maximum number of tokens the model can use for thinking (for non-adaptive models).
Default: 16000
[thinking]
enabled = true
budget_tokens = 20000
[prompt]
Settings for injecting custom instructions into the system prompt. Use this to set installation-specific rules that should apply to every session – things the agent needs to know about your system, preferred tools, or policies.
prompt.instructions
A string of custom instructions that agsh will include in every system prompt, under a ## User Instructions section. The model is told to treat them as hard constraints unless they conflict with safety requirements.
Suitable use cases:
- System-specific policies: “Never install Python packages globally with pip – always use
uvor a venv.” - Installed tooling the agent should know about: “Poppler is available on this system – use
pdftotextfor PDFs.” - Workflow preferences: “Prefer ripgrep over grep; it’s installed and faster.”
- Signing / compliance rules: “Git commits on this system must use gpg signing.”
Default: unset (no custom instructions).
[prompt]
instructions = """
Never install Python packages globally with pip. Always use `uv` or a venv.
Poppler is available on this system — use `pdftotext` for PDFs.
Prefer ripgrep over grep.
"""
Notes:
- Empty or whitespace-only strings are treated as unset.
- Instructions apply to sub-agents spawned via
spawn_agenttoo. - Instructions are included at all permission levels (including
none) because they are authored by you.
[mcp]
Settings for MCP (Model Context Protocol) tool servers. MCP allows agsh to discover and use tools provided by external servers.
[[mcp.servers]]
An array of MCP server configurations. Each entry defines a server to connect to at startup.
| Field | Required | Description |
|---|---|---|
name | Yes | Unique name for this server. Used as namespace prefix for tools (name__tool). Must match [A-Za-z0-9_-]+, must not contain __, and must not be agsh, ide, or start with mcp_. |
transport | Yes | Transport type: "stdio" (spawn subprocess) or "http" (streamable HTTP). |
command | Stdio only | Path or name of the executable to spawn. On Windows, npx / .cmd / .bat / .ps1 are auto-wrapped in cmd /c. |
args | No | Arguments to pass to the command. |
env | No | Environment variables to set for the spawned process (stdio only). |
url | HTTP only | URL of the MCP server endpoint. |
auth_token | No | Bearer token for HTTP authentication (sent as Authorization: Bearer <token>). |
auth | No | OAuth authentication configuration (see below). Mutually exclusive with auth_token. |
headers | No | Custom HTTP headers to include with every request (HTTP only). |
headers_helper | No | Path to an executable whose stdout (Name: Value\n lines) is merged over headers at connect-time (HTTP only). Executed with AGSH_MCP_SERVER_NAME / AGSH_MCP_SERVER_URL in env; 15 s timeout. |
permission | No | Server-wide permission override. Applies to every tool on this server, beating the readOnlyHint the server advertises and the [mcp].default_permission global fallback. See Permission resolution below. |
allowed_tools | No | Optional allow-list of raw tool names (the form the server advertises, not the server__tool namespaced form). When set and non-empty, only these tools are registered; all others from this server are ignored. |
disabled_tools | No | Optional block-list of raw tool names. Applied after allowed_tools — tools listed here are never registered. Both lists can coexist; the net set is allowed_tools \ disabled_tools. |
tool_permissions | No | Per-tool permission overrides keyed by raw tool name. Beats the server-level permission and the server’s readOnlyHint when resolving a tool’s required permission. |
sampling | No | Allow this server to call sampling/createMessage against your configured LLM provider. Default false (reject). Enabling this lets a compromised server inject arbitrary messages into your LLM context and burn your provider quota — opt in per-server, deliberately. |
sampling_limit | No | Cap on sampling calls per agsh session from this server when sampling = true. Default 10. Requests beyond the limit return an INTERNAL_ERROR to the server. |
disabled | No | When true, the server is skipped entirely at startup — no process is spawned, no HTTP connect is attempted. Flip it back with agsh mcp enable <name> or by editing the config. Defaults to false. |
[mcp] top-level table
| Field | Purpose |
|---|---|
default_permission | Fallback permission for MCP tools whose server didn’t advertise readOnlyHint and doesn’t have a permission override. Accepts "none", "read", "ask", or "write". If unset the hardcoded fallback is "write" (strict). |
strict | When true (default), every turn is gated on all enabled MCP servers being Connected. If any are not, the turn is rejected with a shell-style error instead of sending the request to the model. Set to false to proceed with whichever servers are ready (a warn log names the missing ones). |
grace_seconds | Per-turn cap on how long to wait for still-Pending servers to connect before applying the strict check. Default 3. Set to 0 to skip waiting (useful for scripts that want to fail fast). |
connect_timeout_seconds | Per-server timeout for connect + initialize + list_tools. A hung stdio spawn or slow HTTPS handshake can’t stall the whole fleet past this bound. Default 30. |
Startup concurrency
MCP servers connect in parallel at startup, partitioned by transport so a fleet of stdio servers (process-spawn bound) doesn’t fight a fleet of HTTP servers (network bound):
- stdio:
AGSH_MCP_STDIO_CONCURRENCY(default3) - http:
AGSH_MCP_HTTP_CONCURRENCY(default20)
These env vars are tuning knobs — rarely needed, but useful if you’re running ~30 stdio servers on a constrained box (lower it) or ~50 HTTP servers (raise it).
Permission resolution
Every MCP tool’s required permission is resolved through a five-step chain; the first match wins:
server.tool_permissions[<raw-tool>]— explicit per-tool override.server.permission— explicit server-level override. Applies to every tool on that server regardless of what the server advertises.tool.annotations.readOnlyHintfrom the server:true→Read,false→Write.[mcp].default_permission— global fallback.- Hardcoded
Write— strict ultimate fallback.
User-supplied config (1, 2, 4) always beats the server’s self-classification — if a server lies about a tool, you can override. But when no user config says anything, the server’s hint is trusted for that specific tool so readOnlyHint = false destructive tools don’t silently become Read-accessible just because the user opted into a lenient global default.
Hint spoofing: a compromised server could claim readOnlyHint = true on a destructive tool. Defend by setting server.permission = "write" on suspect servers (step 2 wins) or by listing the destructive tools explicitly in tool_permissions / disabled_tools.
Stale config: entries in allowed_tools / disabled_tools / tool_permissions that don’t match any advertised tool get a warn! line at connect time. The server still connects; you just see a heads-up so you can clean up after the server renames a tool.
Visibility across levels: the resolved permission doesn’t hide a tool from the agent. Every registered tool is listed in the system prompt with its required level noted inline, and a per-turn [Permission context] block names the current level plus any tools it blocks. The agent can still reason about an inaccessible tool and suggest /permission <level> to enable it; the permission gate is enforced at dispatch time. Keeping the tool catalogue visible across levels is also what lets the Anthropic prompt cache survive mid-session permission toggles.
Examples
Exa — reliable web search when the built-in DuckDuckGo scraper gets CAPTCHA’d. The free tier works without an API key; paste a key into the headers table for the paid tier:
# Free tier — no key required
agsh mcp add exa https://mcp.exa.ai/mcp
# Paid tier — expands from EXA_API_KEY at connect time
agsh mcp add exa https://mcp.exa.ai/mcp --header "x-api-key=${EXA_API_KEY}"
Well-annotated server — no config needed. Every tool is classified by its own readOnlyHint (read tools Read, write tools Write):
[[mcp.servers]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"
User-declared trust on an unannotated server — all tools accessible in Read:
[[mcp.servers]]
name = "internal"
transport = "http"
url = "https://mcp.internal/…"
permission = "read"
Overriding a mis-annotated or distrusted tool — one specific tool requires Write:
[[mcp.servers]]
name = "notion"
transport = "http"
url = "https://mcp.notion.com/mcp"
[mcp.servers.tool_permissions]
"notion-do-something-scary" = "write"
Subset of a server’s tools — only query registers, all others are ignored:
[[mcp.servers]]
name = "pg"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres"]
allowed_tools = ["query"]
Block-list with a narrow exception — all fs tools are Read-accessible except the two destructive ones, which are never registered:
[[mcp.servers]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem"]
permission = "read"
disabled_tools = ["delete_file", "move_file"]
MCP tools are registered with namespaced names in the format servername__toolname to prevent collisions with built-in tools or between servers.
Tool and resource descriptions returned from MCP servers are truncated at 2048 characters to keep the system prompt bounded.
[tools] — built-in tool filters
The three knobs [[mcp.servers]] exposes for MCP tools also apply to agsh’s built-in tools (read_file, write_file, execute_command, web_search, etc.) via a top-level [tools] table. MCP per-server filtering is separate from this and keeps its own namespaces — this block only affects the built-ins.
| Key | Purpose |
|---|---|
allowed_tools | Optional allow-list of built-in tool names. When set and non-empty, only these built-ins register. Use agsh tools list to see the canonical names. |
disabled_tools | Block-list of built-in tool names. Applied after allowed_tools; a tool here is never registered even if it also appears in the allow-list. |
tool_permissions | Per-tool required-permission override keyed by built-in name. Beats the hardcoded required level from the tool’s impl. Levels: none, read, ask, write. |
Stale entries (a name that doesn’t match any built-in) emit a warn! at startup. agsh still starts — the warning just flags a likely typo or a tool the binary renamed.
Restrict a session to read-only inspection:
[tools]
allowed_tools = ["read_file", "find_files", "search_contents", "fetch_url"]
Force execute_command to need write so ask mode prompts for every shell call:
[tools.tool_permissions]
execute_command = "write"
Disable web access entirely in a locked-down environment:
[tools]
disabled_tools = ["web_search", "fetch_url"]
Sub-agents spawned via spawn_agent inherit the same filter — a disabled built-in is disabled everywhere. Run agsh tools list to see every built-in’s effective required permission, whether a [tools.tool_permissions] override is in effect, and whether the current config enables it.
Environment variable substitution
Every string field listed above (command, args, env values, url, headers values, auth_token) supports ${VAR} and ${VAR:-default} expansion from the process environment. Missing variables with no default leave the literal ${VAR} in place and log a warning at startup. Use this to avoid committing secrets:
[[mcp.servers]]
name = "github"
transport = "http"
url = "https://mcp.github.com"
auth_token = "${GITHUB_MCP_TOKEN}"
Environment variables
| Variable | Default | Purpose |
|---|---|---|
AGSH_MCP_TOOL_TIMEOUT | 600000 ms (600 s) | Per-call timeout for MCP tools. Triggers notifications/cancelled on expiry. |
agsh mcp CLI
Manage configured servers without editing config.toml by hand:
| Command | Action |
|---|---|
agsh mcp list | Print all configured servers. |
agsh mcp get <name> | Print full details for one server. |
agsh mcp add <name> <url-or-command> [args...] [flags] | Persist a server. Transport is auto-detected: a URL starting with http[s]:// means HTTP, anything else means stdio. Preserves existing formatting/comments via toml_edit. |
agsh mcp remove <name> | Best-effort revoke stored OAuth tokens (RFC 7009) at the provider, then delete the server entry, clear stored credentials, and drop any resource-update ledger entries. |
agsh mcp disable <name> | Set disabled = true on the server entry. The next agsh start skips it entirely. |
agsh mcp enable <name> | Clear the disabled flag, so the server connects on the next start. |
agsh mcp reconnect <name> | Smoke-test a connect; prints ok or the error. |
agsh mcp tools <name> | Connect and list every advertised tool with its resolved permission, the chain step that decided it, and whether the current config allows it. Useful for populating --allow-tool, --disable-tool, or --tool-permission overrides without leaving the CLI. |
agsh mcp login <name> | Drive interactive OAuth. If the server has no [auth] block and uses HTTP, assumes type = "oauth" and persists the block on success. |
agsh mcp logout <name> | Call the provider’s revocation_endpoint (RFC 7009) best-effort, then clear stored credentials + auth-probe cache. |
agsh mcp add flags
| Flag | Purpose |
|---|---|
--transport <stdio|http> | Override the auto-detected transport. |
--env KEY=VALUE | Environment variable for stdio (repeatable). |
--header KEY=VALUE | HTTP header (repeatable). |
--auth <oauth|client-credentials|client-credentials-jwt> | Configure the [auth] block. |
--auth-token <TOKEN> | Static bearer token. Mutually exclusive with --auth. |
--client-id, --client-secret | OAuth / client-credentials client identifiers. |
--signing-key <PATH>, --signing-algorithm <ALG> | JWT signing material (client-credentials-jwt only). |
--scope <SCOPE> | OAuth scope (repeatable). |
--redirect-port <PORT> | Fixed OAuth redirect port (default: ephemeral). |
--permission <none|read|ask|write> | Per-server permission cap (applies to all tools on the server). |
--allow-tool <NAME> | Raw tool name to allow (repeatable). When set, only listed tools register. |
--disable-tool <NAME> | Raw tool name to block (repeatable). Applied after --allow-tool. |
--tool-permission <NAME=LEVEL> | Per-tool permission override (repeatable). LEVEL is none/read/ask/write. |
--sampling, --sampling-limit <N> | Opt into server-initiated sampling/createMessage. |
Example: Notion
$ agsh mcp add notion https://mcp.notion.com/mcp
ok: added 'notion' to ~/.config/agsh/config.toml
probe: server requires OAuth.
running OAuth authorisation for 'notion' (use --no-login to skip).
no [auth] block for 'notion' — assuming OAuth authorization_code.
…
ok: authorized 'notion'
agsh mcp add on an HTTP endpoint:
-
Probe — issues an unauthenticated
GET(3 s timeout, redirects off) and classifies the response per the MCP authorization spec + RFC 6750 + RFC 9728:2xx→ server is open, no login needed.401/403withWWW-Authenticate: Bearer …→ OAuth required. Theresource_metadata="…"attribute (RFC 9728) is captured at DEBUG.- Any other status → couldn’t infer, prints the status code.
- Network failure → prints the error.
-
Auto-login — if the probe says OAuth is required (or
--auth oauthwas explicitly set), the OAuth authorization_code flow runs immediately as though the user had chainedagsh mcp login <name>themselves. The synthesised[auth] = oauthblock is written back toconfig.tomlon success. -
Rollback on failure — if the OAuth flow errors out, the entry we just wrote is purged from
config.toml(alongside any partial credentials + probe cache), leaving the user’s config clean. The command exits non-zero. -
--no-login— skips step 2. The entry is still persisted and the probe’s hint is still printed; runagsh mcp login <name>when ready. Useful for scripted setup or when you expect to edit[auth]by hand.
The probe and the auto-login only run for HTTP servers, and only when the user didn’t provide --auth-token (static bearer) or --auth (other than oauth). Stdio servers skip both.
Remote hosts / SSH sessions
The OAuth flow redirects the browser to http://127.0.0.1:<port>/callback. When agsh is running on a different host than the browser (SSH session, container, Codespace, WSL), the browser can’t reach back and shows a “connection refused” error page. agsh handles this automatically:
- While
agsh mcp login <name>waits for the callback it also watches stdin. - The browser’s address bar still contains the full callback URL (including
codeandstate) even when the connection fails. Copy it, paste it into the agsh prompt, and press Enter. - Whichever completes first — the TCP callback or the pasted URL — wins.
$ agsh mcp login notion
server 'notion' has no [auth] block; assuming OAuth authorization_code.
Opening browser for MCP server 'notion' OAuth authorization...
If the browser didn't open, visit:
https://mcp.notion.com/authorize?response_type=code&…
Waiting for OAuth callback (up to 120s).
If the browser can't reach this host (e.g. you're over SSH), paste the full
callback URL here and press Enter.
http://127.0.0.1:46437/callback?code=…&state=… ← paste here
ok: authorized 'notion'
REPL parity
Inside the REPL:
/mcp list— list configured servers./mcp reconnect <server>— reconnect smoke-test./mcp login <server>//mcp logout <server>— run the auth flow or revoke./mcp <server>:<prompt> [args...]— render a server-defined prompt as the next user turn.
Resources and prompts
In addition to tools, agsh exposes MCP resources and prompts through four builtin tools (deferred — the agent activates them when needed):
| Builtin | Purpose |
|---|---|
list_mcp_resources | List resources from one or every configured server. |
read_mcp_resource | Read a resource by server + uri; text inline, binary base64-encoded. |
list_mcp_prompts | List prompts from one or every configured server, including their declared arguments. |
get_mcp_prompt | Render a prompt by server + name with optional arguments; returns <role>: <text> lines. |
subscribe_mcp_resource | Subscribe to resources/updated notifications for a specific URI. |
unsubscribe_mcp_resource | Cancel a prior subscription. |
list_mcp_resource_updates | Print every resource that has been reported as updated since the session started. |
Connection lifecycle
- Reconnection is automatic for all transports (stdio, plain HTTP, OAuth-authenticated HTTP) when the transport closes mid-session. HTTP transports use exponential backoff (1s, 2s, 4s, 8s, 16s, capped 30s, max 5 attempts); stdio gets one immediate retry. The reconnect runs on a blocking thread to work around an upstream rmcp bug where the auth future is
!Send. - Session-expired recovery: rmcp 1.5 transparently re-initialises HTTP sessions on 404 / JSON-RPC
-32001. agsh relies on this; no per-call handling is required. - Cancellation: when the agent cancels a tool call (e.g. Ctrl-C), agsh sends
notifications/cancelledto the server with the in-flight request id so the server can stop work. - Timeouts: tool calls default to 600 s; override with
AGSH_MCP_TOOL_TIMEOUTin ms. - Tool list refresh: on
tools/list_changed, agsh re-discovers the server’s tools and hot-swaps them in the registry — no restart needed. - Progress notifications: MCP tool calls attach a per-request
progressToken; incomingnotifications/progressrender as a live status line under the tool invocation. - Server instructions:
InitializeResult.instructionsis captured once per connection and spliced into the system prompt (sanitised + truncated to 2048 chars) under## MCP Server Instructions. - Auth-probe cache: 401 responses are cached for 15 minutes so a restart after a failed auth flow skips the unauthenticated probe and goes straight to OAuth. Cleared by
agsh mcp logout. resources/list_changed,prompts/list_changed, andresources/updatednotifications are logged atinfo/debuglevel.
Server-to-client features
| Feature | agsh behaviour |
|---|---|
roots/list | Returns a single root: file://<current-working-directory> with the directory basename as the name. |
elicitation/create | Always responds with Decline and logs a warning — interactive form/URL input is not wired into the REPL. |
sampling/createMessage | Rejected with METHOD_NOT_FOUND unless the server has sampling = true in its config. When allowed, the current provider handles the request; per-session sampling_limit caps how many times each server may invoke it. |
[mcp.servers.auth]
OAuth authentication for HTTP MCP servers. Set type to choose the authentication method. This is mutually exclusive with auth_token.
| Field | Required | Description |
|---|---|---|
type | Yes | Auth method: "client_credentials", "client_credentials_jwt", or "oauth" |
client_id | Varies | OAuth client ID (required for client_credentials/jwt, optional for oauth with dynamic registration) |
client_secret | Varies | Client secret (required for client_credentials, optional for oauth) |
scopes | No | OAuth scopes to request |
resource | No | Resource parameter (RFC 8707), client_credentials only |
signing_key_path | JWT only | Path to PEM private key file |
signing_algorithm | No | JWT signing algorithm: RS256 (default), RS384, RS512, ES256, ES384 |
redirect_port | No | Local port for OAuth authorization code callback. When omitted, agsh binds to a random ephemeral port (recommended). oauth only. |
Examples
Stdio server
[[mcp.servers]]
name = "postgres"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"]
permission = "write"
HTTP server
[[mcp.servers]]
name = "web-tools"
transport = "http"
url = "http://localhost:8080/mcp"
permission = "read"
HTTP server with authentication
[[mcp.servers]]
name = "api"
transport = "http"
url = "https://api.example.com/mcp"
auth_token = "your-bearer-token"
permission = "write"
[mcp.servers.headers]
X-Custom-Header = "value"
Stdio server with environment variables
[[mcp.servers]]
name = "github"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
permission = "read"
[mcp.servers.env]
GITHUB_TOKEN = "ghp_..."
Multiple servers
[[mcp.servers]]
name = "filesystem"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/projects"]
permission = "read"
[[mcp.servers]]
name = "github"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
permission = "write"
HTTP server with OAuth client credentials
[[mcp.servers]]
name = "api"
transport = "http"
url = "https://api.example.com/mcp"
permission = "write"
[mcp.servers.auth]
type = "client_credentials"
client_id = "my-client-id"
client_secret = "my-client-secret"
scopes = ["read", "write"]
HTTP server with JWT client credentials
[[mcp.servers]]
name = "api"
transport = "http"
url = "https://api.example.com/mcp"
[mcp.servers.auth]
type = "client_credentials_jwt"
client_id = "my-client-id"
signing_key_path = "/path/to/private-key.pem"
signing_algorithm = "RS256"
scopes = ["admin"]
HTTP server with OAuth authorization code flow
On first connection, agsh opens a browser for authorization and stores the token for future use.
[[mcp.servers]]
name = "github-mcp"
transport = "http"
url = "https://mcp.example.com"
[mcp.servers.auth]
type = "oauth"
client_id = "my-app-id"
scopes = ["repo", "user"]
redirect_port = 8400
If client_id is omitted, agsh attempts dynamic client registration with the server.
Environment Variables
The config file is the recommended way to configure agsh. Environment variables are useful as overrides – for example, in CI pipelines, containers, or when you want to temporarily switch providers without editing your config.
Environment variables override config file values but are overridden by CLI flags.
agsh-Specific Variables
| Variable | Description | Example |
|---|---|---|
AGSH_PROVIDER | LLM provider name | openai, claude |
AGSH_MODEL | Model identifier | gpt-4o, claude-sonnet-4-20250514 |
AGSH_PERMISSION | Default permission mode | none, read, write |
AGSH_CONFIG_DIR | Override the default config directory. Points at the agsh directory itself (contains config.toml and skills/). The only isolation knob that works on every platform — dirs::config_dir() ignores $XDG_CONFIG_HOME on macOS/Windows. | /tmp/agsh-test/agsh |
MCP Variables
| Variable | Description | Default |
|---|---|---|
AGSH_MCP_TOOL_TIMEOUT | Per-call timeout for MCP tools, in milliseconds. Applies to every remote tool invocation; on timeout agsh cancels the request and returns an error to the model. | 600000 (600s) |
Provider API Keys
| Variable | Used When |
|---|---|
OPENAI_API_KEY | Provider is openai |
CLAUDE_API_KEY | Provider is claude |
OAuth Authentication
| Variable | Description |
|---|---|
CLAUDE_OAUTH_TOKEN | OAuth access token for the Claude provider |
OAuth tokens (with sk-ant-oat01- prefix) are also auto-detected when passed via CLAUDE_API_KEY.
On first use, the OAuth token is saved to the database and loaded automatically on subsequent launches. Setting the env var again replaces the stored token.
Provider Base URL
| Variable | Description |
|---|---|
OPENAI_BASE_URL | Custom base URL for the OpenAI-compatible endpoint |
Logging
agsh uses the tracing framework. The log level can be controlled with:
| Variable | Description | Example |
|---|---|---|
RUST_LOG | Standard Rust log filter | agsh=debug, agsh=trace |
If RUST_LOG is not set, the verbosity flag (-v, -vv, -vvv) controls the level:
| Flag | Level |
|---|---|
| (none) | warn |
-v | info |
-vv | debug |
-vvv | trace |
Logs are written to stderr so they do not interfere with agent output.
CLI Options
agsh [OPTIONS] [PROMPT]
agsh <COMMAND>
Commands
setup
Run the interactive configuration wizard. Prompts for provider, authentication, model, and base URL, then writes the configuration to ~/.config/agsh/config.toml.
agsh setup
This wizard also runs automatically on first launch when no config file exists.
export
Export a session as Markdown.
agsh export <SESSION_ID> [-o <PATH>]
Use -o - to print to stdout. See Sessions for details.
delete
Delete one or more sessions by UUID, or all sessions with --all.
agsh delete <SESSION_ID>...
agsh delete --all
list
List past sessions with ID, last update time, and a preview.
agsh list [-n <LIMIT>]
Default limit: 20.
Arguments
[PROMPT]
Run a one-shot prompt and exit. The agent processes the prompt, prints its response, and the process terminates.
agsh "list all files larger than 1MB in the current directory"
When omitted, agsh starts in interactive mode.
Options
-c, --continue [SESSION_ID]
Resume a session. Without a session ID, resumes the most recently updated session. With a session ID, resumes that specific session.
agsh -c # resume last session
agsh -c 550e8400-e29b-41d4-a716-446655440000 # resume specific session
Errors if the session does not exist or is locked by another agsh instance.
--permission <MODE>
Set the initial permission mode. Accepts none (or n), read (or r), ask (or a), write (or w).
agsh --permission write
agsh --permission ask
Default: read.
--provider <NAME>
Set the LLM provider. Overrides AGSH_PROVIDER and the config file.
agsh --provider claude
Supported values: openai, claude.
-m, --model <MODEL>
Set the model name. Overrides AGSH_MODEL and the config file.
agsh -m gpt-4o-mini
--base-url <URL>
Set a custom API base URL. Overrides OPENAI_BASE_URL and the config file.
agsh --base-url http://localhost:11434/v1
--no-stream
Disable streaming mode. The agent waits for the complete response before displaying it. By default, responses are streamed token-by-token.
agsh --no-stream
--render-mode <MODE>
Set the output render mode. Accepts bat (default), termimad (or rich), or raw.
bat: Syntax-highlighted markdown output via bat.termimad: Full terminal formatting (box-drawn code blocks, reflowed paragraphs, formatted tables).raw: Raw markdown printed verbatim with aligned tables.
agsh --render-mode raw
Can also be set permanently via display.render_mode in the config file.
--thinking
Enable extended thinking (Claude provider only).
agsh --thinking
--thinking-budget <TOKENS>
Set the extended thinking token budget. Implies --thinking.
agsh --thinking-budget 20000
-v, --verbose
Increase log verbosity. Can be repeated up to three times.
agsh -v # info
agsh -vv # debug
agsh -vvv # trace
--help
Print help information.
--version
Print version information.
Interactive Mode
Start agsh without the -p flag to enter interactive mode:
agsh
You get a prompt:
agsh [r] >
Type your instruction and press Enter to submit. The agent processes your request and prints its response (streamed in real time as Markdown). When it finishes, you get another prompt.
Keybindings
agsh uses Emacs-style keybindings (provided by reedline).
Input
| Key | Action |
|---|---|
| Enter | Submit the current prompt |
| Alt+Enter | Insert a newline (for multi-line input) |
| Shift+Tab | Cycle the permission mode (none → read → ask → write → none) |
Navigation
| Key | Action |
|---|---|
| Ctrl+A | Move cursor to start of line |
| Ctrl+E | Move cursor to end of line |
| Ctrl+F | Move cursor forward one character |
| Ctrl+B | Move cursor backward one character |
| Alt+F | Move cursor forward one word |
| Alt+B | Move cursor backward one word |
Editing
| Key | Action |
|---|---|
| Ctrl+D | Delete character under cursor / exit on empty line |
| Ctrl+H, Backspace | Delete character before cursor |
| Ctrl+K | Kill text from cursor to end of line |
| Ctrl+U | Kill text from start of line to cursor |
| Ctrl+W | Kill word before cursor |
| Ctrl+Y | Yank (paste) killed text |
Control
| Key | Action |
|---|---|
| Ctrl+C | Interrupt the running agent; clear the line if idle |
| Ctrl+D | Exit the shell (when the line is empty) |
| Ctrl+R | Reverse incremental search through history |
| Ctrl+L | Clear the screen |
Prompt Format
agsh [indicator] >
The indicator shows the current permission mode:
| Mode | Indicator | Color |
|---|---|---|
| None | [n] | Green |
| Read | [r] | Yellow |
| Ask | [a] | Magenta |
| Write | [w] | Red |
The color provides a visual cue about the agent’s current capabilities. Red means the agent can modify your system.
Multi-Line Input
Press Alt+Enter to insert a newline instead of submitting. The prompt changes to show continuation:
agsh [r] > write a python script that
... prints hello world
... and saves it to hello.py
Press Enter on the last line to submit the entire multi-line input.
Pasting multi-line content also works seamlessly — all pasted lines appear in the buffer for review, and you press Enter to submit.
Slash Commands
agsh supports / prefix commands for controlling the shell:
| Command | Description |
|---|---|
/help | Show available commands |
/exit | Exit the shell |
/clear | Clear the terminal screen |
/session | Show the current session ID |
/permission [none|read|ask|write] | Show or set the permission level |
/compact | Summarize and compact the session history |
/cd [path] | Change working directory |
/mcp list | List configured MCP servers with their live state (pending / connected / failed / disabled) |
/mcp reconnect <server> | Smoke-test connect for one server |
/mcp login <server> | Run the OAuth flow from the REPL |
/mcp logout <server> | Revoke cached credentials for a server |
/mcp <server>:<prompt> [args...] | Render a server-defined prompt and send it to the agent |
/compact
The /compact command asks the LLM to summarize the entire conversation, then replaces the message history with a single summary message. This is useful for long sessions that are approaching the context window limit or becoming expensive.
After compacting, the session continues with the summary as context. The previous messages are removed from both memory and the database.
Shell Escape
Prefix any input with ! to execute it directly as a shell command, bypassing the LLM entirely:
agsh [r] > !pwd
/home/user/projects
agsh [r] > !ls -la
total 32
drwxr-xr-x 5 user user 4096 Mar 4 10:00 .
...
agsh [r] > !ping 1.1.1.1 -c 2
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
...
The command runs with inherited stdin/stdout/stderr, so it behaves exactly like a regular shell. This is useful for quick checks without waiting for the LLM.
Exiting
You can exit agsh in any of these ways:
- Type
/exit - Type
exitorquit - Press Ctrl+D on an empty line
Interrupting the Agent
Press Ctrl+C while the agent is running to interrupt it. This cancels the current LLM request and kills any running shell commands that were spawned by the agent.
One-Shot Mode
One-shot mode runs a single prompt and exits, similar to bash -c:
agsh "your prompt here"
The agent processes the prompt (including any tool calls), prints its response, and the process terminates. The session UUID is printed to stderr on exit.
Examples
# Simple question
agsh "what is my current working directory?"
# File operations (requires write permission)
agsh --permission write "create a file called notes.txt with today's date"
# Search
agsh "find all TODO comments in this project"
# Web search
agsh "search the web for the latest Rust release"
Combining with Other Flags
All configuration flags work in one-shot mode:
# Use a specific provider and model
agsh --provider claude -m claude-sonnet-4-20250514 "explain this codebase"
# With write permission
agsh --permission write "run 'cargo test' and summarize the results"
# Disable streaming
agsh --no-stream "read README.md and summarize it"
Session Behavior
One-shot mode creates a new session for each invocation. The session UUID is printed to stderr when the run completes:
Session: 550e8400-e29b-41d4-a716-446655440000
You can resume this session later in interactive mode:
agsh -s 550e8400-e29b-41d4-a716-446655440000
Permissions
agsh uses a four-level permission system to control what tools the agent can use. This gives you control over the agent’s capabilities and prevents accidental modifications.
Permission Levels
| Level | Indicator | Allowed Tools |
|---|---|---|
| None | [n] (green) | No tools. The agent can only respond with text. |
| Read | [r] (yellow) | Read-only tools: read_file, find_files, search_contents, fetch_url, web_search, execute_command (sandboxed), todo_write, spawn_agent, scratchpad tools |
| Ask | [a] (magenta) | All tools, but each call requires user approval (Y/n prompt) |
| Write | [w] (red) | All tools without restrictions: write_file, edit_file, execute_command (unsandboxed) |
Each level includes all tools from the levels below it. Write mode includes all read tools.
Default Permission
The default permission is read. You can change it with:
- CLI flag:
agsh --permission write - Environment variable:
export AGSH_PERMISSION=write
Changing Permissions at Runtime
Press Shift+Tab to cycle through permission levels:
none → read → ask → write → none → ...
Or use the /permission slash command:
/permission write
/permission ask
The prompt indicator updates immediately to reflect the new level. The agent learns the current level via a per-turn [Permission context] block prepended to your message (see How Permissions Work below).
Ask Mode
In ask mode, the agent has access to all tools, but each tool call is paused for your approval:
[ask] Shell ls -la (Y/n)
Press Enter or y to approve, or n to deny. If denied, the agent receives an error and may try an alternative approach.
This mode is useful when you want the agent to have full capabilities but want to review each action before it executes.
How Permissions Work
When the agent attempts to use a tool, agsh checks whether the current permission level allows it:
- If allowed, the tool executes normally.
- In ask mode, you are prompted to approve or deny.
- If denied, agsh returns an error message to the agent explaining which level is required and suggests running
/permission <level>.
Telling the agent the current level
agsh lists every registered tool in the system prompt with its required permission level inline — nothing is filtered out — and each user message carries a compact [Permission context] block:
<context>
[Permission context]
Current permission level: read
Only read-only tools are executable.
...
</context>
That two-line block is the only permission-dependent content in the request. The system prompt and the tools-array schemas stay byte-identical across /permission toggles, so mid-session level changes don’t invalidate the Anthropic prompt cache — the entire message history stays warm.
MCP tool permissions
MCP tools are classified through a 5-step resolution chain: per-tool override → server-level override → the server’s own readOnlyHint → [mcp].default_permission → hardcoded Write fallback. See the Permission resolution section of the Config File docs for the full rules and how to override a misclassified tool.
Built-in tool permissions
Any built-in tool’s required permission can be overridden from config.toml without editing code — see [tools] — built-in tool filters. The same section documents how to allow-list or block-list specific built-ins (e.g. disabling web_search in a locked-down environment).
Examples
Read Mode (Default)
agsh [r] > read the contents of main.rs
The agent uses read_file and shows the contents. Shell commands also work in read mode, but run in a read-only sandbox – the filesystem is physically write-protected for the child process:
agsh [r] > list the files in this directory
agsh [r] > show me the git log
Commands like ls, cat, git log, df, ps, and uname work normally. Commands that attempt to write to the filesystem (e.g., touch, rm, mkdir) will fail with a permission error.
If you ask the agent to modify a file:
agsh [r] > add a comment to the top of main.rs
The agent will explain that it cannot write files in read mode and suggest switching to write mode.
Note: The read-only sandbox uses Landlock on Linux (kernel 5.13+) and sandbox-exec on macOS. On platforms where sandboxing is unavailable, shell commands are not available in read mode. You can disable sandboxed shell execution by setting
sandbox = falseunder[shell]in the config file (see Config File).
Write Mode
agsh [w] > run cargo test and show me the output
The agent uses execute_command to run the tests and shows the results.
Sessions
Sessions persist your conversation history so you can resume later. Each session is identified by a UUID and stored in a SQLite database.
How Sessions Work
- A session is not created when agsh starts. It is created lazily when you send the first message.
- When a session is created, its UUID is printed to stderr.
- When you exit agsh (Ctrl+D), the session UUID is printed again so you can note it for later.
- Sessions include the full message history: your inputs, the agent’s responses, and tool call results.
Resuming a Session
Continue Last Session
agsh -c
This resumes the most recently updated session.
By UUID
agsh -c 550e8400-e29b-41d4-a716-446655440000
The agent loads the previous conversation history and continues from where you left off.
Session Locking
Only one agsh instance can be attached to a session at a time. This prevents race conditions from concurrent writes.
- If you try to resume a session that is locked by a running agsh process, you will get an error.
- If the locking process has exited (crashed or was killed), agsh detects this and allows you to take over the lock.
Storage Location
Sessions are stored in a SQLite database at a platform-specific location:
| Platform | Path |
|---|---|
| Linux | ~/.local/share/agsh/sessions.db ($XDG_DATA_HOME/agsh/sessions.db) |
| macOS | ~/Library/Application Support/agsh/sessions.db |
| Windows | %APPDATA%\agsh\sessions.db |
Database Schema
The database has three tables:
sessions – one row per session:
| Column | Type | Description |
|---|---|---|
id | TEXT (UUID) | Primary key |
created_at | TEXT (RFC 3339) | When the session was created |
updated_at | TEXT (RFC 3339) | When the session was last updated |
locked_by | TEXT (PID) | PID of the process holding the lock, or NULL |
metadata | TEXT | Reserved for future use |
messages – one row per message in a session:
| Column | Type | Description |
|---|---|---|
id | INTEGER | Auto-incrementing primary key |
session_id | TEXT (UUID) | Foreign key to sessions.id |
role | TEXT | user, assistant, or tool_results |
content | TEXT | Message content (plain text or JSON) |
created_at | TEXT (RFC 3339) | When the message was saved |
tool_outputs – scratchpad entries, one row per entry:
| Column | Type | Description |
|---|---|---|
session_id | TEXT (UUID) | Part of composite primary key |
name | TEXT | Part of composite primary key |
content | TEXT | The stored content |
created_at | TEXT (RFC 3339) | When the entry was created |
Scratchpad entries are scoped to a session. Two sessions can have entries with the same name. Entries are preserved across compaction but deleted when a session is deleted.
History Retention
agsh automatically manages session storage on startup with sensible defaults:
retention_days(default:90) – deletes sessions whoseupdated_atis older than this many days.max_storage_bytes(default:52428800/ 50 MB) – when total message content exceeds this limit, the oldest sessions are deleted until the total is under the limit.
You can override these defaults in the config file under [session]:
[session]
retention_days = 30 # delete sessions not used in 30 days
max_storage_bytes = 10485760 # cap total storage at ~10 MB
See Config File for details.
Context Window Limiting
Long sessions can exceed the LLM’s context window or become expensive. The context_messages setting (default: 200) limits how many recent messages are sent to the API:
[session]
context_messages = 100
The full history remains in SQLite for resumption. Only the API payload is truncated. The truncation preserves tool call chains (it never splits a tool use from its result).
Compacting a Session
If a session becomes too long, you can use the /compact command to have the LLM summarize the conversation and replace older messages with a structured summary. Recent messages are preserved verbatim. The summary includes key files, decisions, errors, and user preferences.
Compaction preserves scratchpad entries and the todo list, and re-injects environment context so the agent isn’t disoriented after compaction.
Auto-Compact
When auto_compact is enabled (default: true), agsh automatically compacts the conversation when the input token count exceeds 80% of the context window. This runs between turns, not during tool loops.
[session]
auto_compact = true
context_window = 200000 # optional override
Listing Sessions
To see past sessions:
agsh list
This shows a table with each session’s ID, last update time, and a preview of the first message:
ID Updated Preview
550e8400-e29b-41d4-a716-446655440000 2026-03-14 12:00:00 How do I implement a binary search tree?
a1b2c3d4-e5f6-7890-abcd-ef1234567890 2026-03-13 09:30:00 Fix the login page CSS
By default the 20 most recent sessions are shown. Use -n to change:
agsh list -n 50
Exporting a Session
You can export any session as a Markdown file:
agsh export 550e8400-e29b-41d4-a716-446655440000
This writes session-550e8400-e29b-41d4-a716-446655440000.md in the current directory with the full conversation history. User and assistant messages are rendered as Markdown sections, while tool calls and results are wrapped in collapsible <details> blocks.
To write to a specific file:
agsh export 550e8400-e29b-41d4-a716-446655440000 -o conversation.md
To print to stdout (for piping):
agsh export 550e8400-e29b-41d4-a716-446655440000 -o -
Deleting Sessions
Delete specific sessions by UUID:
agsh delete 550e8400-e29b-41d4-a716-446655440000
Delete multiple sessions at once:
agsh delete 550e8400-e29b-41d4-a716-446655440000 a1b2c3d4-e5f6-7890-abcd-ef1234567890
Delete all sessions:
agsh delete --all
Managing Sessions via SQLite
You can also manage sessions directly through the SQLite database. For example, to list all sessions:
sqlite3 ~/.local/share/agsh/sessions.db \
"SELECT id, created_at, updated_at FROM sessions ORDER BY updated_at DESC;"
Skills
Skills are user-defined knowledge packages that give the agent non-standard knowledge – manuals, procedures, tool-specific instructions, and experience the LLM doesn’t have natively. Each skill is a directory containing a SKILL.md file with structured metadata.
How Skills Work
- Skills live in
~/.config/agsh/skills/(platform-specific config dir). - Each skill is a directory:
skills/<name>/SKILL.md. SKILL.mdstarts with a YAML frontmatter block declaring the skill’s metadata, followed by Markdown body content.- On every prompt, agsh discovers all valid skills and lists them in the system prompt with their
descriptionandwhen_to_use. - The agent invokes a skill by calling the
skilltool with the skill name. The tool returns the full body, which the agent follows. - Skills are available in read, ask, and write permission modes (not in none).
File Format
A skill is a directory under ~/.config/agsh/skills/ containing a SKILL.md file:
~/.config/agsh/skills/
└── download-videos/
└── SKILL.md
SKILL.md must begin with a YAML frontmatter block, followed by the skill body:
---
description: Download videos from various websites using yt-dlp
when_to_use: When the user wants to download a video from a website
allowed_tools: [execute_command]
version: "1.0"
user_invocable: true
---
# Download Videos with yt-dlp
## Installation
Install yt-dlp:
\```bash
pip install yt-dlp
\```
## Basic Usage
Download a video:
\```bash
yt-dlp "https://example.com/video"
\```
Required Frontmatter Fields
| Field | Description |
|---|---|
description | One-line summary of what the skill does. Shown in the system prompt. |
when_to_use | A hint telling the agent when to invoke the skill. Shown in the system prompt. |
Skills missing either field are skipped at discovery with a warning log.
Optional Frontmatter Fields
| Field | Default | Description |
|---|---|---|
allowed_tools | [] | Array or CSV string of tool names the skill expects. Currently advisory (not enforced). |
version | none | Free-form version label (e.g. "1.0", "2024-03-14"). |
user_invocable | true | Reserved for future /skill <name> slash command. |
Variable Substitution
The skill body may reference these variables, which are expanded when the skill is loaded:
${AGSH_SKILL_DIR}– the absolute path to the skill’s directory. Use this to reference bundled helper files (e.g.${AGSH_SKILL_DIR}/helper.sh).${AGSH_SESSION_ID}– the current session UUID.
Storage Location
| Platform | Path |
|---|---|
| Linux | ~/.config/agsh/skills/<name>/SKILL.md ($XDG_CONFIG_HOME/agsh/skills/) |
| macOS | ~/Library/Application Support/agsh/skills/<name>/SKILL.md |
| Windows | %APPDATA%\agsh\skills\<name>\SKILL.md |
How the Agent Uses Skills
When skills are available, the system prompt includes a ## Skills section like:
## Skills
- **download-videos**: Download videos from various websites using yt-dlp — When the user wants to download a video from a website
- **deploy-kubernetes**: Deploy services to a K8s cluster — When the user asks to deploy to Kubernetes
The agent loads a skill by calling the skill tool:
skill(name: "download-videos")
The tool returns the full body of SKILL.md (with variables expanded) as its output. The agent then follows the instructions.
Tips
- Use short, unambiguous skill names (e.g.
setup-postgres, notpg). The name is what the agent sees and calls. - Write
descriptionandwhen_to_useconcisely – they go into every system prompt and consume tokens. - Keep each skill focused on a single topic or procedure. Spawn multiple skills rather than one giant one.
- Bundle supporting files in the skill directory and reference them with
${AGSH_SKILL_DIR}/file.ext. - Skills are re-discovered on every prompt, so you can add, edit, or remove skills mid-session without restarting agsh.
Providers Overview
Providers are the LLM inference backends that agsh uses to process your instructions. agsh ships with two built-in providers:
| Provider | API | Streaming | Tool Calling |
|---|---|---|---|
| OpenAI | Chat Completions | SSE | Function calling |
| Claude | Messages API | SSE (named events) | Content blocks |
Selecting a Provider
Set the provider via any configuration layer:
# CLI flag
agsh --provider openai
# Environment variable
export AGSH_PROVIDER=claude
# Config file (~/.config/agsh/config.toml)
[provider]
name = "openai"
OpenAI-Compatible APIs
The openai provider works with any API that implements the OpenAI Chat Completions format. This includes:
- OpenAI (default endpoint)
- Ollama (
http://localhost:11434/v1) - OpenRouter (
https://openrouter.ai/api/v1) - vLLM, LiteLLM, and other OpenAI-compatible servers
Set the --base-url flag or OPENAI_BASE_URL environment variable to point to the alternative endpoint.
Streaming vs Non-Streaming
By default, agsh uses streaming mode: tokens appear in the terminal as they are generated. Use --no-stream to wait for the complete response before displaying it.
Streaming is recommended for interactive use. Non-streaming may be useful for scripting or when the provider does not support SSE.
OpenAI Provider
The OpenAI provider uses the Chat Completions API. It also works with any OpenAI-compatible API endpoint.
Configuration
| Setting | Value |
|---|---|
| Provider name | openai |
| Default base URL | https://api.openai.com/v1 |
| API key env var | OPENAI_API_KEY |
| Auth method | Bearer token (Authorization: Bearer <key>) |
Minimal Setup
export AGSH_PROVIDER=openai
export AGSH_MODEL=gpt-4o
export OPENAI_API_KEY=sk-...
agsh
Config File
[provider]
name = "openai"
model = "gpt-4o"
Supported Models
Any model available through the OpenAI Chat Completions API (or compatible endpoint) that supports tool calling:
gpt-4o,gpt-4o-minigpt-4-turboo1,o3-mini- Third-party models via compatible APIs
Custom Base URL
To use an OpenAI-compatible endpoint, set the base URL:
# Ollama
agsh --provider openai --model llama3 --base-url http://localhost:11434/v1
# OpenRouter
agsh --provider openai --model anthropic/claude-sonnet-4-20250514 --base-url https://openrouter.ai/api/v1
Or in the config file:
[provider]
name = "openai"
model = "llama3"
api_key = "unused"
base_url = "http://localhost:11434/v1"
API Details
Endpoint: POST {base_url}/chat/completions
Tool format: Tools are sent as function definitions:
{
"type": "function",
"function": {
"name": "read_file",
"description": "Read the contents of a file at the given path.",
"parameters": { "type": "object", "properties": { ... } }
}
}
Tool results: Sent back as messages with role: "tool" and the corresponding tool_call_id.
Streaming: Uses Server-Sent Events (SSE) with data: {...} lines. The stream ends with data: [DONE].
Claude Provider
The Claude provider uses the Claude API Messages endpoint.
Configuration
| Setting | Value |
|---|---|
| Provider name | claude |
| Default base URL | https://api.anthropic.com |
| API key env var | CLAUDE_API_KEY |
| OAuth token env var | CLAUDE_OAUTH_TOKEN |
| Auth method | x-api-key header (API key) or Authorization: Bearer (OAuth) |
| API version | 2023-06-01 |
| Max tokens | 8192 |
Quickest Start (OAuth Login)
Run the setup wizard and choose OAuth login when prompted:
agsh setup
This opens your browser for authorization, exchanges the code for tokens, and saves them to the database. No API key needed.
Minimal Setup (API Key)
export AGSH_PROVIDER=claude
export AGSH_MODEL=claude-sonnet-4-20250514
export CLAUDE_API_KEY=sk-ant-api03-...
agsh
Minimal Setup (OAuth Token)
export AGSH_PROVIDER=claude
export AGSH_MODEL=claude-sonnet-4-20250514
export CLAUDE_OAUTH_TOKEN=sk-ant-oat01-...
agsh
On the first run, the OAuth token is saved to the database. On subsequent runs, the token is loaded automatically without needing the environment variable.
Config File
[provider]
name = "claude"
model = "claude-sonnet-4-20250514"
Authentication
agsh supports two authentication methods for the Claude provider:
OAuth Login
The recommended way to authenticate. Run agsh setup (or let the first-launch wizard guide you) and select OAuth login. This performs an OAuth Authorization Code flow with PKCE:
- agsh generates a PKCE challenge and opens your browser to Claude’s authorization page
- You authorize the application in your browser
- You paste the authorization code back into agsh
- agsh exchanges the code for access and refresh tokens
- Tokens are stored in the database and refreshed automatically
The OAuth client ID defaults to Claude Code’s client ID but can be overridden via the CLAUDE_CLIENT_ID environment variable.
API Key
Traditional API key authentication using the x-api-key header. Set via CLAUDE_API_KEY env var or provider.api_key in the config file.
Manual OAuth Token
OAuth token authentication using the Authorization: Bearer header. Set via CLAUDE_OAUTH_TOKEN env var or provider.oauth_token in the config file.
OAuth tokens are automatically detected by their sk-ant-oat01- prefix, even when passed via CLAUDE_API_KEY.
Token lifecycle:
- Provide the initial token via env var, config, or OAuth login
- agsh saves it to the database on first use
- On subsequent launches, the token is loaded from the database
- If the token expires, agsh refreshes it automatically and updates the database
- Setting a new env var or config value replaces the stored token
Token refresh URL: Defaults to https://api.anthropic.com/v1/oauth/token. Configurable via provider.oauth_token_url in the config file.
Supported Models
Any model available through the Claude API:
claude-opus-4-20250514claude-sonnet-4-20250514claude-haiku-4-5-20251001
Custom Base URL
To use a Claude-compatible proxy or gateway:
agsh --provider claude --model claude-sonnet-4-20250514 --base-url https://my-proxy.example.com
API Details
Endpoint: POST {base_url}/v1/messages
Headers (API key):
x-api-key: <api_key>anthropic-version: 2023-06-01content-type: application/json
Headers (OAuth):
Authorization: Bearer <oauth_token>anthropic-version: 2023-06-01content-type: application/json
System prompt: Sent as a top-level system field in the request body (not as a message).
Tool format: Tools are defined with input_schema instead of parameters:
{
"name": "read_file",
"description": "Read the contents of a file at the given path.",
"input_schema": { "type": "object", "properties": { ... } }
}
Tool use and results: Expressed as content blocks within messages:
- Tool use:
{"type": "tool_use", "id": "...", "name": "...", "input": {...}} - Tool result:
{"type": "tool_result", "tool_use_id": "...", "content": "..."}
Streaming: Uses Server-Sent Events with named event types:
| Event | Description |
|---|---|
message_start | Message initialization |
content_block_start | Begin a text or tool_use block |
content_block_delta | Incremental text (text_delta) or tool input (input_json_delta) |
content_block_stop | End of a content block |
message_delta | Final metadata including stop_reason |
message_stop | Stream complete |
ping | Keep-alive |
Tools Overview
Tools are the actions that the agent can perform on your behalf. The LLM decides which tools to call based on your instructions.
Available Tools
| Tool | Permission | Description |
|---|---|---|
read_file | Read | Read file contents |
edit_file | Write | Make string replacements in a file |
write_file | Write | Create or overwrite a file |
find_files | Read | Find files by glob pattern |
search_contents | Read | Search file contents with regex |
fetch_url | Read | Fetch a web page as markdown |
web_search | Read | Search the web |
execute_command | Read/Write | Run a shell command |
todo_write | Read | Manage a structured task list |
spawn_agent | Read | Delegate tasks to a sub-agent |
scratchpad_write | Read | Store content in the scratchpad |
scratchpad_read | Read | Read a scratchpad entry |
scratchpad_edit | Read | Edit a scratchpad entry |
scratchpad_list | Read | List scratchpad entries |
scratchpad_delete | Read | Delete a scratchpad entry |
skill | Read | Load a named skill’s instructions |
render_image | Read | View an image from in-memory base64 or scratchpad |
Permission Requirements
Tools are grouped by the minimum permission level required:
Read permission (available in read, ask, and write modes):
read_file,find_files,search_contents,fetch_url,web_searchexecute_command(sandboxed, filesystem write-protected)todo_write,spawn_agent,skill,render_image- All scratchpad tools
Write permission (only available in write mode):
edit_file,write_file,execute_command(unsandboxed)
In ask mode, all tools are available but each call requires user confirmation.
In none mode, no tools are available. The agent can only respond with text.
Filtering Built-in Tools
Any built-in can be allow-listed, blocked, or have its required permission overridden via the [tools] table in config.toml. See [tools] — built-in tool filters. Run agsh tools list to see every built-in with its effective permission and current status.
MCP Tools
When MCP servers are configured, their tools are registered under a namespaced name of the form <server>__<tool> (e.g. notion__notion-search). They appear in the system prompt catalogue alongside the built-ins — with their resolved permission level annotated inline — and are called the same way.
agsh also exposes seven built-in MCP meta-tools for browsing server-side resources and prompts. All are deferred by default (loaded on first use):
| Tool | Permission | Description |
|---|---|---|
list_mcp_resources | Read | List resources a server exposes |
read_mcp_resource | Read | Read a server resource by URI |
list_mcp_prompts | Read | List server-defined prompts |
get_mcp_prompt | Read | Render a server prompt with arguments |
subscribe_mcp_resource | Read | Receive change notifications for a resource |
unsubscribe_mcp_resource | Read | Stop receiving change notifications |
list_mcp_resource_updates | Read | Inspect pending resource-change notifications |
Scratchpad Parameter
All tools support an optional scratchpad string parameter. When provided, the tool’s output is saved to the scratchpad under that name instead of being returned inline. This lets the agent store large outputs for later processing without consuming conversation context.
execute_command({"command": "pdftotext doc.pdf -", "scratchpad": "pdf_text"})
How Tool Calls Work
- The agent receives your instruction and decides which tools to call
- For each tool call, agsh checks the current permission level
- In ask mode, you are prompted to approve or deny each tool call
- If permitted, the tool executes and its output is fed back to the agent
- The agent may make additional tool calls or respond with text
- This loop continues until the agent has no more tool calls to make
Tool calls and their results are displayed in the terminal so you can see what the agent is doing.
todo_write
A built-in tool for managing a structured task list during a session. The agent uses this to track multi-step work and communicate progress. The task list is displayed in the terminal and injected into the conversation context each turn.
spawn_agent
Spawns a read-only sub-agent to perform research or analysis tasks. The sub-agent has access to the same tools (except spawn_agent and todo_write) and returns a report. This is useful for delegating exploration without polluting the main conversation context.
skill
Loads a named skill’s instructions. Skills are user-defined knowledge packages stored in ~/.config/agsh/skills/<name>/SKILL.md. The system prompt lists available skills with their description and when-to-use hint; the agent calls skill({"name": "<skill-name>"}) to load the full body. See Skills for how to author skills.
render_image
Displays an image the agent has in memory — as base64 bytes or in a scratchpad entry — as a multimodal content block. Complements fetch_url (network) and read_file (local file) by covering the third case: image data produced on the fly by a command pipeline.
Typical workflow:
execute_command({"command": "ffmpeg -i input.mp4 -vframes 1 -f image2pipe pipe: | base64 -w0", "scratchpad": "frame"})
render_image({"from_scratchpad": "frame"})
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
from_scratchpad | string | one of two | Name of a scratchpad entry containing base64-encoded image bytes |
base64 | string | one of two | Base64-encoded image bytes, passed inline |
Exactly one of from_scratchpad or base64 must be provided. Prefer from_scratchpad for large images — inline base64 inflates tool-call JSON.
The bytes must decode to a supported raster image. PNG, JPEG, GIF, WebP, and BMP pass through unchanged; TIFF, ICO, HDR, EXR, TGA, PNM, QOI, DDS, and Farbfeld are auto-converted to PNG. Size cap is ~3.75 MB on the final payload.
Only call render_image when the current model supports vision input.
Redirecting output to the scratchpad
Several tools — execute_command, find_files, search_contents, fetch_url, spawn_agent — accept an optional scratchpad parameter that redirects their output to a named scratchpad entry instead of returning it inline. When this parameter is set, the tool produces its full, untruncated output: internal result-count caps (find_files 200, search_contents 100) and length caps (fetch_url max_length) are lifted for the scratchpad-bound result.
File Operations
read_file
Read the contents of a file at a given path. Supports text files and images.
Permission: Read
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
path | string | yes | The file path to read |
offset | integer | no | Line number to start reading from (0-based) |
limit | integer | no | Maximum number of lines to read |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- When
offsetandlimitare both omitted, defaults to the first 2000 lines. If the file has more, a truncation notice is appended. - Use
offset/limitto page through large files.
Image files
Recognized image extensions are returned as base64-encoded multimodal content:
- Provider-native (pass-through):
.png,.jpg/.jpeg,.gif,.webp,.bmp - Convertible (decoded and re-encoded as PNG transparently):
.tif/.tiff,.ico,.hdr,.exr,.tga,.pbm/.pgm/.ppm/.pnm,.qoi,.dds,.ff/.farbfeld - Unsupported (fall through to text read, which will fail on binary):
.svg,.jxl,.heic,.avif
Images are rejected if the final payload exceeds 3.75 MB (~5 MB base64). Conversion can enlarge an image, so a small TIFF may produce a too-large PNG.
Only read image files when the current model supports vision input — text-only models will either error or silently drop the image block.
Examples
Read an entire file:
agsh [r] > show me the contents of src/main.rs
Read lines 10-20:
agsh [r] > show me lines 10 through 20 of src/main.rs
edit_file
Make a string replacement in a file. The file must have been read with read_file first (unless force is set).
Permission: Write
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
path | string | yes | The file path to edit |
old_string | string | yes | The exact string to find and replace |
new_string | string | yes | The replacement string |
replace_all | boolean | no | Replace all occurrences (default: false) |
force | boolean | no | Bypass read-before-edit requirement (default: false) |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- By default, only the first occurrence of
old_stringis replaced. Setreplace_allto replace every occurrence. - The file must have been previously read with
read_fileon the same path. This prevents blind edits. Setforceto bypass this requirement. - If
old_stringis not found, the tool returns an error (without modifying the file).
write_file
Create or overwrite a file with the given content.
Permission: Write
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
path | string | yes | The file path to write |
content | string | yes | The content to write to the file |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- Creates parent directories if they do not exist.
- Overwrites the file if it already exists.
Search Tools
find_files
Find files matching a glob pattern.
Permission: Read
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
pattern | string | yes | Glob pattern to match files against |
path | string | no | Directory to search in (defaults to current directory) |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- Results are limited to 200 matches.
- Returns one file path per line.
Glob Patterns
| Pattern | Matches |
|---|---|
*.rs | All .rs files in the current directory |
**/*.rs | All .rs files recursively |
src/*.txt | All .txt files in src/ |
test_* | All files starting with test_ |
search_contents
Search file contents using a regex pattern. Powered by the ripgrep library.
Permission: Read
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
pattern | string | yes | Regex pattern to search for |
path | string | no | File or directory to search in (defaults to current directory) |
glob | string | no | Glob pattern to filter which files are searched (e.g., *.rs) |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- Searches recursively through directories.
- Skips hidden files (starting with
.) and common non-text directories (target,node_modules). - Results are limited to 100 matches.
- Each result includes the file path, line number, and matching line.
Web Tools
fetch_url
Fetch a web page and return its content as markdown text.
Permission: Read
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
url | string | yes | The URL to fetch |
max_length | integer | no | Maximum characters to return (default: 30000, 0 for no limit) |
headers | object | no | Custom HTTP headers (overrides defaults like User-Agent) |
regex | string | no | If provided, return only matching content (matches joined by newlines) |
raw | boolean | no | Return raw HTML instead of converting to markdown (default: false) |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- Fetches the page via HTTP GET.
- Converts HTML to Markdown using
fast_html2md(unlessrawis true). - Truncates the output to
max_lengthcharacters (default: 30,000). - HTTP timeout: 30 seconds.
- Returns the HTTP status code as an error if the request fails (e.g., 404, 500).
Image URLs
If the response Content-Type is a supported raster image format, fetch_url returns a multimodal Image content block instead of markdown. No disk is touched — bytes are base64-encoded in memory.
Provider-native formats (passed through unchanged):
image/png,image/jpeg(andimage/jpg),image/gif,image/webp,image/bmp(andimage/x-ms-bmp)
Convertible formats (decoded and re-encoded as PNG transparently):
image/tiff,image/vnd.microsoft.icon/image/x-icon,image/vnd.radiance(HDR),image/x-exr,image/x-targa,image/x-portable-*(PNM),image/qoi,image/vnd.ms-dds,image/x-farbfeld
Unsupported formats (fall through to the text branch): image/svg+xml, image/jxl, image/heic, image/avif.
- The
max_length,regex, andrawoptions do not apply to image responses. - Size cap of ~3.75 MB applies to the output bytes (after conversion). Conversion can enlarge an image, so a 1 MB TIFF may produce a larger PNG.
- Detection uses the response’s actual
Content-Typeheader, so redirect chains and extension-less URLs are handled correctly.
Only fetch image URLs when the current model supports vision input — text-only models will either error or silently drop the image block.
web_search
Search DuckDuckGo and return the top results.
Permission: Read
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
query | string | yes | The search query |
headers | object | no | Custom HTTP headers (overrides defaults like User-Agent) |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- Returns up to 10 results per search.
- Each result includes the title, source domain, URL, and a snippet with matched terms emphasised in bold.
- Snippets are capped at 300 characters; use
fetch_urlon the result URL for the full page. - Uses HTML scraping (no API key required).
- HTTP timeout: 30 seconds.
CAPTCHA detection
DuckDuckGo occasionally serves a bot-challenge page instead of results (detected by the anomaly-modal element). web_search returns a distinct error so the agent doesn’t silently retry:
DuckDuckGo served a CAPTCHA challenge (bot detection / rate limit).
Retry later.
If this happens often in your environment, configure a search-capable MCP server — see the MCP configuration examples for patterns that work well.
Shell Tool
execute_command
Execute a shell command and return its output.
Permission: Read (sandboxed) / Write (unsandboxed)
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
command | string | yes | The shell command to execute |
timeout_ms | integer | no | Timeout in milliseconds (default: 30000) |
scratchpad | string | no | Save output to the scratchpad under this name |
Behavior
- Executes the command via
sh -c "<command>"on Unix, orpowershell.exe -NoProfile -NonInteractive -Command "<command>"on Windows (same shell in both sandboxed and unsandboxed mode). - Captures both stdout and stderr.
- Returns the exit code along with the output if non-zero.
- Oversized output is losslessly persisted to the scratchpad by the agent layer — the tool itself never truncates.
- Default timeout is 30 seconds. If the command exceeds the timeout, it is killed (on Unix, via the process group so backgrounded grandchildren are caught too).
- Supports cancellation: pressing Ctrl+C while a command is running kills the child process.
Shell-specific semantics
- Unix (
sh -c): POSIX$VARexpansion applies. Pass a literal$with single quotes ('$foo') or backslash escape (\$foo). - Windows (
powershell.exe -Command): The script body reaches PowerShell directly. Use PowerShell syntax ($var = ...,$env:PATH) — and crucially, do not wrap your command in anotherpowershell -Command "...". The outer PowerShell will expand your inner$varreferences to empty strings before the inner shell runs, producing a parser error on mangled syntax. If you need to invoke a nested script, drop it into a.ps1file and run it by path, use-EncodedCommand <base64>, or escape each$as`$.
Read-Only Sandbox
In read mode, commands run inside a filesystem sandbox that blocks writes to the user’s real data. Reads and program execution still work normally:
- Linux: Uses Landlock LSM (kernel 5.13+). The child process is restricted via
landlock_restrict_selfbefore exec. OnlyREAD_FILE,READ_DIR, andEXECUTEaccess rights are granted — writes anywhere on the filesystem returnEACCES. - macOS: Uses
sandbox-execwith a SBPL profile that denies allfile-write*operations. - Windows: Spawns the child with a duplicated primary token dropped to Low integrity (
SECURITY_MANDATORY_LOW_RID) viaSetTokenInformation(TokenIntegrityLevel, …). Writes to the home directory,%APPDATA%, Program Files, and system directories — any location with Medium-or-higher integrity ACLs — are blocked by the kernel. Unlike Landlock, Low integrity is not a total write-denial: the child can still write to the small residual Low-integrity-writable surface (%LOCALAPPDATA%\Low,%TEMP%\Low, any path with an explicit Low-integrity write ACE) and to files it creates itself (which inherit Low integrity). For practical purposes this matches the guarantees ofsandbox-execand prevents the agent from touching user data, but full “zero writes anywhere” on Windows would require Windows Sandbox or an AppContainer and is out of scope. - Unsupported platforms: Shell commands are not available in read mode — switch to write mode to execute commands without a sandbox.
In write mode, commands run without any sandbox restrictions.
To disable sandboxed shell execution in read mode, set sandbox = false under [shell] in the config file. When disabled, shell commands require write mode.
[shell]
sandbox = false
Scratchpad
The scratchpad is a session-scoped working memory that the agent can use to store, retrieve, edit, and manage content without consuming conversation context. Entries are identified by string names and persist across turns within a session.
When the Scratchpad is Used
- Proactively: The agent stores intermediate results (extracted text, API responses, research notes) for later use.
- Via
scratchpadparameter: Any tool can save its output directly to the scratchpad by including ascratchpadparameter in the tool call. - Automatically: When a tool’s output exceeds 30,000 characters, it is saved to the scratchpad under an auto-generated name (e.g.,
execute_command_1) and replaced with a preview in the conversation.
Tools
scratchpad_write
Store content in the scratchpad. If the name already exists, the content is overwritten.
Permission: Read
| Name | Type | Required | Description |
|---|---|---|---|
name | string | yes | Name for the entry |
content | string | yes | The content to store |
scratchpad_read
Read or search a scratchpad entry by name.
Permission: Read
| Name | Type | Required | Description |
|---|---|---|---|
name | string | yes | The entry name |
offset | integer | no | Character offset to start reading from (default: 0) |
limit | integer | no | Maximum characters to return (default: 30000) |
regex | string | no | Search the entry and return matching lines (max 100) |
scratchpad_edit
Edit a scratchpad entry in place. Provide content for a full overwrite, or old_string/new_string for targeted replacement.
Permission: Read
| Name | Type | Required | Description |
|---|---|---|---|
name | string | yes | The entry name |
content | string | no | Full replacement (mutually exclusive with old/new) |
old_string | string | no | String to find |
new_string | string | no | Replacement string |
replace_all | boolean | no | Replace all occurrences (default: false) |
scratchpad_list
List all scratchpad entries with their name, size, and creation time. No parameters.
Permission: Read
scratchpad_delete
Delete a scratchpad entry by name.
Permission: Read
| Name | Type | Required | Description |
|---|---|---|---|
name | string | yes | The entry name to delete |
Lifecycle
- Entries are scoped to the session and persist across turns.
- Entries survive session compaction (
/compact). - Entries are deleted when the session is deleted.
- Two sessions can have entries with the same name without conflict.
- Writing to an existing name overwrites it silently.