Fast, stateless LLM-powered assistant for your shell: qq answers; qa runs commands
qqqa is a two-in-one, stateless CLI tool that brings LLM assistance to the command line without ceremony.
The two binaries are:
qq- ask a single question, e.g. "qq how can I recursively list all files in this directory" (qq stands for "quick question")qa- a single step agent that can optionally use tools to finish a task: read a file, write a file, or execute a command with confirmation (qa stands for "quick agent")
qqqa runs on macOS, Linux, and Windows.
By default the repo includes profiles for OpenRouter (default), OpenAI, Groq, a local Ollama runtime, the Codex CLI (piggyback on ChatGPT), and the Claude Code CLI (reuse your Claude subscription). An Anthropic profile stub exists in the config for future work but is not wired up yet.
demo.mp4
qq means quick question. qa means quick agent. Both are easy to type rapidly on QWERTY keyboards with minimal finger movement. That makes interacting with LLMs faster and more natural during real work.
qqqa is deliberately stateless. There is no long running session and no hidden conversation memory stored by the tool. Every run is mostly independent and reproducible. For maintaining a lowkey continuity you can use "include_history": true in the config.json (or choose to use history during the qq --init process).
Why stateless is great:
- Simple and focused - Unix philosophy applied to LLM tools.
- Shell friendly - compose with pipes and files instead of interactive chats.
- Safe by default - qq is read-only and has access to no tools. qa is built with security in mind and requires confirmation before running tools.
The tools may include transient context you choose to provide:
qqcan include the last few terminal commands as hints and piped stdin if present.qacan read files or run a specific command, but only once per invocation and with safety checks.
OpenRouter mirrors the OpenAI Chat Completions API, adds generous community-hosted models, and keeps openai/gpt-4.1-nano fast and inexpensive. qqqa talks to https://openrouter.ai/api/v1 out of the box and reads the API key from OPENROUTER_API_KEY, so your first run works as soon as you drop in a key.
If you need even more throughput, the bundled groq profile that targets openai/gpt-oss-20b and openai/gpt-oss-120b remains available, and you can still add any OpenAI-compatible provider by editing ~/.qq/config.json or creating a new profile.
Already paying for ChatGPT? Select the codex profile (during qq --init, via qq --profile codex, or by editing ~/.qq/config.json) and qqqa will shell out to the Codex CLI instead of hitting an HTTP endpoint. That lets you reuse an existing ChatGPT subscription with practically zero marginal cost.
What to know:
- Install the Codex CLI via the ChatGPT desktop app (Settings → Labs → Codex) or
pip install codex-cli, then ensurecodexis on yourPATH. - Streaming is unavailable; even without
--no-stream, qqqa buffers the Codex response and prints it once. qastill expects JSON tool calls. When you needread_file,write_file, orexecute_command, respond with{ "tool": string, "arguments": object }the same way you would on OpenRouter.- If the binary is missing or exits with an error, qqqa surfaces the stderr/stdout so you can fix your environment quickly.
Example ~/.qq/config.json fragment that pins Codex as the default profile:
{
"default_profile": "codex",
"profiles": {
"codex": {
"model_provider": "codex",
"model": "gpt-5",
"reasoning_effort": "minimal"
}
}
}Have a Claude subscription? Select the claude_cli profile and qqqa will use the claude binary. That keeps usage effectively free if you already pay for Claude for Desktop.
What to know:
- Install Claude Code so the
claudebinary is on yourPATH, then runclaude loginonce. - Claude Code streams responses the same way API-based LLMs do.
- Need to pin a different Claude desktop model? Add
"model_override": "claude-haiku-4-5"undermodel_providers.claude_cli.cliin~/.qq/config.json. That override only applies to the Claude CLI;qq -m/--modelstill takes precedence per run.
Minimal config snippet:
{
"default_profile": "claude_cli",
"profiles": {
"claude_cli": {
"model_provider": "claude_cli",
"model": "claude-haiku-4-5"
}
},
"model_providers": {
"claude_cli": {
"cli": {
"model_override": "claude-haiku-4-5"
}
}
}
}- OpenAI compatible API client with streaming and non streaming calls.
- Stateless, single shot workflow that plays well with pipes and scripts.
- Rich but simple formatting using XML like tags rendered to ANSI colors.
- Config driven providers and profiles with per profile model overrides.
- Safety rails for file access and command execution.
- Old-school and SERIOUS? Optional no-emoji mode persisted via
--no-fun🥸
brew install qqqaDownload a prebuilt archive from the GitHub Releases page, extract it, and place qq/qa somewhere on your PATH (e.g., /usr/local/bin).
Download the Windows archive from Releases (choose the architecture that matches your machine), extract qq.exe and qa.exe, and add them to your %PATH%.
On first run qqqa creates ~/.qq/config.json with safe permissions. For a smooth first interaction, run the init flow:
# Interactive setup (choose provider and set key)
qq --init
# or
qa --initIf ~/.qq/config.json already exists, the init command keeps it untouched and explains how to rerun after moving or deleting the file.
The initializer lets you choose the default provider:
- OpenRouter +
openai/gpt-4.1-nano(default, fast and inexpensive) - Groq +
openai/gpt-oss-20b(faster, cheap paid tier) - OpenAI +
gpt-5-mini(slower, a bit smarter) - Anthropic +
claude-3-5-sonnet-20241022(placeholder until their Messages API finalizes) - Ollama (runs locally, adjust port if needed)
- Codex CLI +
gpt-5(wraps thecodex execbinary so you can reuse a ChatGPT subscription; no API key needed, buffered output only) - Claude Code CLI +
claude-haiku-4-5(wraps theclaudebinary;qqstreams live,qabuffers so it can parse tool calls)- Need to force a different desktop model? Add
"model_override"under the provider'scliblock (supported for both Codex and Claude). That override wins over the profile default but still yields to the per-run--modelflag.
- Need to force a different desktop model? Add
It also offers to store an API key in the config (optional). If you prefer environment variables, leave it blank and set one of:
OPENROUTER_API_KEYfor OpenRouter (default)GROQ_API_KEYfor GroqOPENAI_API_KEYfor OpenAIOLLAMA_API_KEY(optional; any non-empty string works—evenlocal—because the Authorization header cannot be blank)- No API key is required for the Codex or Claude CLI profiles—their binaries handle auth (
codex login/claude login).
Defaults written to ~/.qq/config.json:
- Providers
openrouter→ basehttps://openrouter.ai/api/v1, envOPENROUTER_API_KEY, default headersHTTP-Referer=https://github.com/iagooar/qqqaandX-Title=qqqaopenai→ basehttps://api.openai.com/v1, envOPENAI_API_KEYgroq→ basehttps://api.groq.com/openai/v1, envGROQ_API_KEYollama→ basehttp://127.0.0.1:11434/v1, envOLLAMA_API_KEY(qqqa auto-injects a non-empty placeholder if you leave it unset)anthropic→ basehttps://api.anthropic.com/v1, envANTHROPIC_API_KEY(present in the config schema for future support; not usable yet)codex→ modecli, binarycodexwith base argsexec(install Codex CLI; auth handled bycodex login). Optional"model_override"in thecliblock forces a fallback ChatGPT model if OpenAI retires the default.claude_cli→ modecli, binaryclaude(install@anthropic-ai/claude-code; auth handled byclaude login). Optional"model_override"pins Claude Code’s--modelflag without touching your profile’s model.codex→ CLI provider, binarycodex- fails if the binary is missing
- Profiles
openrouter→ modelopenai/gpt-4.1-nano(default)openai→ modelgpt-5-minigroq→ modelopenai/gpt-oss-20bollama→ modelllama3.1anthropic→ modelclaude-3-5-sonnet-20241022(inactive placeholder until Anthropic integration lands)codex→ model labelgpt-5(only used for display; Codex CLI picks the backing ChatGPT model)
- Optional per-profile
reasoning_effortfor GPT-5 family models. If you leave it unset, qqqa sends"reasoning_effort": "minimal"for anygpt-5*model to keep responses fast. Set it to"low","medium", or"high"when you want deeper reasoning. - (discouraged) Optional per-profile
temperature. Most models default to0.15unless you set it in~/.qq/config.jsonor pass--temperature <value>for a single run. GPT-5 models ignore custom temperatures; qqqa forces them to1.0. - (discouraged): you can change the timeout, e.g.
"timeout": "240"under a model profile in~/.qq/config.jsonto raise the per-request limit (qq+qadefault to 180 s - this is SLOW; faster models are a better fix).
Example override in ~/.qq/config.json:
{
"profiles": {
"openai": {
"model_provider": "openai",
"model": "gpt-5-mini",
"reasoning_effort": "medium"
}
}
}- Optional flag:
no_emoji(unset by default). Set viaqq --no-funorqa --no-fun. - Optional auto-copy:
copy_first_command(unset/false by default). Enable duringqq --init, by runningqq --enable-auto-copy, or by editing~/.qq/config.jsonso qq copies the first<cmd>block to your clipboard. Turn it off withqq --disable-auto-copy. Override per run with--copy-command/--ccor--no-copy-command/--ncc(also available as-ncc). - Per-run control:
--no-streamforces qq to wait for the full response before printing; streaming is the default.
Terminal history is off by default. During qq --init / qa --init you can opt in to sending the last 10 qq/qa commands along with each request. You can still override per run with --history (force on) or -n/--no-history (force off). Only commands whose first token is qq or qa are ever shared.
qq streams responses by default so you see tokens the moment they arrive. If you prefer the classic buffered output—for example when piping into another tool or copying the final answer as a whole—pass --no-stream to wait until the response completes before printing anything.
# simplest
qq "convert mp4 to mp3"
# stream tokens by default (formatted output)
qq "how do I kill a process by name on macOS"
# disable streaming and wait for the full formatted response
qq --no-stream "summarize today's git status"
# bump temperature for non GPT-5 models on a single run
qq --temperature 0.4 "draft a playful git commit message"
# include piped context
git status | qq "summarize what I should do next"
# pipe extra context and keep CLI question
printf '%s\n' "This is a sample context. My code is 4242" | qq "What is my code"
# pipe the question itself
printf '%s\n' "Show me the full contents of this directory" | qq
# raw text (no ANSI formatting)
qq -r "explain sed vs awk"
# include terminal history for this run
qq --history "find large files in the last day"
# disable emojis in responses (persists)
qq --no-fun "summarize this"
# auto-copy the first <cmd> block for fast pasting (alias: --cc)
qq --copy-command "list docker images"
# temporarily disable auto-copy even if enabled in config (alias: --ncc / -ncc)
qq --no-copy-command "print working directory"
# enable auto-copy for all future qq runs
qq --enable-auto-copy
# disable auto-copy persistently
qq --disable-auto-copyNote: it is possible to run qq without quotes, which works most of the time the same way as with quotes.
# simplest
qq convert mp4 to mp3You want to extract audio from a YouTube video but you do not remember the exact flags.
Ask with qq:
qq "how do I use ffmpeg to extract audio from a YouTube video into mp3"A typical answer will suggest installing the tools and then using yt-dlp to fetch audio and ffmpeg to convert it:
# macOS
brew install yt-dlp ffmpeg
# Debian or Ubuntu
sudo apt-get update && sudo apt-get install -y yt-dlp ffmpeg
# Download and extract audio to MP3 using ffmpeg under the hood
yt-dlp -x --audio-format mp3 "https://www.youtube.com/watch?v=VIDEO_ID"Do it for me with qa:
qa "download audio as mp3 from https://www.youtube.com/watch?v=VIDEO_ID"The agent will propose a safe command like yt-dlp -x --audio-format mp3 URL, show it for confirmation, then run it. You can pass -y to auto approve.
qa can either answer in plain text or request one tool call in JSON. Supported tools:
read_filewith{ "path": string }write_filewith{ "path": string, "content": string }execute_commandwith{ "command": string, "cwd?": string }
Examples:
# read a file the safe way
qa "read src/bin/qq.rs and tell me what main does"
# write a file
qa "create a README snippet at notes/intro.md with a short summary"
# run a command with confirmation
qa "list Rust files under src sorted by size"
# pipe the task itself
printf '%s\n' "Show me the full contents of this directory" | qa
# auto approve tool execution for non interactive scripts
qa -y "count lines across *.rs"
# include recent qq/qa commands just for this run
qa --history "trace which git commands I ran recently"
# raise temperature for this run (non GPT-5 models only)
qa --temperature 0.3 "brainstorm fun git aliases"
# disable emojis in responses (persists)
qa --no-fun "format and lint the repo"
# run qa non-interactively with confirmation already granted
qa -y "count lines across *.rs"When qa runs a command while stdout is a terminal, output streams live; the structured [tool:execute_command] summary still prints afterward for easy copying.
execute_command prints the proposed command and asks for confirmation. It warns if the working directory is outside your home. Use -y to auto approve in trusted workflows.
The runner enforces a default allowlist (think ls, grep, find, rg, awk, etc.) and rejects pipelines, redirection, and other high-risk constructs. When a command is blocked, qa prompts you to add it to command_allowlist inside ~/.qq/config.json; approving once persists the choice and updates future runs. On Windows it automatically adapts to the active environment so built-ins like dir or Get-ChildItem keep working without extra flags.
Some OpenAI-compatible gateways (LiteLLM, local corporate proxies, etc.) terminate TLS with a self-signed CA. Add a per-provider tls block so qqqa trusts that CA in addition to the default Rustls bundle:
{
"model_providers": {
"litellm": {
"name": "LiteLLM",
"base_url": "https://proxy.local/v1",
"env_key": "LITELLM_API_KEY",
"tls": {
"ca_bundle_path": "certs/litellm-ca.pem",
"ca_bundle_env": "SSL_CERTFILE_PATH"
}
}
}
}ca_bundle_pathaccepts a PEM or DER file. Relative paths are resolved against~/.qq/so you can keep certificates next to the config.ca_bundle_envis optional; if set, qqqa reads that environment variable for the bundle path and falls back toca_bundle_pathwhen it is unset. This mirrors proxies that exposeSSL_CERTFILE_PATHor similar knobs.- Multiple certificates can live in the same file (concatenate PEM entries). qqqa appends them to the existing Rustls trust store, so standard public CAs continue to work.
With this configuration any provider—LiteLLM, Ollama over HTTPS, your company gateway, or another proxy—can authenticate with its custom CA without disabling TLS verification.
Pick the built-in ollama profile (or create your own) to talk to a local runtime. Override the API base when you expose the service on a different host/port:
qq --profile ollama --api-base http://127.0.0.1:11435/v1 "summarize build failures"
qa --profile ollama --api-base http://192.168.1.50:9000/v1 "apply the diff" -yqa --init offers Ollama as an option and skips the API key warning; qqqa still sends a placeholder bearer token so OpenAI-compatible middleware keeps working. If you bypass the init flow and edit config.json manually, set either "api_key": "local" under the ollama provider or export OLLAMA_API_KEY=local so the Authorization header remains non-empty.
Example local setup: LM Studio on macOS driving
ollama run meta-llama-3.1-8b-instruct-hf(Q4_K_M) on a MacBook Air M4/32 GB works fine, just slower than the hosted OpenRouter/Groq profiles. Adjust the model tag in yourollamaprofile accordingly.
You can still override at runtime:
# choose profile
qq -p groq "what is ripgrep"
# override model for a single call
qq -m openai/gpt-oss-20b "explain this awk one-liner"- File tools require paths to be inside your home or the current directory. Reads are capped to 1 MiB, and traversal/symlink escapes are blocked.
- Command execution uses a default allowlist (e.g.
ls,grep,rg,find) plus your customcommand_allowlistentries. Destructive patterns (rm -rf /,sudo,mkfs, etc.) are always blocked, and pipelines/redirection/newlines prompt for confirmation even with--yes. - Commands run with a 120 s timeout and the agent performs at most one tool step—there is no loop.
- Config files are created with safe permissions. API keys come from environment variables unless you explicitly add a key to the config.
OPENROUTER_API_KEYfor the OpenRouter provider (default)GROQ_API_KEYfor the Groq providerOPENAI_API_KEYfor the OpenAI provider
Project layout:
src/bin/qq.rsandsrc/bin/qa.rsentry points- Core modules in
src/:ai.rs,config.rs,prompt.rs,history.rs,perms.rs,formatting.rs - Tools in
src/tools/:read_file.rs,write_file.rs,execute_command.rs - Integration tests in
tests/
See CONTRIBUTING.md for guidelines on reporting issues and opening pull requests, building from source, and the release process.
- API error about missing key: run
qq --initto set things up, or export the relevant env var, e.g.export OPENROUTER_API_KEY=.... - No output while streaming: try
-dto see debug logs or rerun with--no-streamto fall back to buffered output (it might work better in some edge case scenarios). - Piped input not detected: ensure you are piping into
qqand not running it in a subshell that swallows stdin.
Licensed under MIT.