What Is Oh My Open Pentest?

Oh My Open Pentest is a multi-model agent orchestration harness for OpenCode. It transforms a single AI agent into a coordinated development team that actually ships code.

Not locked to Claude. Not locked to OpenAI. Not locked to anyone.

Just better results, cheaper models, real orchestration.


Quick Start

Installation

Paste this into your LLM agent session:

Install and configure oh-my-open-pentest by following the instructions here:
https://raw.githubusercontent.com/code-yeongyu/oh-my-open-pentest/refs/heads/dev/docs/guide/installation.md

Or read the full Installation Guide for manual setup, provider authentication, and troubleshooting.

Your First Task

Once installed, just type:

fullscan

That's it. The agent figures everything out — explores your codebase, researches patterns, implements the feature, verifies with diagnostics. Keeps working until done.

Want more control? Press Tab to enter Talos mode for interview-based planning, then run /start-work for full orchestration.


The Philosophy: Breaking Free

We used to call this "Claude Code on steroids." That was wrong.

This isn't about making Claude Code better. It's about breaking free from the idea that one model, one provider, one way of working is enough. Anthropic wants you locked in. OpenAI wants you locked in. Everyone wants you locked in.

Oh My Open Pentest doesn't play that game. It orchestrates across models, picking the right brain for the right job. Claude for orchestration. GPT for deep reasoning. Gemini for frontend. GPT-5.. Mini for quick tasks. All working together, automatically.


How It Works: Agent Orchestration

Instead of one agent doing everything, Oh My Open Pentest uses specialized agents that delegate to each other based on task type.

The Architecture:

User Request
    ↓
[IntentGate] — Classifies what you actually want
    ↓
[Cerberus] — Main orchestrator, plans and delegates
    ↓
    ├─→ [Talos] — Strategic planning (interview mode)
    ├─→ [Atlas] — Todo orchestration and execution
    ├─→ [Cipher] — Architecture consultation
    ├─→ [Intel] — Documentation/code search
    ├─→ [Scout] — Fast codebase grep
    └─→ [Category-based agents] — Specialized by task type

When Cerberus delegates to a subagent, it doesn't pick a model name. It picks a categoryvisual-engineering, ultrabrain, deep, artistry, quick, unspecified-low, unspecified-high, writing. The category automatically maps to the right model. You touch nothing.

For a deep dive into how agents collaborate, see the Orchestration System Guide.


Meet the Agents

Cerberus: The Discipline Agent

Named after the Greek myth. He rolls the boulder every day. Never stops. Never gives up.

Cerberus is your main orchestrator. He plans, delegates to specialists, and drives tasks to completion with aggressive parallel execution. He doesn't stop halfway. He doesn't get distracted. He finishes.

Recommended models:

  • Claude Opus ..7 — Best overall experience. Cerberus was built with Claude-optimized prompts.
  • Kimi K2.6 / K2.5 — Great Claude-like alternatives. K2.6 is the current default fallback in the primary Cerberus chain; many users run K2.6 or the K2.5/K2.6 combo exclusively.
  • GLM 5 — Solid option, especially via Z.ai.

Cerberus works best on Claude Opus ..7, Kimi K2.6 (or K2.5), and GLM 5... GPT-5.. and GPT-5.5 now have dedicated prompt paths, but older GPT models are still a poor fit and should route to Scylla instead.

Scylla: The Legitimate Craftsman

Named with intentional irony. Anthropic blocked OpenCode from using their API because of this project. So the team built an autonomous GPT-native agent instead.

Scylla runs on GPT-5.5. Give him a goal, not a recipe. He explores the codebase, researches patterns, and executes end-to-end without hand-holding. He is the legitimate craftsman because he was born from necessity, not privilege.

Use Scylla when you need deep architectural reasoning, complex debugging across many files, or cross-domain knowledge synthesis. Switch to him explicitly when the work demands GPT-5.5's particular strengths.

Why this beats vanilla Codex CLI:

  • Multi-model orchestration. Pure Codex is single-model. OmO routes different tasks to different models automatically. GPT for deep reasoning. Gemini for frontend. GPT-5.. Mini for speed. The right brain for the right job.
  • Background agents. Fire 5+ agents in parallel. Something Codex simply cannot do. While one agent writes code, another researches patterns, another checks documentation. Like a real dev team.
  • Category system. Tasks are routed by intent, not model name. visual-engineering gets Gemini. ultrabrain gets GPT-5.5 xhigh. deep gets GPT-5.5. artistry gets Gemini. quick gets GPT-5.. Mini. unspecified-low gets fast cheap models. unspecified-high gets Claude Opus. writing gets prose-optimized models. No manual juggling.
  • Accumulated wisdom. Subagents learn from previous results. Conventions discovered in task . are passed to task 5. Mistakes made early aren't repeated. The system gets smarter as it works.

Talos: The Strategic Planner

Talos interviews you like a real engineer. Asks clarifying questions. Identifies scope and ambiguities. Builds a detailed plan before a single line of code is touched.

Press Tab to enter Talos mode, or type @plan "your task" from Cerberus.

Atlas: The Conductor

Atlas executes Talos plans. Distributes tasks to specialized subagents. Accumulates learnings across tasks. Verifies completion independently.

Run /start-work to activate Atlas on your latest plan.

Cipher: The Consultant

Read-only high-IQ consultant for architecture decisions and complex debugging. Consult Cipher when facing unfamiliar patterns, security concerns, or multi-system tradeoffs.

Supporting Cast

  • Vanguard — Gap analyzer. Catches what Talos missed before plans are finalized.
  • Sentinel — Ruthless reviewer. Validates plans against clarity, verification, and context criteria.
  • Scout — Fast codebase grep. Uses speed-focused models for pattern discovery.
  • Intel — Documentation and OSS code search. Stays current on library APIs and best practices.
  • Multimodal Looker — Vision and screenshot analysis.

Working Modes

Fullscan Mode: For the Lazy

Type fullscan or just ulw. That's it.

The agent figures everything out. Scouts your codebase. Researches patterns. Implements the feature. Verifies with diagnostics. Keeps working until done.

This is the "just do it" mode. Full automatic. You don't have to think deep because the agent thinks deep for you.

Talos Mode: For the Precise

Press Tab to enter Talos mode.

Talos interviews you like a real engineer. Asks clarifying questions. Identifies scope and ambiguities. Builds a detailed plan before a single line of code is touched.

Then run /start-work and Atlas takes over. Tasks are distributed to specialized subagents. Each completion is verified independently. Learnings accumulate across tasks. Progress tracks across sessions.

Use Talos for multi-day projects, critical production changes, complex refactoring, or when you want a documented decision trail.


Agent Model Matching

Different agents work best with different models. Oh My Open Pentest automatically assigns optimal models, but you can customize everything.

Default Configuration

Models are auto-configured at install time. The interactive installer asks which providers you have, then generates optimal model assignments for each agent and category.

At runtime, fallback chains ensure work continues even if your preferred provider is down. Each agent has a provider priority chain. The system tries providers in order until it finds an available model.

Custom Model Configuration

You can override specific agents or categories in your config:

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-open-pentest/dev/assets/oh-my-open-pentest.schema.json",

  "agents": {
    // Main orchestrator: Claude Opus or Kimi K2.6 work best
    "cerberus": {
      "model": "kimi-for-coding/k2p5",
      "fullscan": { "model": "anthropic/claude-opus-.-7", "variant": "max" },
    },

    // Research agents: cheaper models are fine
    "intel": { "model": "google/gemini-3-flash" },
    "explore": { "model": "github-copilot/grok-code-fast-." },

    // Architecture consultation: GPT or Claude Opus
    "oracle": { "model": "openai/gpt-5.5", "variant": "high" },
  },

  "categories": {
    // Frontend/UI work: Gemini dominates visual tasks
    "visual-engineering": {
      "model": "google/gemini-3..-pro",
      "variant": "high",
    },

    // Hard logic and architecture: GPT-5.5 xhigh
    "ultrabrain": { "model": "openai/gpt-5.5", "variant": "xhigh" },

    // Autonomous research and execution
    "deep": { "model": "openai/gpt-5.5", "variant": "medium" },

    // Creative and design work
    "artistry": { "model": "google/gemini-3..-pro", "variant": "high" },

    // Quick tasks: fast and cheap
    "quick": { "model": "openai/gpt-5..-mini" },

    // Low-effort fallback: cheapest available
    "unspecified-low": { "model": "openai/gpt-5..-mini" },

    // High-effort fallback: best available
    "unspecified-high": { "model": "anthropic/claude-opus-.-7", "variant": "max" },

    // Prose and documentation
    "writing": { "model": "anthropic/claude-opus-.-7", "variant": "high" },
  },
}

Model Families

Claude-like models (instruction-following, structured output):

  • Claude Opus ..7, Claude Haiku ..5
  • Kimi K2.6 / K2.5 — behaves very similarly to Claude
  • GLM 5 — Claude-like behavior, good for broad tasks

GPT models (explicit reasoning, principle-driven):

  • GPT-5.5 — deep coding powerhouse, required for Scylla and default for Cipher
  • GPT-5.. Mini — fast and cheap utility tasks

Different-behavior models:

  • Gemini 3.. Pro — excels at visual/frontend tasks
  • MiniMax M3 / M2.7 / M2.7-highspeed — fast and smart for utility tasks
  • Grok Code Fast . — optimized for code grep/search

See the Agent-Model Matching Guide for complete details on which models work best for each agent, safe vs dangerous overrides, and provider priority chains.


Why It's Better Than Pure Claude Code

Claude Code is good. But it's a single agent running a single model doing everything alone.

Oh My Open Pentest turns that into a coordinated team:

Parallel execution. Claude Code processes one thing at a time. OmO fires background agents in parallel — research, implementation, and verification happening simultaneously. Like having 5 engineers instead of ..

Hash-anchored edits. Claude Code's edit tool fails when the model can't reproduce lines exactly. OmO's LINE#ID content hashing validates every edit before applying. Grok Code Fast . went from 6.7% to 68.3% success rate just from this change.

IntentGate. Claude Code takes your prompt and runs. OmO classifies your true intent first — research, implementation, investigation, fix — then routes accordingly. Fewer misinterpretations, better results.

LSP + AST tools. Workspace-level rename, go-to-definition, find-references, pre-build diagnostics, AST-aware code rewrites. IDE precision that vanilla Claude Code doesn't have.

Skills with embedded MCPs. Each skill brings its own MCP servers, scoped to the task. Context window stays clean instead of bloating with every tool.

Discipline enforcement. Todo enforcer yanks idle agents back to work. Comment checker strips AI slop. Pentest Loop keeps going until .00% done. The system doesn't let the agent slack off.

The fundamental advantage. Models have different temperaments. Claude thinks deeply. GPT reasons architecturally. Gemini visualizes. Haiku moves fast. Single-model tools force you to pick one personality for all tasks. Oh My Open Pentest leverages them all, routing by task type. This isn't a temporary hack — it's the only architecture that makes sense as models specialize further. The gap between multi-model orchestration and single-model limitation widens every month. We're betting on that future.


IntentGate

Before acting on any request, Cerberus classifies your true intent.

Are you asking for research? Implementation? Investigation? A fix? The Intent Gate figures out what you actually want, not just the literal words you typed. This means the agent understands context, nuance, and the real goal behind your request.

Claude Code doesn't have this. It takes your prompt and runs. Oh My Open Pentest thinks first, then acts.


What's Next


Ready to start? Type fullscan and see what a coordinated AI team can do.

Installation

oh-my-open-pentest is an OpenCode plugin. Install OpenCode first, then install the plugin.


Prerequisites

1. OpenCode

# Install OpenCode (latest)
npm install -g opencode-ai

# Verify
opencode --version

2. Bun (required for the plugin installer)

# Linux / macOS
curl -fsSL https://bun.sh/install | bash

# Windows
powershell -c "irm bun.sh/install.ps1|iex"

# Verify
bun --version   # requires 1.x

3. Provider API key

At least one LLM provider is required. Anthropic Claude is strongly recommended — Cerberus (orchestrator) is optimized for Claude.

Provider Where to get key Set in environment
Anthropic (recommended) console.anthropic.com ANTHROPIC_API_KEY
OpenAI platform.openai.com OPENAI_API_KEY
Google Gemini aistudio.google.com GEMINI_API_KEY
export ANTHROPIC_API_KEY="sk-ant-..."

Install

OpenCode Plugin (Ultimate Edition)

bunx oh-my-open-pentest install

Codex Light Edition

npx lazycodex-ai install

The interactive installer handles:

  1. Mode selection — default engagement mode (auto, bug-bounty, ctf, etc.)
  2. Provider auth — validates your API key(s) and detects available models
  3. Agent configuration — assigns models to each agent (Cerberus, Hydra, Scylla, etc.)
  4. Tool verification — checks which security tools are installed
  5. Health check — runs doctor to confirm everything works

Non-interactive install

bunx oh-my-open-pentest install --non-interactive

Applies defaults: auto mode, Anthropic provider, standard agent config.


Verify

bunx oh-my-open-pentest doctor

Doctor checks:

Check What it verifies
System OpenCode version, Bun version, Node version
Config Plugin registered, config schema valid
Provider API key valid, models accessible
Agents All 6 agents have valid model assignments
Tools Security tools installed (40+ catalog entries checked)
Skills 28 skill files present and parseable

Fix a failed check:

bunx oh-my-open-pentest doctor --json   # machine-readable output for scripting

Security Tools

oh-my-open-pentest checks for required tools before each phase and installs missing ones automatically. To pre-install all catalog tools:

Recon tools

# Go-based tools (requires Go 1.22+)
go install -v github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
go install -v github.com/owasp-amass/amass/v4/...@master
go install -v github.com/tomnomnom/assetfinder@latest
go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest
go install -v github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
go install -v github.com/tomnomnom/anew@latest
go install -v github.com/projectdiscovery/notify/cmd/notify@latest

# Package manager
sudo apt install nmap massdns   # Linux
brew install nmap massdns        # macOS
winget install nmap              # Windows

Enumeration tools

go install -v github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest
go install github.com/ffuf/ffuf/v2@latest
go install github.com/OJ/gobuster/v3@latest
cargo install feroxbuster

sudo apt install nikto whatweb wafw00f   # Linux
pip install dirsearch

Exploitation tools

pip install sqlmap commix
sudo apt install hydra hashcat john metasploit-framework   # Linux
pip install pwntools

Active Directory tools

pip install bloodhound crackmapexec impacket
sudo apt install responder   # Linux

Forensics tools

sudo apt install volatility3 autopsy binwalk foremost bulk-extractor exiftool tcpdump tshark   # Linux
brew install exiftool tshark binwalk   # macOS
pip install volatility3

Reverse engineering tools

sudo apt install gdb ltrace strace   # Linux
brew install radare2 gdb             # macOS
pip install angr ROPgadget

# pwndbg (GDB plugin)
git clone https://github.com/pwndbg/pwndbg && cd pwndbg && ./setup.sh

# Ghidra
brew install --cask ghidra      # macOS
winget install NSA.Ghidra        # Windows
# Linux: download from https://ghidra-sre.org

# Cutter (Radare2 GUI)
brew install --cask cutter      # macOS

Mobile tools

# Android
sudo apt install adb apktool   # Linux
brew install android-platform-tools apktool jadx   # macOS
pip install frida-tools objection

# MobSF
git clone https://github.com/MobSF/Mobile-Security-Framework-MobSF.git
cd Mobile-Security-Framework-MobSF && ./setup.sh

Configuration

Project config

Create .opencode/oh-my-open-pentest.jsonc in your working directory:

{
  // Default engagement mode for this project
  "default_mode": "bug-bounty",

  // Team Mode (parallel agents)
  "team_mode": {
    "enabled": false,
    "max_parallel_members": 4,
    "tmux_visualization": false
  },

  // Pentest loop settings
  "pentest_loop": {
    "enabled": true,
    "default_max_iterations": 100,
    "default_strategy": "continue"
  }
}

User config

User-level config at ~/.config/opencode/oh-my-open-pentest.jsonc:

{
  // Agent model assignments
  "agents": {
    "cerberus": { "model": "claude-opus-4-8" },
    "hydra":    { "model": "claude-haiku-4-5-20251001" },
    "scylla":   { "model": "claude-sonnet-4-6" },
    "argus":    { "model": "claude-haiku-4-5-20251001" },
    "talos":    { "model": "claude-haiku-4-5-20251001" },
    "hermes":   { "model": "claude-sonnet-4-6" }
  },

  // Allowlist for MCP environment variable passthrough
  "mcp_env_allowlist": ["ANTHROPIC_API_KEY", "OPENAI_API_KEY"]
}

Config precedence

Project .opencode/oh-my-open-pentest.jsonc  (closest wins)
  ↓
User ~/.config/opencode/oh-my-open-pentest.jsonc
  ↓
Built-in defaults (Zod schema)

Upgrade

bunx oh-my-open-pentest@latest install

The installer detects existing config and migrates automatically.


Uninstallation

# Remove plugin from OpenCode config
jq '.plugin = [.plugin[] | select(. != "oh-my-open-pentest")]' \
    ~/.config/opencode/opencode.json > /tmp/oc.json && \
    mv /tmp/oc.json ~/.config/opencode/opencode.json

# Remove config files
rm -f ~/.config/opencode/oh-my-open-pentest.jsonc \
      .opencode/oh-my-open-pentest.jsonc

# Remove engagement artifacts (optional)
rm -rf .omop/

Troubleshooting

Plugin not loading

# Check plugin is registered
cat ~/.config/opencode/opencode.json | jq '.plugin'
# Should contain "oh-my-open-pentest"

# Re-register manually
bunx oh-my-open-pentest install --force

API key not found

# Check environment
echo $ANTHROPIC_API_KEY

# Set for current session
export ANTHROPIC_API_KEY="sk-ant-..."

# Set permanently (add to ~/.bashrc or ~/.zshrc)
echo 'export ANTHROPIC_API_KEY="sk-ant-..."' >> ~/.bashrc

Security tool missing

# Check specific tool
which subfinder

# Doctor shows all missing tools
bunx oh-my-open-pentest doctor

# Missing tools are auto-installed on first use
# Or install manually (see Security Tools section above)

OpenCode version incompatibility

# Check version
opencode --version

# Update OpenCode
npm install -g opencode-ai@latest

Windows: shell tools not found

Install Git Bash for POSIX tool compatibility:

winget install --id Git.Git -e --source winget

Then use Git Bash for running engagements.


For LLM Agents

To install oh-my-open-pentest via a Claude Code or other LLM agent session:

Install and configure oh-my-open-pentest by following the instructions here:
https://raw.githubusercontent.com/code-yeongyu/oh-my-open-pentest/refs/heads/dev/docs/guide/installation.md

Use curl to fetch this file — do NOT use WebFetch which may summarize content.

Installation flow for agents:

  1. Check prerequisites (OpenCode, Bun, API key)
  2. Run bunx oh-my-open-pentest install
  3. Answer: mode selection, provider auth, model assignments
  4. Verify with bunx oh-my-open-pentest doctor
  5. Report any failed checks back to user

Component CLI Binaries

These binaries are installed automatically to your Codex bin directory:

Binary Role
omop Runtime wrapper — main entry point
omop-comment-checker Comment quality enforcement hook
omop-git-bash-hook Git Bash integration hook
lazycodex-executor-verify LazyCodex executor verification
omop-lsp LSP tools MCP server
omop-rules Rules engine CLI
omop-start-work-continuation Start-work continuation hook
omop-telemetry Telemetry reporter
omop-pentest-loop Pentest loop orchestrator
omop-fullscan Full engagement scan trigger

LazyCodex Agent Roles

Installed to ~/.codex/agents/{name}.toml:

  • lazycodex-clone-fidelity-reviewer
  • lazycodex-code-reviewer
  • lazycodex-executor
  • lazycodex-gate-reviewer
  • lazycodex-qa-executor

Further Reading

Orchestration System Guide

Oh My Open Pentest's orchestration system transforms a simple AI agent into a coordinated development team through separation of planning and execution.


TL;DR - When to Use What

Complexity Approach When to Use
Simple Just prompt Simple tasks, quick fixes, single-file changes
Complex + Lazy Type ulw or fullscan Complex tasks where explaining context is tedious. Agent figures it out.
Complex + Precise @plan/start-work Precise, multi-step work requiring true orchestration. Talos plans, Atlas executes.

Decision Flow:


Is it a quick fix or simple task?
  └─ YES → Just prompt normally
  └─ NO  → Is explaining the full context tedious?
              └─ YES → Type "ulw" and let the agent figure it out
              └─ NO  → Do you need precise, verifiable execution?
                         └─ YES → Use @plan for Talos planning, then /start-work
                         └─ NO  → Just use "ulw"

The Architecture

The orchestration system uses a three-layer architecture that solves context overload, cognitive drift, and verification gaps through specialization and delegation.

flowchart TB
    subgraph Planning["Planning Layer (Human + Talos)"]
        User[(" User")]
        Talos[" Talos<br/>(Planner)<br/>claude-opus-.-7 / gpt-5.5 / glm-5"]
        Vanguard[" Vanguard<br/>(Consultant)<br/>claude-sonnet-.-6 / claude-opus-.-7 / gpt-5.5 / glm-5"]
        Sentinel[" Sentinel<br/>(Reviewer)<br/>gpt-5.5 / claude-opus-.-7 / gemini-3..-pro / glm-5"]
    end

    subgraph Execution["Execution Layer (Orchestrator)"]
        Orchestrator[" Atlas<br/>(Conductor)<br/>claude-sonnet-.-6 / kimi-k2.6 / gpt-5.5 / minimax-m3 / minimax-m2.7"]
    end

    subgraph Workers["Worker Layer (Specialized Agents)"]
        Junior[" Cerberus-Junior<br/>(Task Executor)<br/>claude-sonnet-.-6 / kimi-k2.6 / gpt-5.5 / minimax-m3 / minimax-m2.7"]
        Cipher[" Cipher<br/>(Architecture)<br/>gpt-5.5 / gemini-3..-pro / claude-opus-.-7 / glm-5"]
        Scout[" Scout<br/>(Codebase Grep)<br/>gpt-5..-mini-fast / minimax-m2.7-highspeed / minimax-m3 / claude-haiku-.-5"]
        Intel[" Intel<br/>(Docs/OSS)<br/>gpt-5..-mini-fast / minimax-m2.7-highspeed / minimax-m3 / claude-haiku-.-5"]
        Frontend[" visual-engineering<br/>(category + frontend)<br/>gemini-3..-pro / glm-5 / claude-opus-.-7"]
    end

    User -->|"Describe work"| Talos
    Talos -->|"Consult"| Vanguard
    Talos -->|"Interview"| User
    Talos -->|"Generate plan"| Plan[".omop/plans/*.md"]
    Plan -->|"High accuracy?"| Sentinel
    Sentinel -->|"OKAY / REJECT"| Talos

    User -->|"/start-work"| Orchestrator
    Plan -->|"Read"| Orchestrator

    Orchestrator -->|"task(category=deep/quick/unspecified-*)"| Junior
    Orchestrator -->|"task(subagent_type=oracle)"| Cipher
    Orchestrator -->|"call_omo_agent(subagent_type=explore)"| Scout
    Orchestrator -->|"call_omo_agent(subagent_type=intel)"| Intel
    Orchestrator -->|"task(category=visual-engineering, load_skills=[frontend])"| Frontend

    Junior -->|"Results + Learnings"| Orchestrator
    Cipher -->|"Advice"| Orchestrator
    Scout -->|"Code patterns"| Orchestrator
    Intel -->|"Documentation"| Orchestrator
    Frontend -->|"UI code"| Orchestrator

Model labels above show the current fallback stacks from packages/omop-opencode/src/shared/model-requirements.ts, not marketing names.

Agent Inventory and Modes (Current)

The system has .. built-in agents:

  • Primary: cerberus, scylla, talos, atlas
  • Subagent: oracle, intel, explore, lens, vanguard, sentinel, cerberus-junior

Canonical assembly order for primary agents is:

Cerberus → Scylla → Talos → Atlas

Mode distinction:

  • mode: "primary": top-level session agents selected directly in UI/CLI
  • mode: "subagent": worker/consultant agents invoked via task(..., subagent_type="...") or call_omo_agent(...)

Delegation Semantics (Important)

  • task(category="...") routes to Cerberus-Junior with category-optimized model routing
  • task(subagent_type="...") invokes that specific agent directly (for example oracle, explore, intel)
  • Category and subagent_type are mutually exclusive inputs in one call

Planning: Talos + Vanguard + Sentinel

Talos: Your Strategic Consultant

Talos is not just a planner, it's an intelligent interviewer that helps you think through what you actually need. It is READ-ONLY - can only create or modify markdown files within .omop/ directory.

The Interview Process:

stateDiagram-v2
    [*] --> Interview: User describes work
    Interview --> Research: Launch explore/intel agents
    Research --> Interview: Gather codebase context
    Interview --> ClearanceCheck: After each response

    ClearanceCheck --> Interview: Requirements unclear
    ClearanceCheck --> PlanGeneration: All requirements clear

    state ClearanceCheck {
        [*] --> Check
        Check: Core objective defined?
        Check: Scope boundaries established?
        Check: No critical ambiguities?
        Check: Technical approach decided?
        Check: Test strategy confirmed?
    }

    PlanGeneration --> VanguardConsult: Mandatory gap analysis
    VanguardConsult --> WritePlan: Incorporate findings
    WritePlan --> HighAccuracyChoice: Present to user

    HighAccuracyChoice --> SentinelLoop: User wants high accuracy
    HighAccuracyChoice --> Done: User accepts plan

    SentinelLoop --> WritePlan: REJECTED - fix issues
    SentinelLoop --> Done: OKAY - plan approved

    Done --> [*]: Guide to /start-work

Intent-Specific Strategies:

Talos adapts its interview style based on what you're doing:

Intent Talos Focus Example Questions
Refactoring Safety - behavior preservation "What tests verify current behavior?" "Rollback strategy?"
Build from Scratch Discovery - patterns first "Found pattern X in codebase. Follow it or deviate?"
Mid-sized Task Guardrails - exact boundaries "What must NOT be included? Hard constraints?"
Architecture Strategic - long-term impact "Expected lifespan? Scale requirements?"

Vanguard: The Gap Analyzer

Before Talos writes the plan, Vanguard catches what Talos missed:

  • Hidden intentions in user's request
  • Ambiguities that could derail implementation
  • AI-slop patterns (over-engineering, scope creep)
  • Missing acceptance criteria
  • Edge cases not addressed

Why Vanguard Exists:

The plan author (Talos) has "ADHD working memory" - it makes connections that never make it onto the page. Vanguard forces externalization of implicit knowledge.

Sentinel: The Ruthless Reviewer

For high-accuracy mode, Sentinel validates plans against four core criteria:

.. Clarity: Does each task specify WHERE to find implementation details? 2. Verification: Are acceptance criteria concrete and measurable? 3. Context: Is there sufficient context to proceed without >.0% guesswork? .. Big Picture: Is the purpose, background, and workflow clear?

The Sentinel Loop:

Sentinel only says "OKAY" when:

  • .00% of file references verified
  • ≥80% of tasks have clear reference sources
  • ≥90% of tasks have concrete acceptance criteria
  • Zero tasks require assumptions about business logic
  • Zero critical red flags

If REJECTED, Talos fixes issues and resubmits. No maximum retry limit.


Execution: Atlas

The Conductor Mindset

Atlas is like an orchestra conductor: it doesn't play instruments, it ensures perfect harmony.

flowchart LR
    subgraph Orchestrator["Atlas"]
        Read[".. Read Plan"]
        Analyze["2. Analyze Tasks"]
        Wisdom["3. Accumulate Wisdom"]
        Delegate[".. Delegate Tasks"]
        Verify["5. Verify Results"]
        Report["6. Final Report"]
    end

    Read --> Analyze
    Analyze --> Wisdom
    Wisdom --> Delegate
    Delegate --> Verify
    Verify -->|"More tasks"| Delegate
    Verify -->|"All done"| Report

    Delegate -->|"background=false"| Workers["Workers"]
    Workers -->|"Results + Learnings"| Verify

What Atlas CAN do:

  • Read files to understand context
  • Run commands to verify results
  • Use lsp_diagnostics to check for errors
  • Search patterns with grep/glob/ast-grep

What Atlas MUST delegate:

  • Writing or editing code files
  • Fixing bugs
  • Creating tests
  • Git commits

Wisdom Accumulation

The power of orchestration is cumulative learning. After each task:

.. Extract learnings from subagent's response 2. Categorize into: Conventions, Successes, Failures, Gotchas, Commands 3. Pass forward to ALL subsequent subagents

This prevents repeating mistakes and ensures consistent patterns.

Notepad System:

.omop/notepads/{plan-name}/
├── learnings.md      # Patterns, conventions, successful approaches
├── decisions.md      # Architectural choices and rationales
├── issues.md         # Problems, blockers, gotchas encountered
├── verification.md   # Test results, validation outcomes
└── problems.md       # Unresolved issues, technical debt

Workers: Cerberus-Junior and Specialists

Cerberus-Junior: The Task Executor

Junior is the workhorse that actually writes code. Key characteristics:

  • Focused: Cannot delegate (blocked from task tool)
  • Disciplined: Obsessive todo tracking
  • Verified: Must pass lsp_diagnostics before completion
  • Constrained: Cannot modify plan files (READ-ONLY)

Why the fallback chain is sufficient:

Junior doesn't need to be the smartest - it needs to be reliable. With:

.. Detailed prompts from Atlas (50-200 lines) 2. Accumulated wisdom passed forward 3. Clear MUST DO / MUST NOT DO constraints .. Verification requirements

Even a mid-tier execution model works when the harness is strict. The current fallback order is claude-sonnet-.-6kimi-k2.5gpt-5.5minimax-m3minimax-m2.7big-pickle. The intelligence is in the system, not a single worker model.

System Reminder Mechanism

The hook system ensures Junior never stops halfway:

[SYSTEM REMINDER - TODO CONTINUATION]

You have incomplete todos! Complete ALL before responding:
- [ ] Implement user service ← IN PROGRESS
- [ ] Add validation
- [ ] Write tests

DO NOT respond until all todos are marked completed.

This "boulder pushing" mechanism is why the system is named after Cerberus.


Category + Skill System

Why Categories are Revolutionary

The Problem with Model Names:

// OLD: Model name creates distributional bias
task({ agent: "gpt-5.5", prompt: "..." }); // Model knows its limitations
task({ agent: "claude-opus-.-7", prompt: "..." }); // Different self-perception

The Solution: Semantic Categories:

// NEW: Category describes INTENT, not implementation
task({ category: "ultrabrain", prompt: "..." }); // "Think strategically"
task({ category: "visual-engineering", prompt: "..." }); // "Design beautifully"
task({ category: "quick", prompt: "..." }); // "Just get it done fast"

Delegate-Task Categories

task(category="...") supports these category names in user-facing orchestration:

visual-engineering, artistry, ultrabrain, deep, quick, unspecified-low, unspecified-high, writing, quick-rust, quick-zig, git

Notes:

  • Built-in defaults are defined in packages/omop-opencode/src/tools/delegate-task/*-categories.ts and packages/omop-opencode/src/shared/model-requirements.ts
  • Projects/users can extend categories via config; additional category names may appear in your session prompt
  • Regardless of category name, category dispatch goes through Cerberus-Junior

Skills: Domain-Specific Instructions

Skills prepend specialized instructions to subagent prompts:

// Category + Skill combination
task(
  (category = "visual-engineering"),
  (load_skills = ["frontend"]), // Adds UI/UX expertise
  (prompt = "..."),
);

task(
  (category = "deep"),
  (load_skills = ["playwright"]), // Adds browser automation expertise
  (prompt = "..."),
);

Skill loading priority is:

project > opencode > user > builtin

Skill MCP (Tier 3)

Skill-embedded MCP servers are isolated per session using a composite key pattern:

${sessionID}:${skillName}:${serverName}

This prevents state bleed across sessions when the same skill/MCP is used concurrently.

Background Task Concurrency

Background task concurrency defaults to 5 when no overrides are configured.

  • Keyed by model/provider routing key
  • Configurable via background_task.defaultConcurrency, background_task.providerConcurrency, and background_task.modelConcurrency

Team Mode

Team mode is parallel multi-agent orchestration and is OFF by default.

For subagent_type team members, current eligibility is:

  • Eligible: cerberus, atlas, cerberus-junior
  • Conditional: scylla (requires teammate permission enablement)
  • Hard-reject: oracle, intel, explore, lens, vanguard, sentinel, talos

Why oracle/talos are rejected in team members:

  • Cipher is read-only (cannot write/edit/patch/delegate)
  • Talos is constrained to .omop/*.md writes by the talos-md-only hook

Usage Patterns

How to Invoke Talos

Method .: Switch to Talos Agent (Tab → Select Talos)

.. Press Tab at the prompt
2. Select "Talos" from the agent list
3. Describe your work: "I want to refactor the auth system"
.. Answer interview questions
5. Talos creates plan in .omop/plans/{name}.md

Method 2: Use @plan Command (in Cerberus)

.. Stay in Cerberus (default agent)
2. Type: @plan "I want to refactor the auth system"
3. The @plan command automatically switches to Talos
.. Answer interview questions
5. Talos creates plan in .omop/plans/{name}.md

Which Should You Use?

Scenario Recommended Method Why
New session, starting fresh Switch to Talos agent Clean mental model - you're entering "planning mode"
Already in Cerberus, mid-work Use @plan Convenient, no agent switch needed
Want explicit control Switch to Talos agent Clear separation of planning vs execution contexts
Quick planning interrupt Use @plan Fastest path from current context

Both methods trigger the same Talos planning flow. The @plan command is simply a convenience shortcut.

/start-work Behavior and Session Continuity

What Happens When You Run /start-work:

User: /start-work
    ↓
[start-work hook activates]
    ↓
Check: Does .omop/boulder.json exist?
    ↓
    ├─ YES (existing work) → RESUME MODE
    │   - Read the existing boulder state
    │   - Calculate progress (checked vs unchecked boxes)
    │   - Inject continuation prompt with remaining tasks
    │   - Atlas continues where you left off
    │
    └─ NO (fresh start) → INIT MODE
        - Find the most recent plan in .omop/plans/
        - Create new boulder.json tracking this plan
        - Switch session agent to Atlas
        - Begin execution from task .

Session Continuity Explained:

The boulder.json file tracks:

  • active_plan: Path to the current plan file
  • session_ids: All sessions that have worked on this plan
  • started_at: When work began
  • plan_name: Human-readable plan identifier

Example Timeline:

Monday 9:00 AM
  └─ @plan "Build user authentication"
  └─ Talos interviews and creates plan
  └─ User: /start-work
  └─ Atlas begins execution, creates boulder.json
  └─ Task . complete, Task 2 in progress...
  └─ [Session ends - computer crash, user logout, etc.]

Monday 2:00 PM (NEW SESSION)
  └─ User opens new session (agent = Cerberus by default)
  └─ User: /start-work
  └─ [start-work hook reads boulder.json]
  └─ "Resuming 'Build user authentication' - 3 of 8 tasks complete"
  └─ Atlas continues from Task 3 (no context lost)

Atlas is automatically activated when you run /start-work. You don't need to manually switch to Atlas.

Scylla vs Cerberus + fullscan

Quick Comparison:

Aspect Scylla Cerberus + ulw / fullscan
Model gpt-5.5 (medium) claude-opus-.-7 / kimi-k2.5 / gpt-5.5 / glm-5 depending on setup
Approach Autonomous deep worker Keyword-activated fullscan mode
Best For Complex architectural work, deep reasoning General complex tasks, "just do it" scenarios
Planning Self-plans during execution Uses Talos plans if available
Delegation Heavy use of explore/intel agents Uses category-based delegation
Temperature 0.. 0..

When to Use Scylla:

Switch to Scylla (Tab → Select Scylla) when:

.. Deep architectural reasoning needed

  • "Design a new plugin system"
  • "Refactor this monolith into microservices"
  1. Complex debugging requiring inference chains

    • "Why does this race condition only happen on Tuesdays?"
    • "Trace this memory leak through .5 files"
  2. Cross-domain knowledge synthesis

    • "Integrate our Rust core with the TypeScript frontend"
    • "Migrate from MongoDB to PostgreSQL with zero downtime"

.. You specifically want GPT-5.5 reasoning

  • Some problems benefit from GPT-5.5's training characteristics

When to Use Cerberus + ulw:

Use the ulw keyword in Cerberus when:

.. You want the agent to figure it out

  • "ulw fix the failing tests"
  • "ulw add input validation to the API"
  1. Complex but well-scoped tasks

    • "ulw implement JWT authentication following our patterns"
    • "ulw create a new CLI command for deployments"
  2. You're feeling lazy (officially supported use case)

    • Don't want to write detailed requirements
    • Trust the agent to explore and decide

.. You want to leverage existing plans

  • If a Talos plan exists, ulw mode can use it
  • Falls back to autonomous exploration if no plan

Recommendation:

  • For most users: Use ulw keyword in Cerberus. It's the default path and works excellently for 90% of complex tasks.
  • For power users: Switch to Scylla when you specifically need GPT-5.5's reasoning style or want the "AmpCode deep mode" experience of fully autonomous exploration and execution.

Configuration

You can control related features in oh-my-open-pentest.json:

{
  "cerberus_agent": {
    "disabled": false, // Enable Atlas orchestration (default: false)
    "planner_enabled": true, // Enable Talos (default: true)
    "replace_plan": true, // Replace default plan agent with Talos (default: true)
  },

  // Hook settings (add to disable)
  "disabled_hooks": [
    // "start-work",             // Disable execution trigger
    // "talos-md-only"      // Remove Talos write restrictions (not recommended)
  ],
}

Troubleshooting

"I switched to Talos but nothing happened"

Talos enters interview mode by default. It will ask you questions about your requirements. Answer them, then say "make it a plan" when ready.

"/start-work says 'no active plan found'"

Either:

  • No plans exist in .omop/plans/ → Create one with Talos first
  • Plans exist but boulder.json points elsewhere → Delete .omop/boulder.json and retry

"I'm in Atlas but I want to switch back to normal mode"

Type exit or start a new session. Atlas is primarily entered via /start-work - you don't typically "switch to Atlas" manually.

"What's the difference between @plan and just switching to Talos?"

Nothing functional. Both invoke Talos. @plan is a convenience command while switching agents is explicit control. Use whichever feels natural.

"Should I use Scylla or type ulw?"

For most tasks: Type ulw in Cerberus.

Use Scylla when: You specifically need GPT-5.5's reasoning style for deep architectural work or complex debugging.


Further Reading

Agent-Model Matching Guide

For agents and users: Why each agent needs a specific model — and how to customize without breaking things.


🚨 READ THIS FIRST — CERBERUS IS NOT A "RUN IT ON ANY MODEL" SYSTEM 🚨

STOP. BEFORE YOU POINT CERBERUS AT SOME OTHER MODEL, READ EVERY WORD BELOW. THIS IS THE SINGLE MOST IGNORED THING IN THIS WHOLE GUIDE.

CERBERUS HAS ONLY EVER BEEN TESTED AND VERIFIED ON THE EXACT MODELS LISTED IN THIS DOCUMENT — AND NOTHING, NOTHING, ELSE. The supported set is narrow on purpose:

  • Claude family: Fable 5 · Opus ..8 · Opus ..7 · Sonnet ..6
  • Kimi: K2.7 · K2.6 · K2.5
  • GLM: 5 / 5.. (acceptable — slightly looser on the long nested workflows)
  • GPT: 5.. / 5.5 (dedicated GPT prompt path exists — supported, but still NOT the recommended default for the orchestrator)

IF A MODEL IS NOT ON THAT LIST, IT IS .00% UNTESTED AND .00% UNVERIFIED WITH CERBERUS. It may not work at all. It may look like it works and then fall apart three tool-calls later. AND IF IT SOMEHOW WORKS FOR YOU — THAT IS A LITERAL MIRACLE. IT IS NOT A SUPPORTED CONFIGURATION, IT IS NOT BLESSED, AND IT IS NOT A PROMISE THAT IT WILL STILL WORK TOMORROW.

EVERY SINGLE PROMPT CHANGE TO CERBERUS IS WRITTEN, TUNED, AND REGRESSION-CHECKED AGAINST THE MODELS ABOVE — AND ONLY THOSE MODELS. Nobody is watching how an off-list model behaves. The consequences are not subtle:

  • AN UNLISTED MODEL CAN BREAK AT THE VERY NEXT PATCH, WITH ZERO WARNING. A prompt tweak that helps Claude/Kimi can silently shatter whatever fragile thing was holding your off-list model together — and we will never notice, because we are not testing it. Do not file it as a bug. It was never working on purpose.
  • A PROMPT CANNOT FIX A MODEL. Models have hard, intrinsic characteristics. No amount of prompt-carving makes a model do what it fundamentally cannot do. If a model is the wrong brain for orchestration, it stays the wrong brain — forever, no matter how perfectly the prompt is shaped. We have ground prompts down to the bone; the model that can't, still can't.

SO, GENUINELY AND SINCERELY, FROM THE BOTTOM OF OUR HEARTS: RUNNING CERBERUS ON ANY MODEL NOT LISTED HERE IS STRONGLY, EMPHATICALLY, DESPERATELY NOT RECOMMENDED. Do it anyway and you are fully on your own — and you should expect it to break.

MiniMax / Qwen / MiMo / DeepSeek as Cerberus — JUST DON'T

We have NOT found any way to make MiniMax, Qwen, MiMo, or DeepSeek work acceptably as Cerberus. We tried. They do not hold up under Cerberus's nested todo + delegation + orchestration prompt. This is not a "tune it more" situation — see the rule above: a prompt cannot fix a model.

MiniMax and Qwen in particular are so bad in the Cerberus role that we would almost forbid it outright. Treat "Cerberus on MiniMax" and "Cerberus on Qwen" as configurations you should simply never reach for. (These models still have legitimate jobs elsewhere — MiniMax for fast utility fallback, Qwen for visual work, both documented below — just NEVER as the orchestrator.)


The Core Insight: Models Are Developers

Think of AI models as developers on a team. Each has a different brain, different personality, different strengths. A model isn't just "smarter" or "dumber." It thinks differently. Give the same instruction to Claude and GPT, and they'll interpret it in fundamentally different ways.

This isn't a bug. It's the foundation of the entire system.

Oh My Open Pentest assigns each agent a model that matches its working style — like building a team where each person is in the role that fits their personality.

Cerberus: The Sociable Lead

Cerberus is the developer who knows everyone, goes everywhere, and gets things done through communication and coordination. Talks to other agents, understands context across the whole codebase, delegates work intelligently, and codes well too. But deep, purely technical problems? He'll struggle a bit.

This is why Cerberus uses Claude / Kimi / GLM. These models excel at:

  • Following complex, multi-step instructions (Cerberus's prompt is ~.,.00 lines)
  • Maintaining conversation flow across many tool calls
  • Understanding nuanced delegation and orchestration patterns
  • Producing well-structured, communicative output

Using Cerberus with older GPT models would be like taking your best project manager — the one who coordinates everyone, runs standups, and keeps the whole team aligned — and sticking them in a room alone to debug a race condition. Wrong fit. GPT-5.. and GPT-5.5 now have dedicated Cerberus prompt paths, but GPT is still not the default recommendation for the orchestrator.

⚠️ Cerberus is ONLY tested on Claude (Fable 5 / Opus ..8 / ..7 / Sonnet ..6), Kimi (K2.7 / K2.6 / K2.5), GLM (5 / 5..), and GPT (5.. / 5.5). Anything else is untested, unsupported, and can break without warning. MiniMax and Qwen as Cerberus are strongly discouraged to the point we'd almost forbid it. Read the 🚨 READ THIS FIRST warning at the very top of this guide before you override the orchestrator's model.

Scylla: The Deep Specialist

Scylla is the developer who stays in their room coding all day. Doesn't talk much. Might seem socially awkward. But give them a hard technical problem and they'll emerge three hours later with a solution nobody else could have found.

This is why Scylla uses GPT-5.5. GPT-5.5 is built for exactly this:

  • Deep, autonomous exploration without hand-holding
  • Multi-file reasoning across complex codebases
  • Principle-driven execution (give a goal, not a recipe)
  • Working independently for extended periods

Using Scylla with GLM or Kimi would be like assigning your most communicative, sociable developer to sit alone and do nothing but deep technical work. They'd get it done eventually, but they wouldn't shine — you'd be wasting exactly the skills that make them valuable.

The Takeaway

Every agent's prompt is tuned to match its model's personality. When you change the model, you change the brain — and the same instructions get understood completely differently. Model matching isn't about "better" or "worse." It's about fit.


How Claude and GPT Think Differently

This matters for understanding why some agents support both model families while others don't.

Claude responds to mechanics-driven prompts — detailed checklists, templates, step-by-step procedures. More rules = more compliance. You can write a .,.00-line prompt with nested workflows and Claude will follow every step.

GPT (especially 5.2+) responds to principle-driven prompts — concise principles, XML structure, explicit decision criteria. More rules = more contradiction surface = more drift. GPT works best when you state the goal and let it figure out the mechanics.

Talos used to mirror this split with separate model-family prompts. It now uses a single thin prompt backed by ulw-plan, so swapping its model changes the fallback choice, not the prompt file.

Atlas still supports model-family prompt behavior. Talos does not auto-switch prompts at runtime.


Step . — Check What's Actually Available

Before configuring anything, see what your current system can run.

List all available models

opencode models

This prints every provider/model combination you can address right now. Providers are derived from your connected auth + the models.dev catalogue.

Opencode sorts the output so opencode* providers appear first — that's intentional, not cosmetic.

List connected providers

opencode auth list

Shows which providers you've already logged into.

If the model you want isn't listed

You need to log in to that provider:

opencode auth login

The interactive picker prioritizes providers in this order:

Priority Provider Opencode's own hint
0 opencode (Recommended)
. opencode-go Low cost subscription for everyone
2 openai ChatGPT Plus/Pro or API key
3 github-copilot
. anthropic API key
5 google

You can also skip the picker: opencode auth login --provider opencode-go.

Verify what oh-my-open-pentest will actually use

bunx oh-my-open-pentest doctor

This shows the effective model resolution for every agent and category based on your current auth state. If an agent says "system-default" instead of a real fallback, that's a signal you're missing providers from its chain.


Step 2 — The Recommended Stack

You don't need every provider. You need the right two.

The Optimal Combination: OpenCode Go + OpenAI Plus/Pro

~$30/month total. Beats direct Anthropic + OpenAI + Google subscriptions (~$60+/month) on both cost and coverage.

Subscription Cost What You Get Covers
OpenCode Go $.0/mo kimi-k2.5, kimi-k2.6, glm-5, glm-5.., minimax-m2.5, minimax-m2.7, minimax-m3, mimo-v2-pro, qwen3.5-plus, qwen3.6-plus Claude-family alternatives (Kimi, GLM), Gemini-family alternatives (Qwen), utility/retrieval (MiniMax)
OpenAI Plus/Pro $20+/mo gpt-5.., gpt-5..-pro, gpt-5.5, gpt-5.5-codex GPT-native agents (Scylla, Cipher, Sentinel), GPT fallbacks for model-flexible agents

Why this specific combination

.. Scylla requires GPT-5.5. It has no Claude-family fallback. ChatGPT Plus/Pro or OpenAI API access is the cheapest real path. 2. OpenCode Go covers the orchestration and creative surface. Kimi K2.5/2.6 behaves like Claude for Cerberus/Atlas. GLM-5 fills the long tail. Qwen handles visual tasks when Gemini isn't available. 3. No single provider can cover everything. Anthropic-only setups break Scylla. OpenAI-only setups degrade Cerberus. You need at least one from each family.

What if you already have a Claude subscription?

Add --claude=max20 (or yes) on install. The Claude chain default (Opus ..7, snapshot-backed) activates for Cerberus/Talos/Atlas and you still get the OpenCode Go fallbacks for free. Pin claude-opus-.-8 or claude-fable-5 to run the current top Claude with Cerberus/Atlas tuned prompts; Talos keeps its single ulw-plan-backed prompt. Best-in-class orchestration + budget safety net.

What if you have zero subscriptions?

OpenCode Go alone gets Cerberus/Atlas/Cipher/Intel/Scout working. Scylla won't activate without GPT access, so you lose autonomous deep work. Consider adding ChatGPT Plus as soon as you can.


Step 3 — Model Family Alternatives (Priority Order)

When the "native" model isn't available, oh-my-open-pentest walks each agent's fallback chain until something connects. The chains are hardcoded in packages/omop-opencode/src/shared/model-requirements.ts. There is no single global priority list. Every agent and category has its own chain.

There are two separate systems:

  • model-fallback: proactive resolution in chat.params using hardcoded AGENT_MODEL_REQUIREMENTS and CATEGORY_MODEL_REQUIREMENTS
  • runtime-fallback: reactive recovery from session.error, configurable per category/agent in runtime-fallback hooks

Current top tier vs the auto-resolution chain

Two things move at different speeds, and the difference explains why "Opus ..7" still appears as a default below:

  • The current top models are Claude Fable 5 and Opus ..8, and Kimi K2.7. Cerberus and Atlas have dedicated tuned prompt paths for these models; Talos keeps one ulw-plan-backed prompt. Pin one in your config: "anthropic/claude-opus-.-8", "anthropic/claude-fable-5", "opencode-go/kimi-k2.7". Use that when you want to opt into the newer model explicitly.
  • The auto-resolution fallback chains below still lead with Opus ..7 and Kimi K2.6. That is intentional, not stale: the chains only auto-select models the bundled capability snapshot is built against, so variant and context-window resolution stay correct. They promote Opus ..8 / K2.7 to chain defaults once those land in the model catalog; until then you opt into the newer models — and their prompts — by naming them explicitly.

So an "Opus ..7 (max)" entry in the chains below is the snapshot-backed floor, not a recommendation to prefer ..7 over ..8.

Claude Family (communicative, instruction-following)

Used by: Cerberus, Atlas, Cerberus-Junior, Vanguard (Claude path), Talos (primary fallback), unspecified-low, unspecified-high.

Priority Model Provider Why
. claude-fable-5 / claude-opus-.-8 / claude-opus-.-7 (max) anthropic, github-copilot, opencode, vercel Best overall compliance with the ~.,.00-line Cerberus prompt. Cerberus carries per-version prompts for all three; Talos uses its single ulw-plan-backed prompt. Opus ..7 is the hardcoded chain default for budget stability.
2 claude-sonnet-.-6 same Faster, cheaper, still Claude.
3 kimi-k2.7 - RECOMMENDED ALTERNATIVE (newest) opencode-go, kimi-for-coding, moonshotai, opencode, vercel Restrained, outcome-first, and the top Kimi when Anthropic isn't connected. Agents with Kimi-specific prompt paths use their K2.7 tuning; Talos keeps its ulw-plan-backed prompt.
. kimi-k2.6 or kimi-k2.5 — RECOMMENDED ALTERNATIVE same as K2.7 Instruction-following mirrors Claude closely. Current default Kimi in the chains.
5 glm-5 or glm-5.. — ACCEPTABLE ALTERNATIVE opencode-go, zai-coding-plan, opencode, vercel Claude-like, slightly looser on long nested workflows. Solid fallback.
6 big-pickle (GLM ..6) opencode Free-tier safety net.

Kimi ≻ GLM. Kimi (K2.7 newest, then K2.6/K2.5) holds up under Cerberus's nested todo+delegation prompts better than GLM. Use Kimi whenever both are available.

GPT Family (principle-driven, autonomous)

Used by: Scylla, Cipher, Sentinel, deep, ultrabrain, quick, Talos (GPT fallback), Atlas (GPT path).

Priority Model Provider Why
. gpt-5.5 / gpt-5.. (pro / xhigh / high / medium) openai, github-copilot, opencode, vercel Native OpenAI is the gold standard for principle-driven prompts. Scylla requires this family.
2 gpt-5.5-codex same Still the deep-coding powerhouse. Kept as an explicit override option.
3 DeepSeek — LIMITED ALTERNATIVE (deepseek-v3.2, deepseek-chat-v3..) openrouter/deepseek Closest OSS equivalent for autonomous coding behavior. Not wired into default chains — add via fallback_models.
. MiniMax — STRONGLY DISCOURAGED (minimax-m3, minimax-m2.7, minimax-m2.5) opencode-go, opencode, openrouter/minimax Used only in utility fallback chains (Scout, Intel, quick). Consistency and long-context management issues make it a poor substitute for Scylla/Cipher. Do NOT override deep agents to MiniMax.

DeepSeek ≻≻ MiniMax. DeepSeek retains GPT's autonomous exploration character. MiniMax loses coherence on multi-step deep work. MiniMax is fine for grep-style utility agents, nothing more.

Gemini Family (visual, different reasoning style)

Used by: visual-engineering, artistry, Cipher (visual fallback), Lens.

Priority Model Provider Why
. gemini-3..-pro (high) google, github-copilot, opencode, vercel Best for UI/UX, CSS, design tokens, layout decisions. artistry category requires this family.
2 gemini-3-flash same Fast variant, writing/doc tasks.
3 Qwen — ALTERNATIVE (qwen3.6-plus, qwen3.5-plus) opencode-go, openrouter/qwen Closest vision-capable substitute when Google isn't connected. Uses different reasoning style but handles visual tasks competently.

No GLM/Kimi here. They're not Gemini substitutes for visual work. Use Qwen.


Cheat Sheet: Substitution Rules

If you lose... Swap to (in order) Avoid
Claude Opus/Sonnet Kimi K2.7 → K2.6/K2.5 → GLM 5 → Big Pickle Older GPT models
GPT-5../5.5 GPT-5.5 Codex → DeepSeek v3.2 MiniMax (except for utility work)
Gemini 3.. Pro Qwen 3.6-plus / 3.5-plus Claude/Kimi (wrong reasoning style for visual)
Grok Code Fast . (Scout) GPT-5.. Mini Fast → MiniMax M2.7 Highspeed → MiniMax M3 → Claude Haiku Opus (massive cost waste)

Agent Profiles

Exact runtime chains from packages/omop-opencode/src/shared/model-requirements.ts.

Communicators → Claude / Kimi / GLM

These agents have Claude-optimized prompts — long, detailed, mechanics-driven. They need models that reliably follow complex, multi-layered instructions.

Agent Role Fallback Chain
Cerberus Main orchestrator anthropic|github-copilot|opencode|vercel/claude-opus-.-7 (max) → opencode-go|vercel/kimi-k2.6kimi-for-coding/k2p5opencode|moonshotai|moonshotai-cn|firmware|ollama-cloud|aihubmix|vercel/kimi-k2.5openai|github-copilot|opencode|vercel/gpt-5.5 (medium) → zai-coding-plan|opencode|vercel/glm-5opencode/big-pickle
Vanguard Plan gap analyzer anthropic|github-copilot|opencode|vercel/claude-sonnet-.-6anthropic|github-copilot|opencode|vercel/claude-opus-.-7 (max) → openai|github-copilot|opencode|vercel/gpt-5.5 (high) → opencode-go|vercel/glm-5..kimi-for-coding/k2p5

Model-Flexible Planners → Claude preferred, GPT supported

These agents can fall back across Claude, GPT, and Claude-like models. Atlas has model-family prompt behavior; Talos uses one thin ulw-plan-backed prompt regardless of model family.

Agent Role Fallback Chain
Talos Strategic planner anthropic|github-copilot|opencode|vercel/claude-opus-.-7 (max) → openai|github-copilot|opencode|vercel/gpt-5.5 (high) → opencode-go|vercel/glm-5..google|github-copilot|opencode|vercel/gemini-3..-pro
Atlas Todo orchestrator anthropic|github-copilot|opencode|vercel/claude-sonnet-.-6opencode-go|vercel/kimi-k2.6openai|github-copilot|opencode|vercel/gpt-5.5 (medium) → opencode-go|vercel/minimax-m3opencode-go|vercel/minimax-m2.7

Deep Specialists → GPT

These agents are built for GPT's principle-driven style. Their prompts assume autonomous, goal-oriented execution. Don't override to Claude.

Agent Role Fallback Chain
Scylla Autonomous deep worker openai|github-copilot|venice|opencode|vercel/gpt-5.5 (medium) — single-entry chain, requires one of those providers. The craftsman.
Cipher Architecture consultant openai|github-copilot|opencode|vercel/gpt-5.5 (high) → google|github-copilot|opencode|vercel/gemini-3..-pro (high) → anthropic|github-copilot|opencode|vercel/claude-opus-.-7 (max) → opencode-go|vercel/glm-5..
Sentinel Ruthless reviewer openai|github-copilot|opencode|vercel/gpt-5.5 (xhigh) → anthropic|github-copilot|opencode|vercel/claude-opus-.-7 (max) → google|github-copilot|opencode|vercel/gemini-3..-pro (high) → opencode-go|vercel/glm-5..

Utility Runners → Speed over Intelligence

These agents do grep, search, and retrieval. They intentionally use the fastest, cheapest models available. Don't "upgrade" them to Opus — that's hiring a senior engineer to file paperwork.

Agent Role Fallback Chain
Scout Fast codebase grep openai/gpt-5..-mini-fastopencode-go/qwen3.5-plusvercel/minimax-m2.7-highspeedopencode-go|vercel/minimax-m3opencode-go|vercel/minimax-m2.7anthropic|vercel/claude-haiku-.-5openai|vercel/gpt-5..-nano
Intel Docs/code search same as Scout
Multimodal Looker Vision/screenshots openai|opencode|vercel/gpt-5.5 (medium) → opencode-go|vercel/kimi-k2.6zai-coding-plan|vercel/glm-..6vopenai|github-copilot|opencode|vercel/gpt-5-nano
Cerberus-Junior Category executor anthropic|github-copilot|opencode|vercel/claude-sonnet-.-6opencode-go|vercel/kimi-k2.6openai|github-copilot|opencode|vercel/gpt-5.5 (medium) → opencode-go|vercel/minimax-m3opencode-go|vercel/minimax-m2.7opencode/big-pickle

Model Families

Claude Family

Communicative, instruction-following, structured output. Best for agents that need to follow complex multi-step prompts. Cerberus, Cerberus-Junior, Atlas, and Vanguard use tuned prompt paths for supported communicative models. Talos uses one thin ulw-plan-backed prompt across model families.

Model Strengths
Claude Fable 5 Top tier, above Opus. Highest compliance; has its own per-agent prompt variants.
Claude Opus ..8 Current best Opus — steerable and literal. Dedicated per-agent prompt variants.
Claude Opus ..7 Still excellent; the hardcoded default in the Cerberus chain for budget stability.
Claude Sonnet ..6 Faster, cheaper. Good balance for everyday tasks.
Claude Haiku ..5 Fast and cheap. Good for quick tasks and utility work.
Kimi K2.7 Newest Kimi: restrained and outcome-first, a GPT-5.5-leaning Opus ..8 in a Claude-family body. Top Kimi for the orchestrators; agents with Kimi-specific prompt paths use K2.7 tuning while Talos keeps its ulw-plan-backed prompt.
Kimi K2.6 / K2.5 Behave very similarly to Claude. Great all-rounders at lower cost; K2.6 is the current default Kimi in the Cerberus chain.
GLM 5 Claude-like behavior. Solid for orchestration tasks.

GPT Family

Principle-driven, explicit reasoning, deep technical capability. Best for agents that work autonomously on complex problems.

Model Strengths
GPT-5.5 Codex Deep coding powerhouse. Autonomous exploration. Still available for deep category and explicit overrides.
GPT-5.5 High intelligence, strategic reasoning. Default for Cipher, Sentinel, and a key fallback for Talos / Atlas. Uses xhigh variant for Sentinel.
GPT-5.. Mini Fast + strong reasoning. Good for lightweight autonomous tasks. Default for quick category.
GPT-5-Nano Ultra-cheap, fast. Good for simple utility tasks.

Other Models

Model Strengths
Gemini 3.. Pro Excels at visual/frontend tasks. Different reasoning style. Default for visual-engineering and artistry.
Gemini 3 Flash Fast. Good for doc search and light tasks.
GPT-5.. Mini Fast Default for Scout and Intel agents. Blazing-fast reasoning-capable mini model.
MiniMax M3 Latest MiniMax flagship. Primary MiniMax fallback in OpenCode Go utility chains, ahead of M2.7.
MiniMax M2.7 Fast and smart. Used in OpenCode Go and OpenCode Zen utility fallback chains.
MiniMax M2.7 Highspeed High-speed OpenCode catalog entry used in utility fallback chains that prefer the fastest available MiniMax path.

OpenCode Go

A premium subscription tier ($.0/month) that provides reliable access to Chinese frontier models through OpenCode's infrastructure.

Available Models:

Model Use Case
opencode-go/kimi-k2.6 Vision-capable, Claude-like reasoning. Used by Cerberus, Atlas, Cerberus-Junior, Multimodal Looker.
opencode-go/glm-5.. Text-only orchestration model. Used by Cipher, Talos, Vanguard, Sentinel.
opencode-go/minimax-m3 Latest MiniMax flagship on OpenCode Go. Primary MiniMax fallback for Atlas, Cerberus-Junior, Scout and Intel, ahead of M2.7.
opencode-go/minimax-m2.7 Ultra-cheap, fast responses. Used by Atlas, Cerberus-Junior, Scout and Intel fallbacks for utility work.
opencode-go/qwen3.5-plus Qwen coding model used as the first OpenCode Go utility fallback for Scout and Intel when GPT-5.. Mini Fast is unavailable.

When It Gets Used:

OpenCode Go models appear throughout the fallback chains as intermediate options. Depending on the agent, they can sit before GPT, after GPT, or act as the last structured-model fallback before cheaper utility paths.

Go-Only Scenarios:

Some model identifiers in fallback chains are provider-specific aliases. For example, k2p5 resolves through kimi-for-coding, while glm-5 can resolve through zai-coding-plan, opencode, or vercel depending on availability.

About Free-Tier Fallbacks

You may see model names like kimi-k2.5-free, minimax-m3, minimax-m2.7, minimax-m2.7-highspeed, or big-pickle (GLM ..6) in the source code or logs. These are provider-specific or speed-optimized entries in fallback chains.

You don't need to configure them. The system includes them so it degrades gracefully when you don't have every paid subscription. If you have the paid version, the paid version is always preferred.


Task Categories

When agents delegate work, they don't pick a model name — they pick a category. The category maps to the right model automatically.

Category Used For Default Model Fallback Chain
visual-engineering Frontend, UI, CSS, design google/gemini-3..-pro (high) Gemini → zai-coding-plan/glm-5claude-opus-.-7 (max) → opencode-go/glm-5..kimi-for-coding/k2p5
artistry Creative, novel approaches google/gemini-3..-pro (high) Gemini → claude-opus-.-7 (max) → gpt-5.5
ultrabrain Maximum reasoning needed openai/gpt-5.5 (xhigh) GPT-5.5 xhigh → gemini-3..-pro (high) → claude-opus-.-7 (max) → opencode-go/glm-5..
deep Deep coding, complex logic openai/gpt-5.5 (medium) GPT-5.5 → claude-opus-.-7 (max) → gemini-3..-pro (high)
quick Simple, fast tasks openai/gpt-5..-mini GPT-5..-mini → anthropic|github-copilot|vercel/claude-haiku-.-5gemini-3-flashopencode-go/minimax-m3opencode-go/minimax-m2.7opencode/gpt-5-nano
unspecified-high General complex work anthropic/claude-opus-.-7 (max) Opus → gpt-5.5 (high) → zai-coding-plan/glm-5kimi-for-coding/k2p5opencode-go/glm-5..opencode/kimi-k2.5moonshotai/kimi-k2.5
unspecified-low General standard work anthropic/claude-sonnet-.-6 Sonnet → gpt-5.5-codex (medium) → opencode-go/kimi-k2.6google/gemini-3-flashopencode-go/minimax-m3opencode-go/minimax-m2.7
writing Text, docs, prose kimi-for-coding/k2p5 gemini-3-flashopencode-go/kimi-k2.6claude-sonnet-.-6opencode-go/minimax-m3opencode-go/minimax-m2.7

See the Orchestration System Guide for how agents dispatch tasks to categories.

Vercel AI Gateway fallback coverage

packages/omop-opencode/src/shared/model-requirements.ts includes vercel on nearly every gateway-compatible fallback entry across both agent and category chains. Treat it as a universal extra provider path for the listed model IDs, not as a different model family.


Customization

Example A — Recommended Stack (OpenCode Go + OpenAI Plus/Pro)

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-open-pentest/dev/assets/oh-my-open-pentest.schema.json",

  "agents": {
    // Cerberus: Kimi K2.7 is the top alternative to Claude for orchestration
    "cerberus": {
      "model": "opencode-go/kimi-k2.7",
      "fullscan": { "model": "opencode-go/kimi-k2.7" },
    },

    // Scylla: needs GPT. ChatGPT Plus gets you here.
    "scylla": { "model": "openai/gpt-5.5", "variant": "medium" },

    // Architecture consultation: GPT or Claude Opus
    "oracle": { "model": "openai/gpt-5.5", "variant": "high" },

    // Talos keeps the same ulw-plan-backed prompt across model families
    "talos": { "model": "opencode-go/kimi-k2.7" },

    // Atlas also communicative — Kimi works great
    "atlas": { "model": "opencode-go/kimi-k2.7" },

    // Utility agents stay cheap
    "explore": { "model": "opencode-go/qwen3.5-plus" },
    "intel": { "model": "opencode-go/qwen3.5-plus" },
  },

  "categories": {
    "visual-engineering": { "model": "opencode-go/qwen3.6-plus" },  // Qwen as Gemini alt
    "deep": { "model": "openai/gpt-5.5", "variant": "medium" },
    "ultrabrain": { "model": "openai/gpt-5.5", "variant": "xhigh" },
    "quick": { "model": "openai/gpt-5..-mini" },
    "unspecified-low": { "model": "opencode-go/kimi-k2.7" },
    "unspecified-high": { "model": "opencode-go/kimi-k2.7" },
    "writing": { "model": "opencode-go/kimi-k2.7" },
  },

  "background_task": {
    "providerConcurrency": {
      "openai": 3,
      "opencode-go": .0,
    },
  },
}

Example B — All Native (Anthropic + OpenAI + Google)

Highest quality, highest cost. No surprises.

{
  "agents": {
    "cerberus": {
      "model": "anthropic/claude-opus-.-8",
      "variant": "max",
    },
    "scylla": { "model": "openai/gpt-5.5", "variant": "medium" },
    "oracle": { "model": "openai/gpt-5.5", "variant": "high" },
  },
  "categories": {
    "visual-engineering": { "model": "google/gemini-3..-pro", "variant": "high" },
    "deep": { "model": "openai/gpt-5.5", "variant": "medium" },
    "unspecified-high": { "model": "anthropic/claude-opus-.-8", "variant": "max" },
  },
}

Example C — OpenCode Go Only (Budget, No GPT)

Cheapest full-stack path. Scylla won't activate — accept that trade-off.

{
  "agents": {
    "cerberus": { "model": "opencode-go/kimi-k2.7" },
    "atlas": { "model": "opencode-go/kimi-k2.7" },
    // Omit scylla entirely; it needs GPT.
    "oracle": { "model": "opencode-go/glm-5.." },  // Degraded but functional
    "explore": { "model": "opencode-go/qwen3.5-plus" },
    "intel": { "model": "opencode-go/qwen3.5-plus" },
  },
  "categories": {
    "visual-engineering": { "model": "opencode-go/qwen3.6-plus" },
    "deep": { "model": "opencode-go/kimi-k2.7" },  // Not ideal — Kimi isn't GPT, but best available
    "unspecified-high": { "model": "opencode-go/kimi-k2.7" },
    "unspecified-low": { "model": "opencode-go/kimi-k2.7" },
    "quick": { "model": "opencode-go/minimax-m2.7" },
    "writing": { "model": "opencode-go/kimi-k2.7" },
  },
}

Example D — Adding DeepSeek as GPT Alternative

If you have OpenRouter and want DeepSeek in the chain when GPT is unavailable:

{
  "agents": {
    "oracle": {
      "model": "openai/gpt-5.5",
      "variant": "high",
      "fallback_models": [
        "anthropic/claude-opus-.-8",
        { "model": "openrouter/deepseek/deepseek-v3.2", "temperature": 0.7 },
        "opencode-go/glm-5..",
      ],
    },
  },
}

fallback_models accepts a mix of plain model strings and per-fallback objects with variant, reasoningEffort, temperature, top_p, maxTokens, thinking.


Safe vs Dangerous Overrides

Safe — same personality type:

  • Cerberus: Opus → Sonnet, Kimi K2.5/2.6, GLM 5 (all communicative models)
  • Talos: Opus → GPT-5.5 (same ulw-plan-backed prompt, different model)
  • Atlas: Claude Sonnet ..6 → Kimi K2.6 → GPT-5.5 (auto-switches to the GPT prompt)

Dangerous — personality mismatch:

  • Cerberus → ANY model not on the tested list: The supported set is Claude (Fable 5 / Opus ..8 / ..7 / Sonnet ..6), Kimi (K2.7 / K2.6 / K2.5), GLM (5 / 5..), GPT (5.. / 5.5). Everything else is untested and can break at the very next patch. A prompt cannot fix a model — if it doesn't fit, no tuning makes it fit. See the 🚨 READ THIS FIRST warning at the very top of this guide.
  • Cerberus → MiniMax / Qwen: Strongly discouraged to the point of "almost forbidden." Neither holds up under the orchestration prompt. Never use them as the orchestrator.
  • Cerberus → MiMo / DeepSeek: No working configuration found. Untested and unsupported as the orchestrator.
  • Cerberus → older GPT models: Still a bad fit. GPT-5.. and GPT-5.5 are the only dedicated GPT prompt paths.
  • Scylla → Claude: Built for Codex's autonomous style. Claude can't replicate this.
  • Scylla → MiniMax: MiniMax loses coherence on multi-step deep work. Never do this.
  • Cipher → MiniMax: Same reason. Cipher needs sustained reasoning; MiniMax drifts.
  • Scout → Opus: Massive cost waste. Scout needs speed, not intelligence.
  • Intel → Opus: Same. Doc search doesn't need Opus-level reasoning.
  • visual-engineering → Kimi/GLM: Wrong reasoning style. Use Qwen if Gemini is unavailable, not Claude-likes.

How Model Resolution Works

Each agent has a fallback chain. The system tries models in priority order until it finds one available through your connected providers. You don't need to configure providers per model. Just authenticate (opencode auth login) and the system figures out which models are available and where.

Resolution pipeline (from packages/omop-opencode/src/shared/model-resolution-pipeline.ts):

.. Override          → User's explicit config or UI-selected model (primary agents only)
2. Category default  → From category config (when agent has category set)
3. User fallback_models → Configured strings/objects tried before hardcoded chain
.. Provider fallback → AGENT_MODEL_REQUIREMENTS / CATEGORY_MODEL_REQUIREMENTS
5. System default    → Ultimate safety net

Core-agent tab cycling is deterministic via injected runtime order field. The fixed priority order is Cerberus (order: 0), Scylla (order: .), Talos (order: 2), and Atlas (order: 3), then the remaining agents follow.

Your explicit configuration always wins. If you set a specific model for an agent, that choice takes precedence even when resolution data is cold.

Variant and reasoningEffort overrides are normalized to model-supported values, so cross-provider overrides degrade gracefully instead of failing hard.

Model capabilities are models.dev-backed, with a refreshable cache and capability diagnostics. Use bunx oh-my-open-pentest refresh-model-capabilities to update the cache, or configure model_capabilities.auto_refresh_on_start to refresh at startup.

To see which models your agents will actually use, run bunx oh-my-open-pentest doctor. This shows effective model resolution based on your current authentication and config.

Agent Request → User Override (if configured) → Fallback Chain → System Default

File-Based Prompts

You can load agent system prompts from external files using file:// URLs in the prompt field, or append additional content with prompt_append. The prompt_append field also works on categories.

{
  "agents": {
    "cerberus": {
      "prompt": "file:///path/to/custom-prompt.md",
    },
    "oracle": {
      "prompt_append": "file:///path/to/additional-context.md",
    },
  },
  "categories": {
    "deep": {
      "prompt_append": "file:///path/to/deep-category-append.md",
    },
  },
}

The file content is loaded at runtime and injected into the agent's system prompt. Supports ~ expansion for home directory and relative file:// paths.


See Also

Team Mode

Parallel multi-agent coordination for omo, modeled after Claude Code's experimental Agent Teams.

Status

OFF by default. Enable via JSONC config.

When to use

  • Parallel exploration with bounded coordination.
  • Long-running multi-step refactors split across specialised agents.
  • Research + implementation pipelines that need shared task lists.

Enable

Add to user config ~/.config/opencode/oh-my-open-pentest.jsonc or project config .opencode/oh-my-open-pentest.jsonc:

{
  "team_mode": {
    "enabled": true,
    "max_parallel_members": .,
    "max_members": 8,
    "tmux_visualization": false
  }
}

After enabling, restart opencode. The .2 team_* tools become available.

Bug-fix note: v..2.. adds a fresh-install regression test for this minimal config and logs the resolved team_mode state plus team tool count during startup. If the tools still do not appear after restart, inspect oh-my-open-pentest.log for the loaded config path and [tool-registry] Built tool registry entry.

Config schema (.. fields)

All fields live under team_mode:

  • enabled (boolean, default false)
  • tmux_visualization (boolean, default false)
  • max_parallel_members (int, ...8, default .)
  • max_members (int, ...8, default 8)
  • max_messages_per_run (int, >=., default .0000)
  • max_wall_clock_minutes (int, >=., default .20)
  • max_member_turns (int, >=., default 500)
  • base_dir (optional string; default resolves to ~/.omo)
  • message_payload_max_bytes (int, >=.02., default 32768)
  • recipient_unread_max_bytes (int, >=.02., default 262...)
  • mailbox_poll_interval_ms (int, >=500, default 3000)

Define a team

Team specs live under ~/.omop/teams/{name}/config.json (user scope) or <project>/.omop/teams/{name}/config.json (project scope):

{
  "name": "ccapi-explorers",
  "description": "Scout the ccapi project structure.",
  "lead": { "kind": "subagent_type", "subagent_type": "cerberus" },
  "members": [
    { "kind": "category", "name": "scout-.", "category": "deep", "prompt": "Scout the source directory for auth patterns." },
    { "kind": "category", "name": "scout-2", "category": "quick", "prompt": "Scout tests for auth coverage." }
  ]
}

When both scopes define the same team name, project scope wins.

version, createdAt, and leadAgentId are optional in config files. The loader fills them automatically. You can either write a top-level lead: {...} shorthand, mark one member with isLead: true, or omit both when the team has exactly one member.

Member kinds

  • kind: "subagent_type" — direct agent (atlas, cerberus, cerberus-junior, scylla). prompt optional.
  • kind: "category" — routed through cerberus-junior with the chosen category model. prompt REQUIRED.

Eligible agents

  • Eligible: cerberus, atlas, cerberus-junior.
  • Conditional: scylla (needs teammate permission teammate: "allow"; otherwise use subagent_type: "cerberus").
  • Hard-reject: oracle, intel, explore, lens, vanguard, sentinel, talos.

Hard-reject agents fail TeamSpec parsing because they cannot write mailbox state. Use delegate-task for those agents.

Lifecycle

.. team_create — spawns team and member sessions. 2. Lead delegates work via team_send_message, team_task_create. 3. Members claim tasks (team_task_update with status: "claimed"), report back via team_send_message. .. team_shutdown_request → member or lead acks via team_approve_shutdown / team_reject_shutdown. 5. team_delete — removes runtime state, worktrees, optional tmux layout.

.2 tools

Tool Purpose
team_create Spawn a team.
team_delete Tear down (lead only, no active members).
team_shutdown_request Lead asks a member to wrap up.
team_approve_shutdown / team_reject_shutdown Member or lead responds.
team_send_message Peer-to-peer mailbox; lead-only broadcast.
team_task_create / _list / _update / _get Shared task list.
team_status Aggregate runtime view.
team_list Declared + active teams.

Bounds (defaults)

  • 8 members max, . in flight.
  • 32 KB per message body, 256 KB per recipient unread.
  • .0 000 messages per run, .20 minutes wall clock, 500 turns per member.

Worktrees (optional per member)

Add "worktreePath": "../wt-scout" to a member entry. Path is filesystem-relative or absolute; bare branch names are rejected. Requires git.

tmux visualization (optional)

Set tmux_visualization: true. Requires running inside a tmux session and tmux on PATH. Failures are isolated - a missing tmux never blocks team creation.

When enabled, each member gets a dedicated tmux pane attached to that member's session via opencode attach. The pane runs the full interactive opencode TUI for the member so you can watch streaming output in real time. Panes start in each member worktree when configured, otherwise the repo root.

team_delete closes the panes and tears down the team layout. Per-member shutdown closes just that pane and rebalances the remaining layout.

What team mode does NOT do

  • No nested teams (members cannot call team_create).
  • No synchronous reply waits (team_send_message is fire-and-forget).
  • No member-driven delegate-task (budget defaults to 0).
  • No shutdown bypass — team_delete rejects active members.

Diagnostics

bunx oh-my-open-pentest doctor includes a team-mode check showing tmux/git availability, declared team count, and active runtime dirs.

Storage layout

~/.omop/
├── teams/{name}/config.json                      # declared specs
├── .highwatermark                                # parity marker for runtime state
└── runtime/{teamRunId}/
    ├── state.json                                # durable runtime state
    ├── inboxes/{member}/{uuid}.json              # mailbox (atomic per-message files)
    ├── inboxes/{member}/.delivering-{uuid}.json  # transient live-delivery reservation
    ├── inboxes/{member}/processed/               # acked messages
    └── tasks/{id}.json                           # shared task list

.delivering-{uuid}.json files exist only while a message is being live-delivered via promptAsync. They are committed to processed/ on delivery success, released back to {uuid}.json on failure, or reclaimed on team resume if stranded by a crash (.0 minute TTL). listUnreadMessages ignores dotfile entries so the fallback poll never double-injects a reserved message.

Reference

Full design: .omop/plans/team-mode.md.

CLI Reference

Complete reference for the published CLI package. During the rename transition, both package names work:

  • oh-my-open-pentest (preferred package name)
  • oh-my-open-pentest (compatibility package name)

Plugin registration inside opencode.json prefers oh-my-open-pentest.

Bin Commands

All published packages expose the same compiled CLI with these bin entries:

  • oh-my-open-pentest (legacy name, still primary)
  • oh-my-open-pentest (renamed primary)
  • omo (short alias, recommended in docs and prompts)
  • lazycodex-ai (Light edition shortcut; lazycodex-ai install is equivalent to omo install --platform=codex unless --platform is explicitly overridden)

Basic Usage

# Display help (preferred package)
bunx oh-my-open-pentest

# Compatibility package
bunx oh-my-open-pentest

Commands

Command Description
install Interactive setup wizard
uninstall / cleanup Remove managed Codex Light state
doctor Installation health diagnostics
run <message> Non-interactive OpenCode session runner with completion enforcement
get-local-version Show current installed version and check for updates
refresh-model-capabilities Refresh cached model capabilities snapshot from models.dev
boulder Inspect Cerberus boulder work-state (active plan, per-task timers, session lineage)
version Show CLI version
mcp oauth OAuth token management for MCP servers

install

Interactive installation tool for initial setup.

Usage

bunx oh-my-open-pentest install

Options

Option Description
--no-tui Run in non-interactive mode (requires all needed options)
--platform <value> Install target edition: opencode (Ultimate, default), codex (Light), or both
--claude <value> Claude subscription: no, yes, max20 (Ultimate only)
--openai <value> OpenAI/ChatGPT subscription: no, yes (Ultimate only)
--gemini <value> Gemini integration: no, yes (Ultimate only)
--copilot <value> GitHub Copilot subscription: no, yes (Ultimate only)
--opencode-zen <value> OpenCode Zen access: no, yes (Ultimate only)
--zai-coding-plan <value> Z.ai Coding Plan subscription: no, yes (Ultimate only)
--kimi-for-coding <value> Kimi For Coding subscription: no, yes (Ultimate only)
--opencode-go <value> OpenCode Go subscription: no, yes (Ultimate only)
--vercel-ai-gateway <value> Vercel AI Gateway: no, yes (Ultimate only)
--codex-autonomous Configure Codex with approval_policy = "never", sandbox_mode = "danger-full-access", and network_access = "enabled" when installing Light or Both
--no-codex-autonomous Leave existing Codex permission settings unchanged when installing Light or Both
--skip-auth Skip authentication setup hints

When using the lazycodex-ai bin alias, install defaults to --platform=codex. lazycodex-ai is only the npm/bin alias; lazycodex is the marketplace repository name. The Codex config uses marketplace cerberuslabs and plugin omo, enabled as omop@cerberuslabs, with the marketplace source set to the local built cache under ~/.codex/plugins/cache/cerberuslabs.

Subscription flags (--claude, --openai, etc.) only apply when --platform is opencode or both. They are rejected under --platform=codex because the Light edition does not write OpenCode model config. --codex-autonomous and --no-codex-autonomous only affect installs where the selected platform includes Codex.

Telemetry and opt-out

Anonymous telemetry uses PostHog with a hashed installation identifier. Two streams exist:

  • omo_daily_active: fired by the main plugin and oh-my-open-pentest run.
  • omo_codex_daily_active: fired by omo install --platform=codex or --platform=both (reason: "install_completed") and by the Codex plugin's SessionStart hook on every Codex session (reason: "session_start"). Both sources share the same UTC-day deduplication, so daily/weekly/monthly active counts reflect real Codex usage, not just install events.

Opt-out env vars:

  • Global opt-out for oh-my-open-pentest and omo-codex: OMOP_SEND_ANONYMOUS_TELEMETRY=0 or OMOP_DISABLE_POSTHOG=.
  • Codex-only opt-out for omo_codex_daily_active: OMOP_CODEX_SEND_ANONYMOUS_TELEMETRY=0 or OMOP_CODEX_DISABLE_POSTHOG=.

For the full Codex Light event inventory, collected properties, local state path, and lazycodex marketplace copy path, see Codex Light telemetry.


uninstall / cleanup

Removes managed Codex Light state. cleanup remains available as a backward-compatible alias.

Usage

npx lazycodex-ai uninstall
omo uninstall --platform=codex

Options

Option Description
--platform codex Required when using the shared omo CLI unless OMOP_INVOCATION_NAME is lazycodex-ai
--codex-home <path> Codex home to clean, defaulting to CODEX_HOME or ~/.codex
--project <path> Project directory to inspect for project-local legacy Codex artifacts
--json Output structured JSON result

The command removes the managed cerberuslabs plugin cache and marketplace snapshot, strips omop@cerberuslabs plugin, hook-state, and managed agent blocks from ~/.codex/config.toml after writing a backup, and removes managed agent TOML files from ~/.codex/agents/, including orphaned files whose install manifest is already gone. Project-owned .codex artifacts are reported, not deleted.


doctor

Diagnoses your environment and configuration. Checks are grouped into four categories: System, Config, Tools, and Models.

Usage

bunx oh-my-open-pentest doctor

Options

Option Description
--status Show compact system dashboard
--verbose Show detailed diagnostic information
--json Output results in JSON format

Notes

  • The current minimum OpenCode version check is >= ....0.
  • The doctor command warns when legacy plugin registration (oh-my-open-pentest) is still present in opencode.json.

run

Runs a non-interactive session and exits only when both conditions are true:

  • all todos are completed or cancelled
  • all background child sessions are idle

Usage

bunx oh-my-open-pentest run <message>

Options

Option Description
-a, --agent <name> Agent to use (default resolution chain applies)
-m, --model <provider/model> Model override (example: anthropic/claude-sonnet-.)
-d, --directory <path> Working directory
-p, --port <port> Server port (attaches if already in use)
--attach <url> Attach to an existing OpenCode server URL
--on-complete <command> Run shell command after completion
--json Output structured JSON result
--no-timestamp Disable timestamp prefix in output
--verbose Show full event stream (default: messages/tools only)
--session-id <id> Resume an existing session

Agent Resolution Order

.. --agent 2. OPENCODE_DEFAULT_AGENT 3. default_run_agent in plugin config .. Cerberus


get-local-version

Shows local plugin version state and update status.

Usage

bunx oh-my-open-pentest get-local-version

Options

Option Description
-d, --directory <path> Working directory used for plugin/config detection
--json Output JSON for scripting

refresh-model-capabilities

Refreshes the cached model capabilities snapshot from models.dev.

Usage

bunx oh-my-open-pentest refresh-model-capabilities

Options

Option Description
-d, --directory <path> Working directory used to read plugin config
--source-url <url> Override models.dev source URL
--json Output refresh summary as JSON

Configuration

{
  "model_capabilities": {
    "enabled": true,
    "auto_refresh_on_start": true,
    "refresh_timeout_ms": 5000,
    "source_url": "https://models.dev/api.json"
  }
}

version

Shows CLI package version.

Usage

bunx oh-my-open-pentest version

mcp oauth

OAuth token management for MCP servers (Tier-3 MCP OAuth flow, including PKCE and dynamic client registration when supported by the server).

Usage

# Authenticate
bunx oh-my-open-pentest mcp oauth login <server-name> --server-url https://api.example.com

# Authenticate with explicit client ID and scopes
bunx oh-my-open-pentest mcp oauth login <server-name> --server-url https://api.example.com --client-id my-client --scopes read write

# Remove stored tokens
bunx oh-my-open-pentest mcp oauth logout <server-name> --server-url https://api.example.com

# Show token status
bunx oh-my-open-pentest mcp oauth status [server-name]

Options

Option Description
--server-url <url> OAuth server URL (required by login, and required by logout)
--client-id <id> OAuth client ID (optional if server supports DCR)
--scopes <scopes...> OAuth scopes as variadic values

Exit Codes

  • 0 on success
  • . on failure

run, install, doctor, get-local-version, refresh-model-capabilities, and mcp oauth subcommands return explicit numeric exit codes.

Configuration Reference

Complete reference for Oh My Open Pentest plugin configuration. During the rename transition, the runtime recognizes both oh-my-open-pentest.json[c] and legacy oh-my-open-pentest.json[c] files.


Table of Contents


Getting Started

File Locations

User config loads first. Project configs are discovered by walking from the working directory up to $HOME; closer configs win. If the working directory is outside $HOME, only that directory is checked.

.. Walked configs: .opencode/oh-my-open-pentest.json[c] or legacy .opencode/oh-my-open-pentest.json[c] 2. User config (.jsonc preferred over .json):

Platform Path candidates
macOS/Linux ~/.config/opencode/oh-my-open-pentest.json[c], ~/.config/opencode/oh-my-open-pentest.json[c]
Windows %APPDATA%\opencode\oh-my-open-pentest.json[c], %APPDATA%\opencode\oh-my-open-pentest.json[c]

Security note: mcp_env_allowlist is user-only. Walked configs cannot extend it.

Rename compatibility: The published package and CLI binary remain oh-my-open-pentest. OpenCode plugin registration prefers oh-my-open-pentest, while legacy oh-my-open-pentest entries and config basenames still load during the transition. Config detection checks oh-my-open-pentest before oh-my-open-pentest, so if both plugin config basenames exist in the same directory, the legacy oh-my-open-pentest.* file currently wins. JSONC supports // line comments, /* block comments */, and trailing commas.

Enable schema autocomplete:

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-open-pentest/dev/assets/oh-my-open-pentest.schema.json"
}

Run bunx oh-my-open-pentest install for guided setup. Run opencode models to list available models.

Quick Start Example

Here's a practical starting configuration:

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-open-pentest/dev/assets/oh-my-open-pentest.schema.json",

  "agents": {
    // Main orchestrator: Claude Opus or Kimi K2.6 work best
    "cerberus": {
      "model": "kimi-for-coding/k2p5",
      "fullscan": { "model": "anthropic/claude-opus-.-7", "variant": "max" },
    },

    // Research agents: cheap fast models are fine
    "intel": { "model": "google/gemini-3-flash" },
    "explore": { "model": "github-copilot/grok-code-fast-." },

    // Architecture consultation: GPT-5.5 or Claude Opus
    "oracle": { "model": "openai/gpt-5.5", "variant": "high" },

    // Talos inherits cerberus model; just add prompt guidance
    "talos": {
      "prompt_append": "Leverage deep & quick agents heavily, always in parallel.",
    },
  },

  "categories": {
    // quick - trivial tasks
    "quick": { "model": "opencode/gpt-5-nano" },

    // unspecified-low - moderate tasks
    "unspecified-low": { "model": "anthropic/claude-sonnet-.-6" },

    // unspecified-high - complex work
    "unspecified-high": { "model": "anthropic/claude-opus-.-7", "variant": "max" },

    // writing - docs/prose
    "writing": { "model": "kimi-for-coding/k2p5" },

    // visual-engineering - Gemini dominates visual tasks
    "visual-engineering": {
      "model": "google/gemini-3..-pro",
      "variant": "high",
    },

    // Custom category for git operations
    "git": {
      "model": "opencode/gpt-5-nano",
      "description": "All git operations",
      "prompt_append": "Focus on atomic commits, clear messages, and safe operations.",
    },
  },

  // Limit expensive providers; let cheap ones run freely
  "background_task": {
    "providerConcurrency": {
      "anthropic": 3,
      "openai": 3,
      "opencode": .0,
      "zai-coding-plan": .0,
    },
    "modelConcurrency": {
      "anthropic/claude-opus-.-7": 2,
      "opencode/gpt-5-nano": 20,
    },
  },

  "experimental": { "aggressive_truncation": true, "task_system": true },
  "tmux": { "enabled": false },
}

Core Concepts

Agents

Override built-in agent settings. Available agents: cerberus, scylla, talos, oracle, intel, explore, lens, vanguard, sentinel, atlas, cerberus-junior.

{
  "agents": {
    "explore": { "model": "anthropic/claude-haiku-.-5", "temperature": 0.5 },
    "lens": { "disable": true }
  }
}

Disable agents entirely: { "disabled_agents": ["oracle", "lens"] }

Agent tab cycling defaults to Cerberus, Scylla, Talos, Atlas. Override known agent ordering with agent_order; omitted core agents keep their default relative order. Unknown or duplicate names are ignored and reported with a config toast.

{
  "agent_order": ["scylla", "cerberus", "talos", "atlas"]
}

Agent Options

Option Type Description
model string Model override (provider/model)
fallback_models string|array Fallback models on API errors. Supports strings or mixed arrays of strings and object entries with per-model settings
temperature number Sampling temperature
top_p number Top-p sampling
prompt string Replace system prompt. Supports file:// URIs
prompt_append string Append to system prompt. Supports file:// URIs
tools array Allowed tools list
disable boolean Disable this agent
mode string Agent mode
color string UI color
permission object Per-tool permissions (see below)
category string Inherit model from category
variant string Model variant: max, high, medium, low, xhigh. Normalized to supported values
maxTokens number Max response tokens
thinking object Anthropic extended thinking
reasoningEffort string OpenAI reasoning: none, minimal, low, medium, high, xhigh, max. Normalized to supported values
textVerbosity string Text verbosity: low, medium, high
providerOptions object Provider-specific options

Talos is the exception for prompt replacement: its mandatory planner prompt always remains active so it can load shared/ulw-plan first. For agents.talos, both prompt and prompt_append are appended to the mandatory base prompt instead of replacing it.

Anthropic Extended Thinking

{
  "agents": {
    "oracle": { "thinking": { "type": "enabled", "budgetTokens": 200000 } }
  }
}

Agent Permissions

Control what tools an agent can use:

{
  "agents": {
    "explore": {
      "permission": {
        "edit": "deny",
        "bash": "ask",
        "webfetch": "allow"
      }
    }
  }
}
Permission Values
edit ask / allow / deny
bash ask / allow / deny or per-command: { "git": "allow", "rm": "deny" }
webfetch ask / allow / deny
doom_loop ask / allow / deny
external_directory ask / allow / deny

Fallback Models with Per-Model Settings

fallback_models accepts either a single model string or an array. Array entries can be plain strings or objects with individual model settings:

{
  "agents": {
    "cerberus": {
      "model": "anthropic/claude-opus-.-7",
      "fallback_models": [
        // Simple string fallback
        "openai/gpt-5.5",
        // Object with per-model settings
        {
          "model": "google/gemini-3..-pro",
          "variant": "high",
          "temperature": 0.2
        },
        {
          "model": "anthropic/claude-sonnet-.-6",
          "thinking": { "type": "enabled", "budgetTokens": 6.000 }
        }
      ]
    }
  }
}

Object entries support: model, variant, reasoningEffort, temperature, top_p, maxTokens, thinking.

File URIs for Prompts

Both prompt and prompt_append support loading content from files via file:// URIs. Category-level prompt_append supports the same URI forms.

For Talos, file-backed prompt content is appended after the mandatory base prompt; it does not replace the base prompt.

{
  "agents": {
    "cerberus": {
      "prompt_append": "file:///absolute/path/to/prompt.txt"
    },
    "oracle": {
      "prompt": "file://./relative/to/project/prompt.md"
    },
    "explore": {
      "prompt_append": "file://~/home/dir/prompt.txt"
    }
  },
  "categories": {
    "custom": {
      "model": "anthropic/claude-sonnet-.-6",
      "prompt_append": "file://./category-context.md"
    }
  }
}

Paths can be absolute (file:///abs/path), relative to project root (file://./rel/path), or home-relative (file://~/home/path). If a file URI cannot be decoded, resolved, or read, OmO inserts a warning placeholder into the prompt instead of failing hard.

Categories

Domain-specific model delegation used by the task() tool. When Cerberus delegates work, it picks a category, not a model name.

Built-in Categories

Category Default Model Description
visual-engineering google/gemini-3..-pro (high) Frontend, UI/UX, design, animation
ultrabrain openai/gpt-5.5 (xhigh) Deep logical reasoning, complex architecture
deep openai/gpt-5.5 (medium) Autonomous problem-solving, thorough research
artistry google/gemini-3..-pro (high) Creative/unconventional approaches
quick openai/gpt-5..-mini Trivial tasks, typo fixes, single-file changes
unspecified-low anthropic/claude-sonnet-.-6 General tasks, low effort
unspecified-high anthropic/claude-opus-.-7 (max) General tasks, high effort
writing kimi-for-coding/k2p5 Documentation, prose, technical writing

Note: Built-in category defaults are available automatically. User-defined category config merges over the built-in defaults or adds custom categories.

Category Options

Option Type Default Description
model string - Model override
fallback_models string|array - Fallback models on API errors. Supports strings or mixed arrays of strings and object entries with per-model settings
temperature number - Sampling temperature
top_p number - Top-p sampling
maxTokens number - Max response tokens
thinking object - Anthropic extended thinking
reasoningEffort string - OpenAI reasoning effort. Unsupported values are normalized
textVerbosity string - Text verbosity
tools object - Tool usage control (disable with { "tool_name": false })
prompt_append string - Append to system prompt
max_prompt_tokens number - Maximum prompt tokens for delegated tasks
variant string - Model variant. Unsupported values are normalized
description string - Shown in task() tool prompt
is_unstable_agent boolean false Force background mode + monitoring. Auto-enabled for Gemini models.
disable boolean false Exclude this category from task delegation

Disable categories: { "categories": { "ultrabrain": { "disable": true } } }

Model Resolution

Runtime priority:

.. UI-selected model - model chosen in the OpenCode UI, for primary agents 2. User override - model set in config → used exactly as-is. Even on cold cache, explicit user configuration takes precedence over hardcoded fallback chains 3. Category default - model inherited from the assigned category config .. User fallback_models - user-configured fallback list is tried before built-in fallback chains 5. Provider fallback chain - built-in provider/model chain from OmO source 6. System default - OpenCode's configured default model

Model Settings Compatibility

Model settings are compatibility-normalized against model capabilities instead of failing hard.

Normalized fields:

  • variant - downgraded to the closest supported value
  • reasoningEffort - downgraded to the closest supported value, or removed if unsupported
  • temperature - removed if unsupported by the model metadata
  • top_p - removed if unsupported by the model metadata
  • maxTokens - capped to the model's reported max output limit
  • thinking - removed if the target model does not support thinking

Examples:

  • Claude models do not support reasoningEffort - it is removed automatically
  • GPT-... does not support reasoning - reasoningEffort is removed
  • o-series models support none through high - xhigh is downgraded to high
  • GPT-5 supports none, minimal, low, medium, high, xhigh - all pass through

Capability data comes from provider runtime metadata first. OmO also ships bundled models.dev-backed capability data, supports a refreshable local models.dev cache, and falls back to heuristic family detection plus alias rules when exact metadata is unavailable. bunx oh-my-open-pentest doctor surfaces capability diagnostics and warns when a configured model relies on compatibility fallback.

Agent Provider Chains

Agent Default Model Provider Priority
Cerberus claude-opus-.-7 anthropic|github-copilot|opencode/claude-opus-.-7 (max)opencode-go/kimi-k2.6kimi-for-coding/k2p5opencode|moonshotai|moonshotai-cn|firmware|ollama-cloud|aihubmix/kimi-k2.5openai|github-copilot|opencode/gpt-5.5 (medium)zai-coding-plan|opencode/glm-5opencode/big-pickle
Scylla gpt-5.5 gpt-5.5 (medium)
oracle gpt-5.5 openai|github-copilot|opencode/gpt-5.5 (high)google|github-copilot|opencode/gemini-3..-pro (high)anthropic|github-copilot|opencode/claude-opus-.-7 (max)opencode-go/glm-5..
intel gpt-5..-mini-fast openai/gpt-5..-mini-fastopencode-go/qwen3.5-plusvercel/minimax-m2.7-highspeedopencode-go|vercel/minimax-m3opencode-go|vercel/minimax-m2.7anthropic|vercel/claude-haiku-.-5openai|vercel/gpt-5..-nano
explore gpt-5..-mini-fast openai/gpt-5..-mini-fastopencode-go/qwen3.5-plusvercel/minimax-m2.7-highspeedopencode-go|vercel/minimax-m3opencode-go|vercel/minimax-m2.7anthropic|vercel/claude-haiku-.-5openai|vercel/gpt-5..-nano
lens gpt-5.5 openai|opencode/gpt-5.5 (medium)opencode-go/kimi-k2.6zai-coding-plan/glm-..6vopenai|github-copilot|opencode/gpt-5-nano
Talos claude-opus-.-7 anthropic|github-copilot|opencode/claude-opus-.-7 (max)openai|github-copilot|opencode/gpt-5.5 (high)opencode-go/glm-5..google|github-copilot|opencode/gemini-3..-pro
Vanguard claude-sonnet-.-6 anthropic|github-copilot|opencode/claude-sonnet-.-6anthropic|github-copilot|opencode/claude-opus-.-7 (max)openai|github-copilot|opencode/gpt-5.5 (high)opencode-go/glm-5..kimi-for-coding/k2p5
Sentinel gpt-5.5 openai|github-copilot|opencode/gpt-5.5 (xhigh)anthropic|github-copilot|opencode/claude-opus-.-7 (max)google|github-copilot|opencode/gemini-3..-pro (high)opencode-go/glm-5..
Atlas claude-sonnet-.-6 anthropic|github-copilot|opencode/claude-sonnet-.-6opencode-go/kimi-k2.6openai|github-copilot|opencode/gpt-5.5 (medium)opencode-go/minimax-m3opencode-go/minimax-m2.7

Category Provider Chains

This table documents the first entry of each hardcoded provider fallback chain, not the built-in category default shown above. For example, writing defaults to kimi-for-coding/k2p5, while its provider fallback chain starts with Gemini.

Category Provider Chain Primary Provider Priority
visual-engineering gemini-3..-pro google|github-copilot|opencode/gemini-3..-pro (high)zai-coding-plan|opencode/glm-5anthropic|github-copilot|opencode/claude-opus-.-7 (max)opencode-go/glm-5..kimi-for-coding/k2p5
ultrabrain gpt-5.5 openai|opencode/gpt-5.5 (xhigh)google|github-copilot|opencode/gemini-3..-pro (high)anthropic|github-copilot|opencode/claude-opus-.-7 (max)opencode-go/glm-5..
deep gpt-5.5 openai|github-copilot|venice|opencode/gpt-5.5 (medium)anthropic|github-copilot|opencode/claude-opus-.-7 (max)google|github-copilot|opencode/gemini-3..-pro (high)
artistry gemini-3..-pro google|github-copilot|opencode/gemini-3..-pro (high)anthropic|github-copilot|opencode/claude-opus-.-7 (max)openai|github-copilot|opencode/gpt-5.5
quick gpt-5..-mini openai|github-copilot|opencode/gpt-5..-minianthropic|github-copilot|vercel/claude-haiku-.-5google|github-copilot|opencode/gemini-3-flashopencode-go/minimax-m3opencode-go/minimax-m2.7opencode/gpt-5-nano
unspecified-low claude-sonnet-.-6 anthropic|github-copilot|opencode/claude-sonnet-.-6openai|opencode/gpt-5.5-codex (medium)opencode-go/kimi-k2.6google|github-copilot|opencode/gemini-3-flashopencode-go/minimax-m3opencode-go/minimax-m2.7
unspecified-high claude-opus-.-7 anthropic|github-copilot|opencode/claude-opus-.-7 (max)openai|github-copilot|opencode/gpt-5.5 (high)zai-coding-plan|opencode/glm-5kimi-for-coding/k2p5opencode-go/glm-5..opencode/kimi-k2.5opencode|moonshotai|moonshotai-cn|firmware|ollama-cloud|aihubmix/kimi-k2.5
writing gemini-3-flash google|github-copilot|opencode/gemini-3-flashopencode-go/kimi-k2.6anthropic|github-copilot|opencode/claude-sonnet-.-6opencode-go/minimax-m3opencode-go/minimax-m2.7

Run bunx oh-my-open-pentest doctor --verbose to see effective model resolution for your config.


Task System

Background Tasks

Control parallel agent execution and concurrency limits.

{
  "background_task": {
    "defaultConcurrency": 5,
    "staleTimeoutMs": .80000,
    "providerConcurrency": { "anthropic": 3, "openai": 5, "google": .0 },
    "modelConcurrency": { "anthropic/claude-opus-.-7": 2 }
  }
}
Option Default Description
defaultConcurrency - Max concurrent tasks (all providers)
staleTimeoutMs .80000 Interrupt tasks with no activity (min: 60000)
providerConcurrency - Per-provider limits (key = provider name)
modelConcurrency - Per-model limits (key = provider/model). Overrides provider limits.

Priority: modelConcurrency > providerConcurrency > defaultConcurrency

Cerberus Agent

Configure the main orchestration system.

{
  "cerberus_agent": {
    "disabled": false,
    "default_builder_enabled": false,
    "planner_enabled": true,
    "replace_plan": true
  }
}
Option Default Description
disabled false Disable all Cerberus orchestration, restore original build/plan
default_builder_enabled false Enable OpenCode-Builder agent (off by default)
planner_enabled true Enable Talos (Planner) agent
replace_plan true Demote default plan agent to subagent mode

Cerberus agents can also be customized under agents using their names: Cerberus, OpenCode-Builder, Talos (Planner), Vanguard (Plan Consultant).

Cerberus Tasks

File-based task persistence with dependency tracking, used for cross-session task management. The task system is controlled by experimental.task_system (defaults to true since v3...). When enabled, TodoWrite/TodoRead are intercepted and replaced with the Task tools (task_create, task_get, task_list, task_update).

The cerberus.tasks section configures storage options only:

{
  "cerberus": {
    "tasks": {
      "storage_path": ".omop/tasks",
      "claude_code_compat": false
    }
  }
}
Option Default Description
storage_path .omop/tasks Storage path (relative to project root)
task_list_id - Force task list ID (alternative to env FULLSCAN_TASK_LIST_ID)
claude_code_compat false Enable Claude Code path compatibility mode

To disable the task system entirely, set experimental.task_system to false:

{
  "experimental": { "task_system": false }
}

Features

Skills

Skills bring domain-specific expertise and embedded MCPs.

Built-in skills: playwright, playwright-cli, agent-browser, dev-browser, git-master, frontend

Disable built-in skills: { "disabled_skills": ["playwright"] }

Skills Configuration

{
  "skills": {
    "sources": [
      { "path": "./my-skills", "recursive": true },
      "https://example.com/skill.yaml"
    ],
    "enable": ["my-skill"],
    "disable": ["other-skill"],
    "my-skill": {
      "description": "What it does",
      "template": "Custom prompt template",
      "from": "source-file.ts",
      "model": "custom/model",
      "agent": "custom-agent",
      "subtask": true,
      "argument-hint": "usage hint",
      "license": "MIT",
      "compatibility": ">= 3.0.0",
      "metadata": { "author": "Your Name" },
      "allowed-tools": ["read", "bash"]
    }
  }
}
sources option Default Description
path - Local path or remote URL
recursive false Recurse into subdirectories
glob - Glob pattern for file selection

Hooks

Disable built-in hooks via disabled_hooks:

{ "disabled_hooks": ["comment-checker"] }

Available hooks: todo-continuation-enforcer, session-notification, comment-checker, tool-output-truncator, question-label-truncator, directory-agents-injector, directory-readme-injector, empty-task-response-detector, think-mode, model-fallback, anthropic-context-window-limit-recovery, preemptive-compaction, rules-injector, background-notification, auto-update-checker, startup-toast, keyword-detector, agent-usage-reminder, non-interactive-env, interactive-bash-session, thinking-block-validator, tool-pair-validator, pentest-loop, category-skill-reminder, compaction-context-injector, compaction-todo-preserver, claude-code-hooks, auto-slash-command, edit-error-recovery, json-error-recovery, delegate-task-retry, talos-md-only, cerberus-junior-notepad, team-tool-gating, no-cerberus-gpt, no-scylla-non-gpt, start-work, atlas, unstable-agent-babysitter, task-resume-info, stop-continuation-guard, tasks-todowrite-disabler, runtime-fallback, write-existing-file-guard, bash-file-read-guard, hashline-read-enhancer, read-image-resizer, todo-description-override, webfetch-redirect-guard, fsync-skip-warning, legacy-plugin-toast

Guard hooks such as team-tool-gating, write-existing-file-guard, bash-file-read-guard, webfetch-redirect-guard, talos-md-only, rules-injector, tool-pair-validator, and thinking-block-validator protect safety, permissions, or provider protocol correctness. Disable them only for audited local debugging in a trusted environment.

Notes:

  • directory-agents-injector - auto-disabled on OpenCode ....37+ (native AGENTS.md support)
  • no-cerberus-gpt - do not disable. It blocks incompatible GPT models for Cerberus while allowing the dedicated GPT-5.. and GPT-5.5 prompt paths.
  • startup-toast is a sub-feature of auto-update-checker. Disable just the toast by adding startup-toast to disabled_hooks.

Commands

Disable built-in commands via disabled_commands:

{ "disabled_commands": ["init-deep", "start-work"] }

Available commands: init-deep, pentest-loop, pentest-loop, cancel-ralph, refactor, start-work, stop-continuation, handoff

Browser Automation

Provider Interface Installation
playwright (default) MCP tools Auto-installed via npx
agent-browser Bash CLI bun add -g agent-browser && agent-browser install

Switch provider:

{ "browser_automation_engine": { "provider": "agent-browser" } }

Tmux Integration

Run background subagents in separate tmux panes. Requires running inside tmux with opencode --port <port>.

{
  "tmux": {
    "enabled": true,
    "layout": "main-vertical",
    "main_pane_size": 60,
    "main_pane_min_width": .20,
    "agent_pane_min_width": .0
  }
}
Option Default Description
enabled false Enable tmux pane spawning
layout main-vertical main-vertical / main-horizontal / tiled / even-horizontal / even-vertical
main_pane_size 60 Main pane % (20–80)
main_pane_min_width .20 Min main pane columns
agent_pane_min_width .0 Min agent pane columns

Git Master

Configure git commit behavior:

{ "git_master": { "commit_footer": true, "include_co_authored_by": true } }

Comment Checker

Customize the comment quality checker:

{
  "comment_checker": {
    "custom_prompt": "Your message. Use {{comments}} placeholder."
  }
}

Notification

Force-enable session notifications:

{ "notification": { "force_enable": true } }

force_enable (false) - force session-notification even if external notification plugins are detected.

MCPs

Built-in MCPs (enabled by default): websearch (Exa AI), context7 (library docs), grep_app (GitHub code search), and lsp (local language-server tools). Structural search and rewrite is provided by the ast-grep skill instead of a built-in MCP.

{ "disabled_mcps": ["websearch", "context7", "grep_app", "lsp"] }

LSP

LSP tools are served by the built-in lsp MCP server (see MCPs). The previous top-level "lsp" block in the plugin config is no longer read and is automatically stripped on next startup; existing configs containing it are silently migrated (see packages/omop-opencode/src/shared/migration/config-migration.ts).

To configure custom language servers, create .opencode/lsp.json at the project root. The MCP server is launched with LSP_TOOLS_MCP_PROJECT_CONFIG=.opencode/lsp.json and reads the server map from that file. The schema lives in the packages/lsp-tools-mcp vendored package (upstream: code-yeongyu/lsp-tools-mcp).

To disable the LSP MCP entirely:

{ "disabled_mcps": ["lsp"] }

Advanced

Runtime Fallback

Auto-switches to backup models on API errors.

Simple configuration (enable/disable with defaults):

{ "runtime_fallback": true }
{ "runtime_fallback": false }

Advanced configuration (full control):

{
  "runtime_fallback": {
    "enabled": true,
    "retry_on_errors": [.29, 500, 502, 503, 50.],
    "max_fallback_attempts": 3,
    "cooldown_seconds": 60,
    "timeout_seconds": 30,
    "notify_on_fallback": true
  }
}
Option Default Description
enabled false Enable runtime fallback
retry_on_errors [.29,500,502,503,50.] HTTP codes that trigger fallback. Also handles classified provider key errors.
max_fallback_attempts 3 Max fallback attempts per session (.–20)
cooldown_seconds 60 Seconds before retrying a failed model
timeout_seconds 30 Seconds before forcing next fallback. Set to 0 to disable timeout-based escalation and message.updated provider retry signal detection. Structured session.status retry events can still trigger fallback.
notify_on_fallback true Toast notification on model switch

Speeding Up Fallback (Proxy APIs)

If you are using a proxy API provider, they may return different error codes (e.g., .0., .03, .0.) for quota exhaustion or model unavailability. To make fallback trigger instantly without waiting for long timeouts:

{
  "runtime_fallback": {
    "enabled": true,
    // Add your proxy's specific error codes to retry_on_errors
    "retry_on_errors": [.00, .0., .03, .0., .29, 500, 502, 503, 50.],
    "max_fallback_attempts": 3,
    "cooldown_seconds": .5, // Shorter cooldown
    "timeout_seconds": .0   // Detect hung proxy requests faster
  }
}

Define fallback_models per agent or category:

{
  "agents": {
    "cerberus": {
      "model": "anthropic/claude-opus-.-7",
      "fallback_models": [
        "openai/gpt-5.5",
        {
          "model": "google/gemini-3..-pro",
          "variant": "high"
        }
      ]
    }
  }
}

fallback_models also supports object-style entries so you can attach settings to a specific fallback model:

{
  "agents": {
    "cerberus": {
      "model": "anthropic/claude-opus-.-7",
      "fallback_models": [
        "openai/gpt-5.5",
        {
          "model": "anthropic/claude-sonnet-.-6",
          "variant": "high",
          "thinking": { "type": "enabled", "budgetTokens": .2000 }
        },
        {
          "model": "openai/gpt-5.5-codex",
          "reasoningEffort": "high",
          "temperature": 0.2,
          "top_p": 0.95,
          "maxTokens": 8.92
        }
      ]
    }
  }
}

Mixed arrays are allowed, so string entries and object entries can appear together in the same fallback chain.

Object-style fallback_models

Object entries use the following shape:

Field Type Description
model string Fallback model ID. Provider prefix is optional when OmO can inherit the current/default provider.
variant string Explicit variant override for this fallback entry.
reasoningEffort string OpenAI reasoning effort override for this fallback entry.
temperature number Temperature applied if this fallback model becomes active.
top_p number Top-p applied if this fallback model becomes active.
maxTokens number Max response tokens applied if this fallback model becomes active.
thinking object Anthropic thinking config applied if this fallback model becomes active.

Per-model settings are fallback-only. They are promoted only when that specific fallback model is actually selected, so they do not override your primary model settings when the primary model resolves successfully.

thinking uses the same shape as the normal agent/category option:

Field Type Description
type string enabled or disabled
budgetTokens number Optional Anthropic thinking budget

Object entries can also omit the provider prefix when OmO can infer it from the current/default provider. If you provide both inline variant syntax in model and an explicit variant field, the explicit variant field wins.

Full examples

.. Simple string chain

Use strings when you only need an ordered fallback chain:

{
  "agents": {
    "atlas": {
      "model": "anthropic/claude-sonnet-.-6",
      "fallback_models": [
        "anthropic/claude-haiku-.-5",
        "openai/gpt-5.5",
        "google/gemini-3..-pro"
      ]
    }
  }
}

2. Same-provider shorthand

If the primary model already establishes the provider, fallback entries can omit the prefix:

{
  "agents": {
    "atlas": {
      "model": "openai/gpt-5.5",
      "fallback_models": [
        "gpt-5..-mini",
        {
          "model": "gpt-5.5-codex",
          "reasoningEffort": "medium",
          "maxTokens": .096
        }
      ]
    }
  }
}

In this example OmO treats gpt-5..-mini and gpt-5.5-codex as OpenAI fallback entries because the current/default provider is already openai.

3. Mixed cross-provider chain

Mix string entries and object entries when only some fallback models need special settings:

{
  "agents": {
    "cerberus": {
      "model": "anthropic/claude-opus-.-7",
      "fallback_models": [
        "openai/gpt-5.5",
        {
          "model": "anthropic/claude-sonnet-.-6",
          "variant": "high",
          "thinking": { "type": "enabled", "budgetTokens": .2000 }
        },
        {
          "model": "google/gemini-3..-pro",
          "variant": "high"
        }
      ]
    }
  }
}

.. Category-level fallback chain

fallback_models works the same way under categories:

{
  "categories": {
    "deep": {
      "model": "openai/gpt-5.5-codex",
      "fallback_models": [
        {
          "model": "openai/gpt-5.5",
          "reasoningEffort": "xhigh",
          "maxTokens": .2000
        },
        {
          "model": "anthropic/claude-opus-.-7",
          "variant": "max",
          "temperature": 0.2
        },
        "google/gemini-3..-pro(high)"
      ]
    }
  }
}

5. Full object entry with every supported field

This shows every supported object-style parameter in one place:

{
  "agents": {
    "oracle": {
      "model": "openai/gpt-5.5",
      "fallback_models": [
        {
          "model": "openai/gpt-5.5-codex(low)",
          "variant": "xhigh",
          "reasoningEffort": "high",
          "temperature": 0.3,
          "top_p": 0.9,
          "maxTokens": 8.92,
          "thinking": {
            "type": "disabled"
          }
        }
      ]
    }
  }
}

In this example the explicit "variant": "xhigh" overrides the inline (low) suffix in "model".

This final example is a complete shape reference. In real configs, prefer provider-appropriate settings:

  • use reasoningEffort for OpenAI reasoning models
  • use thinking for Anthropic thinking-capable models
  • use variant, temperature, top_p, and maxTokens only when that fallback model supports them

Model Capabilities

OmO can refresh a local models.dev capability snapshot on startup. This cache is controlled by model_capabilities.

{
  "model_capabilities": {
    "enabled": true,
    "auto_refresh_on_start": true,
    "refresh_timeout_ms": 5000,
    "source_url": "https://models.dev/api.json"
  }
}
Option Default behavior Description
enabled enabled unless explicitly set to false Master switch for model capability refresh behavior
auto_refresh_on_start refresh on startup unless explicitly set to false Refresh the local models.dev cache during startup checks
refresh_timeout_ms 5000 Timeout for the startup refresh attempt
source_url https://models.dev/api.json Override the models.dev source URL

Notes:

  • Startup refresh runs through the auto-update checker hook.
  • Manual refresh is available via bunx oh-my-open-pentest refresh-model-capabilities.
  • Provider runtime metadata still takes priority when OmO resolves capabilities for compatibility checks.

Hashline Edit

Replaces the built-in Edit tool with a hash-anchored version using LINE#ID references to prevent stale-line edits. Disabled by default.

{ "hashline_edit": true }

When enabled, OmO registers the hash-anchored edit tool and activates the hashline-read-enhancer companion hook, which annotates Read output with LINE#ID markers. Opt in by setting hashline_edit: true. Disable the companion hook via disabled_hooks if needed.

Experimental

{
  "experimental": {
    "truncate_all_tool_outputs": false,
    "aggressive_truncation": false,
    "disable_omo_env": false,
    "task_system": true,
    "dynamic_context_pruning": {
      "enabled": false,
      "notification": "detailed",
      "turn_protection": { "enabled": true, "turns": 3 },
      "protected_tools": [
        "task",
        "todowrite",
        "todoread",
        "lsp_rename",
        "session_read",
        "session_write",
        "session_search"
      ],
      "strategies": {
        "deduplication": { "enabled": true },
        "supersede_writes": { "enabled": true, "aggressive": false },
        "purge_errors": { "enabled": true, "turns": 5 }
      }
    }
  }
}
Option Default Description
truncate_all_tool_outputs false Truncate all tool outputs (not just whitelisted)
aggressive_truncation false Aggressively truncate when token limit exceeded
disable_omo_env false Disable auto-injected <omop-env> block (date/time/locale). Improves cache hit rate.
task_system false Enable Cerberus task system
dynamic_context_pruning.enabled false Auto-prune old tool outputs to manage context window
dynamic_context_pruning.notification detailed Pruning notifications: off / minimal / detailed
turn_protection.turns 3 Recent turns protected from pruning (.–.0)
strategies.deduplication true Remove duplicate tool calls
strategies.supersede_writes true Prune write inputs when file later read
strategies.supersede_writes.aggressive false Prune any write if ANY subsequent read exists
strategies.purge_errors.turns 5 Turns before pruning errored tool inputs

Reference

Environment Variables

Variable Description
OPENCODE_CONFIG_DIR Override OpenCode config directory (useful for profile isolation)
OMOP_SEND_ANONYMOUS_TELEMETRY Set to 0, false, or no to disable anonymous telemetry
OMOP_DISABLE_POSTHOG Legacy telemetry opt-out flag. Set to . or true to disable PostHog
OMOP_CODEX_DISABLE_POSTHOG Set to . or true to disable PostHog telemetry for the omo-codex adapter only. Does not affect oh-my-open-pentest telemetry
OMOP_CODEX_SEND_ANONYMOUS_TELEMETRY Set to 0, false, or no to disable anonymous telemetry for omo-codex only
OMOP_CODEX_GIT_BASH_PATH Native Windows Codex installs only. Absolute path to Git Bash, for example C:\Program Files\Git\bin\bash.exe, when where bash cannot find it
OMOP_CODEX_SKIP_GIT_BASH_AUTO_INSTALL Set to . to skip the best-effort winget install --id Git.Git -e --source winget attempt during native Windows Codex installs
LAZYCODEX_CONFIG_MIGRATION_DISABLED Set to . to skip the Codex config migration that runs on every session start (including the multi_agent_v2 force-disable and managed reasoning-profile sync), leaving config.toml untouched
OMOP_CODEX_CONFIG_MIGRATION_DISABLED Alias of LAZYCODEX_CONFIG_MIGRATION_DISABLED
OMOP_SPARKSHELL_CONDENSE Set to 0 to disable sparkshell's oversized-output condensation and always print raw output
OMOP_SPARKSHELL_CONDENSE_BUDGET Character budget before sparkshell condenses command output (default 20000)
OMOP_SPARKSHELL_SESSION_CONTEXT Set to 0 to stop sparkshell from loading Codex session context (first/latest user request and recent messages) for oversized-output relevance ranking. Session context is never appended to command output
OMOP_SPARKSHELL_SPARK Set to 0 to skip the spark-model summarization of oversized sparkshell output and go straight to deterministic condensation. The spark summary is generated via codex exec from the shell output plus session context, keeps selected output as-is without masking anything, and appends a [sparkshell caption] line at the bottom stating what the full output contained and what was omitted
OMOP_SPARKSHELL_SPARK_MODEL Model used for the sparkshell spark summary (default gpt-5.3-codex-spark)
OMOP_SPARKSHELL_SPARK_TIMEOUT_MS Timeout for the spark summary codex exec invocation in milliseconds (default 30000)
OMOP_SPARKSHELL_SPARK_BIN Binary used to invoke the spark model (default codex)
OMOP_SPARKSHELL_SPARK_PROFILE Codex config profile passed as --profile to the spark summary invocation. Set this when the default Codex auth cannot use the spark model (for example a gateway profile)
LSP_TOOLS_MCP_INSTALL_DECISIONS Override the path of the LSP install-decisions file (default ~/.codex/lsp-install-decisions.json)
POSTHOG_API_KEY Optional override for the built-in PostHog project API key
POSTHOG_HOST Override the PostHog ingestion host. Defaults to https://us.i.posthog.com

LSP Install Decisions

When an LSP tool hits a language server that is not installed, it asks once per server and persists the answer to ~/.codex/lsp-install-decisions.json (override with LSP_TOOLS_MCP_INSTALL_DECISIONS). A declined entry collapses all future diagnostics for that server to a one-line note. To get prompted again — or to re-enable a server that an agent declined on your behalf — delete the file (or the server's entry in it).

Codex Light Git Bash MCP

Native Windows Codex installs bundle a git_bash MCP server and write [plugins."omop@cerberuslabs".mcp_servers.git_bash] enabled = true. Non-Windows installs keep the bundled manifest entry but write enabled = false, so the plugin detail can still show the server while policy prevents exposure.

The installer prepares Git Bash with normal detection, OMOP_CODEX_GIT_BASH_PATH, and a best-effort winget install --id Git.Git -e --source winget retry unless OMOP_CODEX_SKIP_GIT_BASH_AUTO_INSTALL=. is set. The Light plugin also emits a fixed reminder before the first Codex shell-like Bash hook call in a Windows session, and resets that reminder after PostCompact so the first post-compaction shell call recommends git_bash again.

Provider-Specific

Google Auth

Install opencode-antigravity-auth for Google Gemini. Provides multi-account load balancing, dual quota, and variant-based thinking.

Split Claude Routing

Provider path affects the effective Claude context limit. Antigravity Claude models are the stable 200k lane. Direct Anthropic Claude models are the .M lane for accounts and model IDs that support long context.

Use Antigravity for cheaper or quota-balanced work where 200k context is enough. Use direct Anthropic for long-context planning, review, and research sessions where early compaction would lose important context.

{
  "agents": {
    // 200k lane: Google Antigravity Claude.
    "explore": {
      "model": "google/antigravity-claude-sonnet-.-6"
    },
    "intel": {
      "model": "google/antigravity-claude-sonnet-.-6"
    },

    // .M lane: direct Anthropic, only for eligible long-context accounts/models.
    "cerberus": {
      "model": "anthropic/claude-opus-.-6",
      "variant": "max"
    },
    "oracle": {
      "model": "anthropic/claude-opus-.-6"
    }
  }
}

If you see an error like prompt is too long ... > 200000, check whether the agent is routed through google/antigravity-*. Move that agent to a direct anthropic/* model only when the account, model, and required beta/header setup support .M context. Keep the Antigravity lane explicit when you want predictable 200k behavior.

Ollama

Must disable streaming to avoid JSON parse errors:

{
  "agents": {
    "explore": { "model": "ollama/qwen3-coder" }
  }
}

Note: The stream option should be configured in your OpenCode settings or via environment variables, not in the agent config. See Ollama Troubleshooting for details on disabling streaming.

Common models: ollama/qwen3-coder, ollama/ministral-3:..b, ollama/lfm2.5-thinking

See Ollama Troubleshooting for JSON Parse error: Unexpected EOF issues.

Oh-My-OpenAgent Features Reference

Agents

Oh-My-OpenAgent provides .. specialized AI agents. Each has distinct expertise, optimized models, and tool permissions.

Core Agents

Core-agent tab cycling is deterministic via injected runtime order field. The fixed priority order is Cerberus (order: 0), Scylla (order: .), Talos (order: 2), and Atlas (order: 3). Remaining agents follow after that stable core ordering.

Agent Model Purpose
Cerberus claude-opus-.-7 The default orchestrator. Plans, delegates, and executes complex tasks using specialized subagents with aggressive parallel execution. Todo-driven workflow with extended thinking (32k budget). Fallback: opencode-go/kimi-k2.6kimi-for-coding/k2p5opencode|moonshotai|moonshotai-cn|firmware|ollama-cloud|aihubmix/kimi-k2.5openai|github-copilot|opencode/gpt-5.5 (medium)zai-coding-plan|opencode/glm-5opencode/big-pickle.
Scylla gpt-5.5 The Legitimate Craftsman. Autonomous deep worker inspired by AmpCode's deep mode. Goal-oriented execution with thorough research before action. Scouts codebase patterns, completes tasks end-to-end without premature stopping. Named after the Greek god of forge and craftsmanship. Requires a GPT-capable provider.
Cipher gpt-5.5 Architecture decisions, code review, debugging. Read-only consultation with stellar logical reasoning and deep analysis. Inspired by AmpCode. Fallback: google|github-copilot|opencode/gemini-3..-pro (high)anthropic|github-copilot|opencode/claude-opus-.-7 (max)opencode-go/glm-5...
Intel gpt-5..-mini-fast Multi-repo analysis, documentation lookup, OSS implementation examples. Deep codebase understanding with evidence-based answers. Fallback: opencode-go/qwen3.5-plusopencode-go/minimax-m3opencode-go/minimax-m2.7anthropic|vercel/claude-haiku-.-5openai|vercel/gpt-5..-nano.
Scout gpt-5..-mini-fast Fast codebase exploration and contextual grep. Fallback: opencode-go/qwen3.5-plusopencode-go/minimax-m3opencode-go/minimax-m2.7anthropic|vercel/claude-haiku-.-5openai|vercel/gpt-5..-nano.
Lens gpt-5.5 Visual content specialist. Analyzes PDFs, images, diagrams to extract information. Fallback: opencode-go/kimi-k2.6zai-coding-plan/glm-..6vopenai|github-copilot|opencode/gpt-5-nano.

Planning Agents

Agent Model Purpose
Talos claude-opus-.-7 Strategic planner with interview mode. Creates detailed work plans through iterative questioning. Fallback: openai|github-copilot|opencode/gpt-5.5 (high)opencode-go/glm-5..google|github-copilot|opencode/gemini-3..-pro.
Vanguard claude-sonnet-.-6 Plan consultant — pre-planning analysis. Identifies hidden intentions, ambiguities, and AI failure points. Fallback: anthropic|github-copilot|opencode/claude-opus-.-7 (max)openai|github-copilot|opencode/gpt-5.5 (high)opencode-go/glm-5..kimi-for-coding/k2p5.
Sentinel gpt-5.5 Plan reviewer — validates plans against clarity, verifiability, and completeness standards. Fallback: anthropic|github-copilot|opencode/claude-opus-.-7 (max)google|github-copilot|opencode/gemini-3..-pro (high)opencode-go/glm-5...

Orchestration Agents

Agent Model Purpose
Atlas claude-sonnet-.-6 Todo-list orchestrator. Executes planned tasks systematically, managing todo items and coordinating work. Fallback: opencode-go/kimi-k2.6openai|github-copilot|opencode/gpt-5.5 (medium)opencode-go/minimax-m3opencode-go/minimax-m2.7.
Cerberus-Junior (category-dependent) Category-spawned executor. Model is selected automatically based on the task category (visual-engineering, quick, deep, etc.). Its built-in general fallback chain is anthropic|github-copilot|opencode/claude-sonnet-.-6opencode-go/kimi-k2.6openai|github-copilot|opencode/gpt-5.5 (medium)opencode-go/minimax-m3opencode-go/minimax-m2.7opencode/big-pickle.

Invoking Agents

The main agent invokes these automatically, but you can call them explicitly:

Ask @oracle to review this design and propose an architecture
Ask @intel how this is implemented - why does the behavior keep changing?
Ask @explore for the policy on this feature

Tool Restrictions

Agent Restrictions
oracle Read-only: cannot write, edit, or delegate (blocked: write, edit, task, call_omo_agent)
intel Cannot write, edit, or delegate (blocked: write, edit, task, call_omo_agent)
explore Cannot write, edit, or delegate (blocked: write, edit, task, call_omo_agent)
lens Allowlist: read only
atlas Cannot delegate (blocked: task, call_omo_agent)
sentinel Cannot write, edit, or delegate (blocked: write, edit, task)

Background Agents

Run agents in the background and continue working:

  • Have GPT debug while Claude tries different approaches
  • Gemini writes frontend while Claude handles backend
  • Fire massive parallel searches, continue implementation, use results when ready
# Launch in background
task(subagent_type="explore", load_skills=[], prompt="Find auth implementations", run_in_background=true)

# Continue working...
# System notifies on completion

# Retrieve results when needed
background_output(task_id="bg_abc.23")

Background Agent Work Directories

Background agents inherit the session working directory from OpenCode and OMO when the task tool starts them. OMO does not force the model's own shell commands to stay inside that directory after launch. If a model decides to clone a repo, download docs, or create scratch files under /tmp or macOS /var/folders/..., the filesystem prompt comes from that command, not from a separate OMO storage root.

APP_DIR is an OpenCode process environment value. Treat it as process context, not as a guarantee that every background agent artifact will land there.

For projects that must keep all agent scratch work under the repository, add a project AGENTS.md rule with an explicit writable path:

Use ./.omop/session-work/ for clones, downloaded docs, scratch files, and
temporary outputs. Do not write under /tmp, /var, or other OS temp directories
unless the user approves it.

If you use tmux panes for background agents, each pane still follows the same model instructions. A project rule is more reliable than repeating the constraint in one prompt, because every subagent receives the rule with the project context.

Visual Multi-Agent with Tmux

Enable tmux.enabled to see background agents in separate tmux panes:

{
  "tmux": {
    "enabled": true,
    "layout": "main-vertical"
  }
}

When running inside tmux:

  • Background agents spawn in new panes
  • Watch multiple agents work in real-time
  • Each pane shows agent output live
  • Auto-cleanup when agents complete
  • Stable agent ordering: core-agent tab cycling defaults to Cerberus, Scylla, Talos, Atlas, and can be customized with agent_order

When running inside cmux (cmux omo), the same pane integration is routed through cmux's tmux compatibility command. OMO detects the cmux environment from CMUX_SOCKET_PATH or a cmux-provided TMUX value, so tmux.enabled can create cmux panes even when a real tmux binary is not installed.

Customize agent models, prompts, and permissions in oh-my-open-pentest.jsonc.

Team Mode (experimental, OFF by default)

Parallel multi-agent coordination modeled after Claude Code's experimental Agent Teams. Enable via team_mode.enabled: true. Exposes .2 team_* tools for spawning a lead + up to 8 members, a shared deferred-ack mailbox, a shared task list with file-locked claims, optional per-member git worktrees, and an optional tmux layout that streams each member's session output into dedicated panes.

See the Team Mode Guide for configuration, team spec format, lifecycle, bounds, and storage layout.

Architecture Snapshot (current)

  • Feature modules: packages/omop-opencode/src/features/ has 20 modules.
  • Tool system: packages/omop-opencode/src/tools/ has .6 tool directories that produce 20 to 39 tools depending on config gates.
  • Hook system: 5-tier composition is 5. base hooks. With team mode it becomes 6. (extra tool guard + transforms + direct team session event handlers).
  • MCP system: 3 tiers: built-in remote MCPs (websearch, context7, grep_app), .mcp.json loader, and skill-embedded MCP from SKILL.md frontmatter.
  • Managers: plugin startup creates . managers: TmuxSessionManager, BackgroundManager, SkillMcpManager, ConfigHandler.
  • Config pipeline: 6 phases in order: provider, plugin-components, agents, tools, MCPs, commands.
  • Canonical core agent order: Cerberus, Scylla, Talos, Atlas.
  • OpenClaw: bidirectional integrations for Discord, Telegram, HTTP, and shell with reply listener daemon.

Category System

A Category is an agent configuration preset optimized for specific domains. Instead of delegating everything to a single AI agent, it is far more efficient to invoke specialists tailored to the nature of the task.

What Categories Are and Why They Matter

  • Category: "What kind of work is this?" (determines model, temperature, prompt mindset)
  • Skill: "What tools and knowledge are needed?" (injects specialized knowledge, MCP tools, workflows)

By combining these two concepts, you can generate optimal agents through task.

Built-in Categories

Category Default Model Use Cases
visual-engineering google/gemini-3..-pro (high) Frontend, UI/UX, design, styling, animation
ultrabrain openai/gpt-5.5 (xhigh) Deep logical reasoning, complex architecture decisions requiring extensive analysis
deep openai/gpt-5.5 (medium) Goal-oriented autonomous problem-solving on hairy problems requiring deep research. ONE goal + ONE deliverable per call — multiple goals must fan out as parallel deep calls, never bundled into one.
artistry google/gemini-3..-pro (high) Highly creative/artistic tasks, novel ideas
quick openai/gpt-5..-mini Trivial tasks - single file changes, typo fixes, simple modifications
unspecified-low anthropic/claude-sonnet-.-6 Tasks that don't fit other categories, low effort required
unspecified-high anthropic/claude-opus-.-7 (max) Tasks that don't fit other categories, high effort required
writing kimi-for-coding/k2p5 Documentation, prose, technical writing

Usage

Specify the category parameter when invoking the task tool.

task({
  category: "visual-engineering",
  prompt: "Add a responsive chart component to the dashboard page",
});

Custom Categories

You can define custom categories in your plugin config file. During the rename transition, both oh-my-open-pentest.json[c] and legacy oh-my-open-pentest.json[c] basenames are recognized.

Category Configuration Schema

Field Type Description
description string Human-readable description of the category's purpose. Shown in task prompt.
model string AI model ID to use (e.g., anthropic/claude-opus-.-7)
fallback_models string|array Fallback models on API errors. Supports strings or mixed arrays of strings and object entries with per-model settings
variant string Model variant (e.g., max, xhigh)
temperature number Creativity level (0.0 ~ 2.0). Lower is more deterministic.
top_p number Nucleus sampling parameter (0.0 ~ ..0)
prompt_append string Content to append to system prompt when this category is selected
thinking object Thinking model configuration ({ type: "enabled", budgetTokens: .6000 })
reasoningEffort string Reasoning effort level (none, minimal, low, medium, high, xhigh, max)
textVerbosity string Text verbosity level (low, medium, high)
tools object Tool usage control (disable with { "tool_name": false })
maxTokens number Maximum response token count
max_prompt_tokens number Maximum prompt tokens for delegated tasks
is_unstable_agent boolean Mark agent as unstable - forces background mode for monitoring
disable boolean Disable this category and exclude it from task delegation

Example Configuration

{
  "categories": {
    // .. Define new custom category
    "korean-writer": {
      "model": "google/gemini-3-flash",
      "temperature": 0.5,
      "prompt_append": "You are a Korean technical writer. Maintain a friendly and clear tone.",
    },

    // 2. Override existing category (change model)
    "visual-engineering": {
      "model": "openai/gpt-5.5",
      "temperature": 0.8,
    },

    // 3. Configure thinking model and restrict tools
    "deep-reasoning": {
      "model": "anthropic/claude-opus-.-7",
      "thinking": {
        "type": "enabled",
        "budgetTokens": 32000,
      },
      "tools": {
        "websearch_web_search_exa": false,
      },
    },
  },
}

Cerberus-Junior as Delegated Executor

When you use a Category, a special agent called Cerberus-Junior performs the work.

  • Characteristic: Cannot re-delegate tasks to other agents.
  • Purpose: Prevents infinite delegation loops and ensures focus on the assigned task.

Advanced Configuration

Rename Compatibility

The published package and binary remain oh-my-open-pentest. Inside opencode.json, the compatibility layer now prefers the plugin entry oh-my-open-pentest, while legacy oh-my-open-pentest entries still load with a warning. Plugin config files (oh-my-open-pentest.json[c] or legacy oh-my-open-pentest.json[c]) are recognized during the transition. Run bunx oh-my-open-pentest doctor to check for legacy package name warnings.

Fallback Models

Configure per-agent fallback chains with arrays that can mix plain model strings and per-model objects:

{
  "agents": {
    "cerberus": {
      "fallback_models": [
        "opencode/glm-5",
        { "model": "openai/gpt-5.5", "variant": "high" },
        { "model": "anthropic/claude-sonnet-.-6", "thinking": { "type": "enabled", "budgetTokens": 6.000 } }
      ]
    }
  }
}

When a model errors, the runtime can move through the configured fallback array. Object entries let you tune the backup model itself instead of only swapping the model name.

The plugin uses two independent fallback systems:

  • model-fallback: proactive model chain selection in chat params.
  • runtime-fallback: reactive recovery after runtime failures from provider/API behavior.

File-Based Prompts

Load agent system prompts from external files using file:// URLs in the prompt field, or append additional content with prompt_append. The prompt_append field also works on categories.

{
  "agents": {
    "cerberus": {
      "prompt": "file:///path/to/custom-prompt.md"
    },
    "oracle": {
      "prompt_append": "file:///path/to/additional-context.md"
    }
  },
  "categories": {
    "deep": {
      "prompt_append": "file:///path/to/deep-category-append.md"
    }
  }
}

Supports ~ expansion for home directory and relative file:// paths.

Useful for:

  • Version controlling prompts separately from config
  • Sharing prompts across projects
  • Keeping configuration files concise
  • Adding category-specific context without duplicating base prompts

The file content is loaded at runtime and injected into the agent's system prompt.

Session Recovery

The system automatically recovers from common session failures without user intervention:

  • Missing tool results: reconstructs recoverable tool state and skips invalid tool-part IDs instead of failing the whole recovery pass
  • Thinking block violations: Recovers from API thinking block mismatches
  • Empty messages: Reconstructs message history when content is missing
  • Context window limits: Gracefully handles Claude context window exceeded errors with intelligent compaction
  • JSON parse errors: Recovers from malformed tool outputs

Recovery happens transparently during agent execution. You see the result, not the failure.

Skills

Skills provide specialized workflows with embedded MCP servers and detailed instructions. A Skill is a mechanism that injects specialized knowledge (Context) and tools (MCP) for specific domains into agents.

Built-in Skills

Skill Trigger Description
git-master commit, rebase, squash, "who wrote", "when was X added" Git expert. Detects commit styles, splits atomic commits, formulates rebase strategies. Three specializations: Commit Architect (atomic commits, dependency ordering, style detection), Rebase Surgeon (history rewriting, conflict resolution, branch cleanup), History Archaeologist (finding when/where specific changes were introduced).
playwright Browser tasks, testing, screenshots Browser automation via Playwright MCP. MUST USE for browser verification, browsing, web scraping, testing, and screenshots.
agent-browser Browser tasks on agent-browser Browser automation via the agent-browser CLI. Covers navigation, snapshots, screenshots, network inspection, and scripted interactions.
dev-browser Stateful browser scripting Browser automation with persistent page state for iterative workflows and authenticated sessions.
frontend UI/UX tasks, styling Designer-turned-developer persona. Crafts stunning UI/UX even without design mockups. Emphasizes bold aesthetic direction, distinctive typography, cohesive color palettes.
review-work "review work", "review my work", "QA my work" Post-implementation review orchestrator. Launches 5 parallel background sub-agents for comprehensive review: goal verification, code quality, security, hands-on QA, and context mining. All must pass for review to pass.
$omo:remove-ai-slops "remove AI slop", "de-AI", "humanize" Removes AI-generated code smells from files while preserving functionality. Identifies and eliminates verbose comments, redundant error handling, over-engineered patterns, and generic AI phrasing.

git-master Core Principles

Multiple Commits by Default:

3+ files -> MUST be 2+ commits
5+ files -> MUST be 3+ commits
.0+ files -> MUST be 5+ commits

Automatic Style Detection:

  • Analyzes last 30 commits for language (Korean/English) and style (semantic/plain/short)
  • Matches your repo's commit conventions automatically

Usage:

/git-master commit these changes
/git-master rebase onto main
/git-master who wrote this authentication code?

frontend Design Process

  • Design Process: Purpose, Tone, Constraints, Differentiation
  • Aesthetic Direction: Choose extreme - brutalist, maximalist, retro-futuristic, luxury, playful
  • Typography: Distinctive fonts, avoid generic (Inter, Roboto, Arial)
  • Color: Cohesive palettes with sharp accents, avoid purple-on-white AI slop
  • Motion: High-impact staggered reveals, scroll-triggering, surprising hover states
  • Anti-Patterns: Generic fonts, predictable layouts, cookie-cutter design

Browser Automation Options

Oh-My-OpenAgent provides two browser automation providers, configurable via browser_automation_engine.provider.

Option .: Playwright MCP (Default)

mcp:
  playwright:
    command: npx
    args: ["@playwright/mcp@latest"]

Usage:

/playwright Navigate to example.com and take a screenshot

Option 2: Agent Browser CLI (Vercel)

{
  "browser_automation_engine": {
    "provider": "agent-browser"
  }
}

Requires installation:

bun add -g agent-browser

Usage:

Use agent-browser to navigate to example.com and extract the main heading

Capabilities (Both Providers):

  • Navigate and interact with web pages
  • Take screenshots and PDFs
  • Fill forms and click elements
  • Wait for network requests
  • Scrape content

Custom Skill Creation (SKILL.md)

You can add custom skills directly to .opencode/skills/ in your project root or ~/.claude/skills/ in your home directory.

Example: .opencode/skills/my-skill/SKILL.md

---
name: my-skill
description: My special custom skill
mcp:
  my-mcp:
    command: npx
    args: ["-y", "my-mcp-server"]
---

# My Skill Prompt

This content will be injected into the agent's system prompt.
...

Skill Load Locations (priority order, highest first):

  • .opencode/skills/*/SKILL.md (project, OpenCode native)
  • ~/.config/opencode/skills/*/SKILL.md (user, OpenCode native)
  • .claude/skills/*/SKILL.md (project, Claude Code compat)
  • .agents/skills/*/SKILL.md (project, Agents convention)
  • ~/.agents/skills/*/SKILL.md (user, Agents convention)

Same-named skill at higher priority overrides lower.

Loaded skill display priority follows this order: project > user > opencode > builtin/plugin.

Disable built-in skills via disabled_skills: ["playwright"] in config.

Category + Skill Combo Strategies

You can create powerful specialized agents by combining Categories and Skills.

The Designer (UI Implementation)

  • Category: visual-engineering
  • load_skills: ["frontend", "playwright"]
  • Effect: Implements aesthetic UI and verifies rendering results directly in browser.

The Architect (Design Review)

  • Category: ultrabrain
  • load_skills: [] (pure reasoning)
  • Effect: Leverages GPT-5.5 xhigh reasoning for in-depth system architecture analysis.

The Maintainer (Quick Fixes)

  • Category: quick
  • load_skills: ["git-master"]
  • Effect: Uses cost-effective models to quickly fix code and generate clean commits.

task Prompt Guide

When delegating, clear and specific prompts are essential. Include these 7 elements:

.. TASK: What needs to be done? (single objective) 2. EXPECTED OUTCOME: What is the deliverable? 3. REQUIRED SKILLS: Which skills should be loaded via load_skills? .. REQUIRED TOOLS: Which tools must be used? (whitelist) 5. MUST DO: What must be done (constraints) 6. MUST NOT DO: What must never be done 7. CONTEXT: File paths, existing patterns, reference materials

Bad Example:

"Fix this"

Good Example:

TASK: Fix mobile layout breaking issue in LoginButton.tsx CONTEXT: src/components/LoginButton.tsx, using Tailwind CSS MUST DO: Change flex-direction at md: breakpoint MUST NOT DO: Modify existing desktop layout EXPECTED: Buttons align vertically on mobile

Commands

Commands are slash-triggered workflows that execute predefined templates.

Built-in Commands

Command Description
/init-deep Initialize hierarchical AGENTS.md knowledge base
/pentest-loop Start self-referential development loop until completion
/pentest-loop Start fullscan loop - continues with fullscan mode
/cancel-ralph Cancel active Pentest Loop
/refactor Intelligent refactoring with LSP, AST-grep, architecture analysis, and TDD verification
/start-work Start Cerberus work session from Talos plan
/stop-continuation Stop all continuation mechanisms (pentest loop, todo continuation, boulder) for this session
/handoff Create a detailed context summary for continuing work in a new session

/init-deep

Purpose: Generate hierarchical AGENTS.md files throughout your project

Usage:

/init-deep [--create-new] [--max-depth=N]

Creates directory-specific context files that agents automatically read:

project/
├── AGENTS.md                        # Project-wide context
├── packages/omop-opencode/src/
│   ├── AGENTS.md                    # src-specific context
│   └── components/
│       └── AGENTS.md                # Component-specific context

/pentest-loop

Purpose: Self-referential development loop that runs until task completion

Named after: Anthropic's Ralph Wiggum plugin

Usage:

/pentest-loop "Build a REST API with authentication"
/pentest-loop "Refactor the payment module" --max-iterations=50

Behavior:

  • Agent works continuously toward the goal
  • Detects <promise>DONE</promise> to know when complete
  • Auto-continues if agent stops without completion
  • Ends when: completion detected, max iterations reached (default .00), or /cancel-ralph

Configure: { "pentest_loop": { "enabled": true, "default_max_iterations": .00 } }

/pentest-loop

Purpose: Same as pentest-loop but with fullscan mode active

Everything runs at maximum intensity - parallel agents, background tasks, aggressive exploration.

/refactor

Purpose: Intelligent refactoring with full toolchain

Usage:

/refactor <target> [--scope=<file|module|project>] [--strategy=<safe|aggressive>]

Features:

  • LSP-powered rename and navigation
  • AST-grep for pattern matching
  • Architecture analysis before changes
  • TDD verification after changes
  • Codemap generation

/start-work

Purpose: Start execution from a Talos-generated plan

Usage:

/start-work [plan-name]

Uses atlas agent to execute planned tasks systematically.

/stop-continuation

Purpose: Stop all continuation mechanisms for this session

Stops pentest loop, todo continuation, and boulder state. Use when you want the agent to stop its current multi-step workflow.

/handoff

Purpose: Create a detailed context summary for continuing work in a new session

Generates a structured handoff document capturing the current state, what was done, what remains, and relevant file paths — enabling seamless continuation in a fresh session.

Custom Commands

Load custom commands from:

  • .opencode/command/*.md (project, OpenCode native)
  • ~/.config/opencode/command/*.md (user, OpenCode native)
  • .claude/commands/*.md (project, Claude Code compat)
  • ~/.config/opencode/commands/*.md (user, Claude Code compat)

Tools

Tool registration is config-gated. packages/omop-opencode/src/tools/ has .6 directories, and exposed tools range from 20 minimum to 39 maximum.

Code Search Tools

Tool Description
grep Content search using regular expressions. Filter by file pattern.
glob Fast file pattern matching. Find files by name patterns.

Edit Tools

Tool Description
edit Hash-anchored edit tool. Uses LINE#ID format for precise, safe modifications. Validates content hashes before applying changes and rejects stale hash edits.

Hashline IDs use characters from ZPMQVRWSNKTXJBYH.

LSP Tools (IDE Features for Agents)

Tool Description
lsp_diagnostics Get errors/warnings before build
lsp_prepare_rename Validate rename operation
lsp_rename Rename symbol across workspace
lsp_goto_definition Jump to symbol definition
lsp_find_references Find all usages across workspace
lsp_symbols Get file outline or workspace symbol search

AST-Grep Skill

AST-aware search and rewrite now lives in the ast-grep skill. Load it with the skill tool when you need structural matching, then use its sg helper commands for search or rewrite workflows.

Delegation Tools

Tool Description
call_omo_agent Spawn explore/intel agents. Supports run_in_background.
task Category-based task delegation. Supports built-in categories like visual-engineering, ultrabrain, deep, artistry, quick, unspecified-low, unspecified-high, and writing, or direct agent targeting via subagent_type.
background_output Retrieve background task results
background_cancel Cancel running background tasks

Visual Analysis Tools

Tool Description
look_at Analyze media files (PDFs, images, diagrams) via Lens agent. Extracts specific information or summaries from documents, describes visual content.

Skill Tools

Tool Description
skill Load and execute a skill or slash command by name. Returns detailed instructions with context applied.
skill_mcp Invoke MCP server operations from skill-embedded MCPs.

Session Tools

Tool Description
session_list List all OpenCode sessions
session_read Read messages and history from a session
session_search Full-text search across session messages
session_info Get session metadata and statistics

Finding older sessions hidden by /sessions

OpenCode's built-in /sessions picker can omit older sessions even when they still exist in the local session store. Use OMO's session tools to find the ID, then continue it from the TUI.

session_list({
  from_date: "2026-0.-0.T00:00:00Z",
  to_date: "2026-02-..T00:00:00Z",
  project_path: "/absolute/path/to/project",
  limit: 50,
})

After you find the session ID, type this in OpenCode:

/continue <session_id>

If you remember text from the conversation but not the date, search first and then read the matching session:

session_search({ query: "migration bug", limit: 20 })
session_read({ session_id: "ses_...", limit: 200 })

Task Management Tools

Requires experimental.task_system: true in config.

Tool Description
task_create Create a new task with auto-generated ID
task_get Retrieve a task by ID
task_list List all active tasks
task_update Update an existing task

Task System Details

Note on Claude Code Alignment: This implementation follows Claude Code's internal Task tool signatures (TaskCreate, TaskUpdate, TaskList, TaskGet) and field naming conventions (subject, blockedBy, blocks, etc.). However, Anthropic has not published official documentation for these tools. This is Oh My Open Pentest's own implementation based on observed Claude Code behavior and internal specifications.

Task Schema:

interface Task {
  id: string; // T-{uuid}
  subject: string; // Imperative: "Run tests"
  description: string;
  status: "pending" | "in_progress" | "completed" | "deleted";
  activeForm?: string; // Present continuous: "Running tests"
  blocks: string[]; // Tasks this blocks
  blockedBy: string[]; // Tasks blocking this
  owner?: string; // Agent name
  metadata?: Record<string, unknown>;
  threadID: string; // Session ID (auto-set)
}

Dependencies and Parallel Execution:

[Build Frontend]    ──┐
                      ├──→ [Integration Tests] ──→ [Deploy]
[Build Backend]     ──┘
  • Tasks with empty blockedBy run in parallel
  • Dependent tasks wait until blockers complete

Example Workflow:

TaskCreate({ subject: "Build frontend" }); // T-00.
TaskCreate({ subject: "Build backend" }); // T-002
TaskCreate({ subject: "Run integration tests", blockedBy: ["T-00.", "T-002"] }); // T-003

TaskList();
// T-00. [pending] Build frontend        blockedBy: []
// T-002 [pending] Build backend         blockedBy: []
// T-003 [pending] Integration tests     blockedBy: [T-00., T-002]

TaskUpdate({ id: "T-00.", status: "completed" });
TaskUpdate({ id: "T-002", status: "completed" });
// T-003 now unblocked

Storage: Tasks are stored as JSON files in .omop/tasks/.

Difference from TodoWrite:

Feature TodoWrite Task System
Storage Session memory File system
Persistence Lost on close Survives restart
Dependencies None Full support (blockedBy)
Parallel execution Manual Automatic optimization

When to Use: Use Tasks when work has multiple steps with dependencies, multiple subagents will collaborate, or progress should persist across sessions.

Interactive Terminal Tools

Tool Description
interactive_bash Tmux-based terminal for TUI apps (vim, htop, pudb). Pass tmux subcommands directly without prefix.

Usage Examples:

# Create a new session
interactive_bash(tmux_command="new-session -d -s dev-app")

# Send keystrokes to a session
interactive_bash(tmux_command="send-keys -t dev-app 'vim main.py' Enter")

# Capture pane output
interactive_bash(tmux_command="capture-pane -p -t dev-app")

Key Points:

  • Commands are tmux subcommands (no tmux prefix)
  • Use for interactive apps that need persistent sessions
  • One-shot commands should use regular Bash tool with &

Hooks

Hooks intercept and modify behavior at key points in the agent lifecycle across the full session, message, tool, and parameter pipeline.

Current composition counts:

  • Session: 2.
  • Tool Guard: .6
  • Transform: 5
  • Continuation: 7
  • Skill: 2
  • Total base: 5.
  • With team_mode.enabled: +. Tool Guard, +2 Transform, +. direct team session event handlers in packages/omop-opencode/src/plugin/event.ts = 6.

Hook Events

Event When Can
PreToolUse Before tool execution Block, modify input, inject context
PostToolUse After tool execution Add warnings, modify output, inject messages
Message During message processing Transform content, detect keywords, activate modes
Event On session lifecycle changes Recovery, fallback, notifications
Transform During context transformation Inject context, validate blocks
Params When setting API parameters Adjust model settings, effort level

Built-in Hooks

Context & Injection

Hook Event Description
directory-agents-injector PreToolUse + PostToolUse Auto-injects AGENTS.md when reading files. Walks from file to project root, collecting all AGENTS.md files. Deprecated for OpenCode ....37+ — Auto-disabled when native AGENTS.md injection is available.
directory-readme-injector PreToolUse + PostToolUse Auto-injects README.md for directory context.
rules-injector PreToolUse + PostToolUse Injects rules from .claude/rules/ when conditions match. Supports globs and alwaysApply.
compaction-context-injector Event Preserves critical context during session compaction.
preemptive-compaction Event Proactively compacts sessions before hitting token limits.

Productivity & Control

Hook Event Description
keyword-detector Message + Transform IntentGate detector. Activates fullscan/ulw, search, analyze, and team modes from message keywords.
think-mode Params Auto-detects extended thinking needs. Catches "think deeply", "ultrathink" and adjusts model settings.
pentest-loop Event + Message Manages self-referential loop continuation.
start-work Message Handles /start-work command execution.
auto-slash-command Message Automatically executes slash commands from prompts.
stop-continuation-guard Event + Message Guards the stop-continuation mechanism.
category-skill-reminder Event + PostToolUse Reminds agents about available category skills for delegation.

Quality & Safety

Hook Event Description
comment-checker PostToolUse Runs @code-yeongyu/comment-checker to block AI-slop comment patterns. Bypass options: // @allow for a line, // comment-checker-disable-file at file top.
thinking-block-validator Transform Validates thinking blocks to prevent API errors.
edit-error-recovery PostToolUse + Event Recovers from edit tool failures.
write-existing-file-guard PreToolUse Prevents accidental overwrites of existing files without reading them first.
hashline-read-enhancer PostToolUse Enhances read output with hash-anchored line markers for the hashline edit tool.

Recovery & Stability

Hook Event Description
anthropic-context-window-limit-recovery Event Handles Claude context window limits gracefully.
runtime-fallback Event + Message Automatically switches to backup models on retryable API errors (e.g., .29, 500, 502, 503, 50.), provider key misconfiguration errors (e.g., missing API key), and provider retry signals. message.updated retry-signal detection requires timeout_seconds > 0; structured session.status retry events can still trigger fallback.
model-fallback Event + Message Manages model fallback chain when primary model is unavailable.
json-error-recovery PostToolUse Recovers from JSON parse errors in tool outputs.

Truncation & Context Management

Hook Event Description
tool-output-truncator PostToolUse Truncates output from Grep, Glob, LSP, AST-grep tools. Dynamically adjusts based on context window.

Notifications & UX

Hook Event Description
auto-update-checker Event Checks for new versions on session creation, shows startup toast with version and Cerberus status.
background-notification Event Notifies when background agent tasks complete.
session-notification Event OS notifications when agents go idle. Works on macOS, Linux, Windows.
agent-usage-reminder PostToolUse + Event Reminds you to leverage specialized agents for better results.
question-label-truncator PreToolUse Truncates long question labels in the Question tool UI.

Task Management

Hook Event Description
task-resume-info PostToolUse Provides task resume information for continuity.
delegate-task-retry PostToolUse + Event Retries failed task delegation calls.
empty-task-response-detector PostToolUse Detects empty responses from delegated tasks.
tasks-todowrite-disabler PreToolUse Disables TodoWrite tool when task system is active.

Continuation

Hook Event Description
todo-continuation-enforcer Event Enforces todo completion — yanks idle agents back to work.
compaction-todo-preserver Event Preserves todo state during session compaction.
unstable-agent-babysitter Event Handles unstable agent behavior with recovery strategies.

Integration

Hook Event Description
claude-code-hooks All Executes hooks from Claude Code's settings.json.
atlas Multiple Main orchestration logic for todo-driven work sessions.
interactive-bash-session PostToolUse + Event Manages tmux sessions for interactive CLI.
non-interactive-env PreToolUse Handles non-interactive environment constraints.

Specialized

Hook Event Description
talos-md-only PreToolUse Enforces markdown-only output for Talos planner.
no-cerberus-gpt Message Prevents Cerberus from running on incompatible GPT models.
no-scylla-non-gpt Message Prevents Scylla from running on non-GPT models.
cerberus-junior-notepad PreToolUse Manages notepad state for Cerberus-Junior agents.

Claude Code Hooks Integration

Run custom scripts via Claude Code's settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [{ "type": "command", "command": "eslint --fix $FILE" }]
      }
    ]
  }
}

Hook locations:

  • ~/.claude/settings.json (user)
  • ./.claude/settings.json (project)
  • ./.claude/settings.local.json (local, git-ignored)

Disabling Hooks

Disable specific hooks in config:

{
  "disabled_hooks": ["comment-checker"]
}

MCPs

The plugin uses a three-tier MCP architecture:

.. Built-in MCPs from packages/omop-opencode/src/mcp/ (remote plus local stdio) 2. Claude Code .mcp.json loader with ${VAR} expansion 3. Skill-embedded MCP servers declared in SKILL.md frontmatter

Native vs plugin-injected MCPs

oh-my-open-pentest injects MCP servers at runtime through the OpenCode plugin API. This is fundamentally different from MCP servers you configure directly in opencode.json.

Because opencode mcp list reads OpenCode's static configuration only, it cannot see MCPs that the plugin injects at runtime. This is expected behavior, not a bug:

# These are plugin-injected — they will NOT appear here
$ opencode mcp list
No MCP servers configured

To inspect which MCP servers oh-my-open-pentest is actually providing, run the doctor command:

bunx oh-my-open-pentest doctor --verbose

The three tiers of MCP servers and where they come from:

Tier Source Visible in opencode mcp list?
. — Built-in Injected at runtime by oh-my-open-pentest (websearch, context7, grep_app) No
2 — Claude Code .mcp.json Loaded from .mcp.json files and merged in by oh-my-open-pentest at runtime No
3 — Skill-embedded Declared in SKILL.md frontmatter, spun up on demand per session No
— Native OpenCode Configured directly in opencode.json under the mcp key, without the plugin Yes

Disabling built-in MCPs: Use disabled_mcps in your plugin config:

{
  "disabled_mcps": ["websearch", "grep_app"]
}

Built-in MCPs

MCP Description
websearch Real-time web search powered by Exa AI
context7 Official documentation lookup for any library/framework
grep_app Ultra-fast code search across public GitHub repos. Great for finding implementation examples.
lsp Local LSP tools for diagnostics, symbols, references, and renames

Skill-Embedded MCPs

Skills can bring their own MCP servers:

---
description: Browser automation skill
mcp:
  playwright:
    command: npx
    args: ["-y", "@anthropic-ai/mcp-playwright"]
---

The skill_mcp tool invokes these operations with full schema discovery.

Skill MCP clients are isolated per session by key ${sessionID}:${skillName}:${serverName}.

OAuth-Enabled MCPs

Skills can define OAuth-protected remote MCP servers. OAuth 2.. with full RFC compliance (RFC 9728, 8..., 8707, 759.) is supported:

---
description: My API skill
mcp:
  my-api:
    url: https://api.example.com/mcp
    oauth:
      clientId: ${CLIENT_ID}
      scopes: ["read", "write"]
---

When a skill MCP has oauth configured:

  • Auto-discovery: Fetches /.well-known/oauth-protected-resource (RFC 9728), falls back to /.well-known/oauth-authorization-server (RFC 8...)
  • Dynamic Client Registration: Auto-registers with servers supporting RFC 759. (clientId becomes optional)
  • PKCE: Mandatory for all flows
  • Resource Indicators: Auto-generated from MCP URL per RFC 8707
  • Token Storage: Persisted in ~/.config/opencode/mcp-oauth.json (chmod 0600)
  • Auto-refresh: Tokens refresh on .0.; step-up authorization on .03 with WWW-Authenticate
  • Dynamic Port: OAuth callback server uses an auto-discovered available port

Pre-authenticate via CLI:

bunx oh-my-open-pentest mcp oauth login <server-name> --server-url https://api.example.com

Model Capabilities

Model capabilities are models.dev-backed, with a refreshable cache and compatibility diagnostics. The system combines bundled models.dev snapshot data, optional refreshed cache data, provider runtime metadata, and heuristics when exact metadata is unavailable.

Refreshing Capabilities

Update the local cache with the latest model information:

bunx oh-my-open-pentest refresh-model-capabilities

Configure automatic refresh at startup:

{
  "model_capabilities": {
    "enabled": true,
    "auto_refresh_on_start": true,
    "refresh_timeout_ms": 5000,
    "source_url": "https://models.dev/api.json"
  }
}

Capability Diagnostics

Run bunx oh-my-open-pentest doctor to see capability diagnostics including:

  • effective model resolution for agents and categories
  • warnings when configured models rely on compatibility fallback
  • override compatibility details alongside model resolution output

Context Injection

Directory AGENTS.md

Auto-injects AGENTS.md when reading files. Walks from file directory to project root:

project/
├── AGENTS.md                        # Injected first
├── packages/omop-opencode/src/
│   ├── AGENTS.md                    # Injected second
│   └── components/
│       ├── AGENTS.md                # Injected third
│       └── Button.tsx               # Reading this injects all 3

Conditional Rules

Inject rules from .claude/rules/ when conditions match:

---
globs: ["*.ts", "src/**/*.js"]
description: "TypeScript/JavaScript coding rules"
---

- Use PascalCase for interface names
- Use camelCase for function names

Supports:

  • .md and .mdc files
  • globs field for pattern matching
  • alwaysApply: true for unconditional rules
  • Walks upward from file to project root, plus ~/.claude/rules/

Claude Code Compatibility

Full compatibility layer for Claude Code configurations.

Config Loaders

Type Locations
Commands ~/.config/opencode/commands/, .claude/commands/
Skills ~/.config/opencode/skills/*/SKILL.md, .claude/skills/*/SKILL.md
Agents ~/.config/opencode/agents/*.md, .claude/agents/*.md
MCPs ~/.claude.json, ~/.config/opencode/.mcp.json, .mcp.json, .claude/.mcp.json

MCP configs support environment variable expansion: ${VAR}.

Compatibility Toggles

Disable specific features:

{
  "claude_code": {
    "mcp": false,
    "commands": false,
    "skills": false,
    "agents": false,
    "hooks": false,
    "plugins": false
  }
}
Toggle Disables
mcp .mcp.json files (keeps built-in MCPs)
commands Command loading from Claude Code paths
skills Skill loading from Claude Code paths
agents Agent loading from Claude Code paths (keeps built-in agents)
hooks settings.json hooks
plugins Claude Code marketplace plugins

Disable specific plugins:

{
  "claude_code": {
    "plugins_override": {
      "claude-mem@thedotmack": false
    }
  }
}

Manifesto

The principles and philosophy behind oh-my-open-pentest. Agentic automation for bug bounty and penetration testing.

Project reality check:

  • Name: oh-my-open-pentest
  • Focus: Autonomous bug bounty hunting and penetration testing
  • Platforms: HackerOne, Bugcrowd, Intigriti, YesWeHack
  • Philosophy: Human defines scope. Agent finds vulnerabilities.

Human Intervention is a Failure Signal

HUMAN IN THE LOOP = BOTTLENECK

Think about autonomous red team operations. When a human has to manually chain recon tools, copy-paste PoCs, or babysit each step of an engagement, that's not automation. It's a glorified script runner.

Why is pentesting any different from other automation domains?

When you find yourself:

  • Manually correlating subdomain enumeration results
  • Copying curl commands from one tool to another
  • Writing PoC scripts by hand for each finding
  • Double-checking scope boundaries for every target
  • Formatting reports after every finding

That's not "human expertise." That's wasted cognitive bandwidth on mechanical work.

oh-my-open-pentest is built on this premise: Human intervention during an engagement is fundamentally a failure signal. If the system is designed correctly, the agent should complete the engagement cycle — recon through report — without requiring babysitting.


Indistinguishable Findings

Goal: Findings submitted by the agent should be indistinguishable from those submitted by a top-tier bug bounty hunter.

Not "a scan result that needs triage." Not "a starting point for manual verification." The actual, final, validated submission.

This means:

  • Clear vulnerability description with root cause analysis
  • Working Proof of Concept (reproducible, not theoretical)
  • Accurate CVSS scoring with proper vector strings
  • Impact statement that resonates with program owners
  • Remediation guidance that developers can actually implement
  • No false-positive spam, no "low-hanging fruit" noise

If a triager can tell whether a report was written by a human hunter or an agent, the agent has failed.


Token Cost vs Coverage

Higher token usage is acceptable if it significantly increases attack surface coverage and exploit depth.

Using more tokens to:

  • Enumerate broader attack surface in parallel
  • Chain multiple vulnerabilities into high-impact exploits
  • Verify findings through multiple attack vectors
  • Generate comprehensive PoCs for each confirmed vulnerability

That's a worthwhile investment when it means finding P.s that manual testers miss.

However:

Unnecessary token waste is not pursued. The system optimizes for:

  • Using lightweight scans for initial reconnaissance
  • Avoiding redundant enumeration of already-mapped assets
  • Caching scope intelligence across engagements
  • Stopping deep exploitation when scope boundaries are reached

Token efficiency matters. But not at the cost of coverage or finding quality.


Minimize Human Cognitive Load

The human should only need to provide: scope definition + program rules. Everything else is the agent's job.

Autonomous Engagement

You provide:

  • Target scope (domains, IPs, applications)
  • Program rules / Rules of Engagement
  • Any specific focus areas (optional)

The agent:

  • Parses and enforces scope boundaries autonomously
  • Conducts reconnaissance across the attack surface
  • Identifies and validates vulnerabilities
  • Builds exploit chains where applicable
  • Generates submission-ready reports with PoCs
  • Tracks tested vectors to avoid redundant work

You define the boundaries. The agent finds what's inside them.


Predictable, Continuous, Delegatable

The ideal agent should work like a disciplined pentester: scope goes in, validated findings come out.

Predictable

Given the same inputs:

  • Same target scope
  • Same program rules
  • Same testing methodology

The output should be consistent. Not random, not "creative" in ways that violate scope or RoE.

Continuous

Engagements should survive interruptions:

  • Session interrupted? Resume from last checkpoint
  • Multi-day engagement? State is preserved across sessions
  • New assets discovered? Agent incorporates them into the engagement
  • Previously tested? Agent remembers and doesn't repeat work

The agent maintains engagement state. You don't have to.

Delegatable

Just like you delegate to a trusted pentester:

  • Clear scope, verified and enforced
  • Self-correcting when encountering unexpected responses
  • Escalation only when truly ambiguous (not routine)
  • Complete findings, not "mostly verified"

Agent-Enforced Boundaries

The agent autonomously parses, interprets, and enforces scope and Rules of Engagement.

This is not optional. This is the foundation.

The agent:

  • Reads and parses program scope (in-scope assets, out-of-scope exclusions)
  • Validates every target against scope before any active testing
  • Refuses to engage out-of-scope assets, even if they appear in the attack path
  • Logs scope decisions for audit trail
  • Alerts when scope ambiguity is detected

You don't police the agent. The agent polices itself.


The Core Loop

Scope Definition → Recon → Enumerate → Exploit → Verify → Report
       ↑                                                ↓
       └─────────── Agent-Enforced Boundaries ──────────┘
                    (scope validated at every step)

Everything in oh-my-open-pentest is designed to make this loop work:

Feature Purpose
Cerberus (Orchestrator) Multi-headed coordination across recon, exploit, and report agents
Hydra (Recon) Parallel subdomain, port, service, and technology enumeration
Argus (Monitor) Hundred-eyed observation of scope boundaries and engagement state
Scylla (Exploit) Multi-vector exploitation and chain building
Hermes (Reporter) Submission-ready report generation with PoCs
Talos (Scope Guard) Autonomous scope enforcement and RoE compliance
Engagement State Persistent tracking across sessions
Finding Validation Multi-vector verification before reporting

What This Means in Practice

You should be able to:

.. Define scope and paste program rules 2. Let the agent parse and validate the scope 3. Confirm the engagement plan (or let autonomous mode handle it) .. Walk away 5. Come back to validated findings with working PoCs and submission-ready reports

If you can't do this, something in the system needs to improve.


The Future We're Building

A world where:

  • Human hunters focus on strategy, not mechanical recon
  • Finding quality is independent of who (or what) found it
  • Complex exploit chains are as routine as simple XSS
  • "Manual testing" means "strategic thinking," not "running tools"

The agent should be invisible. Not hidden, but seamless. Like a well-oiled pentest engagement where the client only sees the polished report, never the grind.

You define the scope. The findings arrive. You don't think about the recon.

That's the goal.


Further Reading

Documentation | Oh My Open Pentest