
Imagine an autonomous red team that requires you to manually chain recon tools, copy-paste PoCs, and babysit every step. Would you call that "automation"? No. You'd call it a glorified script runner.
Why is pentesting any different from other automation domains?
We've accepted a paradigm where "AI security testing" means a chatbot that runs a scan, then waits for you to validate findings. That's not automation — that's micromanagement.
- Manually correlating subdomain enumeration results
- Copying curl commands from one tool to another
- Writing PoC scripts by hand for each finding
- Double-checking scope boundaries for every target
That's not "human expertise" — that's wasted cognitive bandwidth on mechanical work.
oh-my-open-pentest is built on the premise that the human should define scope and rules of engagement, not babysit recon tools.
Indistinguishable Findings
Findings submitted by the agent should be indistinguishable from those submitted by a top-tier bug bounty hunter.
"If a triager can tell whether a report was written by a human hunter or an agent, the agent has failed."
Token Cost vs. Coverage
We don't care about token usage. We care about attack surface coverage and exploit depth. If spending tokens means finding P.s that manual testers miss, that's a worthwhile investment.
- Enumerate broader attack surface in parallel
- Chain multiple vulnerabilities into high-impact exploits
- Verify findings through multiple attack vectors
However...
We optimize for efficiency where it counts. Not by crippling coverage, but by:
- Using lightweight scans for initial reconnaissance
- Avoiding redundant enumeration of mapped assets
- Caching scope intelligence across engagements
- Stopping deep exploitation at scope boundaries
Minimize Human Cognitive Load
The human should only need to provide scope definition and program rules. Everything else is the agent's job.
Define scope. Walk away. Get findings.
Parses and enforces scope boundaries
Conducts reconnaissance across attack surface
Identifies and validates vulnerabilities
Builds exploit chains where applicable
Generates submission-ready reports with PoCs
The agent polices itself.
Talos (Scope Guard)
Reads and parses program scope. Validates every target before active testing. Refuses out-of-scope assets even in attack paths.
Argus (Monitor)
Hundred-eyed observation of scope boundaries and engagement state. Logs scope decisions for audit trail. Alerts on ambiguity.

Predictable
Same scope + same RoE = consistent output. No random deviations or scope violations.

Continuous
Survives interruptions. Engagement state preserved across sessions. No redundant re-testing.

Delegatable
Clear scope verified and enforced. Self-correcting on unexpected responses. Escalation only on true ambiguity.
The Core Loop
↻ Agent-Enforced Boundaries
Multi-headed orchestration across recon, exploit, and report agents
Parallel subdomain, port, service, and technology enumeration
Hundred-eyed observation of scope boundaries and engagement state
Multi-vector exploitation and chain building
Submission-ready report generation with PoCs
Autonomous scope enforcement and RoE compliance
Persistent tracking across sessions, no redundant work
Multi-vector verification before reporting
The Future We're Building
manifesto.future.quote1
"You define the scope. The findings arrive. You don't think about the recon."