Manifesto

Agentic Automation Pentest

The Philosophy of Autonomous Security Testing

> HUMAN IN THE LOOP = BOTTLENECK

Imagine an autonomous red team that requires you to manually chain recon tools, copy-paste PoCs, and babysit every step. Would you call that "automation"? No. You'd call it a glorified script runner.

Why is pentesting any different from other automation domains?

We've accepted a paradigm where "AI security testing" means a chatbot that runs a scan, then waits for you to validate findings. That's not automation — that's micromanagement.

Manually correlating subdomain enumeration results
Copying curl commands from one tool to another
Writing PoC scripts by hand for each finding
Double-checking scope boundaries for every target

That's not "human expertise" — that's wasted cognitive bandwidth on mechanical work.

oh-my-open-pentest is built on the premise that the human should define scope and rules of engagement, not babysit recon tools.

Indistinguishable Findings

Findings submitted by the agent should be indistinguishable from those submitted by a top-tier bug bounty hunter.

Clear vulnerability description with root cause analysis

Working Proof of Concept (reproducible, not theoretical)

Accurate CVSS scoring with proper vector strings

Impact statement that resonates with program owners

Remediation guidance that developers can actually implement

"If a triager can tell whether a report was written by a human hunter or an agent, the agent has failed."

Token Cost vs. Coverage

We don't care about token usage. We care about attack surface coverage and exploit depth. If spending tokens means finding P.s that manual testers miss, that's a worthwhile investment.

Enumerate broader attack surface in parallel
Chain multiple vulnerabilities into high-impact exploits
Verify findings through multiple attack vectors

However...

We optimize for efficiency where it counts. Not by crippling coverage, but by:

Using lightweight scans for initial reconnaissance
Avoiding redundant enumeration of mapped assets
Caching scope intelligence across engagements
Stopping deep exploitation at scope boundaries

Minimize Human Cognitive Load

The human should only need to provide scope definition and program rules. Everything else is the agent's job.

Autonomous Mode

Full Engagement

Define scope. Walk away. Get findings.

Parses and enforces scope boundaries

Conducts reconnaissance across attack surface

Identifies and validates vulnerabilities

Builds exploit chains where applicable

Generates submission-ready reports with PoCs

You define the boundaries. The agent finds what's inside them.

Scope Enforcement

Agent-Enforced Boundaries

The agent polices itself.

Talos (Scope Guard)

Reads and parses program scope. Validates every target before active testing. Refuses out-of-scope assets even in attack paths.

Argus (Monitor)

Hundred-eyed observation of scope boundaries and engagement state. Logs scope decisions for audit trail. Alerts on ambiguity.

You don't police the agent. The agent polices itself.

Predictable

Same scope + same RoE = consistent output. No random deviations or scope violations.

Continuous

Survives interruptions. Engagement state preserved across sessions. No redundant re-testing.

Delegatable

Clear scope verified and enforced. Self-correcting on unexpected responses. Escalation only on true ambiguity.

The Core Loop

Scope

Recon

Exploit

Verify

Report

↻ Agent-Enforced Boundaries

Cerberus

Multi-headed orchestration across recon, exploit, and report agents

Hydra

Parallel subdomain, port, service, and technology enumeration

Argus

Hundred-eyed observation of scope boundaries and engagement state

Scylla

Multi-vector exploitation and chain building

Hermes

Submission-ready report generation with PoCs

Talos

Autonomous scope enforcement and RoE compliance

Engagement State

Persistent tracking across sessions, no redundant work

Finding Validation

Multi-vector verification before reporting

The Future We're Building

Human hunters focus on strategy, not mechanical recon

Finding quality independent of who (or what) found it

Complex exploit chains as routine as simple XSS

"Manual testing" means strategic thinking, not running tools

manifesto.future.quote1

"You define the scope. The findings arrive. You don't think about the recon."

scope → findings

Get oh-my-open-pentest