I've spent the last year building HostileReview — and as proof it works, I ran it against itself.

158 findings. 3 critical. All fixed.

That's the pitch.

HostileReview provides 100+ specialized adversarial AI agents to find vulnerabilities in code by actively trying to break or exploit it — not just pattern-match. Every agent has a job. Every job is to find something wrong.

The market moment

AI is writing an increasing share of production code. GitHub Copilot, Cursor, Claude, GPT-4 — developers are shipping faster than ever, and the code is getting through review because the reviewers are the same AI that wrote it. Nobody is checking the work from the outside.

Every enterprise that adopts AI-assisted development is also silently accumulating AI-generated vulnerabilities. The tools that create the code have no incentive to be honest about its flaws. The tools that review the code are built on the same models with the same blind spots.

Security has always lagged development velocity. AI just made that gap much wider, much faster.

2026 is when AI writes your code. 2027 is when you pay for it in breaches.

Why I built HostileReview

I built this after repeatedly watching AI generate broken or unsafe code with full confidence — failed tests described as passing, insecure shortcuts presented as production-ready, secrets exposed in plaintext, and self-reviews that concluded "looks good" when they absolutely should not have.

I used to run structured multi-AI review loops: one AI writes, two others review, repeat until everything passes. It's rigorous. It still misses things. I wrote articles on how to do it.

Peer Review Robots: How Claude, Gemini & Codex took on SAIQL A Rotating AI Collaboration Workflow That Actually Works

You can't ask the author to be the adversary. AI isn't lying. It's blind to its own mistakes. The structured collab approach taught me something important: AI is often just as blind to another AI's mistakes as it is to its own.

The only way to close that gap is to introduce a system whose entire purpose is to attack the code — one that has no relationship to how it was written and no incentive to protect it.

HostileReview is the adversary.

What it's found

I scanned an enterprise browser's distributed Linux installer and found 54 vulnerabilities, including live plaintext credentials for their private APT repository — credentials that gave read access to all 4 release channels: stable, beta, canary, and unstable.

Credentials were confirmed live.

This wasn't a CTF. It was a real production product used by enterprises. The kind of codebase that had presumably been reviewed by humans and passed through a normal development process. HostileReview found what that process missed.

That's one example. There are more at hostilereview.com/published — real scans, real codebases, published reports anyone can read. Hundreds more aren't public. The pattern is consistent: codebases that look clean on the surface have structural vulnerabilities that only show up when something is actively trying to find them.

Each report separates AI-confirmed threats from noise, with severity ratings, root cause analysis, and a full audit trail — so a security team, a CTO, or an investor can read it and judge the findings themselves without taking our word for it.

What it does

Submit a diff, connect a GitHub or GitLab repo, scan a PR, or upload a zip. HostileReview handles the rest.

The system breaks the codebase into chunks and routes each chunk through a panel of specialized agents — one focused on injection vulnerabilities, one on secrets and credential exposure, one on authentication logic, one on dependency risks, and so on across 100+ specializations. Each agent approaches the code from a different threat model.

Findings go through consensus filtering: if only one agent flags something, it's treated differently than if six agents independently flag the same issue. This is how the system reduces false positives without softening its findings — low-confidence noise gets filtered out, high-confidence threats get escalated.

Domino analysis traces the downstream consequences of each vulnerability — a single insecure function call can cascade into a critical exposure once you follow the data flow. HostileReview maps that cascade instead of just flagging the origin point.

The output is a private web report with a 1-click Fix Workflow — a structured, ordered guide that walks you or your AI coding assistant through every fix in sequence, with context on why each one matters and what breaks if you skip it. Owners can optionally publish their report to hostilereview.com/published, making it visible as a public demonstration of security diligence — useful for teams that want to show customers or auditors that their code was independently reviewed.

The goal is not just detection. The goal is getting vulnerable code fixed.

Why this is hard to replicate

The platform is not a wrapper around existing tools. I built the underlying infrastructure specifically for this kind of workload: custom hot/warm/cold memory handling for agent continuity, high-speed indexing tuned for this access pattern, and semantic compression to keep large multi-agent review runs efficient on modest hardware. That infrastructure is SAIQL — a database engine I built for LLM-era workloads before HostileReview existed.

Independent benchmarks against the ClickBench workload showed QIPI point lookups at 6 microseconds — roughly 1000x faster than SQLite — on a consumer Linux workstation. On the i7-14700F that HostileReview runs on today, projected latency drops to ~3 microseconds. That's the index engine keeping 100+ agents moving without waiting on each other. Worth noting: that benchmark used standard analytical data, not an LLM-era workload. QIPI was built for the latter. We gave away the edge before the test even started.

A competitor starting today wouldn't just need to build the agents — they'd need to build the infrastructure those agents run on. That's not a prompt engineering problem. It's a systems problem that took a year of full-time work to solve.

Beyond infrastructure, the system's effectiveness comes from calibration that only comes with iteration. The consensus thresholds, the agent specializations, the domino analysis logic, the Fix Workflow format — these are the product of running real scans against real codebases and tuning based on what actually worked. That iteration history is a moat that can't be copied, only earned.

That architecture is why I was able to get a live multi-agent product running on consumer hardware before raising outside capital. [Intel Core i7-14700F | GPU Nvidia 3090 | 96GB RAM]

The raise

I'm raising $1.5M on a SAFE at a $6M valuation cap with a 20% discount.

Delaware C-Corp forming prior to close. $15K founder capital in, plus a year of full-time labor.

This is not a headcount-heavy raise. It's an infrastructure-and-scale raise.

We're open to raising beyond $1.5M with the right partners — though term adjustments will be required to ensure total investor equity does not exceed 25% at conversion.

01Hardware buildout — dedicated inference hardware to move toward full local AI execution, eliminating per-scan API cost and external dependency
02Infrastructure scale — expanding scan capacity to handle concurrent enterprise workloads without degradation
03Go-to-market — first enterprise sales conversations, security community presence, and the operational overhead that comes with moving from product to company

Local AI execution is the long-term cost structure that makes this business work at scale. Right now each scan has a per-token API cost. Running inference locally on owned hardware converts that variable cost into a fixed one — which changes the unit economics significantly as scan volume grows.

If the company reaches a point where a Series A is optional rather than necessary, I would strongly consider offering early SAFE holders a cash buyout path instead of requiring them to wait on a future acquisition or IPO.

What I'm looking for

I'm a builder first. I built the system, the product, and the proof. What I haven't done is raise money, navigate investor relationships, or take an infrastructure product through an enterprise sales cycle. I'm not pretending otherwise.

The right partner understands early-stage infrastructure, has seen what DevSecOps or AI tooling companies look like at this stage, and can help a technical founder figure out what he doesn't know. That might mean introductions, it might mean guidance, it might mean both.

If you've invested in security tooling, developer infrastructure, or AI-adjacent products — or if you work at a company that buys those things — I'd like to talk. Angels who want to participate without leading a round are welcome.

Houston preferred. I like to meet in person.

angels@saiql.ai

Include your LinkedIn and tell me about yourself. I'll send over the full pitch deck.

One email, that's all it takes.