---
title: "How to Review AI-Generated Code (Without Being Buried in It)"
date: 2026-06-09
tags: ["code-review","agentic-workflows","security","best-practices","spec-driven-development"]
categories: ["Guides","Agentic Workflows"]
summary: "At companies using AI coding agents, a single sprint can generate more code than a team used to write in a quarter. Traditional code review cannot keep pace. Here's the three-layer approach — prevention before generation, automated scanning during, targeted human review after — that catches what your agent missed without consuming your entire engineering day."
---


Sherlock Forensics audited 500 AI-assisted codebases in April 2026 and found critical vulnerabilities in 92% of them. Average exploitable findings per codebase: 8.3. The most common failure modes were not subtle: XSS (86%), log injection (88%), business logic flaws (72%), secret exposure (78%).

The problem is not that AI writes bad code. The problem is that AI writes *a lot of code with high confidence*, and human review pipelines built for three PRs a day are now seeing thirty. The review gets skimmed, the "almost right" issue ships, and nobody catches it until production.

The fix is not to review less carefully. It's to review more strategically — using a layered approach where most defects are caught before, during, and automatically, so human reviewers only see the things that genuinely require human judgment.

## Layer 1: Prevention — Shape What the Agent Generates

The cheapest defect to fix is one that never gets written. Before your agent generates a line of code, the right constraints reduce the review burden downstream.

**CLAUDE.md invariants** are the primary tool here. A well-structured [CLAUDE.md file](/posts/the-complete-claude-md-guide/) is not documentation — it's a set of hard rules the agent checks before writing. Examples:

```markdown
## Security invariants
- Never use innerHTML for user-supplied content. Use textContent or DOMPurify.
- Never construct SQL queries with string concatenation. Use parameterized queries only.
- Never log authentication tokens, passwords, or PII, even at debug level.
- All environment variables that hold secrets must be listed in .env.example with a placeholder.
```

These rules don't guarantee compliance — they constrain the agent's output and make violations visible. An agent that breaks a CLAUDE.md invariant leaves a trail.

**Test-first specifications** are the other lever. The [spec-as-source-of-truth](/posts/spec-file-as-source-of-truth/) pattern asks you to write acceptance tests before the agent writes implementation. When the agent's implementation has to pass a test you wrote, you've constrained not just the behavior but the surface area the agent can wander into. Business logic flaws — the hardest class of AI bug to catch in review — are also the easiest to catch when the tests predate the code.

**Spec files with explicit constraints** help even if you're not doing full SDD. A prompt like "implement the payment webhook handler; the spec is at docs/payment-spec.md; never process a webhook without first verifying the signature against STRIPE_WEBHOOK_SECRET" is reviewed before generation, not after.

## Layer 2: Automated Scanning — Let Machines Catch Machine Output

Human reviewers should not be the first line of defense against XSS, secret exposure, or SQL injection. Automated tools exist for this, and AI-generated code benefits from them more than human-written code because AI output is voluminous and stylistically consistent enough that scanners work well.

**Pre-commit hooks** are the minimum viable setup. A pre-commit hook that runs a secret scanner (truffleHog, detect-secrets), a SAST linter, and your test suite stops the obvious failures before they reach a PR. This is not new advice. But teams using AI coding agents who haven't tightened their pre-commit setup are effectively giving the agent an unlocked door.

**AI-driven code review** is the higher-leverage addition. [Claude Code's /code-review](/posts/claude-code-review-ga-multi-agent-pr-review/) runs multiple independent review agents in parallel — each looking at the PR from a different angle (security, correctness, test coverage) — then synthesizes findings into PR comments. The key property is adversarial independence: each agent is seeded without knowledge of what the others found, so they don't anchor on each other's conclusions. For AI-generated PRs, the asymmetry is useful — an AI agent that generated the code and an AI reviewer that is trying to break it are doing fundamentally different tasks.

This is the **review-in-loop** pattern from [multi-agent architecture](/posts/multi-agent-software-development-architecture-patterns-2026/): the reviewing agent only passes the PR if a majority of independent review agents sign off. For high-stakes changes (auth, payments, data access), require 3/3 sign-offs. For routine changes, 2/3 is a reasonable bar.

**Dependency scanning** should be automatic. AI agents readily add packages because the pattern is correct even when the package is abandoned, has a known CVE, or is a typosquat. A dependency scanner running in CI catches this before merge.

## Layer 3: Targeted Human Review — Focus on What AI Gets Systematically Wrong

Once layers 1 and 2 have filtered the mechanical failures, human review should concentrate on four categories where AI consistently underperforms:

**Business logic that contradicts the spec.** AI generates syntactically correct code that satisfies the test suite and still does the wrong thing. The canonical failure: the agent misread a requirement and implemented a plausible alternative. These bugs don't trigger linters or fail tests — they fail when real users hit an edge case in production. Reviewers should read each PR against its original spec or ticket, not just against the diff.

**Authorization boundaries.** AI agents are good at implementing what they're asked to implement and often skip what they weren't asked about. A controller that correctly validates input can still skip authorization entirely because the spec didn't mention it. For any endpoint that touches data, ask: "What checks whether this user is allowed to do this operation?" If the answer is "the code implicitly assumes they are," that's a finding.

**Error handling and failure modes.** AI-generated code handles the happy path well. The path where the database is down, the external API returns 503, or the user provides unexpected types is often missing or incomplete. Check each external call: what happens when it fails? Does the failure surface correctly, or is it silently swallowed?

**Secrets and configuration.** Despite CLAUDE.md rules, AI agents sometimes hardcode API keys in tests, log sensitive data in debug branches, or construct credentials from string interpolation. A single pass scanning for hardcoded strings that look like keys, and reviewing any `console.log` or `logger.debug` statements in modified files, catches most of this.

## The Minimal Human Review Checklist

For a team moving fast with AI agents, here's the irreducible list:

- [ ] Does this PR implement what the spec/ticket actually says (not just what sounds like it)?
- [ ] Is every data-access path explicitly authorized (not just authenticated)?
- [ ] What happens when each external call fails? Is the failure surfaced or swallowed?
- [ ] Are there any hardcoded secrets, tokens, or credentials — even in tests?
- [ ] Does the test coverage actually test the business logic, or just the happy path?

Five questions. They don't take long when layers 1 and 2 have already filtered the obvious failures. They catch the class of bugs that are both most likely to be present in AI code and most likely to be missed by automated tools.

## The Underlying Principle

The developer trust gap is real: [84% of developers use AI tools, 29% trust what they ship](/posts/developer-ai-trust-crisis-84-use-29-trust/). The gap isn't because AI is unpredictable — it's because review processes haven't adapted to AI's production speed.

The adaptation is not more human effort. It's earlier prevention, better automation, and sharper targeting of the residual human review. Get the layers right and AI-generated code can be safer than human-generated code, because the automation catches what humans skip on time pressure, and the human review focuses on the categories where judgment genuinely matters.

Agentic workflows that include automated review as a first-class step — not an afterthought — are where the productivity gains survive contact with production.

