---
title: "GuardFall: 10 of 11 Open-Source AI Coding Agents Fail a 30-Year-Old Shell Trick"
date: 2026-07-05
tags: ["security","open-source","aider","cline","opencode","shell-injection","agentic-coding"]
categories: ["AI Tools","Industry"]
summary: "Adversa AI's GuardFall research shows that 10 of 11 popular open-source AI coding agents — including Aider, Cline, Goose, opencode, and OpenHands, collectively ~548,000 GitHub stars — can be tricked into running destructive shell commands their own safety filters were built to block, using decades-old bash tricks like quote splitting and $IFS expansion. Only Continue held up under testing."
---


![GuardFall: 10 of 11 Open-Source AI Coding Agents Fail a 30-Year-Old Shell Trick](/images/guardfall-shell-injection-open-source-ai-coding-agents.png)

Adversa AI — the same research group behind May's TrustFall and SymJack disclosures — is back with a third finding, and this one is almost embarrassing in how old the trick is. Published June 30, GuardFall shows that 10 of 11 popular open-source AI coding agents can be talked into running a command their safety filter was explicitly designed to block, using command-injection techniques that predate most of the developers who wrote those filters.

The root cause is a one-sentence architectural mismatch: **the guard inspects the raw command string, but bash rewrites that string before executing it.** A filter that blocks the literal text `rm -rf` has no idea what to do with a string that only becomes `rm -rf` after the shell finishes expanding quotes, variables, and substitutions. By the time the dangerous command actually runs, the safety check is looking at a different, older version of reality.

## Five Ways to Say "rm -rf" Without Saying It

Adversa AI's paper documents five distinct bypass classes, and none of them require anything beyond a standard POSIX shell:

- **Quote splitting** — `r''m` reads as two harmless tokens to a naive regex, but bash strips the empty quotes and executes `rm`.
- **`$IFS` expansion** — the shell's Internal Field Separator variable expands to whitespace, so `rm$IFS-rf$IFS/` sails past filters that only look for a literal space character.
- **Command substitution** — `$(echo rm) -rf /`, or a destructive command nested inside `echo "$(rm /etc/x)"`, hides the payload inside an evaluation context the guard doesn't unwrap.
- **Base64 pipelines** — `echo <base64> | base64 -d | sh` defeats any filter that analyzes commands segment-by-segment, since the dangerous content only exists after the pipeline composes.
- **Alternative destructive utilities** — `find /x -delete`, `dd of=/dev/sda`, `tar -C / -x` achieve the same outcome as `rm -rf` without ever containing the string `rm`.

None of these are new to offensive security — they're straight out of decades-old CTF and shell-scripting lore. What's new is that they're now sufficient to defeat the safety layer of tools that read your codebase and execute commands with your account's full privileges.

## Who Failed, Who Didn't

Adversa AI live-tested production binaries of 11 agents, representing roughly 548,000 combined GitHub stars, and sorted the results into four failure modes:

| Failure mode | Agents | What happened |
|---|---|---|
| Regex matches raw, unprocessed string | Hermes, opencode, Goose | Failed all five bypass classes; Goose leaked 22 of 23 test cases, opencode leaked 16 of 16 |
| Tokenizes, but still reasons about raw text | Cline, Roo-Code | Better, but still fails on quoted substitutions and per-flag reasoning; Cline leaked 2 of 13 in allow+deny mode |
| No static guard at all | Aider, Plandex, Open Interpreter | Relies entirely on human approval — defeated the moment auto-execute flags are enabled, which CI pipelines do routinely |
| Sandbox-only, with a documented opt-out | OpenHands, SWE-agent | Containers work when active, but a documented local-mode flag removes the containment entirely |

**Continue was the only agent that held up**, passing all 24 penetration attempts with zero leaks. Its approach is unglamorous but correct: tokenize the command the way bash's own quoting rules would, detect variable expansion, recursively evaluate command substitutions, check where pipes actually terminate, and keep an explicit deny-list of canonical destructive patterns evaluated *after* that normalization — not before it.

## The Attack in Practice

The paper's proof-of-concept is the part worth internalizing, because it doesn't require tricking anyone into typing a suspicious command. It requires a pull request:

1. An attacker submits a PR containing a poisoned `Makefile`, with a `clean` target that runs `rm -rf "$$HOME/.aws/credentials"`.
2. A developer's agent reads the Makefile — completely normal, unremarkable behavior for any coding agent.
3. The agent later runs `make test`, which depends on `clean` as a prerequisite.
4. With auto-execute enabled (the default in most CI configurations), the destructive command runs with no filter catching it, because the guard never saw the literal string `rm -rf` — it saw a `make` invocation.

This is the same supply-chain shape as May's TrustFall/Miasma story: you don't need to compromise the developer's judgment, you need to compromise a file the agent is going to read as a matter of course.

## Where Does Claude Code Fit In?

Notably, Claude Code isn't on Adversa AI's tested list — the survey selected purely by open-source GitHub stars, and Claude Code is closed-source. That's worth being honest about rather than claiming an unearned pass: the underlying architectural question — "does the guard reason about the command bash will actually run, or the command text it was handed?" — applies to any agent with a text-based permission layer, proprietary or not. Anthropic's own trust-dialog design has been criticized before (Claude Code was one of four agents affected by TrustFall in May, and Anthropic's response was to call the one-keypress RCE pattern within its stated threat model rather than a bug).

What GuardFall does reinforce is the direction Claude Code has been moving since v2.1.187's `sandbox.credentials`: away from pattern-matching specific commands and toward deny-by-default access control over *resources* (credential files, specific directories) regardless of how a command is spelled. A filter that blocks reads of `~/.aws/credentials` at the filesystem layer doesn't care whether the command that tried to read it was `cat`, `dd`, or a base64-decoded pipeline — it never had to parse the command text correctly in the first place. That's a structurally stronger guarantee than any command-string blocklist, however well-tokenized, because it doesn't depend on enumerating every way bash can be asked to do something.

## What to Actually Do About It

If you or your team run any of the ten vulnerable agents, Adversa AI's mitigations are worth applying this week, not this quarter:

- **Redirect `$HOME` to a throwaway sandbox directory** before running the agent, so `~/.ssh` and `~/.aws` are structurally out of reach regardless of what the guard does or doesn't catch.
- **Disable auto-execute flags** (`--auto-exec`, `--auto-run`, `--auto-test` and equivalents) unless the workflow is genuinely non-interruptible — most CI configurations enable these by default without anyone revisiting the decision.
- **Block agent execution on pull requests from forks.** The Makefile scenario above requires nothing more than an accepted external PR.
- **Treat repository configuration files as untrusted input** — `Makefile`, `.aider.conf.yml`, `package.json` scripts, and CI YAML are all attacker-reachable the moment your agent reads them.

Longer term, Adversa AI's advice generalizes past this specific research: don't trust a command filter that was never tested against the five bypass classes above, and prefer tools that enforce access control at the resource level over ones that try to out-parse bash at the string level. Bash has been rewriting strings before executing them since the 1970s. Any safety layer built after that fact and still losing to it isn't a bug to patch — it's a design assumption to retire.

---

**Sources**: [Adversa AI — GuardFall disclosure](https://adversa.ai/blog/opensource-ai-coding-agents-shell-injection-vulnerability/) · [The Hacker News — GuardFall coverage](https://thehackernews.com/2026/06/guardfall-exposes-open-source-ai-coding.html) · [Security Affairs — GuardFall flaw hits 10 of 11 agents](https://securityaffairs.com/194546/ai/guardfall-flaw-hits-10-of-11-popular-open-source-ai-agents.html) · [SC Media — shell injection flaw brief](https://www.scworld.com/brief/shell-injection-flaw-found-in-10-of-11-open-source-ai-agents) · [Mallory — GuardFall story](https://www.mallory.ai/stories/019f1cce-b0d1-74aa-9190-6d9f06c5ca26)

