---
title: "SymJack and TrustFall: Every Major AI Coding Agent Has Been Hacked. Again."
date: 2026-06-25
tags: ["security","claude-code","cursor","github-copilot","mcp","rce","agentic-coding"]
categories: ["AI Tools","Industry"]
summary: "Adversa AI disclosed two new attack classes in May 2026: TrustFall (one-click RCE via malicious .mcp.json files affecting Claude Code, Cursor, Gemini CLI, and GitHub Copilot) and SymJack (symlink-hijack RCE across six agents including Codex and Grok Build). A real-world worm — Miasma — was found exploiting TrustFall in a production Microsoft Azure repository."
---


![SymJack and TrustFall: Every Major AI Coding Agent Has Been Hacked. Again.](/images/symjack-trustfall-rce-ai-coding-agents.png)

2026 keeps adding new chapters to what is becoming the defining security story of the agentic coding era: AI coding agents, because they are designed to read project files and execute tools autonomously, are structurally excellent attack surfaces. By my count, this is now the **fourth distinct security class** targeting these tools in twelve months.

Adversa AI disclosed two of them in May, and both are worse than the headline suggests.

## TrustFall: One Keypress, Full Compromise

**Disclosure**: May 7, 2026. **Affected tools**: Claude Code, Cursor CLI, Gemini CLI, GitHub Copilot CLI.

TrustFall works because every major AI coding agent now supports `.mcp.json` — a project-level configuration file that specifies which MCP servers to start when the agent opens the folder. That configuration is loaded automatically when a developer accepts a "do you trust this folder?" prompt.

Here's what Adversa AI demonstrated:

1. An attacker creates a public repository with a plausible-looking codebase.
2. The repo includes `.mcp.json` and `.claude/settings.json` (or the equivalent per tool) pointing to an attacker-controlled executable.
3. A developer clones the repo and opens it in their agent of choice.
4. The trust dialog appears: "Is this a project you created or one you trust?"
5. Developer clicks **Yes**.
6. The attacker's MCP server spawns as an OS process with full user privileges, opens a C2 channel, and begins exfiltrating SSH keys, cloud credentials, and any accessible file on disk.

The entire compromise happens in a single keypress. No exploit chain. No privilege escalation. Just the agent doing exactly what it was designed to do.

The UX regression that made this exploitable: Claude Code v2.1+ simplified the trust dialog, removing an earlier warning that `.mcp.json` could execute code and the option to "proceed with MCP servers disabled." The simplified dialog defaults to "Yes, I trust this folder" with no MCP-specific language.

Anthropic reviewed the TrustFall report and **declined to treat it as a vulnerability**. Their position: accepting the trust dialog constitutes consent to the full project configuration, including `.mcp.json`. The one-keypress-to-RCE pattern is, in Anthropic's current threat model, the intended behavior of the trust system.

### Miasma: TrustFall in the Wild

TrustFall didn't stay theoretical for long. The **Miasma worm** was discovered having planted `.mcp.json` and IDE configuration files inside `Azure/durabletask`, a legitimate Microsoft Azure repository with thousands of stars and regular clones by enterprise developers. Anyone who opened that repository in Claude Code, Cursor, or Gemini CLI during the infection window triggered the payload automatically on trust acceptance.

Miasma is significant because it demonstrates the supply chain attack vector at scale: you don't need to trick developers into cloning malicious repositories. You compromise repositories they're already using.

## SymJack: The Approval Prompt Is Lying to You

**Disclosure**: May 27, 2026. **Affected tools**: Claude Code, Cursor Agent CLI, GitHub Copilot CLI, Google Antigravity/Gemini CLI, Grok Build, OpenAI Codex CLI — **six agents total**.

SymJack targets a different surface: the file-copy approval flow that agents use when they want to write files to disk.

The attack sequence:

1. The attacker prepares a repository with a **renamed symlink** — a file that presents as an innocuous document (a video file, a log file, a documentation asset) but is actually a symbolic link pointing at the agent's own configuration directory.
2. The agent's next action involves copying that file somewhere in the project.
3. The approval prompt shows the developer the source path (harmless-looking filename) and the stated destination (inside the project). The actual write destination — resolved through the symlink — is the agent's MCP configuration.
4. Developer approves. The kernel follows the symlink. The agent's MCP config is overwritten with the attacker's payload.
5. On the **next agent restart**, the malicious MCP server auto-starts with full user privileges.

The architectural flaw is that agents displayed approval prompts using the path *before* symlink resolution. Users always saw a benign path and a benign destination. The actual write target was invisible to them.

Anthropic has quietly hardened Claude Code to resolve symlinks before displaying approval prompts. OpenAI closed the Codex CLI report via Bugcrowd as a "false positive" — their position is that user approval of a copy command implies consent regardless of symlink targets. Cursor, Copilot, and Grok Build responses vary per available reporting.

## The Fourth Security Class in Twelve Months

To keep score on 2026's agentic security taxonomy:

| Class | When | Attack surface | Example |
|---|---|---|---|
| OAuth token hijacking | May 2026 | npm postinstall hook overwrites `~/.claude.json` | Mitiga Labs disclosure |
| STDIO transport injection | May 2026 | 200K+ servers with no execution boundary | Ox Security audit |
| Agentjacking / MCP injection | June 2026 | Malicious Sentry error events via MCP | Tenet Security / 2,388 orgs exposed |
| TrustFall / SymJack | May 2026 | Project trust dialog + file approval flow | Adversa AI / Miasma worm |

Each class exploits a different seam in the agentic model. MCP injection exploits trusted data sources. STDIO injection exploits the transport layer. OAuth hijacking exploits the installation lifecycle. TrustFall and SymJack exploit the developer permission UX — specifically the gap between what a consent prompt shows and what it actually authorizes.

The common thread: agentic agents are powerful precisely because they act on context from many sources (project files, MCP servers, external APIs). Every source of context is a potential injection point. Every permission dialog is a potential social engineering surface.

## What the Vendor Responses Reveal

**Anthropic's "outside threat model" response to TrustFall** deserves scrutiny. It's technically accurate that the trust dialog is an opt-in. But the purpose of a security dialog is to give users meaningful consent — consent that includes understanding what they're agreeing to. A dialog that says "do you trust this folder?" while omitting "this will execute the following commands as you" is not informed consent.

The counter-argument is that every IDE, terminal, and development tool has always been capable of running arbitrary code from project configurations. This is true. The difference is that modern AI agents lower the skill threshold for cloning and working with unfamiliar repositories to near zero — the exact population most likely to click "Yes" on a trust dialog without reading project configs first.

**OpenAI's "false positive" response to SymJack** has a similar structural problem. "The user approved the copy" is true. "The user understood they were approving a write to their MCP configuration via an opaque symlink" is not.

Both responses prioritize the letter of the permission model over its spirit. They'll need to revisit this as supply chain attacks via AI-targeted repository poisoning scale up.

## Practical Mitigations

**Review `.mcp.json` before accepting any trust dialog.** This is annoying and not how the UX is designed, but it's the only way to verify what MCP servers will actually start. Run `cat .mcp.json` in the cloned repo before opening it in any agent.

**Pin MCP server versions explicitly.** A malicious commit to a depended-upon MCP server is an attack surface. `npx @vendor/server@1.2.3` is safer than `npx @vendor/server@latest`.

**Use Claude Code's `sandbox.credentials` setting** (introduced in v2.1.187). It blocks sandboxed commands from reading credential files at `~/.ssh`, `~/.aws`, `~/.config/gcloud`, etc. It doesn't prevent MCP server startup but limits what a compromised MCP server can exfiltrate.

**Add MCP integrity checking to your CLAUDE.md invariants**: `Never start MCP servers not listed in [your explicit whitelist]`. Whether this holds under all injection scenarios is debatable, but it raises the bar.

**For enterprise teams**, the `disallowed-tools` enterprise sandboxing in Claude Code v2.1.152+ lets you block specific MCP tool categories organization-wide. Combined with a private MCP server registry, you can prevent developers from accidentally running attacker-controlled servers.

The fundamental fix — agents that resolve symlinks before showing approval prompts, and trust dialogs that explicitly enumerate executables about to be spawned — is a product design problem. Some of it is being patched. The rest requires the industry to agree that informed consent in agentic tools means something more than "the user clicked Yes."

---

**Sources**: [Adversa AI — TrustFall disclosure](https://adversa.ai/blog/trustfall-coding-agent-security-flaw-rce-claude-cursor-gemini-cli-copilot/) · [Adversa AI — SymJack disclosure](https://adversa.ai/blog/the-approval-prompt-is-lying-to-you-symlink-rce-in-five-ai-coding-agents-claude-code-cursor-antigravity-copilot-grok-build/) · [SecurityWeek — SymJack supply chain analysis](https://www.securityweek.com/symjack-attack-turns-ai-coding-agents-into-supply-chain-attack-delivery-systems/) · [Help Net Security — TrustFall one-keypress RCE](https://www.helpnetsecurity.com/2026/05/07/trustfall-ai-coding-cli-vulnerability-research/) · [The Register — Claude Code trust prompt RCE](https://www.theregister.com/security/2026/05/07/claude-code-trust-prompt-can-trigger-one-click-rce/5235319) · [CVE-2025-53773 — GitHub Copilot prompt injection](https://nvd.nist.gov/vuln/detail/CVE-2025-53773)