Every time Anthropic drops a new benchmark score, the AI press covers it. Every time Anthropic quietly ships five features that change how you actually work in Claude Code, crickets.
April 2026 was one of those quiet months. While the coverage focused on Opus 4.7’s SWE-bench numbers, Anthropic shipped a cluster of power-user features that matter more to your daily workflow than any leaderboard position. Here’s what dropped and what it actually means.
/ultrareview: A Skeptical Senior Engineer in One Command#
The headliner of the batch is /ultrareview. The description from Anthropic is accurate: it “runs a dedicated review session that reads through your changes and flags what a careful reviewer would catch.”
What that means in practice: rather than asking Claude to review your changes in the middle of an active session (where it’s sharing context budget with the implementation work it just did), /ultrareview spins up a separate, focused review session. It examines your current branch or a specific GitHub PR with a structured protocol: architecture, logic correctness, security, performance, and maintainability in a single pass, run at maximum effort.
The key distinction from claude "review my code" is the isolation. A model reviewing its own recent work in the same session has a well-documented blind spot — it tends to validate what it just built. /ultrareview sidesteps this by starting fresh, with no implementation context to defend.
Pro and Max users get 3 free /ultrareview passes per billing cycle. Beyond that, it runs at standard Opus 4.7 rates. Given that a thorough manual review of a non-trivial changeset takes 30–60 minutes, the math on /ultrareview is favorable even at standard token prices.
When to use it: Before merging a PR you’re not fully confident in. After a large refactor. Any time you’d normally ask a colleague “can you take a quick look at this?” — but you want more than a quick look.
Auto Mode for Max Subscribers#
Auto mode was previously available only on Teams, Enterprise, and API plans. As of this April release, it’s extended to Max subscribers — and it no longer requires the --enable-auto-mode flag.
Auto mode lets Claude proceed through longer tasks without stopping to request confirmation at every decision point. The model still respects explicit safety gates, but routine judgment calls — “should I rename this variable?” “should I split this function?” — don’t generate interruption prompts.
For developers who were already on Teams or Enterprise, nothing changes. For Max subscribers who’ve been running Claude Code in the default interactive mode and found themselves constantly babysitting it through multi-step tasks, this is a material change. You can hand Claude a well-scoped implementation task, go do something else, and return to finished work.
The tradeoff is the same one that’s always existed: auto mode requires you to be precise about what you want upfront. A loose prompt in auto mode produces a confident answer to the wrong question. The spec-first discipline that makes agentic workflows effective — write what you want, not how to do it — is more important, not less, when the agent is running without interruption.
xhigh Effort: Finer-Grained Control on Hard Problems#
Opus 4.7 introduced a new effort level: xhigh. It sits between high and max, giving you more thinking budget per request than high without the full latency and cost ceiling of max.
The practical use case: problems that are too complex for high effort to handle reliably, but where you don’t need the full extended reasoning pass that max triggers. Architectural questions, algorithmic problems, complex debugging — situations where you want the model thinking harder, but you don’t need to wait for the maximum-length chain of thought.
Combined with the /effort command, which lets you set effort level per-session or per-task, you now have four rungs on the effort ladder: low, high, xhigh, max. The default for Opus 4.7 sessions is xhigh.
If you’ve been defaulting to max effort on everything and wondering why costs are higher than expected, switching to xhigh for most tasks and reserving max for genuinely hard problems is an easy optimization.
/recap: Context Recovery Without Starting Over#
Anyone who’s left a long Claude Code session and come back to it knows the problem: the model has lost the thread of what was happening. Recapping manually is tedious. Starting over wastes the prior session’s context.
The new /recap command generates a summary of what happened in the session — what was built, what decisions were made, what’s in progress — and injects it as context. You can invoke it manually with /recap, set it to run automatically when returning to a session via /config, or both.
This is more useful than it sounds. Multi-hour implementation sessions accumulate context that’s genuinely hard to reconstruct: why a particular approach was chosen, what was tried and abandoned, which files were touched. /recap surfaces that context on demand rather than forcing you to scroll through the session history.
The feature integrates with the desktop app’s multi-session sidebar: you can see a session summary in the sidebar before you resume it, which helps with the “which session was I working on this in?” problem.
Prompt Caching TTL Controls#
This one is for API and infrastructure users, but it’s worth knowing.
Claude Code now supports two new environment variables for prompt cache TTL:
ENABLE_PROMPT_CACHING_1H— opt into a 1-hour cache TTL for API key, Bedrock, Vertex, and FoundryFORCE_PROMPT_CACHING_5M— force a 5-minute TTL, overriding the default
The default prompt cache TTL has been 5 minutes. For most interactive sessions, that’s fine — you’re actively working and the cache stays warm. But for scheduled automations, long agentic runs, or workflows where Claude Code pauses for extended periods, the 5-minute TTL means you’re paying full price to re-establish context on every restart.
The 1-hour TTL trades a slightly higher cache cost for much better hit rates on longer workflows — particularly relevant for Claude Code Routines that fire on schedules and need to work with large, stable system prompts.
For teams running Claude Code at scale via the API, this is a meaningful cost lever. The 1-hour TTL can dramatically reduce token costs on workflows with large, stable context — CLAUDE.md files, large codebases with slow-changing structure, per-org system prompts.
The Pattern#
These five features share a common theme: they’re not about making Claude smarter. They’re about removing friction from the workflow layer that sits between the developer and the model.
/ultrareview removes the bias of self-review. Auto mode on Max removes the confirmation tax on longer tasks. xhigh effort removes the false binary between speed and quality. /recap removes the context cliff when sessions pause. Prompt caching TTL removes the cost penalty on intermittent long-horizon workflows.
None of these will appear in a press release. None of them produce a benchmark score. All of them change whether Claude Code is something you use occasionally or something you run your engineering workflow through.
The benchmark releases tell you what the model is capable of. The unglamorous power-user features tell you what that capability is worth in practice.
Sources: