# Claude

- [Agentic Coding 101: When Your AI Plans, Builds, Tests, and Ships](https://sdd.sh/2026/05/agentic-coding-101-when-your-ai-plans-builds-tests-and-ships.md): Most engineers still think of AI coding as an advanced autocomplete. They're missing the paradigm shift. Agentic coding is fundamentally different — the AI plans the work, writes the code, runs the tests, fixes the failures, and iterates until the task is done.
- [Cursor Security Review vs. Claude Security: Two Betas, One Week, Opposite Architectures](https://sdd.sh/2026/05/cursor-security-review-vs.-claude-security-two-betas-one-week-opposite-architectures.md): On April 30, 2026, both Cursor and Anthropic shipped AI-powered security products on the same day. The features look similar on paper. The architectures could not be more different — and that difference tells you everything about where each company thinks AI coding is headed.
- [OpenAI Lands on Amazon Bedrock — The Cloud That Already Houses Claude](https://sdd.sh/2026/04/openai-lands-on-amazon-bedrock-the-cloud-that-already-houses-claude.md): After Microsoft's exclusivity expired on April 27, OpenAI moved its models, Codex agent, and a new jointly built Bedrock Managed Agents runtime onto AWS. Amazon now hosts both Anthropic and OpenAI. Here's what the infrastructure power shift means for the AI coding landscape.
- [DeepSeek V4 Ships: Frontier-Class Coding at 1/6th the Cost](https://sdd.sh/2026/04/deepseek-v4-ships-frontier-class-coding-at-1/6th-the-cost.md): DeepSeek V4-Pro hits 80.6% on SWE-bench Verified and 93.5% on LiveCodeBench — matching or exceeding most closed models — while costing 1/6th of Claude Opus 4.7 and releasing under the MIT license. Here's what actually matters, and what the benchmarks don't tell you.
- [Claude Design Is Not a Figma Clone. It's the Missing First Half of Your Agentic Stack.](https://sdd.sh/2026/04/claude-design-is-not-a-figma-clone.-its-the-missing-first-half-of-your-agentic-stack..md): Anthropic's Claude Design launched April 17 as a research preview. It's not a Figma alternative — it's the upstream half of the Claude Code shipping pipeline, and the handoff mechanism changes the conversation entirely.
- [Claude Opus 4.7 Is Your New API Default on April 23. Here's What Changes.](https://sdd.sh/2026/04/claude-opus-4.7-is-your-new-api-default-on-april-23.-heres-what-changes..md): On April 23, the 'opus' API alias switches to Opus 4.7. Same price, one-third the tool errors, best SWE-bench Pro score on the market. If your pipeline uses the bare alias, you're upgrading automatically. Here's what that actually means.
- [Claude Opus 4.7: 87.6% SWE-bench, Implicit-Need Tests, Same Price](https://sdd.sh/2026/04/claude-opus-4.7-87.6-swe-bench-implicit-need-tests-same-price.md): Anthropic shipped Claude Opus 4.7 on April 16, 2026. SWE-bench Verified jumps nearly 7 points to 87.6%, SWE-bench Pro leaps from 53.4% to 64.3%, and the model is the first Claude to pass implicit-need tests. Pricing stays flat at $5/$25 per million tokens.
- [81% vs. 46%: The AI Coding Benchmark That's Been Lying to You](https://sdd.sh/2026/04/81-vs.-46-the-ai-coding-benchmark-thats-been-lying-to-you.md): SWE-bench Verified — the benchmark that put every frontier model above 80% — is contaminated. OpenAI stopped reporting it in February. Here's what actually happened, what SWE-bench Pro replaces it with, and why 46% is a more honest number than 81%.
- [Claude Managed Agents: Anthropic Just Built the Agent Loop You Were Going to Write Anyway](https://sdd.sh/2026/04/claude-managed-agents-anthropic-just-built-the-agent-loop-you-were-going-to-write-anyway.md): Anthropic launched Claude Managed Agents on April 8 — a managed API that handles the agent loop, sandboxing, checkpointing, and tool orchestration you'd otherwise build yourself. Here's what it actually offers, how the pricing model works, and why it matters for teams shipping production agents.
- [Claude Mythos Goes Official: Project Glasswing and the Zero-Day Reckoning](https://sdd.sh/2026/04/claude-mythos-goes-official-project-glasswing-and-the-zero-day-reckoning.md): Anthropic officially unveiled Claude Mythos Preview on April 7, confirming what the March leak hinted at: a model that autonomously found thousands of zero-days across every major OS and browser. Their response — Project Glasswing — grants restricted access to a select group of tech giants to use Mythos as a defensive weapon. This is the most consequential 'too dangerous to release' moment in AI history.
- [Claude's 1M Context Window Is Now Standard: What Actually Changes for Agentic Coding](https://sdd.sh/2026/04/claudes-1m-context-window-is-now-standard-what-actually-changes-for-agentic-coding.md): On March 13, Anthropic made the 1M token context window standard on Sonnet 4.6 and Opus 4.6 — no beta header, no pricing premium above 200K. Here is what that actually changes for coding agents, how it compares to the competition, and what it still cannot solve.
- [The SWE-bench Plateau: Three Frontier Models Walk In, All Score 80% — Now What?](https://sdd.sh/2026/04/the-swe-bench-plateau-three-frontier-models-walk-in-all-score-80-now-what.md): Gemini 3.1 Pro, Claude Opus 4.6, and GPT-5.3-Codex are all within 0.8% of each other on SWE-bench Verified. When every frontier model aces the exam, the exam stops being useful. Here's what actually differentiates them.
- [Claude Mythos: The Leaked Model That Scared the Security World](https://sdd.sh/2026/03/claude-mythos-the-leaked-model-that-scared-the-security-world.md): A CMS misconfiguration at Anthropic accidentally revealed 'Claude Mythos' — a model tier above Opus 4.6 that Anthropic itself calls an unprecedented cybersecurity risk. Here's what leaked, what it means for agentic coding, and why the security industry noticed immediately.