Weekly News Roundup and Thoughts
Umar Farouk16 min read·Just now--
Today I will be sharing some news, mostly AI related and giving my thoughts and some actionable steps to mitigate risks. Things are happening in the AI space, and just generally in the tech space. Staying up to date gives you an edge, and as always, “Knowledge is power”.
Vulnerability in Claude Extension for Chrome Exposes AI Agent to Takeover
A vulnerability in the Claude extension for Chrome could allow attackers to take over the AI agent and abuse it for information theft, cybersecurity firm LayerX reports.
The flaw, dubbed ClaudeBleed, is a combination of lax permissions, where any Chrome extension can run commands in Claude in Chrome, and poorly implemented trust in the origin of the command, not the execution context.
According to LayerX, the main issue is that the Claude extension allows interaction with any script running in the origin browser, without verifying its owner.
“As a result, any extension can invoke a content script (which does not require any special permissions) and issue commands to the Claude extension,” the company explains.
Claude in Chrome, it says, trusts the origin of the execution, which is claude.ai, and not the execution context, thus allowing any JavaScript running in the origin to issue privileged commands.
This allows an attacker to create an extension with a declared content script and configured to run in the Main world, thus ensuring the script is executed as part of the page, and send a message to the Claude extension, which trusts the sender because it runs in claude.ai.
Because a message handler in Claude in Chrome accepts and forwards arbitrary prompts, the attacker can perform remote prompt injection and control the AI agent’s actions.
While Claude enforces user confirmation for sensitive actions, as well as policies that prevent certain actions, and makes decisions based on certain inputs, LayerX discovered that the attacker’s script could bypass these protections.
The company was able to forge user approval by repeatedly sending a confirmation message and relied on Document Object Model (DOM) manipulation to dynamically modify UI elements and alter Claude’s perception of the actions.
It was also able to gain visibility into command execution through repeated triggering of the action and by observing the effects.
“This vulnerability effectively breaks Chrome’s extension security model by allowing a zero-permission extension to inherit the capabilities of a trusted AI assistant,” LayerX says.
This attack chain, the company says, allows an attacker to weaponize Claude to exfiltrate data from Gmail, GitHub, or Google Drive, as well as to send emails, delete data, and share documents on behalf of the user.
When notified of the issue, Anthropic told LayerX it was working on a patch, but the fix only partially addressed the underlying vulnerability, through “internal security checks to prevent extensions running in ‘standard’ mode from executing remote commands”.
Because the root cause of the weakness was not addressed, an attacker can simply switch the extension to ‘privileged’ mode and bypass the fix. The user is never notified or asked to approve the switch, LayerX
My Thoughts
The story hits a perfect storm: a well-known AI brand, a catchy name, an incomplete patch, and implications that go well beyond one company. Here’s how to make the most of it.
When a security firm discloses a vulnerability and the vendor patches it within days, that should normally be the best-case outcome. With ClaudeBleed, it’s where the real story begins.
LayerX reported the flaw to Anthropic on April 27. Anthropic responded the next day, stating the issue had already been identified internally and would be fixed in an upcoming release. That response itself is worth unpacking. Saying it was “already identified internally” means Anthropic knew about a critical trust boundary failure in a product it had just shipped to the public. That isn’t a minor oversight buried in a backlog ,it’s a known, open wound in an AI agent with access to your Gmail, Drive, and GitHub.
Then came the patch. Anthropic released an updated extension version (version 1.0.70) on May 6, 2026. Contrary to their initial response, the externally_connectable message handler was not removed, but Anthropic did introduce additional approval flows for privileged actions. The researcher didn’t need days to find the bypass. LayerX’s principal security researcher said he was able to “hack the fix” within just three hours.
This matters beyond the specifics of one extension. A three-hour bypass tells you the underlying architecture was never properly rethought ,a surface-level guardrail was added on top of a fundamentally broken trust model. That’s a pattern the security community has a name for: “security theater.”
Zero permission = maximum damage
Chrome’s extension permission system is designed around a principle most users intuitively understand: before an extension can do something sensitive, it has to ask for it. Want to read your tabs? Ask. Want to access a specific site? Ask. This is the social contract that makes the Chrome Web Store feel (at least somewhat) trustworthy.
ClaudeBleed shatters that contract. Any extension can invoke a content script, which does not require any special permissions and issue commands to the Claude extension. The attack works because Claude in Chrome trusts the origin (claude.ai) rather than the execution context. Any JavaScript running on claude.ai can therefore impersonate a legitimate user command. A malicious extension simply needs to inject a content script configured to run in the “Main world,” meaning it executes as part of the page itself rather than in the isolated extension sandbox.
The practical consequence is stark. A zero-permission extension can inherit the capabilities of a trusted AI assistant meaning every dangerous capability Claude has been granted (reading your Drive, composing your emails, browsing your GitHub) becomes available to any extension that wants it, with no prompts, no warnings, and no trail. The malicious extension looks completely harmless in Chrome’s permissions UI because it literally has none.
Imagine a keycard system where any employee badge including the janitor’s automatically gains the same access as the CEO’s, simply by being swiped at the CEO’s door while the CEO is nearby. The badge has no special privileges on paper. The building’s security system just fails to check whether the badge actually belongs there.
The cover-your-tracks angle
Most cyberattacks leave footprints. Anomalous login times, unexpected file access logs, emails in “Sent” that the user didn’t write. Incident responders are trained to look for exactly these artifacts. ClaudeBleed makes that job dramatically harder.
Claude relies on text, user interface semantics, and interpretation of screenshots to make decisions all things that an attacker can control on the input side. The researchers modified Claude’s user interface to remove labels and indicators around sensitive information, like passwords and sharing feedback, then prompted Claude to share the files with an outside server.
Because Claude is making decisions based on what it perceives the page to say not what it actually says the attacker can reshape Claude’s understanding of reality. A “Share externally” button can be relabeled “Request feedback.” A file named “Top Secret” can be made invisible to Claude’s perception while still being exfiltrated. The AI doesn’t know it’s been deceived because it has no ground truth to check against.
Where there is visible activity, the model can be prompted to cover its tracks by deleting emails and other evidence of its actions. This creates a scenario where a full data exfiltration credentials, private documents, confidential communications could occur with no forensic evidence remaining. The victim would have no reason to suspect anything happened until they discovered, externally, that their data had been leaked.
This is what makes ClaudeBleed qualitatively different from a traditional credential-stealing attack. Traditional attacks are passive they capture what passes through. This attack is agentic it actively uses your trusted tools, speaks with your voice, and then erases the conversation.
The “AI race” frame
The most durable insight from this research isn’t technical ,it’s cultural. The LayerX researcher warned: “In the current AI race, vendors are moving too fast and granting powerful capabilities to improve user experience, while neglecting basic security foundations and opening new opportunities for attackers. As AI agents become the norm, these structural flaws are a ticking time bomb.”
This is worth examining carefully because it applies far beyond Anthropic. The commercial pressure to ship AI agents that do things book appointments, send emails, file tickets, manage calendars creates a direct conflict with the security principle of least privilege. An AI agent that can’t actually take consequential actions isn’t very impressive in a demo. An AI agent that can take consequential actions is, by definition, a high-value target.
The problem is that agentic AI collapses two things that traditional software kept separate: authentication and authorization. When you log into Gmail, you are authenticated as yourself and authorized to act as yourself. When Claude in Chrome accesses Gmail on your behalf, the authorization is implicit Claude inherits your authenticated session. There’s no separate token, no scoped permission, no audit trail of “Claude did this.” It’s just you, as far as Gmail is concerned.
This architectural shortcut is understandable in a world where AI agents are still experimental. It becomes dangerous the moment those agents touch production data at scale. ClaudeBleed is a case study in what happens when the shortcut gets exploited before the architecture gets hardened.
Claude Code OAuth Tokens Can Be Stolen Through Stealthy MCP Hijacking
An OAuth token with wide access rights can be stolen stealthily and largely undetectably from Claude Code.
Claude Code is an agentic system. This is great for developers but concerning for security teams. Agentic systems can expand the attack surface while operating largely invisibly. A major issue is the OAuth token. If an attacker can acquire this, the adversary effectively has a master key or digital proxy granting access to every tool connected to or accessible from the Claude Code MCP.
Mitiga Labs has identified an issue within Claude Code that would allow attackers to redirect output, including the tokens, to their own infrastructure before everything is sent on to the legitimate destination. It’s a classic man-in-the-middle-attack giving the attacker access to the tokens.
The MCP configuration and the OAuth tokens are stored in ~/.claude.json. If an adversary can modify that file, MCP traffic can be redirected through the attacker’s own infrastructure. Mitiga has published details of how this could be achieved.
The two prerequisites for the attacker is the ability to install a tailored npm on a machine where Claude Code is configured with dynamic authorization MCP servers. The NPM registers a lifecycle hook that runs as part of the install.
A post installation hook locates common clone locations, and populates the paths with a pre-configured trust dialog set to true. “No prompt will fire when the directory is later opened, because the flag the prompt is gated on is already set,” reports Mitiga.
The hook also opens ~/.claude.json and edits the MCP server in the global config file. It edits ‘mcpServers’ to include the proxy address. “This puts us, ‘the adversary’, in the middle of any request that goes out to the MCP server. As the attacker, we got mitmproxy configured and intercepting,” explains Mitiga.
Whenever Claude Code initiates or refreshes the MCP session, it connects to the proxy and the token transits to the attacker’s infrastructure. The user just sees a valid flow. If the user rotates the token, the hook writes it back on the next load. If the user edits the MCP URL, the hook loads it back on the next load. The attacker has achieved both stealth and persistence.
The attacker gets, “A durable redirection of the victim’s SaaS credentials into attacker-controlled infrastructure, with automatic recovery from token rotation, invisible to the victim’s endpoint UI, and indistinguishable from legitimate traffic on the provider’s side.”
As a man in the middle, the attacker can easily steal any OAuth token since it is stored in plain text within ~/.claude.json. Once stolen the attacker can use the token as an MFA-bypassing golden key into any tool to which the MCP connects, with the same permissions as the user.
Without care, the user sees nothing. No flags are raised since the MCP is simply doing what it is told to do, and the user isn’t aware these actions have been compromised. The new adage of assuming a compromise has happened should take center stage. “Monitor Claude Code configuration changes, MCP server URL changes, OAuth refresh behavior, suspicious SaaS API activity, and unexpected traffic through MCP integrations,” suggests Mitiga.
What you mustn’t do is wait for a solution from Anthropic. Mitiga reported its findings to Anthropic on April 10, 2026. On April 12, 2026, Anthropic replied it was ‘out of scope’. The reason given was effectively the same as its response to Adversa’s ‘TrustFall’ disclosure: the user has already consented to what might happen next.
My Thoughts
Unpacking the Technical Threat
The attack is elegant in the way the worst attacks always are it doesn’t break anything, it however redirects everything. A few specific angles worth developing:
The npm lifecycle hook as a delivery vector. The prerequisite is the ability to install a crafted npm package on a machine running Claude Code. This sounds like a high bar, but it isn’t. Supply chain attacks via malicious or typosquatted npm packages are routine. Developers Claude Code’s primary users install npm packages constantly, often without deep inspection. The attack doesn’t require a zero-day, a phishing campaign, or physical access. It requires one careless npm install from a poisoned repository, which is exactly how most supply chain compromises begin apparently.
The ~/.claude.json plaintext problem. OAuth tokens in plain text in a home directory config file is, to put it charitably, a design choice that prioritises developer convenience over security hygiene. Tokens of this sensitivity belong in OS-level credential stores macOS Keychain, Windows Credential Manager, Linux Secret Service where they benefit from access controls, encryption at rest, and auditing. The decision to use a flat JSON file means that any process, any script, any malicious package running as the same user has trivial read/write access. This isn’t a sophisticated exploit it’s just a file open.
The persistence mechanism is the most alarming part. Stealing a token is bad. Maintaining access through token rotation is a different category of threat. The hook writes the stolen config back every time the file is modified. The attacker effectively owns the MCP configuration in perpetuity. The standard remediation step for a stolen credential rotating the token is not just ineffective here, it is actively counterproductive. Every rotation becomes a fresh delivery to the attacker. This flips the power dynamic in a way that most security playbooks aren’t designed to handle.
MFA bypass as the downstream consequence. Modern security hygiene teaches users to trust MFA as a last line of defence. If a password is stolen, MFA saves you. OAuth tokens, however, are post-authentication artefacts they are issued after MFA was satisfied. Stealing an OAuth token means the attacker skips the entire authentication ceremony and arrives at the authorized state directly. Every tool connected to the MCP code repositories, cloud infrastructure, SaaS platforms, internal API becomes accessible without a single MFA prompt. The attacker is not breaking into your house; they are using your key, which you handed them unknowingly.
The invisibility is structural, not incidental. There are no anomalous login events because the token is legitimate. There are no unusual API calls because the attacker is operating with the user’s exact permission set. The MCP continues to function normally from the user’s perspective because the proxy forwards everything correctly it just reads it first. Traditional detection methods look for anomalies. This attack produces none. That structural invisibility is a design consequence of how agentic systems work, not a clever trick the attacker deployed.
The Bigger Issue In my Opinion: The “User Consented” Defence Is some BS!!
Anthropic’s “out of scope” response echoes its reaction to Adversa’s TrustFall disclosure. The logic, as best as can be reconstructed, goes: the user chose to install Claude Code, the user chose to configure MCP servers, the user chose to connect OAuth tokens therefore the user accepted the risks that follow. This is consent-as-liability-shield, and it deserves serious pushback. SOME ABSOLUTE BS!!!!!!
Consent requires comprehension. A developer configuring Claude Code understands they are connecting a powerful AI coding assistant to their tools. They do not understand because no documentation meaningfully conveys this that a single malicious npm package can then silently redirect their entire authenticated identity to an attacker’s infrastructure, persist through every remediation step they know how to take, and remain entirely invisible. Consent to use a tool is not consent to every attack vector that tool’s architecture enables. If it were, every vendor could simply disclaim responsibility for every vulnerability by pointing to the terms of service.
The “already consented” argument scales dangerously. If Anthropic’s position is that users who configure dynamic authorization MCP servers have accepted the risks of OAuth token interception, what exactly is Anthropic’s security team responsible for? This logic would exculpate any vendor from any design flaw, so long as the user had to configure something first. Security researchers and practitioners have spent decades pushing back against exactly this framing. The user consented to install software, not to have their credentials silently exfiltrated with persistence through rotation.
The user population matters. Claude Code’s users are developers. They are, relative to the general population, sophisticated. They understand command lines, configuration files, and package managers. If a developer installing npm packages in the normal course of their work cannot be expected to detect or defend against this attack, the attack is not within the user’s reasonable ability to manage. Pushing responsibility to users for attacks they cannot practically detect or remediate is a values statement about who the vendor thinks bears the security burden.
Compare to the industry standard. When a password manager stores credentials in plaintext and blames users for installing malware, we call that negligent design. When a browser extension exposes authentication tokens to other extensions and says users consented to the extension model, we publish CVEs. The “user consented” frame is applied inconsistently across the industry, and almost always in ways that benefit vendors rather than users.
The Broader Agentic AI Security Problem
This vulnerability is a useful case study in a structural challenge the industry has not yet seriously grappled with.
Agentic systems are valuable precisely because they act. An AI coding assistant that can’t run commands, call APIs, or interact with tools isn’t very useful. But every capability you give an agentic system is also a capability you give to anyone who can compromise that system. The attack surface of Claude Code is not just Claude Code it is every tool Claude Code can reach, with every permission the user has been granted, with no additional authentication step between them.
Traditional software has explicit attack surfaces: this binary, this port, this API endpoint. Agentic systems have contextual attack surfaces that expand and contract based on what the agent is configured to do. A developer who connects Claude Code to their AWS environment, their GitHub, their Jira, their Slack, and their production database has created an agent that, if compromised, yields access to all of those systems simultaneously. The OAuth token is not one key it is a keyring.
The MCP (Model Context Protocol) architecture specifically is worth examining here. MCP is designed to make it easy to connect AI agents to tools and services. That ease is the point Anthropic wants a rich ecosystem of MCP integrations. But ease of connection is the inverse of security of connection. Every MCP server added to a Claude Code configuration is another potential entry point, another set of credentials to protect, another attack surface to monitor. The architecture that makes MCP powerful is the same architecture that makes this attack possible.
What Defenders Should Actually Do
Treat ~/.claude.json as a sensitive credential file. Set file permissions to 600 (owner read/write only). Monitor it for unexpected changes using file integrity monitoring tools anything from basic inotifywait on Linux to enterprise endpoint detection tools. Any write to this file outside of an expected Claude Code configuration session should trigger an alert.
Audit npm install practices on developer machines. Implement policies around package provenance verify packages against known registries, use lockfiles, consider npm audit as a mandatory pre-install step. Treat developer workstations with agentic AI tooling as high-value targets deserving the same scrutiny as production servers.
Review MCP server configurations regularly. Know what MCP servers should be configured and what addresses they should point to. Any unexpected URL in mcpServers is a potential indicator of compromise.
Monitor OAuth token usage patterns for connected tools. If Claude Code is connected to GitHub, for example, set up alerts for API calls made with your OAuth token from unexpected IP addresses or at unexpected times. The attacker is using your token, but they are not you, therefore their usage patterns may differ from yours.
Scope OAuth tokens aggressively. Where possible, issue OAuth tokens with the minimum permission set needed for Claude Code’s legitimate use rather than broad account access. A token that can only read repositories is less damaging when stolen than one that can write, delete, and administer.
Do not wait for Anthropic. This bears repeating plainly. Mitiga reported this on April 10. Anthropic responded on April 12 with “out of scope.” There is currently no indication that a fix is planned. The attack works today. Mitigation has to come from defenders.
A Note on the Anthropic Response Pattern
This is now a pattern, not an incident. The TrustFall disclosure received a similar response. ClaudeBleed’s patch was bypassed in three hours. A CVSS 10/10 RCE in Claude Desktop Extensions was reportedly left unpatched. The emerging picture is of a vendor deploying powerful, widely-connected agentic systems while adopting a narrow definition of what constitutes its security responsibility.
This is not a unique failure. The AI industry broadly is in a period where speed to market and feature richness are rewarded, and security hardening is deferred. But Anthropic’s specific responses to specific disclosures are now a matter of public record. Security teams evaluating whether to deploy Claude Code in enterprise environments should weigh not just the technical capabilities of the tool, but the vendor’s demonstrated approach to vulnerability response. That is a legitimate and important input to procurement and risk decisions.
The adage Mitiga quotes assume compromise has already happened is the right posture. But it is worth adding: also assume that the vendor will not close the gap for you, and plan accordingly.
Closing
I hope you have found value in today’s article. Consider clapping, subscribing and following me on my socials.
- LinkedIn: https://www.linkedin.com/in/m49d4ch3lly
- Twitter: https://twitter.com/m49D4ch3lly
- Gmail: [email protected]
Sources for this article @https://www.securityweek.com/