Researchers at AI security startup Zenity demonstrated how several widely used enterprise AI assistants can be abused by threat actors to steal or manipulate data.
The Zenity researchers showcased their findings on Wednesday at the Black Hat conference. They shared several examples of how AI assistants can be leveraged — in some cases without any user interaction — to do the attacker’s bidding.
Enterprise tools are increasingly integrated with generative AI to boost productivity, but this also opens cybersecurity holes that could be highly valuable to threat actors.
For instance, security experts demonstrated in the past how the integration between Google’s Gemini gen-AI and Google Workspace productivity tools can be abused through prompt injection attacks for phishing.
Researchers at Zenity showed last year how they could hijack Microsoft Copilot for M365 by planting specially crafted instructions in emails, Teams messages or calendar invites that the attacker assumed would get processed by the chatbot.
This year, Zenity’s experts disclosed similar attack methods — dubbed AgentFlayer — targeting ChatGPT, Copilot, Cursor, Gemini, and Salesforce Einstein.
In the case of ChatGPT, the researchers targeted its integration with Google Drive, which enables users to query and analyze files stored on Drive. The attack involved sharing a specially crafted file — one containing hidden instructions for ChatGPT — with the targeted user (this requires only knowing the victim’s email address).
When the AI assistant was instructed by the victim to process the malicious file, the attacker’s instructions would be executed, without any interaction from the victim. Zenity demonstrated the risks by getting ChatGPT to search the victim’s Google Drive for API keys and exfiltrate them.
In the case of Copilot Studio agents that engage with the internet — over 3,000 instances have been found — the researchers showed how an agent could be hijacked to exfiltrate information that is available to it. Copilot Studio is used by some organizations for customer service, and Zenity showed how it can be abused to obtain a company’s entire CRM.
When Cursor is integrated with Jira MCP, an attacker can create malicious Jira tickets that instruct the AI agent to harvest credentials and send them to the attacker. This is dangerous in the case of email systems that automatically open Jira tickets — hundreds of such instances have been found by Zenity.
In a demonstration targeting Salesforce’s Einstein, the attacker can target instances with case-to-case automations — again hundreds of instances have been found. The threat actor can create malicious cases on the targeted Salesforce instance that hijack Einstein when they are processed by it. The researchers showed how an attacker could update the email addresses for all cases, effectively rerouting customer communication through a server they control.
In a Gemini attack demo, the experts showed how prompt injection can be leveraged to get the gen-AI tool to display incorrect information. In Zenity’s example, the attacker got Gemini to provide a bank account owned by the attacker when the victim requested a certain customer’s account.
The ChatGPT and Copilot Studio weaknesses have been patched, but the rest have been flagged as ‘won’t fix’ by vendors, according to Zenity.
UPDATE: “We have recently deployed new, layered defenses that fix this type of issue. Having a layered defense strategy against prompt injection attacks is crucial – see our recent blog post with detail on the protections we’ve deployed to keep our users safe.” – a Google spokesperson told SecurityWeek.
Google also said it takes prompt injection attack defenses seriously, but pointed out that currently this is largely an area of intensive academic research involving hypothetical attacks, and the technique is rarely seen in the wild as adversarial activity.
Salesforce also told SecurityWeek that it has patched the flaw. A Salesforce spokesperson said, “Salesforce is aware of the vulnerability reported by Zenity and fixed the specific issue on July 11, 2025. The fix has been tested and this issue is no longer exploitable. The security landscape for prompt injection remains a complex and evolving area, and we will continue to invest in strong security controls and work closely with the research community to help protect our customers as these types of issues surface. For more details on how to maintain trust and security with Agentforce actions, see [help page].”
Related: Vibe Coding: When Everyone’s a Developer, Who Secures the Code?
Related: AI Guardrails Under Fire: Cisco’s Jailbreak Demo Exposes AI Weak Points
Related: Google Gemini Tricked Into Showing Phishing Message Hidden in Email

