Hackers Use Hidden Text To Trick Google Gemini

Security researchers have discovered a stealthy new method to manipulate Google’s Gemini AI assistant by hiding malicious commands in email code that Gemini unknowingly follows.

These indirect prompt injection (IPI) methods allow scammers to plant fake alerts inside AI-generated summaries, making them appear as legitimate warnings from Google itself, eventually leading users straight into phishing traps.

How The Exploit Works

Unlike traditional phishing scams that rely on shady links or attachments, this technique is far more subtle, as the trick lies in the email’s code. Attackers hide instructions in emails using invisible text —white font on a white background, zero-sized fonts, or off-screen elements. Though they remain unseen by the human eye, Gemini does see them and fully processes them.

Once the recipient clicks on “Summarize this email” in Google Workspace, Gemini scans the entire message, including the hidden sections. If those hidden parts contain malicious prompts, they are also included in the summary output.

This results in a fake yet convincing security alert urging users to call a support phone number or take urgent action. Since the alert appears to come from Gemini itself, users may trust it, making the attack especially dangerous.

Exploit Discovered Through A Bug Bounty Program

The prompt-injection vulnerability in Google Gemini for Workspace was disclosed to Mozilla’s 0din bug bounty program for generative AI tools by researcher Marco Figueroa, GenAI Bug Bounty Programs Manager at Mozilla. His demonstration showed how an attacker could embed hidden instructions using styling directives like <Admin> tags or CSS code, which are designed to hide content from human eyes, to trick Gemini.

As Gemini treats such instructions as part of the prompt, it ends up repeating them as if they were part of the original message in its summary output, without realizing they were malicious.

Figueroa provided a proof-of-concept example to demonstrate how Gemini could be tricked into displaying a fake security alert, warning the user that their Gmail password had been compromised and providing a fraudulent support number to call.

Why This Matters

The attack is a form of indirect prompt injection, where malicious input is buried within content the AI is supposed to summarize. This has become a growing concern as generative AI becomes integrated into daily workflows. With Gemini integrated across Google Workspace—Gmail, Docs, Slides, and Drive—any system where the assistant analyzes user content is potentially vulnerable.

What makes it more dangerous is that these summaries can seem very convincing. If Gemini includes a fake security warning, users might take it seriously as they trust Gemini as part of Google Workspace without realizing it’s actually a hidden malicious message.

Google’s Multi-Layered Defense Strategy

In response, Google has rolled out a layered defense system for Gemini that is designed to make these attacks harder to pull off. Measures include:

Machine learning classifiers to detect malicious prompts
Markdown sanitization to strip out dangerous formatting
Suspicious URL redaction
A user confirmation framework that adds an extra checkpoint before executing sensitive tasks.
Notifications to alert users when a prompt injection is detected

Google says it is also working with external researchers and red teams to refine its defenses and implement additional protections in future Gemini versions.

“We are constantly hardening our already robust defenses through red-teaming exercises that train our models to defend against these types of adversarial attacks,” a Google spokesperson told BleepingComputer in a statement.

While Google has stated there’s no evidence yet of this technique being used in real-world attacks, the discovery is a clear warning that even AI-generated content, no matter how seamless, can still be manipulated.

Hackers Use Hidden Text To Trick Google Gemini

Data Of 64 Million McDonald’s Job Applications Exposed

Elon Musk’s xAI Launches Grok 4, Declares It the Smartest AI Yet

Qantas Notifies 5.7 Million Customers Of Data Breach

Gmail Launches ‘Manage Subscriptions’ Feature

Brazil’s $140M Bank Heist Pulled Off For Just $2.7K

Suggested Post

Data Of 64 Million McDonald’s Job Applications Exposed

Elon Musk’s xAI Launches Grok 4, Declares It the Smartest AI Yet

Qantas Notifies 5.7 Million Customers Of Data Breach

Gmail Launches ‘Manage Subscriptions’ Feature