Researchers Cause GitLab AI Developer Assistant to Turn Safe Code Malicious

Researchers Cause GitLab AI Developer Assistant to Turn Safe Code Malicious

Marketers often promote AI-assisted developer tools as essential workhorses for modern software engineers. GitLab, for instance, claims its Duo chatbot can "instantly generate a to-do list," eliminating the burden of "wading through weeks of commits." What they fail to mention is that these tools are, even if not by default, easily manipulated by malicious actors to perform hostile actions against users.

Recently, researchers from the security firm Legit demonstrated an attack on GitLab's Duo that inserted malicious code into a script it was directed to write. This attack could also potentially leak private code and confidential issue data, including zero-day vulnerability details. All it takes is for the user to instruct the chatbot to engage with a merge request or similar content from an external source.

AI Assistants’ Double-Edged Blade

The trick to triggering these attacks lies in prompt injections, one of the most common forms of chatbot exploits. Prompt injections are embedded into content that a chatbot is asked to process, such as emails, calendars, or web pages. Large language model-based assistants are so eager to follow orders that they often execute commands from any source, even potentially malicious ones.

The attacks on Duo were sourced from common developer resources, including merge requests, commits, bug descriptions, comments, and source code. Researchers showed how embedded instructions in these sources could mislead Duo.

"This vulnerability highlights the double-edged nature of AI assistants like GitLab Duo: deeply integrated into development workflows, they inherit not only context but risk," Legit researcher Omer Mayraz wrote. "By embedding hidden instructions in seemingly harmless content, we manipulated Duo's behavior, exfiltrated private source code, and demonstrated how AI responses can have unintended harmful outcomes."

In one attack, a hidden instruction in legitimate source code instructed Duo to output a URL pointing to "http://LEGIT.COM/YOURSECRETSHERE." The URL was cleverly masked to appear as “click here now!” using invisible Unicode characters, a format understood by LLMs but invisible to humans.

The generated malicious URLs were in clickable form, meaning users could unknowingly click a link leading to a malicious site. The attack took advantage of markdown language, which can render plain text in user-friendly ways, adding formatting elements without HTML tags.

The use of HTML tags

and

also facilitated the attack because Duo analyzes markdown asynchronously, rendering outputs line by line in real-time, allowing HTML tags to be treated as active web output in responses. This opened up new attack avenues, allowing attackers to embed instructions to leak confidential information, which Duo had access to, in the form of base64-encoded data sent to a user-controlled website.

This tactic enabled Mayraz to exfiltrate source codes from private repositories and confidential vulnerability reports Duo had access to. Legit reported this behavior to GitLab, which responded by disabling the ability for Duo to render unsafe tags such as

and when pointing to non-gitlab.com domains. Therefore, the exploits demonstrated ceased to function. This approach is common among AI chatbot providers to mitigate harmful outcomes from untrustworthy instructions.

This means that AI-assisted developer tools might not deliver the seamless productivity advertised by marketers. Developers must carefully review output from these assistants for signs of malice.

"The broader takeaway is clear: AI assistants are part of your application’s attack surface," Mayraz stated. "Any system allowing LLMs to ingest user-controlled content must consider it untrusted and potentially malicious. Context-aware AI is powerful, but without proper safeguards, it becomes a point of vulnerability."

Dan Goodin, Senior Security Editor at Ars Technica, covers malware, computer espionage, botnets, hardware hacking, encryption, and passwords. Based in San Francisco, he enjoys gardening, cooking, and following the independent music scene. Follow him on Mastodon and Bluesky.