Backdoors in ML - The Dark Side of Hugging Face

AIBlade Podcast

0:00

-10:28

Backdoors in ML - The Dark Side of Hugging Face

Be wary when trying out new ML models...

David Willis-Owen

May 15, 2024

Transcript

New machine learning models are an exciting field to research. Hugging Face is the leader in this space, allowing people to upload and download open-source ML projects.

At the time of writing, over half a million open-source models are available on Hugging Face. But innovative threat actors are using the hype around AI as a guise to hack victim computers.

In this post, we will use a case study to examine a malicious machine learning model. We will look at how the exploit works, why it poses a threat, and how it can be defended against in the future.

Hugging Face - An Attacker’s Playground

The Pickle Problem

Case Study - Malicious Machine Learning Model

Python Reverse Shell

Defending Against Malicious ML Models

Final Thoughts - Looking To The Future

Hugging Face - An Attacker’s Playground

Hugging Face is the #1 platform for people to upload and share their ML projects. Think Github for AI.

Anyone can upload a model to Hugging Face, making it a natural location for attackers to host viruses. In response, Hugging Face scans every file for indicators of malware.

However, the platform simply flags these models as unsafe. It still allows its users to download malicious files:

HuggingFace Warning for Detected Unsafe Models via Pickle Scanning

Rest assured, I will circle back to this point later…

The Pickle Problem

Machine learning models are commonly written in Python with a framework called PyTorch. These models often contain binary files in the “pickle” format. Pickle is used to serialize Python objects (convert them to bytes for easier sending or storage).

When a victim loads a pickled file, the Python objects are deserialized. In the process, any code contained in the files is executed, letting an attacker perform arbitrary operations on the target computer.

This is known as an insecure deserialization exploit - you can learn more here.

Case Study - Malicious Machine Learning Model

On February 27th, 2024, security company JFrog published an article examining a malicious Hugging Face model by a new user named “baller423”. They used a tool called fickling to reverse engineer a malicious file and examine an injected payload as per the image below.

Python Reverse Shell

The script provided attempts to create a “reverse shell” on the target machine. Reverse shell is a term given to a target computer initiating a connection back to an attacker-controlled machine, letting the attacker execute arbitrary commands.

Here are some key pieces of functionality:

RHOST & RPORT - Define IP address and port of attacker server to connect to
if platform != ‘win32’ - Check if target machine is Windows and modify shell accordingly
pty.spawn(“/bin/sh”) - Creates a shell on Unix-based machines
subprocess.Popen([“powershell.exe”]) = Uses powershell instead if target is Windows-based
while True - Infinite loop to continually try connecting to attacker-based server

Significance

If you are new to Python and offensive security, the code may look intimidating. But to an expert eye, this payload is basic.

In just 46 lines, the attacker can gain full access to most computers. All the target has to do is unwittingly load this model, and their machine will be compromised.

Traditionally, the most effective attack vector to gain remote code execution on a target is email - sending malicious attachments and socially engineering them to click. Organizations are well aware of this, spending billions of dollars on firmwide phishing training and bleeding-edge email filtering software.

But AI/ML is a very new field, with enterprises keen to capitalize on the associated hype. Many organizations do not have the same levels of security controls around Hugging Face ML models, making them the perfect Trojan horse to disguise dangerous Python reverse shells.

Defending Against Malicious ML Models

Here are some steps different stakeholders can take to mitigate the threats of Malicious Machine Learning Models:

Hugging Face - DON’T allow malicious uploads!

Currently, Hugging Face comprehensively scans files and flags them up as potentially malicious. Instead of flagging them, Hugging Face should block the files completely. This completely eradicates the threat of a user ignoring a warning and getting pwned!

Organizations - Keep TIGHT security controls around AI/ML

Organizations should keep attack vectors like this in mind, ensuring they are treated with the same level of scrutiny as more established technologies. Hugging Face should be blocklisted by default and pickled object downloads should be monitored.

Individuals - Be AWARE of what happens when you load an untrusted ML model

When you load a machine learning model, you could be executing untrusted code! Be aware of this risk and make sure the developer is reputable.

Final Thoughts - Looking To The Future

In summary, innovative threat actors are using AI as cover to carry out traditional attacks. In the next 5 years I believe this method will become more prevalent, leading to compromises of individuals and organizations.

This attack vector is so effective because people lower their guards when working with AI. The focus is on the latest features and tools, and security is often disregarded. By educating everyone about cybersecurity with posts like these, we can draw attention to upcoming threats, cultivating a safer AI future.

If this article interests you, check out my piece on Unjailbreakable Large Language Models below. Thanks for reading.

Unjailbreakable Large Language Models

David Willis-Owen

May 9, 2024

Since the beginning of the AI gold rush, people have used large language models for malicious intent. Drug recipes, explicit output, and discriminatory behaviour have all been elicited, with often hilarious results. These techniques are known as “prompt injections”

Read full story