For years, CISOs have been fantasizing about truly automated penetration testing, allowing them to quickly find critical bugs in key applications. While this dream isn’t fully here yet, VulnHuntr offers an LLM-based code analysis package that promises to “find and explain complex, multistep vulnerabilities”. In this post, we’ll look at what VulnHuntr is, how it works, and if this tool lives up to its bold claim.
Contents
What is VulnHuntr?
How Does It Work?
Getting Started
Vulnerability Scanning
Limitations
Final Thoughts - The Future
What is VulnHuntr?
VulnHuntr is an LLM-powered static code analysis tool by ProtectAI, specializing in uncovering complex vulnerabilities in Python applications. Over a dozen zero-day vulnerabilities have been found in popular Github repos using this tool, showcasing its immediate value for bug bounty hunters.
VulnHuntr is free to use and works with Anthropic, Openai or Ollama as of the time of writing.
How Does It Work?
VulnHuntr uses Jedi to parse Python code, starting by analyzing the codebase’s user entry point for vulnerabilities. If the LLM finds any potential bugs, it searches other files for references to code objects in a recursive call chain, until the full path from user input to server output is mapped out.
This call chain allows the tool to ingest all context relevant to a vulnerability without needing to parse the entire codebase, dramatically improving its accuracy and minimizing the required tokens. VulnHuntr analyzes all the context and outputs a final report, POC, and confidence rating for each vulnerability.
Getting Started
The quickest way to install VulnHuntr is on Linux. I used the following commands to get the tool up and running, taken from a Huntr blog post:
Add deadsnakes PPA and install Python 3.10:
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.10 python3.10-venv python3.10-dev
Install pip specifically for 3.10:
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
Now you can install pipx using Python 3.10
python3.10 -m pip install --user pipx python3.10 -m pipx ensurepath
Install Vulnhuntr:
pipx install git+https://github.com/protectai/vulnhuntr.git --python python3.10
To run the tool, you need to obtain an API key from your favourite LLM provider. I used OpenAI to do this, and set the key like so:
export OPENAI_API_KEY=”your_key_here”
VulnHuntr only works on Python applications, so I decided to find a suitable bug bounty target from the Huntr platform. Read my post below for more information on AI Bug Bounties!
I chose Apache Airflow as my target, based on its web ui and Python architecture. Finally, I located the user entry point and ran the following command to begin my scan:
vulnhuntr -l gpt -r airflow -a ./airflow/www/views.py
Vulnerability Scanning
VulnHuntr performed a scan and came up with some very interesting findings! It was very confident about the existence of an RCE vulnerability in the /rendered_templates endpoint, referencing arbitrary Python execution with a POC payload.
Unfortunately, /rendered_templates is a non-existent endpoint on the target application, so I was unable to reproduce the issue. Furthermore, the IDs it references as injection points are not accessible to end users, making this finding a hallucination.
It’s worth noting that other people have had success using this tool. Dan McInerney from Huntr was able to find a Local File Inclusion vulnerability in the gpt_academic repository, as shown below:
Limitations
First, as shown above, VulnHuntr is susceptible to hallucination. Most findings made by the tool will turn out to be invalid, preventing fully automated vulnerability discovery.
Next, the tool is expensive to run! 2 usages cost me $5 of API credit, and across several scans, this cost will quickly add up.
Finally, VulnHuntr only works on Python codebases. This makes its application very limited and means codebases in popular languages like Java cannot be scanned.
Final Thoughts - The Future
VulnHuntr is a valuable tool for penetration testers that leverages the power of LLMs to find vulnerabilities. While it has already had success in certain scenarios, the hallucination rate is still very high, making it another tool for pentesters as opposed to a full replacement.
The future of AI pentesting is exciting. VulnHuntr can easily be adapted to scan codebases in other languages, giving it even more versatility. The more exciting development would be linking it to a web application proxy, allowing it to test payloads on the fly and iteratively craft working exploits!
On the other side of the equation, VulnHuntr is open-source, meaning threat actors can leverage this to develop new attacks as well. Secure coding will be more important than ever as these tools improve, and I look forward to seeing their advancements for better or worse in the next 5 years.
Check out my article below to learn more about the AI Goat. Thanks for reading.
Share this post