fuzzdrivergpt is a GPT-based fuzz driver generator.
It is a tool aims to generate effective fuzz drivers for guys who want to fuzz some library APIs.
Ideally, an effective fuzz driver is a piece of high quality API usage code which can sufficiently test the given APIs without raising any false positive (bugs caused by the driver code).
Currently, fuzzdrivergpt:
- generates prompts utilizing multiple sources of API usage knowledge
- can validate the effectiveness of the generated fuzz drivers
- can iteratively fix/improve the generated drivers
fuzzdrivergpt has been tested on 86 APIs collected from 30 C projects of oss-fuzz. It shows promising results: it can generate correct fuzz drivers for 55 (64%) APIs fully automatically and 23 more APIs (91%) with manually configured semantic correctness checkers for filtering out the generated drivers.
The following demo shows the process of using iterative query strategy to generate valid fuzz drivers for md_html from project md4c.
fuzzdrivergpt generated a fuzz driver with compilation error in the first iteration, then it fixed that error in the following iteration using code fix prompts.
examples provide some fuzzdrivergpt generated drivers using iterative query strategies.
-
python 3, python 3.8 is recommended
-
docker, docker latest install steps
-
run
install-pre.sh
bash install-pre.sh YOUR-OPENAI-KEY YOUR-OPENAI-ORGIDAfter finishing the above, use . venv/bin/activate to enter the python environment for running fuzzdrivergpt.
Before generating, you need to prepare the analysis environment and execution environment. Currently, we provided environment for 30 OSS-Fuzz C projects. We're refining this to make the environment preparation more general and painless.
For supported projects, run the following command (we use md4c as an example):
python prepareOSSFuzzImage.py -t fuzzdrivergpt-env md4cThe following command generates fuzz driver for md_html from project md4c using gpt-4-0314 model and ITER-BA generation strategy. It tries up to 3 rounds (-MR 3) to generate the valid fuzz driver. Each round a driver will first be generated and then iteratively fixed up to 20 times (-MI 20). Details and results of each round are saved as test_roundX.json.
python main.py -l c -m gpt-4-0314 -t md4c -f md_html -q ITER-BA -MI 20 -o test.jsonIn our evaluation, ITER-BA and ITER-ALL are superior generation strategies. See technical overview to find more detail.
All results and details of prompts, validations, and driver codes can be found in output jsons.
- Detailed doc for output json format (coming soon)
- View local websites (coming soon)
Tested APIs provides a list of -t & -f options for try.
-
Better usability
- log refinement
- More documentation
- New process/interface for API targets out of OSS-Fuzz projects
- Automation for API usage collection
- Refine the heavy, manual prerequisites installation process
-
Greater functionalities
- Human experts in the loop (integrating experts feedback for better generation!)
- Driver enhance mode (not generate from scratch but from a working one!)
- Generate fuzz driver for closed-source APIs (such as binaries, macOS/Windows SDKs!)
-
More programming languages
Note that this tool is for educational purposes and the author does not condone any illegal use, see more details in LICENSE.txt.
Any suggestion, contribution, or discussion for fuzzdrivergpt is highly appreciated.
Contributors: