The Department of Homeland Security reports that over 90% of cyberattacks stem from security vulnerabilities in software, costing the U.S. $109 billion dollars in damages in 2016 alone according to The White House. As NIST estimates that today's software contains 25 bugs for every 1,000 lines of code, the prompt discovery of security flaws is now vital to mitigating the next major cyberattack. Over the last decade, the software industry has overwhelmingly turned to a lightweight defect discovery approach known as fuzzing: automated testing that uncovers program bugs through repeated injection of randomly-mutated test cases. Academic and industry efforts have long exploited the semantic richness of open-source software to enhance fuzzing with fast and fine-grained code coverage feedback, as well as fuzzing-enhancing code transformations facilitated through lightweight compiler-based instrumentation. However, the world's increasing reliance on closed-source software (i.e., commercial, proprietary, and legacy software) demands analogous advances in automated security vetting beyond open-source contexts.
Unfortunately, the semantic gaps between source code and opaque binary code leave fuzzing nowhere near as effective on closed-source targets. The difficulty of balancing coverage feedback speed and precision in binary executables leaves fuzzers frequently bottlenecked and orders-of-magnitude slower at uncovering security vulnerabilities in closed-source software. Moreover, the challenges of analyzing and modifying binary executables at scale leaves closed-source software fuzzing unable to fully leverage the sophisticated enhancements that have long accelerated open-source software vulnerability discovery. As the U.S. Cybersecurity and Infrastructure Security Agency reports that closed-source software makes up over 80% of the top routinely exploited software today, combating the ever-growing threat of cyberattacks demands new practical, precise, and performant fuzzing techniques unrestricted by the availability of source code.
This thesis answers the following research questions toward enabling fast, effective fuzzing of closed-source software:
1. Can common-case fuzzing insights be exploited to more achieve low-overhead, fine-grained code coverage feedback irrespective of access to source code?
2. What properties of binary instrumentation are needed to extend performant fuzzing-enhancing program transformation to closed-source software fuzzing?
In answering these questions, this thesis produces the following key innovations:
A. The first code coverage techniques to enable fuzzing speed and code coverage greater than source-level fuzzing for closed-source software targets. (chapter 3) B. The first instrumentation platform to extend both compiler-quality code transformation and compiler-level speed to closed-source fuzzing contexts (chapter 4) / Doctor of Philosophy / The Department of Homeland Security reports that over 90% of cyberattacks stem from security vulnerabilities in software, costing the U.S. $109 billion dollars in damages in 2016 alone according to The White House. As NIST estimates that today's software contains 25 bugs for every 1,000 lines of code, the prompt discovery of security flaws is now vital to mitigating the next major cyberattack. Over the last decade, the software industry has overwhelmingly turned to lightweight defect discovery through automated testing, uncovering program bugs through the repeated injection of randomly-mutated test cases. Academic and industry efforts have long exploited the semantic richness of open-source software (i.e., software whose full internals are publicly available, interpretable, and changeable) to enhance testing with fast and fine-grained exploration feedback; as well as testing-enhancing program transformations facilitated during the process by which program executables are generated. However, the world's increasing reliance on closed-source software (i.e., software whose internals are opaque to anyone but its original developer) like commercial, proprietary, and legacy programs demands analogous advances in automated security vetting beyond open-source contexts.
Unfortunately, the challenges of understanding programs without their full source information leaves testing nowhere near as effective on closed-source programs. The difficulty of balancing exploration feedback speed and precision in program executables leaves testing frequently bottlenecked and orders-of-magnitude slower at uncovering security vulnerabilities in closed-source software. Moreover, the challenges of analyzing and modifying program executables at scale leaves closed-source software testing unable to fully leverage the sophisticated enhancements that have long accelerated open-source software vulnerability discovery. As the U.S. Cybersecurity and Infrastructure Security Agency reports that closed-source software makes up over 80% of the top routinely exploited software today, combating the ever-growing threat of cyberattacks demands new practical, precise, and performant software testing techniques unrestricted by the availability of programs' source code.
This thesis answers the following research questions toward enabling fast, effective fuzzing of closed-source software:
1. Can common-case testing insights be exploited to more achieve low-overhead, fine-grained exploration feedback irrespective of access to programs' source code?
2. What properties of program modification techniques are needed to extend performant testing-enhancing program transformations to closed-source programs?
In answering these questions, this thesis produces the following key innovations:
A. The first techniques enabling testing of closed-source programs with speed and exploration higher than on open-source programs. (chapter 3) B. The first platform to extend high-speed program transformations from open-source programs to closed-source ones (chapter 4)
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/110334 |
Date | 25 May 2022 |
Creators | Nagy, Stefan |
Contributors | Computer Science, Hicks, Matthew, Meng, Na, Yao, Danfeng, Kim, Taesoo, Wang, Gang |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Dissertation |
Format | ETD, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.003 seconds