As software continues to envelop traditional industries the need for increased attention to cybersecurity is higher than ever. Software security helps protect businesses and governments from financial losses due to cyberattacks and data breaches, as well as reputational damage. In theory, securing software is relatively straightforward—it involves following certain best practices and guidelines to ensure that the software is secure. In practice, however, software security is often much more complicated. It requires a deep understanding of the underlying system and code (including potentially legacy code), as well as a comprehensive understanding of the threats and vulnerabilities that could be present. Additionally, software security also involves the implementation of strategies to protect against those threats and vulnerabilities, which may involve a combination of technologies, processes, and procedures. In fact many real cyber attacks are caused not from zero day vulnerabilities but from known issues that haven't been addressed so real software security also requires ongoing monitoring and maintenance to ensure critical systems remain secure.
This thesis presents a series of novel techniques that together form an enhanced software maintenance methodology from initial bug reporting all the way through patch deployment. We begin by introducing Ad Hoc Test Generation, a novel testing technique that handles when a security vulnerability or other critical bugis not detected by the developers’ test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before attackers pounce on exposed user installations. This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution.
Our prototype of this concept is called ATTUNE.
To construct patches in some instances developers maintaining software may be forced to deal directly with the binary since source code is no longer available. In these instances this work presents a transformer based model called DIRECT that provides semantics related names for variables and function names that have been lost giving developers the opportunity to work with a facsimile of the source code that would otherwise be unavailable. In the event developers need even more support deciphering the decompiled code we provide another tool called REINFOREST that allows developers to search for similar code which they can use to further understand the code in question and use as a reference when developing a patch.
After patches have been written, deployment remains a challenge. In some instances deploying a patch for the buggy behavior may require supporting legacy systems where software cannot be upgraded without causing compatibility issues. To support these updates this work introduces the concept of binary patch decomposition which breaks a software release down into its component parts and allows software administrators to apply only the critical portions without breaking functionality.
We present a novel software patching methodology that we can recreate bugs, develop patches, and deploy updates in the presence of the typical challenges that come when patching production software including deficient test suites, lack of source code, lack of documentation, compatibility issues, and the difficulties associated with patching binaries directly.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/cqem-n106 |
Date | January 2024 |
Creators | Saieva, Anthony |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.0024 seconds