Global ETD Search

Return to search

Program synthesis and vulnerability injection using a Grammar VAE

The ability to automatically detect and repair vulnerabilities in code before deployment has become the subject of increasing attention. Some approaches to this problem rely on machine learning techniques, however the lack of datasets–code samples labeled as containing a vulnerability or not–presents a barrier to performance. We design and implement a deep neural network based on the recently developed Grammar Variational Autoencoder (VAE) architecture to generate an arbitrary number of unique C functions labeled in the aforementioned manner. We make several improvements on the original Grammar VAE: we guarantee that every vector in the neural network’s latent space decodes to a syntactically valid C function; we extend the Grammar VAE into a context-sensitive environment; and we implement a semantic repair algorithm that transforms syntactically valid C functions into fully semantically valid C functions that compile and execute. Users can control the semantic qualities of output functions with our constraint system. Our constraints allow users to modify the return type, change control flow structures, inject vulnerabilities into generated code, and more. We demonstrate the advantages of our model over other program synthesis models targeting similar applications. We also explore alternative applications for our model, including code plagiarism detection and compiler fuzzing, testing, and optimization.

https://hdl.handle.net/2144/37102

Identifer	oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/37102
Date	09 August 2019
Creators	Kosta, Leonard Raymond
Contributors	Xi, Hongwei
Source Sets	Boston University
Language	en_US
Detected Language	English
Type	Thesis/Dissertation

Page generated in 0.0016 seconds

Program synthesis and vulnerability injection using a Grammar VAE

Description

Links & Downloads

Tags

Additional Fields