Return to search

Contributions to the Interface between Experimental Design and Machine Learning

In data science, machine learning methods, such as deep learning and other AI algorithms, have been widely used in many applications. These machine learning methods often have complicated model structures with a large number of model parameters and a set of hyperparameters. Moreover, these machine learning methods are data-driven in nature. Thus, it is not easy to provide a comprehensive evaluation on the performance of these machine learning methods with respect to the data quality and hyper-parameters of the algorithms. In the statistical literature, design of experiments (DoE) is a set of systematical methods to effectively investigate the effects of input factors for the complex systems. There are few works focusing on the use of DoE methodology for evaluating the quality assurance of AI algorithms, while an AI algorithm is naturally a complex system. An understanding of the quality of Artificial Intelligence (AI) algorithms is important for confidently deploying them in real applications such as cybersecurity, healthcare, and autonomous driving. In this proposal, I aim to develop a set of novel methods on the interface between experimental design and machine learning, providing a systematical framework of using DoE methodology for AI algorithms.

This proposal contains six chapters. Chapter 1 provides a general introduction of design of experiments, machine learning, and surrogate modeling. Chapter 2 focuses on investigating the robustness of AI classification algorithms by conducting a comprehensive set of mixture experiments. Chapter 3 proposes a so-called Do-AIQ framework of using DoE for evaluating the AI algorithm’s quality assurance. I establish a design-of-experiment framework to construct an efficient space-filling design in a high-dimensional constraint space and develop an effective surrogate model using additive Gaussian process to enable the quality assessment of AI algorithms. Chapter 4 introduces a framework to generate continual learning (CL) datsets for cybersecurity applications. Chapter 5 presents a variable selection method under cumulative exposure model for time-to-event data with time-varying covariates. Chapter 6 provides the summary of the entire dissertation. / Doctor of Philosophy / Artificial intelligence (AI) techniques, including machine learning and deep learning algorithms, are widely used in various applications in the era of big data. While these algorithms have impressed the public with their remarkable performance, their underlying mechanisms are often highly complex and difficult to interpret. As a result, it becomes challenging to comprehensively evaluate the overall performance and quality of these algorithms. The Design of Experiments (DoE) offers a valuable set of tools for studying and understanding the underlying mechanisms of complex systems, thereby facilitating improvements. DoE has been successfully applied in diverse areas such as manufacturing, agriculture, and healthcare. The use of DoE has played a crucial role in enhancing processes and ensuring high quality. However, there are few works focusing on the use of DoE methodology for evaluating the quality assurance of AI algorithms, where an AI algorithm can be naturally considered as a complex system. This dissertation aims to develop innovative methodologies on the interface between experimental design and machine learning. The research conducted in this dissertation can serve as practical tools to use DoE methodology in the context of AI algorithms.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/115953
Date31 July 2023
CreatorsLian, Jiayi
ContributorsStatistics, Deng, Xinwei, Hong, Yili, Freeman, Laura J., Kim, Inyoung
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.002 seconds