Return to search

A Comprehensive Approach to Evaluating Usability and Hyperparameter Selection for Synthetic Data Generation

<p dir="ltr">Data is the key component of every machine-learning algorithm. Without sufficient quantities of quality data, the vast majority of machine learning algorithms fail to perform. Acquiring the data necessary to feed algorithms, however, is a universal challenge. Recently, synthetic data production methods have become increasingly relevant as a method of ad-dressing a variety of data issues. Synthetic data allows researchers to produce supplemental data from an existing dataset. Furthermore, synthetic data anonymizes data without losing functionality. To advance the field of synthetic data production, however, measuring the quality of produced synthetic data is an essential step. Although there are existing methods for evaluating synthetic data quality, the methods tend to address finite aspects of the data quality. Furthermore, synthetic data evaluation from one study to another varies immensely adding further challenge to the quality comparison process. Finally, al-though tools exist to automatically tune hyperparameters, the tools fixate on traditional machine learning applications. Thus, identifying ideal hyperparameters for individual syn-thetic data generation use cases is also an ongoing challenge.</p>

  1. 10.25394/pgs.26339674.v1
Identiferoai:union.ndltd.org:purdue.edu/oai:figshare.com:article/26339674
Date20 July 2024
CreatorsAdriana Louise Watson (19180771)
Source SetsPurdue University
Detected LanguageEnglish
TypeText, Thesis
RightsCC BY 4.0
Relationhttps://figshare.com/articles/thesis/A_Comprehensive_Approach_to_Evaluating_Usability_and_Hyperparameter_Selection_for_Synthetic_Data_Generation/26339674

Page generated in 0.0012 seconds