Global ETD Search

Return to search

A Comprehensive Approach to Evaluating Usability and Hyperparameter Selection for Synthetic Data Generation

<p dir="ltr">Data is the key component of every machine-learning algorithm. Without sufficient quantities of quality data, the vast majority of machine learning algorithms fail to perform. Acquiring the data necessary to feed algorithms, however, is a universal challenge. Recently, synthetic data production methods have become increasingly relevant as a method of ad-dressing a variety of data issues. Synthetic data allows researchers to produce supplemental data from an existing dataset. Furthermore, synthetic data anonymizes data without losing functionality. To advance the field of synthetic data production, however, measuring the quality of produced synthetic data is an essential step. Although there are existing methods for evaluating synthetic data quality, the methods tend to address finite aspects of the data quality. Furthermore, synthetic data evaluation from one study to another varies immensely adding further challenge to the quality comparison process. Finally, al-though tools exist to automatically tune hyperparameters, the tools fixate on traditional machine learning applications. Thus, identifying ideal hyperparameters for individual syn-thetic data generation use cases is also an ongoing challenge.</p>

10.25394/pgs.26339674.v1

Data quality

Adversarial machine learning

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/26339674
Date	20 July 2024
Creators	Adriana Louise Watson (19180771)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/A_Comprehensive_Approach_to_Evaluating_Usability_and_Hyperparameter_Selection_for_Synthetic_Data_Generation/26339674

Page generated in 0.0012 seconds

A Comprehensive Approach to Evaluating Usability and Hyperparameter Selection for Synthetic Data Generation

Description

Links & Downloads

Tags

Additional Fields