Global ETD Search

Return to search

Towards Robust Machine Learning Models for Data Scarcity

abstract: Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid deployment of neural networks. Besides, the current research scales poorly to a large number of unseen concepts and is passively spoon-fed with data and supervision.

To overcome the above data scarcity and generalization issues, in my dissertation, I first propose two unsupervised conventional machine learning algorithms, hyperbolic stochastic coding, and multi-resemble multi-target low-rank coding, to solve the incomplete data and missing label problem. I further introduce a deep multi-domain adaptation network to leverage the power of deep learning by transferring the rich knowledge from a large-amount labeled source dataset. I also invent a novel time-sequence dynamically hierarchical network that adaptively simplifies the network to cope with the scarce data.

To learn a large number of unseen concepts, lifelong machine learning enjoys many advantages, including abstracting knowledge from prior learning and using the experience to help future learning, regardless of how much data is currently available. Incorporating this capability and making it versatile, I propose deep multi-task weight consolidation to accumulate knowledge continuously and significantly reduce data requirements in a variety of domains. Inspired by the recent breakthroughs in automatically learning suitable neural network architectures (AutoML), I develop a nonexpansive AutoML framework to train an online model without the abundance of labeled data. This work automatically expands the network to increase model capability when necessary, then compresses the model to maintain the model efficiency.

In my current ongoing work, I propose an alternative method of supervised learning that does not require direct labels. This could utilize various supervision from an image/object as a target value for supervising the target tasks without labels, and it turns out to be surprisingly effective. The proposed method only requires few-shot labeled data to train, and can self-supervised learn the information it needs and generalize to datasets not seen during training. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2020

http://hdl.handle.net/2286/R.I.57014

Medical Image Analysis

Modeling Disease Progression

Sparse Coding

Identifer	oai:union.ndltd.org:asu.edu/item:57014
Date	January 2020
Contributors	Zhang, Jie (Author), Wang, Yalin (Advisor), Liu, Huan (Committee member), Stonnington, Cynthia (Committee member), Liang, Jianming (Committee member), Yang, Yezhou (Committee member), Arizona State University (Publisher)
Source Sets	Arizona State University
Language	English
Detected Language	English
Type	Doctoral Dissertation
Format	172 pages
Rights	http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.002 seconds

Towards Robust Machine Learning Models for Data Scarcity

Description

Links & Downloads

Tags

Additional Fields