Return to search

Robust Approaches for Learning with Noisy Labels

Deep neural networks (DNNs) have achieved remarkable success in data-intense applications, while such success relies heavily on massive and carefully labeled data. In practice, obtaining large-scale datasets with correct labels is often expensive, time-consuming, and sometimes even impossible. Common approaches of constructing datasets involve some degree of error-prone processes, such as automatic labeling or crowdsourcing, which inherently introduce noisy labels. It has been observed that noisy labels severely degrade the generalization performance of classifiers, especially the overparameterized (deep) neural networks. Therefore, studying noisy labels and developing techniques for training accurate classifiers in the presence of noisy labels is of great practical significance. In this thesis, we conduct a thorough study to fully understand LNL and provide a comprehensive error decomposition to reveal the core issue of LNL. We then point out that the core issue in LNL is that the empirical risk minimizer is unreliable, i.e., the DNNs are prone to overfitting noisy labels during training. To reduce the learning errors, we propose five different methods, 1) Co-matching: a framework consists of two networks to prevent the model from memorizing noisy labels; 2) SELC: a simple method to progressively correct noisy labels and refine the model; 3) NAL: a regularization method that automatically distinguishes the mislabeled samples and prevents the model from memorizing them; 4) EM-enhanced loss: a family of robust loss functions that not only mitigates the influence of noisy labels, but also avoids underfitting problem; 5) MixNN: a framework that trains the model with new synthetic samples to mitigate the impact of noisy labels. Our experimental results demonstrate that the proposed approaches achieve comparable or better performance than the state-of-the-art approaches on benchmark datasets with simulated label noise and large-scale datasets with real-world label noise. / Dissertation / Doctor of Philosophy (PhD) / Machine Learning has been highly successful in data-intensive applications but is often hampered when datasets contain noisy labels. Recently, Learning with Noisy Labels (LNL) is proposed to tackle this problem. By using techniques from LNL, the models can still generalize well even when trained on the data containing noisy supervised information. In this thesis, we study this crucial problem and provide a comprehensive analysis to reveal the core issue of LNL. We then propose five different methods to effectively reduce the learning errors in LNL. We show that our approaches achieve comparable or better performance compared to the state-of-the-art approaches on benchmark datasets with simulated label noise and real-world noisy datasets.

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/27915
Date January 2022
CreatorsLu, Yangdi
ContributorsHe, Wenbo, Computing and Software
Source SetsMcMaster University
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0017 seconds