Global ETD Search

Return to search

Learning with constraints on processing and supervision

Collecting a sufficient amount of data and centralizing them are both costly and privacy-concerning operations. These practical concerns arise due to the communication costs between data collecting devices and data being personal such as text messages of an end user. The goal is to train generalizable machine learning models with constraints on data without sharing or transferring the data.

In this thesis, we will present solutions to several aspects of learning with data constraints, such as processing and supervision. We focus on federated learning, online learning, and learning generalizable representations and provide setting-specific training recipes.

In the first scenario, we tackle a federated learning problem where data is decentralized through different users and should not be centralized. Traditional approaches either ignore the heterogeneity problem or increase communication costs to handle it. Our solution carefully addresses the heterogeneity issue of user data by imposing a dynamic regularizer that adapts to the heterogeneity of each user without extra transmission costs. Theoretically, we establish convergence guarantees. We extend our ideas to personalized federated learning, where the model is customized to each end user, and heterogeneous federated learning, where users support different model architectures.

As a next scenario, we consider online meta-learning, where there is only one user, and the data distribution of the user changes over time. The goal is to adapt new data distributions with very few labeled data from each distribution. A naive way is to store data from different distributions to train a model from scratch with sufficient data. Our solution efficiently summarizes the information from each task data so that the memory footprint does not scale with the number of tasks.

Lastly, we aim to train generalizable representations given a dataset. We consider a setting where we have access to a powerful teacher (more complex) model. Traditional methods do not distinguish points and force the model to learn all the information from the powerful model. Our proposed method focuses on the learnable input space and carefully distills attainable information from the teacher model by discarding the over-capacity information.

We compare our methods with state-of-the-art methods in each setup and show significant performance improvements. Finally, we discuss potential directions for future work.

https://hdl.handle.net/2144/46632

Artificial intelligence

Identifer	oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/46632
Date	30 August 2023
Creators	Acar, Durmuş Alp Emre
Contributors	Saligrama, Venkatesh
Source Sets	Boston University
Language	en_US
Detected Language	English
Type	Thesis/Dissertation
Rights	Attribution-NonCommercial-ShareAlike 4.0 International, http://creativecommons.org/licenses/by-nc-sa/4.0/

Page generated in 0.0022 seconds

Learning with constraints on processing and supervision

Description

Links & Downloads

Tags

Additional Fields