Spelling suggestions: "subject:"stromdistribution generalizaton"" "subject:"stromdistribution centralizaiton""
1 |
Towards maintainable machine learning development through continual and modular learningOstapenko, Oleksiy 11 1900 (has links)
As machine learning models grow in size and complexity, their maintainability becomes a critical concern, especially when they are increasingly deployed in dynamic, real-world environments. This thesis addresses the challenges of efficient knowledge retention, integration, and transfer in multitask learning and continuous multitask learning, focusing on improving the maintainability of machine learning systems. At the core of this work is the exploration of modular methods and the strategic use of foundation models (FMs) to facilitate continuous learning (CL) and efficient model management. This thesis first examines how modularity can be exploited to enable continuous learning. The first paper “Continuous Learning via Local Module Composition” introduces the Local Modular Components (LMC) approach, which innovatively uses module-specific local routing to achieve automatic task inference, mitigate forgetting, and enable the fusion of independently trained LMCs. The principle of the local routing component has been extended and refined in subsequent research. The second paper, “Continuous Learning with Foundation Models: An Empirical Study of Latent Replay,” questions the need for complicated continuous learning methods in the era of foundation models. It explores the potential of realizing continuous learning using the encoded features of pre-trained foundation models. This latent continuous learning approach demonstrates that, depending on the characteristics of the tasks and data, latent replay can effectively and efficiently match the performance of traditional end-to-end continuous learning, especially when the alignment between the pre-training and downstream data distributions improves. The third paper, “Towards Modular LLMs by Building and Reusing a Library of LoRAs,” delves into the practical implementation of a hybrid approach combining modularity and foundation models. This work proposes the creation of a library of LoRA adapters, allowing the reuse and combination of these experts in different tasks, facilitated by novel routing techniques called Arrow. This thesis contributes to the field by demonstrating how modularity and foundation models can work together to create adaptive, efficient, and maintainable machine learning systems. It also outlines future directions, emphasizing the need to minimize model retraining through modular architectures and addressing open challenges in managing modular systems. / As machine learning models continue to grow in size and complexity, their maintainability has become a critical concern, especially as they are increasingly deployed in dynamic and real-world environments. This thesis addresses the challenges of efficient knowledge retention, integration, and transfer in multitask and continual multitask learning, focusing on improving the maintainability of machine learning systems. Central to this work is the exploration of modular methods and the strategic use of foundation models (FMs) to facilitate continual learning (CL) and efficient model management. This thesis first investigates how modularity can be leveraged to enable continual learning. The first article “Continual Learning via Local Module Composition” introduces the Local Modular Components (LMC) approach, which innovatively uses module-specific local routing to achieve automatic task inference, mitigate forgetting, and allow the merging of independently trained LMCs. The principle of the local routing component has been extended and refined in subsequent research. The second article, “Continual Learning with Foundation Models: An Empirical Study of Latent Replay”, questions the necessity of complicated continual learning methods in the era of foundation models. It explores the potential of performing CL using the encoded features from pre-trained foundation models. This latent CL approach demonstrates that, depending on the task and data characteristics, latent replay can effectively and efficiently match the performance of traditional end-to-end CL, especially as the alignment between pre-training and downstream data distributions improves. The third article, “Towards Modular LLMs by Building and Reusing a Library of LoRAs”, dives into the practical implementation of a hybrid approach combining modularity and foundation models. This work proposes creating a library of LoRA adapters, enabling the reuse and combination of these experts across different tasks, facilitated by novel routing techniques called Arrow. This thesis contributes to the field by demonstrating how modularity and foundation models can work in tandem to create adaptive, efficient, and maintainable machine learning systems. It also outlines future directions, emphasizing the need for minimizing model retraining through modular architectures and addressing open challenges in modular system management.
|
Page generated in 0.0983 seconds