Spelling suggestions: "subject:"deep 1earning algorithm"" "subject:"deep 1earning allgorithm""
1 |
Computer Vision and Building EnvelopesAnani-Manyo, Nina K. 29 April 2021 (has links)
No description available.
|
2 |
SPARSE DEEP LEARNING FOR TIME SERIES DATA AND MAGNITUDE PRUNING OF LARGE PRETRAINED TRANSFORMER MODELS AND TEMPERING LEARNINGMingxuan Zhang (21215987) 02 May 2025 (has links)
<p dir="ltr">Sparse deep learning has proven to be an effective technique for improving the performance of deep neural networks in areas such as uncertainty quantification, variable selection, and large-scale model compression. While most existing research has focused on settings with independent and identically distributed (i.i.d.) observations, there has been limited exploration of scenarios involving dependent data, such as time series and sequential data in natural language processing (NLP). This work addresses this gap by establishing a theoretical foundation for sparse deep learning with dependent data. It demonstrates that sparse recurrent neural networks (RNNs) can be consistently estimated, and their predictions are asymptotically normally distributed under suitable conditions, enabling accurate prediction uncertainty quantification. Experimental results show that sparse deep learning outperforms state-of-the-art methods, such as conformal predictions, in quantifying uncertainty for time series data. Additionally, the method consistently identifies autoregressive orders in time series and surpasses existing approaches in large-scale model compression, with practical applications in fields like finance, healthcare, and energy.</p><p dir="ltr">The success of pruning techniques in RNN-based language models has inspired further exploration of their applicability to modern large language models. Pretrained transformer models have revolutionized NLP with their state-of-the-art performance but face challenges in real-world deployment due to their massive parameter counts. To tackle this issue, parameter pruning strategies have been explored, including magnitude and sensitivity based approaches. However, traditional magnitude pruning has shown limitations, particularly in transfer learning scenarios for modern NLP tasks. A novel pruning algorithm, Mixture Gaussian Prior Pruning (MGPP), is introduced to address these challenges. By employing a mixture Gaussian prior for regularization, MGPP prunes non-expressive weights while retaining the models expressive capabilities. Extensive evaluations on a variety of NLP tasks, including natural language understanding, question answering, and natural language generation, demonstrate the effectiveness of MGPP, particularly in high-sparsity settings. Theoretical analysis further supports the consistency of sparse transformers, providing insights into the success of this approach. These advancements contribute to optimizing large-scale language models for real-world applications, improving efficiency while maintaining performance.</p><p dir="ltr">State-space modeling has recently emerged as a powerful technique across various fields, including biology, finance, and engineering. However, its potential for training deep neural networks (DNNs) and its applicability to generative modeling remain underexplored. In this part of the dissertation, we introduce tempering learning, a novel algorithm that leverages state-space modeling to train deep neural networks. By manually constructing a tempering ladder, we transform the original learning problem to a data assimilation problem. In addition to its optimization advantages, tempering learning can be extended to one-step image generation through a diffusion-like process. Extensive experiments demonstrate the effectiveness of our approach across classical machine learning tasks, while also showcasing its promise for one-step unconditional image generation on CIFAR-10 and ImageNet datasets.</p>
|
3 |
GENERAL AVIATION AIRCRAFT FLIGHT STATUS IDENTIFICATION FRAMEWORKQilei Zhang (18284122) 01 April 2024 (has links)
<p dir="ltr">The absence or limited availability of operational statistics at general aviation airports restricts airport managers and operators from assessing comprehensive operational data. The traditional manual compilation of operational statistics is labor-intensive and lacks the depth and accuracy to depict a holistic picture of a general aviation airport’s operations. This research developed a reliable and efficient approach to address the problem by providing a comprehensive and versatile flight status identification framework. </p><p dir="ltr">Leveraging the BlueSky flight simulation module, the research can generate a synthetic flight database to emulate real-world general aviation aircraft’s flight scenarios. Two neural network architectures, namely, an RNN-GAN network and a refined Seq2Seq network, were explored to examine their capability to reconstruct flight trajectories. The Seq2Seq network, which demonstrated better performance, was further employed to estimate the simulated aircraft’s different metrics, such as internal mechanical metrics and flight phase. Additionally, this research undertook an array of diverse tailored evaluation techniques to assess the efficacy of flight status predictions and conducted comparative analyses between various configurations. </p><p dir="ltr">Furthermore, the research concluded by discussing the future development of the framework, emphasizing its potential for generalization across various flight data applications and scenarios. The enhanced methodology for collecting operational statistics and the analysis tool will enable airport managers and regulators to better receive a comprehensive view of the airport’s operations, facilitating airport planning and development.</p>
|
Page generated in 0.0617 seconds