Spelling suggestions: "subject:"machine learning accelerated"" "subject:"machine learning accelerate""
1 |
<b>PROCESSING IN MEMORY DESIGN AND OPTIMIZATIONS FOR MACHINE LEARNING INFERENCE</b>Mingxuan He (19759866) 22 October 2024 (has links)
<p dir="ltr">Advances in machine learning (ML) have ignited hardware innovations for efficient execution of the ML models many of which are memory-bound (e.g., long short-term memories, multi-level perceptrons, and recurrent neural networks). Specifically, inference using these ML models with small batches, as would be the case at the Cloud edge, has little reuse of the large filters and is deeply memory-bound. Simultaneously, processing-in or -near memory (PIM or PNM) is promising unprecedented highbandwidth connection between compute and memory. Fortunately, the memory-bound ML models are a good fit for PIM. We focus on digital PIM which provides higher bandwidth than PNM and does not incur the reliability issues of analog PIM. Previous PIM and PNM approaches advocate full processor cores which do not conform to PIM’s severe area and power constraints. This thesis is composed of three major projects: Newton, activation folding (AcF) and ESPIM. Newton is Sk Hynix’s first accelerator-inmemory (AiMX) product for machine learning, AcF improves the performance of Newton by achieving more compute-row access overlap and ESPIM incorporate sparse neural network models to PIM</p>
|
2 |
Accelerating AI-driven scientific discovery with end-to-end learning and random projectionMd Nasim (19471057) 23 August 2024 (has links)
<p dir="ltr">Scientific discovery of new knowledge from data can enhance our understanding of the physical world and lead to the innovation of new technologies. AI-driven methods can greatly accelerate scientific discovery and are essential for analyzing and identifying patterns in huge volumes of experimental data. However, current AI-driven scientific discovery pipeline suffers from several inefficiencies including but not limited to lack of <b>precise modeling</b>, lack of <b>efficient learning methods</b>, and lack of <b>human-in-the-loop integrated frameworks</b> in the scientific discovery loop. Such inefficiencies increase resource requirements such as expensive computing infrastructures, significant human expert efforts and subsequently slows down scientific discovery.</p><p dir="ltr">In this thesis, I introduce a collection of methods to address the lack of precise modeling, lack of efficient learning methods and lack of human-in-the-loop integrated frameworks in AI-driven scientific discovery workflow. These methods include automatic physics model learning from partially annotated noisy video data, accelerated partial differential equation (PDE) physics model learning, and an integrated AI-driven platform for rapid analysis of experimental video data. <b>My research has led to the discovery of a new size fluctuation property of material defects</b> exposed to high temperature and high irradiation environments such as inside nuclear reactors. Such discovery is essential for designing strong materials that are critical for energy applications.</p><p dir="ltr">To address the lack of precise modeling of physics learning tasks, I developed NeuraDiff, an end-to-end method for learning phase field physics models from noisy video data. In previous learning approaches involving multiple disjoint steps, errors in one step can propagate to another, thus affecting the accuracy of the learned physics models. Trial-and-error simulation methods for learning physics model parameters are inefficient, heavily dependent on expert intuition and may not yield reasonably accurate physics models even after many trial iterations. By encoding the physics model equations directly into learning, end-to-end NeuraDiff framework can provide <b>~100%</b> accurate tracking of material defects and yield correct physics model parameters. </p><p dir="ltr">To address the lack of efficient methods for PDE physics model learning, I developed Rapid-PDE and Reel. The key idea behind these methods is the random projection based compression of system change signals which are sparse in - either value domain (Rapid-PDE) or, both value and frequency domain (Reel). Experiments show that PDE model training times can be reduced significantly using our Rapid-PDE (<b>50-70%)</b> and Reel (<b>70-98%</b>) methods. </p><p dir="ltr">To address the lack of human-in-the-loop integrated frameworks for high volume experimental data analysis, I developed an integrated framework with an easy-to-use annotation tool. Our interactive AI-driven annotation tool can reduce video annotation times by <b>50-75%</b>, and enables material scientists to scale up the analysis of experimental videos.</p><p dir="ltr"><b>Our framework for analyzing experimental data has been deployed in the real world</b> for scaling up in-situ irradiation experiment video analysis and has played a crucial role in the discovery of size fluctuation of material defects under extreme heat and irradiation. </p>
|
Page generated in 0.1201 seconds