Global ETD Search

1	<b>MACHINE LEARNING FOR THE DESIGN OF OPTICS/PHOTONICS DEVICES AND SYSTEMS</b> Yingheng Tang (17841722) 25 January 2024 (has links) <p dir="ltr">Modern machine learning research has recently made impressive progress across various research disciplines, such as computer vision, natural language processing, also in scientific fields including materials and molecule discovery, chip, and circuit design. In photonics/optics area, conventional methods in designing and optimiza- tion typically demand substantial time and extensive computing resources, where machine learning approaches hold the potential to significantly elevate and expe- dite these processes. On the other hand, machine learning algorithms can benefit from optical/photonics based neuromorphic computing systems due to their unique strengths in power consumption and parallelization. This talk will focus on imple- menting machine learning algorithms to optimize the optical/ photonics device (ML for photonics) as well as building optical based computing system for ML applica- tions (photonics for ML): First, I will discuss my work using probabilistic generative model (CVAE) for designing nanopatterned photonics power splitter with arbitrage splitting ratio. The model is incorporated with adversarial censoring and active learn- ing to increase the quality of generated devices. Next, I will report a physics-guided and physics-explainable recurrent neural network for time dynamics discovery in op- tical resonances, which can precisely forecast the time-domain response of resonance features with a very short portion of the initial input. The model is trained in a two-step multi-fidelity framework for high-accuracy forecast. In the end, I will present our progress in developing free space reconfigurable optical computing sys- tems for scientific computing, which is an optical based general matrix multiplication (GEMM) hardware accelerator by engineering a spatially reconfigurable array made from chalcogenide phase change materials. A device-system co-design methodology was implemented for GEMM system optimization. The device has been demonstrated over a various of ML applications.</p> Applications in physical sciences Neural networks Machine Learning Accelerator photonic devices application generative model
2	Accelerator Architecture for Secure and Energy Efficient Machine learning Samavatian, Mohammad Hossein 12 September 2022 (has links) No description available. Computer Science Computer Engineering
3	Toward Energy-Efficient Machine Learning: Algorithms and Analog Compute-In-Memory Hardware Indranil Chakraborty (11180610) 26 July 2021 (has links) <div>The ‘Internet of Things’ has increased the demand for artificial intelligence (AI)-based edge computing in applications ranging from healthcare monitoring systems to autonomous vehicles. However, the growing complexity of machine learning workloads requires rethinking to make AI amenable to resource constrained environments such as edge devices. To that effect, the entire stack of machine learning, from algorithms to hardware primitives, have been explored to enable energy-efficient intelligence at the edge. </div><div><br></div><div>From the algorithmic aspect, model compression techniques such as quantization are powerful tools to address the growing computational cost of ML workloads. However, quantization, particularly, can result in substantial loss of performance for complex image classification tasks. To address this, a principal component analysis (PCA)-driven methodology to identify the important layers of a binary network, and design mixed-precision networks. The proposed Hybrid-Net achieves a significant improvement in classification accuracy over binary networks such as XNOR-Net for ResNet and VGG architectures on CIFAR-100 and ImageNet datasets, while still achieving up remarkable energy-efficiency. </div><div><br></div><div>Having explored compressed neural networks, there is a need to investigate suitable computing systems to further the energy efficiency. Memristive crossbars have been extensively explored as an alternative to traditional CMOS based systems for deep learning accelerators due to their high on-chip storage density and efficient Matrix Vector Multiplication (MVM) compared to digital CMOS. However, the analog nature of computing poses significant issues due to various non-idealities such as: parasitic resistances, non-linear I-V characteristics of the memristor device etc. To address this, a simplified equation-based modelling of the non-ideal behavior of crossbars is performed and correspondingly, a modified technology aware training algorithm is proposed. Building on the drawbacks of equation-based modeling, a Generalized Approach to Emulating Non-Ideality in Memristive Crossbars using Neural Networks (GENIEx) is proposed where a neural network is trained on HSPICE simulation data to learn the transfer characteristics of the non-ideal crossbar. Next, a functional simulator was developed which includes key architectural facets such as tiling, and bit-slicing to analyze the impact of non-idealities on the classification accuracy of large-scale neural networks.</div><div><br></div><div>To truly realize the benefits of hardware primitives and the algorithms on top of the stack, it is necessary to build efficient devices that mimic the behavior of the fundamental units of a neural network, namely, neurons and synapses. However, efforts have largely been invested in implementations in the electrical domain with potential limitations of switching speed, functional errors due to analog computing, etc. As an alternative, a purely photonic operation of an Integrate-and-Fire Spiking neuron is proposed, based on the phase change dynamics of Ge2Sb2Te5 (GST) embedded on top of a microring resonator, which alleviates the energy constraints of PCMs in electrical domain. Further, the inherent parallelism of wavelength-division multiplexing (WDM) was leveraged to propose a photonic dot-product engine. The proposed computing platform was used to emulate a SNN inferencing engine for image-classification tasks. These explorations at different levels of the stack can enable energy-efficient machine learning for edge intelligence. </div><div><br></div><div>Having explored various domains to design efficient DNN models and studying various hardware primitives based on emerging technologies, we focus on Silicon implementation of compute-in-memory (CIM) primitives for machine learning acceleration based on the more available CMOS technology. CIM primitives enable efficient matrix-vector multiplications (MVM) through parallelized multiply-and-accumulate operations inside the memory array itself. As CIM primitives deploy bit-serial computing, the computations are exposed bit-level sparsity of inputs and weights in a ML model. To that effect, we present an energy-efficient sparsity-aware reconfigurable-precision compute-in-memory (CIM) 8T-SRAM macro for machine learning (ML) applications. Standard 8T-SRAM arrays are re-purposed to enable MAC operations using selective current flow through the read-port transistors. The proposed macro dynamically leverages workload sparsity by reconfiguring the output precision in the peripheral circuitry without degrading application accuracy. Specifically, we propose a new energy-efficient reconfigurable-precision SAR ADC design with the ability to form (n+m)-bit precision using n-bit and m-bit ADCs. Additionally, the transimpedance amplifier (TIA) –required to convert the summed current into voltage before conversion—is reconfigured based on sparsity to improve sense margin at lower output precision. The proposed macro, fabricated in 65 nm technology, provides 35.5-127.2 TOPS/W as the ADC precision varies from 6-bit to 2-bit, respectively. Building on top of the fabricated macro, we next design a hierarchical CIM core micro-architecture that addresses the existing CIM scaling challenges. The proposed CIM core micro-architecture consists of 32 proposed sparsity-aware CIM macros. The 32 macros are divided into 4 matrix-vector multiplication units (MVMUs) consisting of 8 macros each. The core has three unique features: i) it can adaptively reconfigure ADC precision to achieve energy-efficiency and lower latency based on input and weight sparsity, determined by a sparsity controller, ii) it deploys row-gating feature to maintain SNR requirements for accurate DNN computations, and iii) hardware support for load balancing to balance latency mismatches occurring due to different ADC precisions in different compute units. Besides the CIM macros, the core micro-architecture consists of input, weight, and output memories, along with instruction memory and control circuits. The instruction set architecture allows for flexible dataflows and mapping in the proposed core micro-architecture. The sparsity-aware processing core is scheduled to be taped out next month. The proposed CIM demonstrations complemented by our previous analysis on analog CIM systems progressed our understanding of this emerging paradigm in pertinence to ML acceleration.</div> Computer Engineering Machine Learning Hardware Artificial Intelligence Compute-in-Memory Accelerator Machine Learning Accelerator Photonic Neural Networks Neural Networks Phase Change Materials

1

Page generated in 0.1214 seconds