• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • Tagged with
  • 6
  • 6
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Design and Optimization of Resistive RAM-based Storage and Computing Systems

January 2019 (has links)
abstract: The Resistive Random Access Memory (ReRAM) is an emerging non-volatile memory technology because of its attractive attributes, including excellent scalability (< 10 nm), low programming voltage (< 3 V), fast switching speed (< 10 ns), high OFF/ON ratio (> 10), good endurance (up to 1012 cycles) and great compatibility with silicon CMOS technology [1]. However, ReRAM suffers from larger write latency, energy and reliability issue compared to Dynamic Random Access Memory (DRAM). To improve the energy-efficiency, latency efficiency and reliability of ReRAM storage systems, a low cost cross-layer approach that spans device, circuit, architecture and system levels is proposed. For 1T1R 2D ReRAM system, the effect of both retention and endurance errors on ReRAM reliability is considered. Proposed approach is to design circuit-level and architecture-level techniques to reduce raw Bit Error Rate significantly and then employ low cost Error Control Coding to achieve the desired lifetime. For 1S1R 2D ReRAM system, a cross-point array with “multi-bit per access” per subarray is designed for high energy-efficiency and good reliability. The errors due to cell-level as well as array-level variations are analyzed and a low cost scheme to maintain reliability and latency with low energy consumption is proposed. For 1S1R 3D ReRAM system, access schemes which activate multiple subarrays with multiple layers in a subarray are used to achieve high energy efficiency through activating fewer subarray, and good reliability is achieved through innovative data organization. Finally, a novel ReRAM-based accelerator design is proposed to support multiple Convolutional Neural Networks (CNN) topologies including VGGNet, AlexNet and ResNet. The multi-tiled architecture consists of 9 processing elements per tile, where each tile implements the dot product operation using ReRAM as computation unit. The processing elements operate in a systolic fashion, thereby maximizing input feature map reuse and minimizing interconnection cost. The system-level evaluation on several network benchmarks show that the proposed architecture can improve computation efficiency and energy efficiency compared to a state-of-the-art ReRAM-based accelerator. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2019
2

Developing RRAM-Based Approaches for Security and Provisioning of ICs

Hanna, Drew E. 28 June 2021 (has links)
No description available.
3

Algorithm and Hardware Design for Efficient Deep Learning Inference

January 2018 (has links)
abstract: Deep learning (DL) has proved itself be one of the most important developements till date with far reaching impacts in numerous fields like robotics, computer vision, surveillance, speech processing, machine translation, finance, etc. They are now widely used for countless applications because of their ability to generalize real world data, robustness to noise in previously unseen data and high inference accuracy. With the ability to learn useful features from raw sensor data, deep learning algorithms have out-performed tradinal AI algorithms and pushed the boundaries of what can be achieved with AI. In this work, we demonstrate the power of deep learning by developing a neural network to automatically detect cough instances from audio recorded in un-constrained environments. For this, 24 hours long recordings from 9 dierent patients is collected and carefully labeled by medical personel. A pre-processing algorithm is proposed to convert event based cough dataset to a more informative dataset with start and end of coughs and also introduce data augmentation for regularizing the training procedure. The proposed neural network achieves 92.3% leave-one-out accuracy on data captured in real world. Deep neural networks are composed of multiple layers that are compute/memory intensive. This makes it difficult to execute these algorithms real-time with low power consumption using existing general purpose computers. In this work, we propose hardware accelerators for a traditional AI algorithm based on random forest trees and two representative deep convolutional neural networks (AlexNet and VGG). With the proposed acceleration techniques, ~ 30x performance improvement was achieved compared to CPU for random forest trees. For deep CNNS, we demonstrate that much higher performance can be achieved with architecture space exploration using any optimization algorithms with system level performance and area models for hardware primitives as inputs and goal of minimizing latency with given resource constraints. With this method, ~30GOPs performance was achieved for Stratix V FPGA boards. Hardware acceleration of DL algorithms alone is not always the most ecient way and sucient to achieve desired performance. There is a huge headroom available for performance improvement provided the algorithms are designed keeping in mind the hardware limitations and bottlenecks. This work achieves hardware-software co-optimization for Non-Maximal Suppression (NMS) algorithm. Using the proposed algorithmic changes and hardware architecture With CMOS scaling coming to an end and increasing memory bandwidth bottlenecks, CMOS based system might not scale enough to accommodate requirements of more complicated and deeper neural networks in future. In this work, we explore RRAM crossbars and arrays as compact, high performing and energy efficient alternative to CMOS accelerators for deep learning training and inference. We propose and implement RRAM periphery read and write circuits and achieved ~3000x performance improvement in online dictionary learning compared to CPU. This work also examines the realistic RRAM devices and their non-idealities. We do an in-depth study of the effects of RRAM non-idealities on inference accuracy when a pretrained model is mapped to RRAM based accelerators. To mitigate this issue, we propose Random Sparse Adaptation (RSA), a novel scheme aimed at tuning the model to take care of the faults of the RRAM array on which it is mapped. Our proposed method can achieve inference accuracy much higher than what traditional Read-Verify-Write (R-V-W) method could achieve. RSA can also recover lost inference accuracy 100x ~ 1000x faster compared to R-V-W. Using 32-bit high precision RSA cells, we achieved ~10% higher accuracy using fautly RRAM arrays compared to what can be achieved by mapping a deep network to an 32 level RRAM array with no variations. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2018
4

System Level Exploration of RRAM for SRAM Replacement

Dogan, Rabia January 2013 (has links)
Recently an effective usage of the chip area plays an essential role for System-on-Chip (SOC) designs. Nowadays on-chip memories take up more than 50%of the total die-area and are responsible for more than 40% of the total energy consumption. Cache memory alone occupies 30% of the on-chip area in the latest microprocessors. This thesis project “System Level Exploration of RRAM for SRAM Replacement” describes a Resistive Random Access Memory (RRAM) based memory organizationfor the Coarse Grained Reconfigurable Array (CGRA) processors. Thebenefit of the RRAM based memory organization, compared to the conventional Static-Random Access Memory (SRAM) based memory organization, is higher interms of energy and area requirement. Due to the ever-growing problems faced by conventional memories with Dynamic Voltage Scaling (DVS), emerging memory technologies gained more importance. RRAM is typically seen as a possible candidate to replace Non-volatilememory (NVM) as Flash approaches its scaling limits. The replacement of SRAMin the lowest layers of the memory hierarchies in embedded systems with RRAMis very attractive research topic; RRAM technology offers reduced energy and arearequirements, but it has limitations with regards to endurance and write latency. By reason of the technological limitations and restrictions to solve RRAM write related issues, it becomes beneficial to explore memory access schemes that tolerate the longer write times. Therefore, since RRAM write time cannot be reduced realistically speaking we have to derive instruction memory and data memory access schemes that tolerate the longer write times. We present an instruction memory access scheme to compromise with these problems. In addition to modified instruction memory architecture, we investigate the effect of the longer write times to the data memory. Experimental results provided show that the proposed architectural modifications can reduce read energy consumption by a significant frame without any performance penalty.
5

Study on Resistive Switching Phenomenon in Metal Oxides for Nonvolatile Memory / 不揮発性メモリに向けた金属酸化物における抵抗スイッチング現象に関する研究

Iwata, Tatsuya 24 March 2014 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(工学) / 甲第18285号 / 工博第3877号 / 新制||工||1595(附属図書館) / 31143 / 京都大学大学院工学研究科電子工学専攻 / (主査)教授 木本 恒暢, 教授 藤田 静雄, 准教授 掛谷 一弘 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM
6

Variants of Ferroelectric Hafnium Oxide based Nonvolatile Memories

Mikolajick, T., Mulaosmanovic, H., Hoffmann, M., Max, B., Mittmann, T., Schroeder, U., Slesazeck, S. 26 January 2022 (has links)
Ferroelectricity is very attractive for nonvolatile memories since it allows non-volatility paired with a field driven switching mechanism enabling a very low-power write operation. Non-volatile memories based on ferroelectric lead-zirconium-titanate (PZT) (see fig. la) are available on the market for more than a quarter of a century now [1]. Yet they are limited to niche applications due to the compatibility issues of the ferroelectric material with CMOS processes and the associated limited scalability [2]. The discovery of ferroelectricity in doped hafnium oxide has revived the activities towards a variety of scalable ferroelectric nonvolatile memory devices

Page generated in 0.062 seconds