21 |
Architectures de circuits nanoélectroniques neuro-inspirée.Chabi, Djaafar 09 March 2012 (has links) (PDF)
Les nouvelles techniques de fabrication nanométriques comme l'auto-assemblage ou la nanoimpression permettent de réaliser des matrices régulières (crossbars) atteignant des densités extrêmes (jusqu'à 1012 nanocomposants/cm2) tout en limitant leur coût de fabrication. Cependant, il est attendu que ces technologies s'accompagnent d'une augmentation significative du nombre de défauts et de dispersions de caractéristiques. La capacité à exploiter ces crossbars est alors conditionnée par le développement de nouvelles techniques de calcul capables de les spécialiser et de tolérer une grande densité de défauts. Dans ce contexte, l'approche neuromimétique qui permet tout à la fois de configurer les nanodispositifs et de tolérer leurs défauts et dispersions de caractéristiques apparaît spécialement pertinente. L'objectif de cette thèse est de démontrer l'efficacité d'une telle approche et de quantifier la fiabilité obtenue avec une architecture neuromimétique à base de crossbar de memristors, ou neurocrossbar (NC). Tout d'abord la thèse introduit des algorithmes permettant l'apprentissage de fonctions logiques sur un NC. Par la suite, la thèse caractérise la tolérance du modèle NC aux défauts et aux variations de caractéristiques des memristors. Des modèles analytiques probabilistes de prédiction de la convergence de NC ont été proposés et confrontés à des simulations Monte-Carlo. Ils prennent en compte l'impact de chaque type de défaut et de dispersion. Grâce à ces modèles analytiques il devient possible d'extrapoler cette étude à des circuits NC de très grande taille. Finalement, l'efficacité des méthodes proposées est expérimentalement démontrée à travers l'apprentissage de fonctions logiques par un NC composé de transistors à nanotube de carbone à commande optique (OG-CNTFET).
|
22 |
The Development And Hardware Implementation Of A High-speed Adaptable Packet Switch FabricAkbaba, Erdem Eyup 01 February 2013 (has links) (PDF)
Routers have to be fast enough to keep pace with increasing traffic data rate because of the increasing need for network bandwidth and processing. The switch fabric component of a router is a combination of hardware and software which moves the incoming packets to the outgoing ports. The access of the input ports to the switch fabric is controlled by a scheduler which affects the overall performance together with the fabric design. In this thesis we investigate two switch fabric and scheduler architectures, the well-known iSlip fabric scheduler and the Byte-Focal switch. We observe that these two architectures have different behaviors under different input traffic load ranges. The novel contribution of this thesis is a combined switch architecture which is composed of these two architectures that are implemented and run in parallel to selectively forward the packets with lower delay to the outputs to achieve an overall lower average delay. The design of the combined switch is carried out on FPGA and simulated. Our results show that the combined architecture has 100% throughput and a lower average delay compared to the Byte-Focal switch and the input-queued switch with iSlip. On the other hand, our combined switch uses more resources in FPGA than individual iSlip and Byte-Focal switch.
|
23 |
IN-MEMORY COMPUTING WITH CMOS AND EMERGING MEMORY TECHNOLOGIESShubham Jain (7464389) 17 October 2019 (has links)
Modern computing workloads such as machine learning and data analytics perform simple computations on large amounts of data. Traditional von Neumann computing systems, which consist of separate processor and memory subsystems, are inefficient in realizing modern computing workloads due to frequent data transfers between these subsystems that incur significant time and energy costs. In-memory computing embeds computational capabilities within the memory subsystem to alleviate the fundamental processor-memory bottleneck, thereby achieving substantial system-level performance and energy benefits. In this dissertation, we explore a new generation of in-memory computing architectures that are enabled by emerging memory technologies and new CMOS-based memory cells. The proposed designs realize Boolean and non-Boolean computations natively within memory arrays.<br><div><br></div><div>For Boolean computing, we leverage the unique characteristics of emerging memories that allow multiple word lines within an array to be simultaneously enabled, opening up the possibility of directly sensing functions of the values stored in multiple rows using single access. We propose Spin-Transfer Torque Compute-in-Memory (STT-CiM), a design for in-memory computing with modifications to peripheral circuits that leverage this principle to perform logic, arithmetic, and complex vector operations. We address the challenge of reliable in-memory computing under process variations utilizing error detecting and correcting codes to control errors during CiM operations. We demonstrate how STT-CiM can be integrated within a general-purpose computing system and propose architectural enhancements to processor instruction sets and on-chip buses for in-memory computing. <br></div><div><br></div><div>For non-Boolean computing, we explore crossbar arrays of resistive memory elements, which are known to compactly and efficiently realize a key primitive operation involved in machine learning algorithms, i.e., vector-matrix multiplication. We highlight a key challenge involved in this approach - the actual function computed by a resistive crossbar can deviate substantially from the desired vector-matrix multiplication operation due to a range of device and circuit level non-idealities. It is essential to evaluate the impact of the errors introduced by these non-idealities at the application level. There has been no study of the impact of non-idealities on the accuracy of large-scale workloads (e.g., Deep Neural Networks [DNNs] with millions of neurons and billions of synaptic connections), in part because existing device and circuit models are too slow to use in application-level evaluation. We propose a Fast Crossbar Model (FCM) to accurately capture the errors arising due to crossbar non-idealities while being four-to-five orders of magnitude faster than circuit simulation. We also develop RxNN, a software framework to evaluate DNN inference on resistive crossbar systems. Using RxNN, we evaluate a suite of large-scale DNNs developed for the ImageNet Challenge (ILSVRC). Our evaluations reveal that the errors due to resistive crossbar non-idealities can degrade the overall accuracy of DNNs considerably, motivating the need for compensation techniques. Subsequently, we propose CxDNN, a hardware-software methodology that enables the realization of large-scale DNNs on crossbar systems with minimal degradation in accuracy by compensating for errors due to non-idealities. CxDNN comprises of (i) an optimized mapping technique to convert floating-point weights and activations to crossbar conductances and input voltages, (ii) a fast re-training method to recover accuracy loss due to this conversion, and (iii) low-overhead compensation hardware to mitigate dynamic and hardware-instance-specific errors. Unlike previous efforts that are limited to small networks and require the training and deployment of hardware-instance-specific models, CxDNN presents a scalable compensation methodology that can address large DNNs (e.g., ResNet-50 on ImageNet), and enables a common model to be trained and deployed on many devices. <br></div><div><br></div><div>For non-Boolean computing, we also propose TiM-DNN, a programmable hardware accelerator that is specifically designed to execute ternary DNNs. TiM-DNN supports various ternary representations including unweighted (-1,0,1), symmetric weighted (-a,0,a), and asymmetric weighted (-a,0,b) ternary systems. TiM-DNN is an in-memory accelerator designed using TiM tiles --- specialized memory arrays that perform massively parallel signed vector-matrix multiplications on ternary values per access. TiM tiles are in turn composed of Ternary Processing Cells (TPCs), new CMOS-based memory cells that function as both ternary storage units and signed scalar multiplication units. We evaluate an implementation of TiM-DNN in 32nm technology using an architectural simulator calibrated with SPICE simulation and RTL synthesis. TiM-DNN achieves a peak performance of 114 TOPs/s, consumes 0.9W power, and occupies 1.96mm2 chip area, representing a 300X improvement in TOPS/W compared to a state-of-the-art NVIDIA Tesla V100 GPU. In comparison to popular quantized DNN accelerators, TiM-DNN achieves 55.2X-240X and 160X-291X improvement in TOPS/W and TOPS/mm2, respectively.<br></div><div><br></div><div>In summary, the dissertation proposes new in-memory computing architectures as well as addresses the need for scalable modeling frameworks and compensation techniques for resistive crossbar based in-memory computing fabrics. Our evaluations show that in-memory computing architectures are promising for realizing modern machine learning and data analytics workloads, and can attain orders-of-magnitude improvement in system-level energy and performance over traditional von Neumann computing systems. <br></div>
|
24 |
Optimisation de mémoires PCRAM pour générations sub-40 nm : intégration de matériaux alternatifs et structures innovantes.Hubert, Quentin 17 December 2013 (has links) (PDF)
Au cours des dernières années, la demande de plus en plus forte pour des mémoires non-volatiles performantes, a mené au développement des technologies NOR Flash et NAND Flash, qui dominent aujourd'hui le marché des mémoires non-volatiles. Cependant, la miniaturisation de ces technologies, qui permettait d'en réduire le coût, laisse aujourd'hui entrevoir ses limites. En conséquence, des mémoires alternatives et émergentes sont développées, et parmi celles-ci, la technologie des mémoires à changement de phase, ou PCRAM, est l'une des candidates les plus prometteuses tant pour remplacer les mémoires Flash, notamment de type NOR, que pour accéder à de nouveaux marchés tels que le marché des SCM. Toutefois, afin d'être pleinement compétitives avec les autres technologies mémoires, certaines performances de la technologie PCRAM doivent encore être améliorées. Au cours de cette thèse, nous cherchons donc à obtenir des dispositifs PCRAM plus performants. Parmi les résultats présentés, nous réduisons les courants de programmation et la consommation électrique des dispositifs tout en augmentant la rétention de l'information à haute température. Pour cela, nous modifions la structure du dispositif ou nous utilisons un matériau à changement de phase alternatif. De plus, à l'aide de solutions innovantes, nous permettons aux dispositifs PCRAM de conserver l'information pendant une éventuelle étape de soudure de la puce mémoire. Enfin, nous avons conçu, développé et validé un procédé de fabrication permettant d'intégrer une diode PN de sélection en Silicium en série avec un élément résistif PCRAM, démontrant l'intérêt de ce sélecteur vertical pour être utilisées comme élément de sélection d'une cellule PCRAM intégrée au sein d'une architecture crossbar.
|
25 |
Konstrukce multifunkčního obráběcího centra / Design of multi-functional machining centreCvejn, Jiří January 2013 (has links)
The purpose of this diploma thesis is a design of frame, crossbar, transverse feed, sliding feed of multifunctional machining center. In the first part, there is a brief research of history of machine tool conducted, division of machining centers, materials for frame construction, alternatives of drives of sliding axis. Further, I carry out an analysis of parameters of compeeting machine tools, from which I selected the parameters of our machine. Frame project of the machine, propulsion of axis X and Y, kinematic connection of axis X,Y,Z. Frame of machine is analysed by Finite Element Method. Over the scope of this work I suggest a solution for covering of linear axis as well as their measuring of actual position. 3D model of frame and drives of machine are included in this work. A complete formation has been introduced into the immersive virtual reality environment.
|
26 |
Memristors for Neuromorphic LogicPetropoulos, Dimitrios Petros January 2022 (has links)
Novel devices are being investigated as artificial synapse candidates for neuromorphic computing. These memory devices share the characteristics of an electronic element called memristor. The memristor can be regarded as a resistor with a history dependent resistance, which mimics the plasticity of a biological synapse. The present work presents various types of candidate devices that have been proposed in neuromorphic research, describes how they mimic a biological synapse and how they can be employed in artificial neuron network architectures.
|
27 |
Etude et développement de points mémoires résistifs polymères pour les architectures Cross-Bar / Development and Study of Organic Polymer Resistive Memories For Crossbar ArchitecturesCharbonneau, Micaël 19 January 2012 (has links)
Ces dix dernières années, les technologies de stockage non-volatile Flash ont joué un rôle majeur dans le développement des appareils électroniques mobiles et multimedia (MP3, Smartphone, clés USB, ordinateurs ultraportables…). Afin d’améliorer davantage les performances, augmenter les capacités et diminuer les coûts de fabrication, de nouvelles solutions technologiques sont aujourd’hui étudiées pour pouvoir compléter ou remplacer la technologie Flash. Citées par l’ITRS, les mémoires résistives polymères présentent des caractéristiques très prometteuses : procédés de fabrication à faible coût et possibilité d’intégration haute densité au dessus des niveaux d’interconnexions CMOS ou sur substrat souple. Ce travail de thèse a été consacré au développement et à l'étude des mémoires résistifs organiques à base de polymère de poly-méthyl-méthacrylate (PMMA) et de molécules de fullerènes (C60). Trois axes de recherche ont été menés en parallèle: le développement et la caractérisation physico-chimique de matériaux composites, l’intégration du matériau organique dans des structures de test spécifiques et la caractérisation détaillée du fonctionnement électrique des dispositifs et des performances mémoires. / Over the past decade, non-volatile Flash storage technologies have played a major role in the development of mobile electronics and multimedia (MP3, Smartphone, USB, ultraportable computers ...). To further enhance performances, increase the capacity and reduce manufacturing costs, new technological solutions are now studied to provide complementary solutions or replace Flash technology. Cited by ITRS, the polymer resistive memories present very promising characteristics: low cost processing and ability for integration at high densities above CMOS interconnections or on flexible substrate. This PhD specifically focused on the development and study of composite material made of Poly-Methyl-Methacrylate (PMMA) polymer resist doped with C60 fullerene molecules. Studies were carried out on three different axes in parallel: Composite materials development & characterization, integration of the organic material in specific test structure and advanced devices and finally detailed electrical characterization of memory cells and performances analysis.
|
28 |
Automated Generation of Round-robin Arbitration and Crossbar Switch LogicShin, Eung Seo 25 November 2003 (has links)
The objective of this thesis is to automate the design of round-robin arbiter logic. The resulting arbitration logic is more than 1.8X times faster than the fastest prior state-of-the-art arbitration logic the author could find reported in the literature. The generated arbiter implemented in a single chip is fast enough in 0.25ьm CMOS technology to achieve terabit switching with a single chip computer network switch. Moreover, this arbiter is applicable to crossbar (Xbar) arbitration logic. The generated Xbar, customized according to user specifications, provides multiple communication paths among masters and slaves.
As the number of transistors on a single chip increases rapidly, there is a productivity gap between the number of transistors available in a chip and the number of transistors per hour a chip designer designs. One solution to reduce this productivity gap is to increase the use of Silicon Intellectual Property (SIP) cores. However, a SIP core should be customized before being used in a system different than the one for which it was designed. Thus, to reconfigure the SIP core, either an engineer must spend significant effort altering the core by hand or else an enhanced CAD tool can automatically customize the core according to customer specifications. In this thesis, we present SIP generator tools for arbiter and Xbar generation.
First, we introduce a Round-robin Arbiter Generator (RAG). The RAG can generate a hierarchical Bus Arbiter (BA) which is faster than all known previous approaches. RAG can also generate a hierarchical Switch Arbiter (SA) which is faster than all known previous approaches. Using a 0.25ьm TSMC standard cell library from LEDA Systems, we show the arbitration time of a 32x32 SA and demonstrate that our SA meets the time constraint to achieve terabit throughput. Furthermore, using a novel token-passing hierarchical arbitration scheme, our 32x32 SA performs better than the Ping-Pong Arbiter and Programmable Priority Encoder by factors of 1.8X and 2.3X, respectively, with less power dissipation.
Finally, we present an Xbar switch Generator (X-Gt) tool that automatically configures a crossbar for a multiprocessor System-on-a-Chip (SoC). An Xbar is generated in Register Transfer Level (RTL) Verilog HDL.
|
29 |
Wafer-level heterogeneous integration of MEMS actuatorsBraun, Stefan January 2010 (has links)
This thesis presents methods for the wafer-level integration of shape memory alloy (SMA) and electrostatic actuators to functionalize MEMS devices. The integration methods are based on heterogeneous integration, which is the integration of different materials and technologies. Background information about the actuators and the integration method is provided. SMA microactuators offer the highest work density of all MEMS actuators, however, they are not yet a standard MEMS material, partially due to the lack of proper wafer-level integration methods. This thesis presents methods for the wafer-level heterogeneous integration of bulk SMA sheets and wires with silicon microstructures. First concepts and experiments are presented for integrating SMA actuators with knife gate microvalves, which are introduced in this thesis. These microvalves feature a gate moving out-of-plane to regulate a gas flow and first measurements indicate outstanding pneumatic performance in relation to the consumed silicon footprint area. This part of the work also includes a novel technique for the footprint and thickness independent selective release of Au-Si eutectically bonded microstructures based on localized electrochemical etching. Electrostatic actuators are presented to functionalize MEMS crossbar switches, which are intended for the automated reconfiguration of copper-wire telecommunication networks and must allow to interconnect a number of input lines to a number of output lines in any combination desired. Following the concepts of heterogeneous integration, the device is divided into two parts which are fabricated separately and then assembled. One part contains an array of double-pole single-throw S-shaped actuator MEMS switches. The other part contains a signal line routing network which is interconnected by the switches after assembly of the two parts. The assembly is based on patterned adhesive wafer bonding and results in wafer-level encapsulation of the switch array. During operation, the switches in these arrays must be individually addressable. Instead of controlling each element with individual control lines, this thesis investigates a row/column addressing scheme to individually pull in or pull out single electrostatic actuators in the array with maximum operational reliability, determined by the statistical parameters of the pull-in and pull-out characteristics of the actuators. / QC20100729
|
Page generated in 0.0588 seconds