Global ETD Search

71	Implementation of Fast Fourier Transformation on Transport Triggered Architecture / Implementation of Fast Fourier Transformation on Transport Triggered Architecture Žádník, Jakub January 2017 (has links) V této práci je navrhnut energeticky úsporný procesor typu TTA (Transport Triggered Architecture) pro výpočet rychlé Fourierovy transformace (FFT). Návrh procesoru byl vytvořen na míru použitému algoritmu pomocí speciáoních funkčních jednotek. Algoritmus byl realizován jako posloupnost instrukcí tak, že většina výpočtu probíhá ve smyčce obrahující pouze jedionu paralelní instrukci. Tato instrukce je umístěna do instrukčního bufferu, odkud je potom volána místo instrukční paměti. Díky tomu se dá docílit nižší spotřeby, neboť volání z instrukčního bufferu je efektivnější než volání z instrukční paměti. Program byl zkompilován na časovém modelu procesoru a časová simulace potvrdila správnost návrhu. Součástí práce jsou rovněž pomocné programy v Pythonu, které slouží ke generaci referenčních výsledků a automatické simulaci a porovnání výsledků simulace s referencí.
72	Application Specific Customization and Scalability of Soft Multiprocessors Unnikrishnan, Deepak C 01 January 2009 (has links) (PDF) Soft multiprocessor systems exploit the plentiful computational resources available in field programmable devices. By virtue of their adaptability and ability to support coarse grained parallelism, they serve as excellent platforms for rapid prototyping and design space exploration of embedded multiprocessor applications. As complex applications emerge, careful mapping, processor and interconnect customization are critical to the overall performance of the multiprocessor system. In this thesis, we have developed an automated scalable framework to efficiently map applications written in a high-level programmer-friendly language to customizable soft-cores. The framework allows the user to specify the application in a high-level language called Streamit. After an initial analysis of the application, a soft multiprocessor system is generated automatically using a set of customizable SPREE processors which communicate with each other over point-to-point FIFO connections. Several micro-architectural features of the processors are then automatically customized on a per-application basis to improve system area, performance and power consumption. The efficiency and scalability of this approach has been validated using a diverse set of eight audio, video and signal processing benchmarks on soft multiprocessor systems consisting of one to sixteen processors. Results show that generated soft multiprocessor systems consisting of sixteen processors can offer up to 6x speedup over a conventional single processor system. Our experiments with soft multiprocessor interconnection networks show that point-to-point topologies perform approximately 2x better than mesh topologies. Finally, we demonstrate that application-specific customizations on the instruction set, memory size, and inter-processor buffer size can improve the area and performance of the generated soft multiprocessor systems. The developed framework facilitates rapid design space exploration of soft multiprocessors. soft processor FPGA multi core automatic synthesis application specific customization scalability Electrical engineering Computer and Systems Architecture Computer Engineering Digital Circuits Hardware Systems
73	Low-power ASIC design with integrated multiple sensor system Jafarian, Hossein 08 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / A novel method of power management and sequential monitoring of several sensors is proposed in this work. Application specific integrated circuits (ASICs) consisting of analog and digital sub-systems forming a system on chip (SoC) has been designed using complementary metal-oxide-semiconductor (CMOS) technology. The analog sub-system comprises the sensor-drivers that convert the input voltage variations to output pulse-frequency. The digital sub-system includes the system management unit (SMU), counter, and shift register modules. This performs the power-usagemanagement, sensor-sequence-control, and output-data-frame-generation functions. The SMU is the key unit within the digital sub-system is that enables or disables a sensor. It captures the pulse waves from a sensor for 3 clocks out of a 16-clock cycle, and transmits the signal to the counter modules. As a result, the analog sub-system is at on-state for only 3/16th fraction (18 %) of the time, leading to reduced power consumption. Three cycles is an optimal number selected for the presented design as the system is unstable with less than 3 cycles and higher clock cycles results in increased power consumption. However, the system can achieve both higher sensitivity and better stability with increased on-state clock cycles. A current-starved-ring-oscillator generates pulse waves that depend on the sensor input parameter. By counting the number of pulses of a sensor-driver in one clock cycle, a sensor input parameter is converted to digital. The digital sub-system constructs a 16-bit frame consisting of 8-bit sensor data, start and stop bits, and a parity bit. Ring oscillators that drive capacitance and resistance-based sensors use an arrangement of delay elements with two levels of control voltages. A bias unit which provides these two levels of control voltages consists of CMOS cascade current mirror to maximize voltage swing for control voltage level swings which give the oscillator wider tuning range and lower temperature induced variations. The ring oscillator was simulated separately for 250 nm and 180 nm CMOS technologies. The simulation results show that when the input voltage of the oscillator is changed by 1 V, the output frequency changes linearly by 440 MHz for 180 nm technology and 206 MHz for 250 nm technology. In a separate design, a temperature sensitive ring oscillator with symmetrical load and temperature dependent input voltage was implemented. When the temperature in the simulation model was varied from -50C to 100C the oscillator output frequency reduced by 510 MHz for the 250 nm and by 810 MHz for 180 nm CMOS technologies, respectively. The presented system does not include memory unit, thus, the captured sensor data has to be instantaneously transmitted to a remote station, e.g. end user interface. This may result in a loss of sensor data in an event of loss of communication link with the remote station. In addition, the presented design does not include transmitter and receiver modules, and thus necessitates the use of separate modules for the transfer of the data. Low-power Asic design. Multiple sensor system Digital electronics Sensor networks Voltage-controlled oscillators Integrated circuits Analog-to-digital converters Digital-to-analog converters
74	Automatic synthesis of application-specific processors Mutigwe, Charles January 2012 (has links) Thesis (D. Tech. (Engineering: Electrical)) -- Central University of technology, Free State, 2012 / This thesis describes a method for the automatic generation of appli- cation speci_c processors. The thesis was organized into three sepa- rate but interrelated studies, which together provide: a justi_cation for the method used, a theory that supports the method, and a soft- ware application that realizes the method. The _rst study looked at how modern day microprocessors utilize their hardware resources and it proposed a metric, called core density, for measuring the utilization rate. The core density is a function of the microprocessor's instruction set and the application scheduled to run on that microprocessor. This study concluded that modern day microprocessors use their resources very ine_ciently and proposed the use of subset processors to exe- cute the same applications more e_ciently. The second study sought to provide a theoretical framework for the use of subset processors by developing a generic formal model of computer architecture. To demonstrate the model's versatility, it was used to describe a number of computer architecture components and entire computing systems. The third study describes the development of a set of software tools that enable the automatic generation of application speci_c proces- sors. The FiT toolkit automatically generates a unique Hardware Description Language (HDL) description of a processor based on an application binary _le and a parameterizable template of a generic mi- croprocessor. Area-optimized and performance-optimized custom soft processors were generated using the FiT toolkit and the utilization of the hardware resources by the custom soft processors was character- ized. The FiT toolkit was combined with an ANSI C compiler and a third-party tool for programming _eld-programmable gate arrays (FPGAs) to create an unconstrained C-to-silicon compiler. Compilers (Computer programmes) Field programmable gate arrays Computer architecture Application software - Automatic control
75	Application Of Alpha Power Law Models To The PLL Design Methodology Using Behavioral Models Balssubramanian, Suresh 04 1900 (has links) (PDF) No description available. Computer Aided Design Integrated Circuits Phased Locked Loop PLL Design Digital Signal Processing (DSP) Phase Frequency Detector (PFD) Voltage Controlled Oscillators (VCO) Electronic Engineering
76	Fault Tolerant Cryptographic Primitives for Space Applications Juliato, Marcio January 2011 (has links) Spacecrafts are extensively used by public and private sectors to support a variety of services. Considering the cost and the strategic importance of these spacecrafts, there has been an increasing demand to utilize strong cryptographic primitives to assure their security. Moreover, it is of utmost importance to consider fault tolerance in their designs due to the harsh environment found in space, while keeping low area and power consumption. The problem of recovering spacecrafts from failures or attacks, and bringing them back to an operational and safe state is crucial for reliability. Despite the recent interest in incorporating on-board security, there is limited research in this area. This research proposes a trusted hardware module approach for recovering the spacecrafts subsystems and their cryptographic capabilities after an attack or a major failure has happened. The proposed fault tolerant trusted modules are capable of performing platform restoration as well as recovering the cryptographic capabilities of the spacecraft. This research also proposes efficient fault tolerant architectures for the secure hash (SHA-2) and message authentication code (HMAC) algorithms. The proposed architectures are the first in the literature to detect and correct errors by using Hamming codes to protect the main registers. Furthermore, a quantitative analysis of the probability of failure of the proposed fault tolerance mechanisms is introduced. Based upon an extensive set of experimental results along with probability of failure analysis, it was possible to show that the proposed fault tolerant scheme based on information redundancy leads to a better implementation and provides better SEU resistance than the traditional Triple Modular Redundancy (TMR). The fault tolerant cryptographic primitives introduced in this research are of crucial importance for the implementation of on-board security in spacecrafts. Security Cryptography Fault Tolerance Trusted Platform Space Systems Hash Functions Secure Hash Algorithm SHA-2 Keyed-Hash Message Authentication Code HMAC Hardware Implementation FPGA Application Specific Processor Custom Instruction HW/SW Partitioning Electrical and Computer Engineering
77	Macromodeling and simulation of linear components characterized by measured parameters Zhang, Mingyang, 1981- January 2008 (has links) Recently, microelectronics designs have reached extremely high operating frequencies as well as very small die and package sizes. This has made signal integrity an important bottleneck in the design process, and resulted in the inclusion of signal integrity simulation in the computer aided design flow. However, such simulations are often difficult because in many cases it is impossible to derive analytical models for certain passive elements, and the only available data are frequency-domain measurements or full-wave simulations. Furthermore, at such high frequencies these components are distributed in nature and require a large number of poles to be properly characterized. Simple lumped equivalent circuits are therefore difficult to obtain, and more systematic approaches are required. In this thesis we study the Vector Fitting techniques for obtaining such equivalent model and propose a more streamlined approach for preserving passivity while maintaining accuracy.
78	Fault Tolerant Cryptographic Primitives for Space Applications Juliato, Marcio January 2011 (has links) Spacecrafts are extensively used by public and private sectors to support a variety of services. Considering the cost and the strategic importance of these spacecrafts, there has been an increasing demand to utilize strong cryptographic primitives to assure their security. Moreover, it is of utmost importance to consider fault tolerance in their designs due to the harsh environment found in space, while keeping low area and power consumption. The problem of recovering spacecrafts from failures or attacks, and bringing them back to an operational and safe state is crucial for reliability. Despite the recent interest in incorporating on-board security, there is limited research in this area. This research proposes a trusted hardware module approach for recovering the spacecrafts subsystems and their cryptographic capabilities after an attack or a major failure has happened. The proposed fault tolerant trusted modules are capable of performing platform restoration as well as recovering the cryptographic capabilities of the spacecraft. This research also proposes efficient fault tolerant architectures for the secure hash (SHA-2) and message authentication code (HMAC) algorithms. The proposed architectures are the first in the literature to detect and correct errors by using Hamming codes to protect the main registers. Furthermore, a quantitative analysis of the probability of failure of the proposed fault tolerance mechanisms is introduced. Based upon an extensive set of experimental results along with probability of failure analysis, it was possible to show that the proposed fault tolerant scheme based on information redundancy leads to a better implementation and provides better SEU resistance than the traditional Triple Modular Redundancy (TMR). The fault tolerant cryptographic primitives introduced in this research are of crucial importance for the implementation of on-board security in spacecrafts. Security Cryptography Fault Tolerance Trusted Platform Space Systems Hash Functions Secure Hash Algorithm SHA-2 Keyed-Hash Message Authentication Code HMAC Hardware Implementation FPGA Application Specific Processor Custom Instruction HW/SW Partitioning Electrical and Computer Engineering
79	Polymorphic ASIC : For Video Decoding Adarsha Rao, S J January 2013 (has links) (PDF) Video applications are becoming ubiquitous in recent times due to an explosion in the number of devices with video capture and display capabilities. Traditionally, video applications are implemented on a variety of devices with each device targeting a speciﬁc application. However, the advances in technology have created a need to support multiple applications from a single device like a smart phone or tablet. Such convergence of applications necessitates support for interoperability among various applications, scalable performance meet the requirements of different applications and a high degree of reconﬁgurability to accommodate rapid evolution in applications features. In addition, low power consumption requirement is also very stringent for many video applications. The conventional custom hardware implementations of video applications deliver high performance at low power consumption while the recent MPSoC implementations enable high degree of interoperability and are useful to support application evolution. In this thesis, we combine the best features of custom hardware and MPSoC approaches to design a Polymorphic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements of several applications belonging to a particular domain. A polymorphic ASIC consists of a fabric of computation, storage and communication resources, using which applications are composed dynamically. Although different video applications differ widely in the internal de-tails of operation, at the heart of almost every video application is a video codec (encoder and decoder). The requirements of scalability, high performance and low power consumption are very stringent for video decoding. Therefore this thesis focuses mainly on the architectural design of a Polymorphic ASIC for video decoding. We present an uniﬁed software and hardware architecture (USHA) for Polymorphic ASIC. USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are software programmable and hardware reconﬁgurable respectively. The distinctive feature of Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-plication processes onto the computational tiles. Depending on the application scenarios, a process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-porates a network–on–chip (NoC) to achieve ﬂexible communication across different tiles. Formulation of a programming framework for Polymorphic ASIC requires an implementation model that captures the structure of video decoder applications as well as the properties of the Polymorphic ASIC architecture. We derive an implementation model based on a combination of parametric polyhedral process networks, stream based functions and windowed dataﬂow models of computation. The implementation model leads to a process network oriented compilation ﬂow that achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi–processor, semi hardware and full hardware conﬁgurations of a video decoder. The thesis also presents an application QoS aware scheduler that selects a decoder conﬁguration that best meets the application performance requirements, thereby enabling dynamic performance scaling. The memory hierarchy of Polymorphic ASIC makes use of an application speciﬁc cache. Through a combined analysis of miss rate and external memory bandwidth, we show that the degradation in decoder performance due to memory stall cycles depends on the properties of the video being decoded as well as the behavior of the external memory interface. Based on this observation, we present the design of a reconﬁgurable 2–D cache architecture which can adjust its parameters in accordance with the characteristics of the video stream being decoded. We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA. The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi processor, hardware accelerated and full hardware conﬁgurations. The scaling in performance delivered by these conﬁgurations shows that the Polymorphic ASIC enables the application to achieve super linear speedups [1]. The experimental results show that different implementations of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like IBM Cell. We also present the energy consumption of various conﬁgurations of video decoders on Polymorphic ASIC and an application to conﬁguration mapping aimed at minimizing the overall energy consumption of a Polymorphic ASIC. Video Decoding Polymorphic ASIC Architecture Multiprocessor System-On-Chip (MPSoC) H.264 Decoder Video Encoding H.264 Decoders H.264 Decoding Computer Science
80	Polymorphic ASIC : For Video Decoding Adarsha Rao, S J January 2013 (has links) (PDF) Video applications are becoming ubiquitous in recent times due to an explosion in the number of devices with video capture and display capabilities. Traditionally, video applications are implemented on a variety of devices with each device targeting a speciﬁc application. However, the advances in technology have created a need to support multiple applications from a single device like a smart phone or tablet. Such convergence of applications necessitates support for interoperability among various applications, scalable performance meet the requirements of different applications and a high degree of reconﬁgurability to accommodate rapid evolution in applications features. In addition, low power consumption requirement is also very stringent for many video applications. The conventional custom hardware implementations of video applications deliver high performance at low power consumption while the recent MPSoC implementations enable high degree of interoperability and are useful to support application evolution. In this thesis, we combine the best features of custom hardware and MPSoC approaches to design a Polymorphic ASIC. A Polymorphic ASIC is an integrated circuit designed to meet the requirements of several applications belonging to a particular domain. A polymorphic ASIC consists of a fabric of computation, storage and communication resources, using which applications are composed dynamically. Although different video applications differ widely in the internal de-tails of operation, at the heart of almost every video application is a video codec (encoder and decoder). The requirements of scalability, high performance and low power consumption are very stringent for video decoding. Therefore this thesis focuses mainly on the architectural design of a Polymorphic ASIC for video decoding. We present an uniﬁed software and hardware architecture (USHA) for Polymorphic ASIC. USHA is a tiled architecture which uses loosely coupled processor and hardware tiles that are software programmable and hardware reconﬁgurable respectively. The distinctive feature of Polymorphic ASIC is the static partitioning of the application and dynamic mapping of ap-plication processes onto the computational tiles. Depending on the application scenarios, a process may be mapped onto one of the hardware or processor tiles. Polymorphic ASIC incor-porates a network–on–chip (NoC) to achieve ﬂexible communication across different tiles. Formulation of a programming framework for Polymorphic ASIC requires an implementation model that captures the structure of video decoder applications as well as the properties of the Polymorphic ASIC architecture. We derive an implementation model based on a combination of parametric polyhedral process networks, stream based functions and windowed dataﬂow models of computation. The implementation model leads to a process network oriented compilation ﬂow that achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi–processor, semi hardware and full hardware conﬁgurations of a video decoder. The thesis also presents an application QoS aware scheduler that selects a decoder conﬁguration that best meets the application performance requirements, thereby enabling dynamic performance scaling. The memory hierarchy of Polymorphic ASIC makes use of an application speciﬁc cache. Through a combined analysis of miss rate and external memory bandwidth, we show that the degradation in decoder performance due to memory stall cycles depends on the properties of the video being decoded as well as the behavior of the external memory interface. Based on this observation, we present the design of a reconﬁgurable 2–D cache architecture which can adjust its parameters in accordance with the characteristics of the video stream being decoded. We validate the Polymorphic ASIC using a proof–of–concept implementation on an FPGA. The performance of H.264 decoder on Polymorphic ASIC is evaluated for uniprocessor, multi processor, hardware accelerated and full hardware conﬁgurations. The scaling in performance delivered by these conﬁgurations shows that the Polymorphic ASIC enables the application to achieve super linear speedups [1]. The experimental results show that different implementations of a H.264 video decoder on the Polymorphic ASIC can deliver performance comparable to a wide spectrum of devices ranging from embedded processor like ARM 9 to MPSoCs like IBM Cell. We also present the energy consumption of various conﬁgurations of video decoders on Polymorphic ASIC and an application to conﬁguration mapping aimed at minimizing the overall energy consumption of a Polymorphic ASIC. Video Decoding Polymorphic ASIC Architecture Multiprocessor System-On-Chip (MPSoC) H.264 Decoder Video Encoding H.264 Decoders H.264 Decoding Computer Science

Search results