221 |
Design and Evaluation of a Single Instruction Processor / Design och utveckling av en eninstruktions processorMu, Rongzeng January 2003 (has links)
A new path of DSP processor design is described in this thesis with an example, to design a FFT processor. It is an innovative concept for DSP processor design developed by the Electronic Systems Division in the department of Electrical Engineer department in Linköping University. The project described in this thesis is to design a Sande-Tukey FFT processor step by step. It will go through all steps from the simplest MATLAB specification to the final synthesizable VHDL specification. The steps should be as small as possible in order to avoid error and MATLAB should be used as for as possible.
|
222 |
Behavioral Model of an Instruction Decoder of Motorola DSP56000 ProcessorKrishna Kumar, Guda January 2006 (has links)
This thesis is a part of an effort to make a scalable behavioral model of the Central Processing Unit and instruction set compatible with the DSP56000 Processor. The goal of this design is to reduce the critical path, silicon area, as well as power consumption of the instruction decoder. The instruction decoder consists of three different types of operations instruction fetching, decoding and execution. By using these three steps an efficient model has to be designed to get the shortest critical path, less silicon area, and low power consumption.
|
223 |
Bit Serial Systolic Architectures for Multiplicative Inversion and Division over GF(2<sup>m</sup>)Daneshbeh, Amir January 2005 (has links)
Systolic architectures are capable of achieving high throughput by maximizing pipelining and by eliminating global data interconnects. Recursive algorithms with regular data flows are suitable for systolization. The computation of multiplicative inversion using algorithms based on EEA (Extended Euclidean Algorithm) are particularly suitable for systolization. Implementations based on EEA present a high degree of parallelism and pipelinability at bit level which can be easily optimized to achieve local data flow and to eliminate the global interconnects which represent most important bottleneck in todays sub-micron design process. The net result is to have high clock rate and performance based on efficient systolic architectures.
This thesis examines high performance but also scalable implementations of multiplicative inversion or field division over Galois fields <i>GF</i>(2<i><sup>m</sup></i>) in the specific case of cryptographic applications where field dimension <i>m</i> may be very large (greater than 400) and either <i>m</i> or defining irreducible polynomial may vary. For this purpose, many inversion schemes with different basis representation are studied and most importantly variants of EEA and binary (Stein's) GCD computation implementations are reviewed. A set of common as well as contrasting characteristics of these variants are discussed. As a result a generalized and optimized variant of EEA is proposed which can compute division, and multiplicative inversion as its subset, with divisor in either <i>polynomial</i> or <i>triangular</i> basis representation. Further results regarding Hankel matrix formation for double-basis inversion is provided. The validity of using the same architecture to compute field division with polynomial or triangular basis representation is proved.
Next, a scalable unidirectional bit serial systolic array implementation of this proposed variant of EEA is implemented. Its complexity measures are defined and these are compared against the best known architectures. It is shown that assuming the requirements specified above, this proposed architecture may achieve a higher clock rate performance w. r. t. other designs while being more flexible, reliable and with minimum number of inter-cell interconnects.
The main contribution at system level architecture is the substitution of all counter or adder/subtractor elements with a simpler distributed and free of carry propagation delays structure. Further a novel restoring mechanism for result sequences of EEA is proposed using a double delay element implementation.
Finally, using this systolic architecture a CMD (Combined Multiplier Divider) datapath is designed which is used as the core of a novel systolic elliptic curve processor. This EC processor uses affine coordinates to compute scalar point multiplication which results in having a very small control unit and negligible with respect to the datapath for all practical values of <i>m</i>. The throughput of this EC based on this bit serial systolic architecture is comparable with designs many times larger than itself reported previously.
|
224 |
A Lightweight Processor Core for Application Specific AccelerationGrant, David January 2004 (has links)
Advances in configurable logic technology have permitted the development of low-cost, high-speed configurable devices, allowing one or more soft processor cores to be introduced into a configurable computing system. Soft processor cores offer logic-area savings and reduced configuration times when compared to the hardware-only implementations typically used for application specific acceleration. Programs for a soft processor core are small and simple compared to the design of a hardware core, but can leverage custom hardware within the processor core to provide greater acceleration for specific applications. This thesis presents several configurable system models, and implements one such model on a Nios Embedded Processor Development Board. A software programmable and hardware configurable lightweight processor core known as the FAST CPU is introduced. The configurable system implementation attaches several FAST CPUs to a standard Nios processor to create a system for experimentation with application specific acceleration. This system incorporating the FAST CPUs was tested for bus utilization behaviour, computing performance, and execution times for a minheap application. Experimental results are compared to the performance of a software-only solution, and also with previous research results. Experimental results verify that the theory and models used to predict bus utilization are correct. Performance testing shows that the FAST CPU is approximately 25% slower than a general purpose processor, which is expected. The FAST CPU, however, is 31% smaller in terms of logic area than the general purpose processor, and is 8% smaller than the design of a hardware-only implementation of a minheap for application specific acceleration. The results verify that it is possible to move functionality from a general purpose processor to a lightweight processor, and further, to realize an increase in performance when a task is parallelized across multiple FAST CPUs. The experimentation uses a procedure by which a set of equations can be derived for predicting bus utilization and deriving a cost-benefit curve for a coprocessing entity. They are applied to a specific system in this research, but the methods are generalizable to any coprocessing entity.
|
225 |
High Performance Elliptic Curve Cryptographic Co-processorLutz, Jonathan January 2003 (has links)
In FIPS 186-2, NIST recommends several finite fields to be used in the elliptic curve digital signature algorithm (ECDSA). Of the ten recommended finite fields, five are binary extension fields with degrees ranging from 163 to 571. The fundamental building block of the ECDSA, like any ECC based protocol, is elliptic curve scalar multiplication. This operation is also the most computationally intensive. In many situations it may be desirable to accelerate the elliptic curve scalar multiplication with specialized hardware.
In this thesis a high performance elliptic curve processor is developed which is optimized for the NIST binary fields. The architecture is built from the bottom up starting with the field arithmetic units. The architecture uses a field multiplier capable of performing a field multiplication over the extension field with degree 163 in 0. 060 microseconds. Architectures for squaring and inversion are also presented. The co-processor uses Lopez and Dahab's projective coordinate system and is optimized specifically for Koblitz curves. A prototype of the processor has been implemented for the binary extension field with degree 163 on a Xilinx XCV2000E FPGA. The prototype runs at 66 MHz and performs an elliptic curve scalar multiplication in 0. 233 msec on a generic curve and 0. 075 msec on a Koblitz curve.
|
226 |
XQuery Query Processing in Relational SystemsChen, Yingwen January 2004 (has links)
With the rapid growth of XML documents to serve as a popular and major media for storage and interchange of the data on the Web, there is an increasing interest in using existing traditional relational database techniques to store and/or query XML data. Since XQuery is becoming a standard XML query language, significant effort has been made in developing an efficient and comprehensive XQuery-to-SQL query processor.
In this thesis, we design and implement an <em>XQuery-to-SQL Query Processor</em> based on the <em>Dynamic Intervals</em> approach. We also provide a comprehensive translation for XQuery basic operations and FLWR expressions. The query processor is able to translate a complex XQuery query, which might include arbitrarily composed and nested FLWR expressions, basic functions, and element constructors, into a single SQL query for RDBMS and a physical plan for the <em>XQuery-enhanced Relational Engine</em>.
In order to produce efficient and concise SQL queries, succinct XQuery to SQL translation templates and the optimization algorithms for the SQL query generation are proposed and implemented. The preferable <em>merge-join</em> approach is also proposed to avoid the inefficient <em>nested-loop</em> evaluation for FLWR expressions. <em>Merge-join</em> patterns and query rewriting rules are designed to identify XQuery fragments that can utilize the efficient <em>merge-join</em> evaluation. Proofs of correctness of the approach are provided in the thesis. Experimental results justify the correctness of our work.
|
227 |
A Serial Bitstream Processor for Smart Sensor SystemsCai, Xin January 2010 (has links)
<p>A full custom integrated circuit design of a serial bitstream processor is proposed for remote smart sensor systems. This dissertation describes details of the architectural exploration, circuit implementation, algorithm simulation, and testing results. The design is fabricated and demonstrated to be a successful working processor for basic algorithm functions. In addition, the energy performance of the processor, in terms of energy per operation, is evaluated. Compared to the multi-bit sensor processor, the proposed sensor processor provides improved energy efficiency for serial sensor data processing tasks, and also features low transistor count and area reduction advantages.</p><p>Operating in long-term, low data rate sensing environments, the serial bitstream processor developed is targeted at low-cost smart sensor systems with serial I/O communication through wireless links. This processor is an attractive option because of its low transistor count, easy on-chip integration, and programming flexibility for low data duty cycle smart sensor systems, where longer battery life, long-term monitoring and sensor reliability are critical. </p><p>The processor can be programmed for sensor processing algorithms such as delta sigma processor, calibration, and self-test algorithms. It also can be modified to utilize Coordinate Rotation Digital Computer (CORDIC) algorithms. The applications of the proposed sensor processor include wearable or portable biomedical sensors for health care monitoring or autonomous environmental sensors.</p> / Dissertation
|
228 |
Architectural Support for Protecting Memory Integrity and ConfidentialityShi, Weidong 10 May 2006 (has links)
This dissertation describes efficient design of tamper-resistant secure processor and cryptographic memory protection model that will strength security of a computing system. The thesis proposes certain cryptographic and security features integrated into the general purpose processor and computing platform to protect confidentiality and integrity of digital content stored in a computing system's memory. System designers can take advantages of the availability of the proposed security model to build future security systems such as systems with strong anti-reverse engineering capability, digital content protection system, or trusted computing system with strong tamper-proof protection.
The thesis explores architecture level optimizations and design trade-offs for supporting high performance tamper-resistant memory model and micro-processor architecture. It expands the research of the previous studies on tamper-resistant processor design on several fronts. It offers some new architecture and design optimization techniques to further reduce the overhead of memory protection over the previous approaches documented in the literature. Those techniques include prediction based memory decryption and efficient memory integrity verification approaches. It compares different encryption modes applicable to memory protection and evaluates their pros and cons. In addition, the thesis tries to solve some of the security issues that have been largely ignored in the prior art. It presents a detailed investigation of how to integrate confidentiality protection and integrity protection into the out-of-order processor architecture both efficiently and securely. Furthermore, the thesis also expands the coverage of protection from single processor to multi-processor.
|
229 |
Design and Implementation of an Air Conditioner Adaptive Compressor Driver with Sine PWM and Current FeedbackHung, De-Shian 27 October 2010 (has links)
This thesis uses TMS320LF2407A DSP from T.I. as the control kernel .It proposes a method of sensorless driver and variable speed driver with current feedback for the rotary compressor. By detecting the back electromotive force signals, the information of rotor position can be detected by the commutation process and the speed estimation can also be achieved. In order to make the system more robust and the improve the power consumption, adaptive controller and close loop structure are adapted. At last, the experimental system structure is built, and the advantages improvement efficiency of the system with sensorless driver¡Bspeed and current
feedback were be verified by experiment.
|
230 |
Transmission Modeling with Simulink and FPGA implementation of 3072-point FFT for the Homeplug AV systemSun, Wei-Cheng 20 July 2011 (has links)
The rapid growth of communication technology with the success of internet, has brought huge profits and great convenience to our daily life. Computer networks can be built using either wired or wireless technologies. It will be an important issue that how to select a medium for the transmission. Wired Ethernet has been the traditional choice in most of the networks. However, it has to deploy the Ethernet wires. For the wired internet networks, the power line communication (PLC) technology will be an alternative choice. In this wire-line communication system, the power line network is used as the transmission medium. Therefore, computer networks can work on the existing power line system. No extra new transmission infrastructure is needed. So far, several PLC standards are available, shch as X-10, CEBus(Consumer Electronic Bus), Echonet and Homeplug. This thesis studies the Homeplug AV specification developed by the Homeplug powerline Alliance. By employing MATLAB/Simulink, we build up a PLC baseband transmission model and simulation platform. We carry out the Homeplug AV baseband transmission performance in system level on this platform. The Homeplug AV adopts 3072-point FFT which is not the power of two. It will be a challenge to design the 3072-point FFT processor. Here, we use Xilinx System Generator to design and implement the 3072-point FFT processor. The function verification of the implemented 3072-point FFT processor for Homeplug AV system is carried out by simulation.
|
Page generated in 0.0419 seconds