Global ETD Search

11	Systematic Generation of Instruction Test Patterns Based on Architectural Parameters Mu, Peter 30 August 2001 (has links) When we survey hardware design groups, we can find that it is now dedicated to verification between 60 to 80 percent. According to the instruction set architecture information should be a feasible and reasonable way for generating the test pattern to verify the function of a microprocessor. In this these, we¡¦ll present an instruction test pattern (for microprocessors) generation method based on the instruction set architecture. It can help the users to generate the instruction test pattern efficiently. The generation flow in this thesis contains three major flows: individual instruction, instruction pair, and manual generation. They are used for different verification cases. The ¡§individual instruction¡¨ could be used for verifying the functions of each implemented instructions. The ¡§instruction pair¡¨ could be used for verifying the interaction of instruction execution in a pipeline for a HDL implementation of a microprocessor. The ¡§manual generation¡¨ could be used to verify some corner cases (behaviors) of the microprocessor. As the quality of our test pattern, we generate some patterns for 32-bits instruction (ARM instruction sets and SPARC instruction sets) and use them to verify a synthesizable RTL core. With some handwriting test pattern (34.7%), our automatic generation method can approach 100% HDL code coverage of the microprocessor design. We use the HDL code coverage as the reference of test pattern quality. Because our generation method is based on the instruction field, we can describe most instruction set for the generator. Hence, our generation method can retarget to most instruction set architecture without modifying the generator. Besides the RISC instructions, even the CISC instructions could be generated. Test Pattern Instruction Set Architecture Code Coverage Simulation-Based Verification
12	An Embedded 16-bit Low Power and Low Cost Microprocessor in Information Appliance Wang, Chuen-You 10 September 2002 (has links) In embedded system, the system resource is limited. So, small is the most important feature of the embedded system. In this thesis, we propose a fast way to design a 16-bit microprocessor through reducing the 32-bit RISC CPU based on ARM 4vT Instruction set to the 16-bit RISC Thumb microprocessor. And through building the programming model, we can reach to save the design time of developing the compiler and assembler to keep its software environment. thumb instruction set microprocessor high code density low cost
13	Implementation of face detection algorithm with parallel extended-MMX instruction set Tzeng, Hua-Yi 20 August 2008 (has links) Face detection has many applications in technical area. We think about accuracy and regular arrangement of data of face detection. So, we select Recognition algorithms using neural network for implementation. The implementation method can be divided into three parts. One is Modified Census Transform. The other one is computing hypotheses. Other is square frame for mark face. Modified Census Transform is a regularly computing method and regular arrangement of data. Modified Census Transform is compatible using SIMD execution, but other parts is irregular arrangement of data and not easy to parallel execution. This paper uses SIMD processor architecture which develops in our laboratory to implementation of Modified Census Transform and multi-data streaming property. The picture is divided four parts to execute at the same time and changes different mode to execute according to different algorithm then fetch data is smooth and moving data can reduce frequency. Adding a new instruction that uses 16bits data format uses four MMX registers for 4¡Ñ4 transpose of the matrix. The other is loading data and extending signed bit or unsigned bit at the same time. They can accelerate parallel execution in multi-data streaming. We also support multi-data streaming that is not series. It uses striping mode to fetch multi-data which between the same distance then we can achieve to compute multi-data streaming. Besides, we use hypotheses to distinguish different person that we only want find one. We compare two hypotheses. If the difference in hypotheses between two different picture that there is small than 0.3%, they are the same person which in different picture. Finial, we verify the function is correct in UMVP-2500 platform. We compare efficiency with MMX and Xscale and analysis multi-data streaming SIMD architecture which has some benefits. We compare efficiency with MMX. We speed up 373%. We compare efficiency with Xscale. We speed up 345%. This result will show that multi-data streaming SIMD architecture compares speed up with others SIMD architecture. Multi-data streaming SIMD architecture adds a new instruction which is 4¡Ñ4 transpose of the matrix. Because the 4¡Ñ4 transpose of the matrix can change row and column, we have new abstraction. The common computation likes a line, but the new abstraction becomes a phase. MMX and Xscale are not this abstraction. multi-data streaming face detection extended-MMX instruction set MMX
14	A high speed 16-bit RISC processor chip / Chen, Wan-Fu. January 1994 (has links) Thesis (M.S.)--Rochester Institute of Technology, 1994. / Typescript. Includes bibliographical references (leaf 170).
15	Emulace CPU pro výuku asemblerů / A CPU Emulator for Course of Assembly Languages Charvát, Lukáš January 2011 (has links) The master thesis discusses the design of an emulator of a CPU architecture instruction set aimed at assembly languages course. While most of nowadays emulators are architecture specific, the emulator proposed in master thesis aims at education and better understanding of assembly languages. The emulator is not limited to a single CPU, but it easily allows defining a purpose-specific architecture and instruction set in order to perform operations upon it and to display its current state.
16	Development of Classroom Tools for a RISC-V Embedded System Phillips, Lucas 01 May 2022 (has links) RISC-V is an open-source instruction set that has been gaining popularity in recent years, and, with support from large chip manufacturers like Intel and the benefits of its open-source nature, RISC-V devices are likely to continue gaining momentum. Many courses in a computer science program involve development on an embedded device. Usually, this device is of the ARM architecture, like a Raspberry Pi. With the increasing use of RISC-V, it may be beneficial to use a RISC-V embedded device in one of these classroom environments. This research intends to assist development on the SiFive HiFive1 RevB, which is a RISC-V embedded device. This device was chosen because of its ease of use, functionality-rich API, and affordability. In order to make developing with this board very approachable for a student, this research involved the development of a small suite of tools. These tools support common functionality like: building a source file into an executable ELF file, converting that ELF executable into an Intel HEX executable format that is required to run on the device, uploading the Intel HEX executable onto the device, and attaching a debug session to the program that is running on the device. With the help of this toolchain, developing on this RISC-V embedded device should be very approachable for most students. RISC-V ISA Embedded System Instruction Set Architecture Systems Architecture
17	Applications of information sharing for code generation in process virtual machines Kyle, Stephen Christopher January 2016 (has links) As the backbone of many computing environments today, it is important that process virtual machines be both performant and robust in mobile, personal desktop, and enterprise applications. This thesis focusses on code generation within these virtual machines, particularly addressing situations where redundant work is being performed. The goal is to exploit information sharing in order to improve the performance and robustness of virtual machines that are accelerated by native code generation. First, the thesis investigates the potential to share generated code between multiple threads in a dynamic binary translator used to perform instruction set simulation. This is done through a code generation design that allows native code to be executed by any simulated core and adding a mechanism to share native code regions between threads. This is shown to improve the average performance of multi-threaded benchmarks by 1.4x when simulating 128 cores on a quad-core host machine. Secondly, the ahead-of-time code generation system used for executing Android applications is improved through the use of profiling. The thesis investigates the potential for profiles produced by individual users of applications to be shared and merged together to produce a generic profile that still provides a lot of benefit for a new user who is then able to skip the expensive profiling phase. These profiles can not only be used for selective compilation to reduce code-size and installation time, but can also be used for focussed optimisation on vital code regions of an application in order to improve overall performance. With selective compilation applied to a set of popular Android applications, code-size can be reduced by 49.9% on average, while installation time can be reduced by 31.8%, with only an average 8.5% increase in the amount of sequential runtime required to execute the collected profiles. The thesis also shows that, among the tested users, the use of a crowd-sourced and merged profile does not significantly affect their estimated performance loss from selective compilation (0.90x-0.92x) in comparison to when they they perform selective compilation with their own unique profile (0.93x). Furthermore, by proposing a new, more powerful code generator for Android’s virtual machine, these same profiles can be used to perform focussed optimisation, which preliminary results show to increase runtime performance across a set of common Android benchmarks by 1.46x-10.83x. Finally, in such a situation where a new code generator is being added to a virtual machine, it is also important to test the code generator for correctness and robustness. The methods of execution of a virtual machine, such as interpreters and code generators, must share a set of semantics about how programs must be executed, and this can be exploited in order to improve testing. This is done through the application of domain-aware binary fuzzing and differential testing within Android’s virtual machine. The thesis highlights a series of actual code generation and verification bugs that were found in Android’s virtual machine using this testing methodology, as well as comparing the proposed approach to other state-of-the-art fuzzing techniques. 005.4
18	Design automation methodologies for extensible processor platform Cheung, Newton, Computer Science & Engineering, Faculty of Engineering, UNSW January 2005 (has links) This thesis addresses two ubiquitous trends in the embedded system world - the increasing importance of design turnaround time as a design metric, and the move towards closing the design productivity gap. Adopting the right choice of design approach has been recognised as an integral part of the design flow in order to meet desired characteristics such as increasing software content, satisfying the growing complexities of an application, reusing off-the-shelf components, and exploring design metrics tradeoff, which closes the design productivity gap. The importance of design turnaround time is motivated by the intensive competition between manufacturers, especially makers of mainstream electronic consumer products, who shrinks the product life cycle and requires faster time-to-market to maximise economic benefits. This thesis presents a suite of design automation methodologies to automatically design embedded systems for an application in the state-of-the-art design approach - the extensible processor platform. These design automation methodologies systematise the extensible processor platform???s design flow, with particular emphasis on solving four challenging design problems: i) code segment identification; ii) instruction generation; iii) architectural customisation selection; and iv) processor evaluation. Our suite of design automation methodologies includes: i) a semi-automatic design system - to design an extensible processor that maximises the application performance while satisfying the area constraint. By specifying a fitting function to identify suitable code segments within an application, a two-level hierarchy selection algorithm is used to first select a predefined processor and then select the right instruction, and a performance estimator is used to estimate an application's performance; ii) a tool to match instructions - to automatically match the pre-designed instructions with computationally intensive code segments, reducing verification time and effort; iii) an instructions estimation model - to estimate the area overhead, latency, power consumption of extensible instructions, exploring larger design space; and iv) an instructions generation tool - to generate new extensible instructions that maximises the speedup while minimising power dissipation. A number of techniques such as system decomposition, combinational equivalence checking and regression analysis etc., have been heavily relied upon in the creation of the final design system. This thesis shows results at every stage to demonstrate the efficacy of our design methodologies in the creation of extensible processors. The methodologies and results presented in this thesis demonstrate that automating the design process for an extensible processor platform results in significant performance increase - on average, an increase of 4.74x (up to 15.71x) compared to the original base processor. Our system achieves significant design turnaround time savings (2.5% of the full simulation time for the entire design space) with majority Pareto points obtained (91% on average), and can lead to fewer and faster design iterations. Our instruction matching tool is 7.3x faster on average compared to the best known approaches to the problem (partial simulations). Our estimation model has a mean absolute error as small as 3.4% (6.7% max.) for area overhead, 5.9% (9.4% max.) for latency, and 4.2% (7.2% max.) for power consumption, compared to estimation through the time consuming synthesis and simulation steps using commercial tools. Finally, the instruction generation tool reduces energy consumption by a further 5.8% on average (up to 17.7%) compared to extensible instructions generated by previous approaches. data processing design and construction integrated circuits design automation extensible processor
19	Hardware mechanisms and their implementations for secure embedded systems Qin, Jian January 2005 (has links) <p>Security issues appearing in one or another form become a requirement for an increasing number of embedded systems. Those systems, which will be used to capture, store, manipulate, and access data with a sensitive nature, have posed several unique and urgent challenges. The challenges to those embedded system require new approaches to security covering all aspects of embedded system design from architecture, implementation to the methodology. However, security is always treated by embedded system designer as the addition of features, such as specific cryptographic algorithm or other security protocol. This paper is intended to draw both the SW and HW designer attention to treat the security issues as a new mainstream during the design of embedded system. We intend to show why hardware option issues have been taken into consideration and how those hardware mechanisms and key features of processor architecture could be implemented in the hardware level (through modification of processor architecture, for example) to deal with various potential attacks unique to embedded systems.</p> Informationsteknik Security Hardware machanism Instruction set Embedded system Informationsteknik Information technology Informationsteknik
20	Design and Implementation of Single Issue DSP Processor Core Ravinath, Vinodh January 2007 (has links) <p>Micro processors built specifically for digital signal processing are DSP processors. DSP is one of the core technologies in rapidly growing applications like communications and audio processing. The estimated growth of DSP processors in the last 6 years is over 40%. The variety of DSP capable processors for various applications also increased with the rising popularity of DSP processors. The design flow and architecture of such processors are not commonly available to students for learning.</p><p>This report is a structured approach to design and implementation of an embedded DSP processor core for voice, audio and video codec. The report focuses on the design requirement specification, senior instruction set and assembly manual release, micro architecture design and implementation of the core. Details about the core verification are also included in this report. The instruction set of this processor supports running basic kernels of BDTI benchmarking.</p> DSP processor codec Instruction set Microarchitecture FSM RTL Computer engineering Datorteknik

Search results