Global ETD Search

21	Implementace procesoru MicroBlaze v jazyce CodAL / MicroBlaze processor implementation using CodAL language Hájek, Radek January 2016 (has links) The diploma thesis contains theoretical basis, classification and function of processors. It summarizes the principle of pipelined instruction processing and the types of hazards in the microarchitecture of the processor. It also introduces design of processors using CodAL language developed by Codasip company. In the practical part of the thesis the model of MicroBlaze core developed by Xilinx company was described in the CodAL language. Designed model was tested and implemented into the FPGA device as practical example.
22	Implementa??o da t?cnica de software pipelining na rede em chip IPNoSyS Medeiros, Aparecida Lopes de 21 February 2014 (has links) Made available in DSpace on 2014-12-17T15:48:10Z (GMT). No. of bitstreams: 1 AparecidaLM_DISSERT.pdf: 8059053 bytes, checksum: a243ee0772a785a00c8a0670955a7cae (MD5) Previous issue date: 2014-02-21 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / Alongside the advances of technologies, embedded systems are increasingly present in our everyday. Due to increasing demand for functionalities, many tasks are split among processors, requiring more efficient communication architectures, such as networks on chip (NoC). The NoCs are structures that have routers with channel point-to-point interconnect the cores of system on chip (SoC), providing communication. There are several networks on chip in the literature, each with its specific characteristics. Among these, for this work was chosen the Integrated Processing System NoC (IPNoSyS) as a network on chip with different characteristics compared to general NoCs, because their routing components also accumulate processing function, ie, units have functional able to execute instructions. With this new model, packets are processed and routed by the router architecture. This work aims at improving the performance of applications that have repetition, since these applications spend more time in their execution, which occurs through repeated execution of his instructions. Thus, this work proposes to optimize the runtime of these structures by employing a technique of instruction-level parallelism, in order to optimize the resources offered by the architecture. The applications are tested on a dedicated simulator and the results compared with the original version of the architecture, which in turn, implements only packet level parallelism / Com os avan?os tecnol?gicos os sistemas embarcados est?o cada vez mais presentes em nosso cotidiano. Devido a crescente demanda por funcionalidades, as fun??es s?o distribu?das entre os processadores, demandando arquiteturas de comunica??o mais eficientes, como as redes em chip (Network-on-Chip - NoC). As NoCs s?o estruturas que possuem roteadores com canais ponto-a-ponto que interconectam os cores do SoC (System-on-Chip), provendo comunica??o. Existem diversas redes em chip na literatura, cada uma com suas caracter?sticas espec?ficas. Dentre essas, para este trabalho foi a escolhida a IPNoSyS (Integrated Processing NoC System) por ser uma rede em chip com caracter?sticas diferenciadas em rela??o ?s NoCs em geral, pois seus componentes de roteamento acumulam tamb?m a fun??o de processamento, ou seja, possuem unidades funcionais capazes de executar instru??es. Com esse novo modelo, pacotes s?o processados e roteados pela arquitetura do roteador. Este trabalho visa melhorar o desempenho das aplica??es que possuem repeti??o, pois essas aplica??es gastam um tempo maior na sua execu??o, o que se d? pela repetida execu??o de suas instru??es. Assim, este trabalho prop?e otimizar o tempo de execu??o dessas estruturas, atrav?s do emprego de uma t?cnica de paralelismo em n?vel de instru??es, visando melhor aproveitar os recursos oferecidos pela arquitetura. As aplica??es s?o testadas em um simulador dedicado, e seus resultados comparados com a vers?o original da arquitetura, a qual prov? paralelismo apenas em n?vel de pacotes
23	Suporte especializado de hardware para geração automática de loop pipelining em FPGAS Souza, Guilherme Stefano Silva de 19 November 2014 (has links) Submitted by Daniele Amaral (daniee_ni@hotmail.com) on 2016-09-13T20:06:59Z No. of bitstreams: 1 DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-15T13:34:53Z (GMT) No. of bitstreams: 1 DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) / Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-15T13:35:23Z (GMT) No. of bitstreams: 1 DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) / Made available in DSpace on 2016-09-15T13:35:30Z (GMT). No. of bitstreams: 1 DissGSSS.pdf: 12761989 bytes, checksum: 9e4c2b4e76a2502af072064ed081eec1 (MD5) Previous issue date: 2014-11-19 / Não recebi financiamento / Loop pipelining is a technique that may offer significant performance improvements, being employed not only in conventional compilation targeting microprocessors, but also by High Level Synthesis (HLS) tools, targeting heterogeneous architectures and hardware accelerators. This work presents a specialized hardware support aiming at facilitate compilation tasks for HLS tools, along with potential advantages in execution performance and total silicon area employed. Two specialized hardware modules are presented: a queue register file and an instruction predication control module. / O desempenho na execução de programas, que é cada vez mais uma prioridade, pode ter uma melhora significativa por meio do uso de paralelismo em nível de instrução (ILP). Uma técnica que utiliza o ILP e propicia ganhos de desempenho significativos é o loop pipelining, sendo usado não apenas por compiladores para microprocessadores, mas também por ferramentas de Síntese de Alto Nível (HLS), visando arquiteturas heterogêneas e aceleradores de hardware. Neste trabalho é apresentado o projeto e implementação de estruturas de hardware especializadas, objetivando-se em solucionar o problema de sobreposição de valores que ocorre no loop pipelining, facilitar tarefas de compilaçãoo em ferramentas HLS e diminuir a repetição de código. Além disso, ganhos potenciais de desempenho e área de silício total podem ser alcançados como resultado do uso das estruturas propostas. Serão apresentados: um arquivo de registradores baseado em filas e um módulo de controle para a execução de instruções predicadas. Loop Pipelining Software Pipelining Queued Register File Modulo Scheduling Register Files Queue Predicated Instructions Escalonamento de Módulo QRF Arquivos de Registradores Filas Instruções Predicadas
24	Suitability of FPGA-based computing for cyber-physical systems Lauzon, Thomas Charles 18 August 2010 (has links) Cyber-Physical Systems theory is a new concept that is about to revolutionize the way computers interact with the physical world by integrating physical knowledge into the computing systems and tailoring such computing systems in a way that is more compatible with the way processes happen in the physical world. In this master’s thesis, Field Programmable Gate Arrays (FPGA) are studied as a potential technological asset that may contribute to the enablement of the Cyber-Physical paradigm. As an example application that may benefit from cyber-physical system support, the Electro-Slag Remelting process - a process for remelting metals into better alloys - has been chosen due to the maturity of its related physical models and controller designs. In particular, the Particle Filter that estimates the state of the process is studied as a candidate for FPGA-based computing enhancements. In comparison with CPUs, through the designs and experiments carried in relationship with this study, the FPGA reveals itself as a serious contender in the arsenal of v computing means for Cyber-Physical Systems, due to its capacity to mimic the ubiquitous parallelism of physical processes. / text Cyber-physical systems Particle filter Electro-slag remelting FPGA CPU Sequential Monte-Carlo methods Pipelining Bluespec.
25	Utilisation du modèle polyédrique pour la synthèse d'architectures pipelinées / Synthesis of pipelined architectures using the polyhedral model Morvan, Antoine 28 June 2013 (has links) Grâce aux progrès réalisés dans le domaine des semi-conducteurs, les plateformes matérielles embarquées sont capables de satisfaire les contraintes de performances d'applications de plus en plus complexes. Cette augmentation conduit à une explosion des coûts de conception, ce qui pousse les concepteurs de ces plateformes à utiliser des outils travaillant à des niveaux d’abstraction plus élevés. Aujourd’hui, les outils de synthèse de haut niveau opèrent sur des descriptions C/C++ pour en générer des accélérateurs matériels spécialisés. Ces outils offrent des gains en productivité significatifs par rapport à la génération précédente, qui opérait sur des descriptions structurelles de l’architecture en VHDL ou Verilog. Ces descriptions algorithmiques doivent être retravaillées pour que les outils puissent générer des circuits performants. Pour faciliter cette tâche, une solution consiste à mettre en œuvre une boite à outils pour des transformations source-à-source orientées synthèse de haut niveau. En particulier, cette thèse s’intéresse aux transformations de boucles, avec pour objectif d’améliorer les performances en exposant des boucles parallèles et en améliorant la localité des accès mémoire. En nous appuyant sur une représentation des boucles dans le modèle polyédrique, nous proposons une approche qui améliore l’applicabilité du pipeline de nids de boucles en vérifiant sa légalité de manière plus précise que les approches existantes. De plus, lorsque la vérification échoue, nous proposons une technique de correction qui insère statiquement des états d’attente pour assurer la légalité du pipeline. Enfin, ce pipeline est mis en œuvre en utilisant une technique de génération de code qui met les nids de boucles à plat. Ces contributions ont été implémentées dans l’infrastructure de compilation source-à-source Gecos, avant d’être appliquées à un ensemble de benchmarks représentatifs des noyaux de calculs cibles de la synthèse de haut niveau. Les résultats montrent un gain en performances significatif, avec un surcoût en surface modéré. / Due to the advances in semiconductor technologies, embedded hardware is capable of satisfying the performance constraints of increasingly complex applications. This leads to a design cost explosion, thus pushing the hardware designers to use tools working with higher levels of abstractions. High-Level Synthesis tools generate custom hardware accelerators out of C/C++ specifications. They offer significant productivity gains compared to the previous generation of tools that worked at the level of hardware description languages, such as VHDL or Verilog. These higher level specifications have to be reworked in order for the High-Level Synthesis tools to generate efficient hardware accelerators. To ease this task, one solution is to provide a source-to-source transformation toolbox targeting High-Level Synthesis. Specifically, this thesis explores loop transformations in order to improve performance by exposing parallel loops and improving the locality of memory accesses. Using polyhedral representation of loop nests, we propose an approach to improve the applicability of nested loop pipelining by verifying its legality in a more precise way than existing approaches. Moreover, we propose a correction mechanism that statically inserts wait states for enforcing the pipeline legality for cases when the verification fails. The resulting pipeline is implemented using a code generation technique that flattens the loop nests. These contributions have been implemented within the GeCoS source-to-source compilation infrastructure, and applied to a set of benchmarks targeted towards High-Level Synthesis. Results show significant performance improvement at the price of a moderate area overhead. Synthèse de haut niveau Pipeline de nid de boucles High level synthesis Loop pipelining Optimizing compilation
26	Certifying Loop Pipelining Transformations in Behavioral Synthesis Puri, Disha 20 March 2017 (has links) Due to the rapidly increasing complexity in hardware designs and competitive time to market trends in the industry, there is an inherent need to move designs to a higher level of abstraction. Behavioral Synthesis is the process of automatically compiling such Electronic System Level (ESL) designs written in high-level languages such as C, C++ or SystemC into Register-Transfer Level (RTL) implementation in hardware description languages such as Verilog or VHDL. However, the adoption of this flow is dependent on designers' faith in the correctness of behavioral synthesis tools. Loop pipelining is a critical transformation employed in behavioral synthesis process, and ubiquitous in commercial and academic behavioral synthesis tools. It improves the throughput and reduces the latency of the synthesized hardware. It is complex and error-prone, and a small bug can result in faulty hardware with expensive ramifications. Therefore, it is critical to certify the loop pipelining transformation so that designers can trust the behaviorally synthesized pipelined designs. Certifying a loop pipelining transformation is however, a major research challenge because there is a huge semantic gap between the input sequential design and the output pipelined implementation, making it infeasible to verify their equivalence with automated sequential equivalence checking (SEC) techniques. Complex loop pipelining transformations can be certified by a combination of theorem proving and SEC: (1) creating a certified pipelining algorithm which generates a reference pipeline model by exploiting pipeline generation information from the synthesis flow (e.g. the iteration interval of a generated pipeline) and (2) conduct SEC between the synthesized pipeline and this reference model. However, a key and arguably, the most complex component of this approach is the development of a formal, mechanically verifiable loop pipelining algorithm. We show how to systematically construct such an algorithm, and carry out its verification using the ACL2 theorem prover. We propose a framework of certified pipelining primitives which are essential for designing pipelining algorithms. Using our framework, we build a certified loop pipelining algorithm. We also propose a key invariant in certifying this algorithm, which links sequential loops with their pipelined counterparts. This is unlike other invariants that have been used in proofs of microprocessor pipelines so far. This dissertation provides a framework for creating certified pipelining algorithms utilizing a mechanical theorem prover. Using this framework, we have developed a certified loop pipelining algorithm. This certified algorithm is essential in the overall approach to certify behaviorally synthesized pipelined designs. We demonstrate the scalability and robustness of our algorithm on several ESL designs across various domains. Algorithms Pipelining (Electronics) Computer systems -- Verification Computer Sciences Theory and Algorithms
27	ADAPT : architectural and design exploration for application specific instruction-set processor technologies Shee, Seng Lin, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) This thesis presents design automation methodologies for extensible processor platforms in application specific domains. The work presents first a single processor approach for customization; a methodology that can rapidly create different processor configurations by the removal of unused instructions sets from the architecture. A profile directed approach is used to identify frequently used instructions and to eliminate unused opcodes from the available instruction pool. A coprocessor approach is next explored to create an SoC (System-on-Chip) to speedup the application while reducing energy consumption. Loops in applications are identified and accelerated by tightly coupling a coprocessor to an ASIP (Application Specific Instruction-set Processor). Latency hiding is used to exploit the parallelism provided by this architecture. A case study has been performed on a JPEG encoding algorithm; comparing two different coprocessor approaches: a high-level synthesis approach and our custom coprocessor approach. The thesis concludes by introducing a heterogenous multi-processor system using ASIPs as processing entities in a pipeline configuration. The problem of mapping each algorithmic stage in the system to an ASIP configuration is formulated. We proposed an estimation technique to calculate runtimes of the configured multiprocessor system without running cycle-accurate simulations, which could take a significant amount of time. We present two heuristics to efficiently search the design space of a pipeline-based multi ASIP system and compare the results against an exhaustive approach. In our first approach, we show that, on average, processor size can be reduced by 30%, energy consumption by 24%, while performance is improved by 24%. In the coprocessor approach, compared with the use of a main processor alone, a loop performance improvement of 2.57x is achieved using the custom coprocessor approach, as against 1.58x for the high level synthesis method, and 1.33x for the customized instruction approach. Energy savings are 57%, 28% and 19%, respectively. Our multiprocessor design provides a performance improvement of at least 4.03x for JPEG and 3.31x for MP3, for a single processor design system. The minimum cost obtained using our heuristic was within 0.43% and 0.29% of the optimum values for the JPEG and MP3 benchmarks respectively. Hardware/software co-design. Embedded systems. Estimation. Partitioning. Performance optimization. Pipelining. System-on-chip.
28	VHDL Implementation of CORDIC Algorithm for Wireless LAN Lashko, Anastasia, Zakaznov, Oleg January 2004 (has links) <p>This work is focused on the CORDIC algorithm for wireless LAN. The primary task is to create a VHDL description for CORDIC vector rotation algorithm. </p><p>The basic research has been carried out in MATLAB. The VHDL implementation of the CORDIC algorithm is based on the results obtained from the MATLAB simulation. Mentor Graphics FPGA Advantage© for Xilinx 4010XL FPGA has been used for the hardware implementation.</p> Electronics FPGA CORDIC implementation wireless LAN intearliving pipelining Elektronik Electronics Elektronik
29	A Multiprocessor Architecture Using Modular Arithmetic for Very High Precision Computation Wu, Henry M. 01 April 1989 (has links) We outline a multiprocessor architecture that uses modular arithmetic to implement numerical computation with 900 bits of intermediate precision. A proposed prototype, to be implemented with off-the-shelf parts, will perform high-precision arithmetic as fast as some workstations and mini- computers can perform IEEE double-precision arithmetic. We discuss how the structure of modular arithmetic conveniently maps into a simple, pipelined multiprocessor architecture. We present techniques we developed to overcome a few classical drawbacks of modular arithmetic. Our architecture is suitable to and essential for the study of chaotic dynamical systems. modular arithmetic computer architecture multiprocessor sresidue number system computer arithmetic pipelining chaos
30	Verification of Pipelined Ciphers Lam, Chiu Hong January 2009 (has links) The purpose of this thesis is to explore the formal verification technique of completion functions and equivalence checking by verifying two pipelined cryptographic circuits, KASUMI and WG ciphers. Most of current methods of communications either involve a personal computer or a mobile phone. To ensure that the information is exchanged in a secure manner, encryption circuits are used to transform the information into an unintelligible form. To be highly secure, this type of circuits is generally designed such that it is hard to analyze. Due to this fact, it becomes hard to locate a design error in the verification of cryptographic circuits. Therefore, cryptographic circuits pose significant challenges in the area of formal verification. Formal verification use mathematics to formulate correctness criteria of designs, to develop mathematical models of designs, and to verify designs against their correctness criteria. The results of this work can extend the existing collection of verification methods as well as benefiting the area of cryptography. In this thesis, we implemented the KASUMI cipher in VHDL, and we applied the optimization technique of pipelining to create three additional implementations of KASUMI. We verified the three pipelined implementations of KASUMI with completion functions and equivalence checking. During the verification of KASUMI, we developed a methodology to handle the completion functions efficiently based on VHDL generic parameters. We implemented the WG cipher in VHDL, and we applied the optimization techniques of pipelining and hardware re-use to create an optimized implementation of WG. We verified the optimized implementation of WG with completion functions and equivalence checking. During the verification of WG, we developed the methodology of ``skipping" that can decrease the number of verification obligations required to verify the correctness of a circuit. During the verification of WG, we developed a way of applying the completion functions approach such that it can deal with a circuit that has been optimized with hardware re-use. verification cryptography pipelining cipher kasumi welch-gong fpga hardware vhdl completion functions Electrical and Computer Engineering

Search results