• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 235
  • 42
  • 18
  • 16
  • 4
  • 4
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 447
  • 447
  • 442
  • 437
  • 115
  • 69
  • 64
  • 55
  • 55
  • 53
  • 51
  • 50
  • 45
  • 44
  • 39
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

Some results on FPGAs, file transfers, and factorizations of graphs.

January 1998 (has links)
by Pan Jiao Feng. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 89-93). / Abstract also in Chinese. / Abstract --- p.i / Acknowledgments --- p.v / List of Tables --- p.x / List of Figures --- p.xi / Chapter Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Graph definitions --- p.2 / Chapter 1.2 --- The S box graph --- p.2 / Chapter 1.3 --- The file transfer graph --- p.4 / Chapter 1.4 --- "(g, f)-factor and (g, f)-factorization" --- p.5 / Chapter 1.5 --- Thesis contributions --- p.6 / Chapter 1.6 --- Organization of the thesis --- p.7 / Chapter Chapter 2. --- On the Optimal Four-way Switch Box Routing Structures of FPGA Greedy Routing Architectures --- p.8 / Chapter 2.1 --- Introduction --- p.9 / Chapter 2.1.1 --- FPGA model and S box model --- p.9 / Chapter 2.1.2 --- FPGA routing --- p.10 / Chapter 2.1.3 --- Problem formulation --- p.10 / Chapter 2.2 --- Definitions and terminology --- p.12 / Chapter 2.2.1 --- General terminology --- p.12 / Chapter 2.2.2 --- Graph definitions --- p.15 / Chapter 2.2.3 --- The S box graph --- p.15 / Chapter 2.3 --- Properties of the S box graph and side-to-side graphs --- p.16 / Chapter 2.3.1 --- On the properties of the S box graph --- p.16 / Chapter 2.3.2 --- The properties of side-to-side graphs --- p.19 / Chapter 2.4 --- Conversion of the four-way FPGA routing problem --- p.23 / Chapter 2.4.1 --- Conversion of the S box model --- p.24 / Chapter 2.4.2 --- Conversion of the DAAA model --- p.26 / Chapter 2.4.3 --- Conversion of the DADA model --- p.27 / Chapter 2.4.4 --- Conversion of the DDDA model --- p.28 / Chapter 2.5 --- Low bounds of routing switches --- p.28 / Chapter 2.5.1 --- The lower bound of the DAAA model --- p.29 / Chapter 2.5.2 --- The lower bound of the DADA model --- p.30 / Chapter 2.5.3 --- The lower bound of the DDDA model --- p.31 / Chapter 2.6 --- Optimal structure of one-side predetermined four-way FPGA routing --- p.32 / Chapter 2.7 --- Optimal structures of two-side and three-side predetermined four-way FPGA routing --- p.45 / Chapter 2.7.1 --- Optimal structure of two-side predetermined four-way FPGA routing --- p.46 / Chapter 2.7.2 --- Optimal structure of three-side predetermined four-way FPGA routing --- p.47 / Chapter 2.8 --- Conclusion --- p.49 / Appendix --- p.50 / Chapter Chapter 3. --- "Application of (0, f)-Factorization on the Scheduling of File Transfers" --- p.53 / Chapter 3.1 --- Introduction --- p.53 / Chapter 3.1.1 --- "(0,f)-factorization" --- p.54 / Chapter 3.1.2 --- File transfer model and its graph --- p.54 / Chapter 3.1.3 --- Previous results --- p.56 / Chapter 3.1.4 --- Our results and outline of the chapter --- p.56 / Chapter 3.2 --- NP-completeness --- p.57 / Chapter 3.3 --- Some lemmas --- p.58 / Chapter 3.4 --- Bounds of file transfer graphs --- p.59 / Chapter 3.5 --- Comparison --- p.62 / Chapter 3.6 --- Conclusion --- p.68 / Chapter Chapter 4. --- "Decomposition Graphs into (g,f)-Factors" --- p.69 / Chapter 4.1 --- Introduction --- p.69 / Chapter 4.1.1 --- "(g,f)-factors and (g,f)-factorizations" --- p.69 / Chapter 4.1.2 --- Previous work --- p.70 / Chapter 4.1.3 --- Our results --- p.72 / Chapter 4.2 --- Proof of Theorem 2 --- p.73 / Chapter 4.3 --- Proof of Theorem 3 --- p.79 / Chapter 4.4 --- Proof of Theorem 4 --- p.80 / Chapter 4.5 --- Related previous results --- p.82 / Chapter 4.6 --- Conclusion --- p.84 / Chapter Chapter 5. --- Conclusion --- p.85 / Chapter 5.1 --- About graph-based approaches --- p.85 / Chapter 5.2 --- FPGA routing --- p.87 / Chapter 5.3 --- The scheduling of file transfer --- p.88 / Bibliography --- p.89 / Vita --- p.94
172

Efficient Elliptic Curve Processor Architectures for Field Programmable Logic

Orlando, Gerardo 27 March 2002 (has links)
Elliptic curve cryptosystems offer security comparable to that of traditional asymmetric cryptosystems, such as those based on the RSA encryption and digital signature algorithms, with smaller keys and computationally more efficient algorithms. The ability to use smaller keys and computationally more efficient algorithms than traditional asymmetric cryptographic algorithms are two of the main reasons why elliptic curve cryptography has become popular. As the popularity of elliptic curve cryptography increases, the need for efficient hardware solutions that accelerate the computation of elliptic curve point multiplications also increases. This dissertation introduces elliptic curve processor architectures suitable for the computation of point multiplications for curves defined over fields GF(2^m) and curves defined over fields GF(p). Each of the processor architectures presented here allows designers to tailor the performance and hardware requirements according to their performance and cost goals. Moreover, these architectures are well suited for implementation in modern field programmable gate arrays (FPGAs). This point was proved with prototyped implementations. The fastest prototyped GF(2^m) processor can compute an arbitrary point multiplication for curves defined over fields GF(2^167) in 0.21 milliseconds and the prototyped processor for the field GF(2^192-2^64-1) is capable of computing a point multiplication in about 3.6 milliseconds. The most critical component of an elliptic curve processor is its arithmetic unit. A typical arithmetic unit includes an adder/subtractor, a multiplier, and possibly a squarer. Some of the architectures presented in this work are based on multiplier and squarer architectures developed as part of the work presented in this dissertation. The GF(2^m) least significant bit super-serial multiplier architecture, the GF(2^m) most significant bit super-serial multiplier architecture, and a new GF(p) Montgomery multiplier architecture were developed as part of this work together with a new squaring architecture for GF(2^m).
173

Channel coding on a nano-satellite platform

Shumba, Angela-Tafadzwa January 2018 (has links)
Thesis (Master of Engineering in Electrical Engineering)--Cape Peninsula University of Technology, 2017. / The concept of forward error correction (FEC) coding introduced the capability of achieving near Shannon limit digital transmission with bit error rates (BER) approaching 10-9 for signal to noise power (Eb/No) values as low as 0.7. This brought about the ability to transmit large amounts of data at fast rates on bad/noisy communication channels. In nano-satellites, however, the constraints on power that limit the energy that can be allocated for data transmission result in significantly reduced communication system performance. One of the effects of these constraints is the limitation on the type of channel coding technique that can be implemented in these communication systems. Another limiting factor on nano-satellite communication systems is the limited space available due to the compact nature of these satellites, where numerous complex systems are tightly packed into a space as small as 10x10x10cm. With the miniaturisation of Integrated-Circuit (IC) technology and the affordability of Field-Programmable-Gate-Arrays (FPGAs) with reduced power consumption, complex circuits can now be implemented within small form factors and at low cost. This thesis describes the design, implementation and cost evaluation of a ½-rate convolutional encoder and the corresponding Viterbi decoder on an FPGA for nano-satellites applications. The code for the FPGA implementation is described in VHDL and implemented on devices from the Artix7 (Xilinx), Cyclone V (Intel-fpga), and Igloo2 (Microsemi) families. The implemented channel code has a coding gain of ~3dB at a BER of 10-3. It can be noted that the implementation of the encoder is quite straightforward and that the main challenge is in the implementation of the decoder.
174

High speed DSP implementation in run-time partially reconfigurable FPGAs / High speed digital signal processing implementation in run-time partially reconfigurable field programmable gate arrays

McBride, Justin D. (Justin Donald), 1980- January 2003 (has links)
Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003. / Includes bibliographical references (leaves 99-100). / This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / This thesis investigates the feasibility of utilizing a run-time partially reconfigurable FPGA to implement a sequence of high-speed digital signal processing filters. Rather than reconfiguring the entire device to modify part of a configuration, a modular architecture is designed to allow smaller segments of the device to be individually reconfigured while the remainder of the device continues to operate. This document describes the design, implementation, simulation, and benchmarking of a five-socket modular DSP architecture and compares the results to the performance of alternative digital signal processing methods, particularly that of software DSP subroutines run on a PowerPC processor. The result is a highly flexible architecture that supports the use of timing verified hardware subroutines that could be partially reconfigured onto the FPGA within 3ms. The highly parallel processing power of the FPGA design yields a performance of 5.825 billion multiply and accumulate operations per second while simulated running at 72.8MHz, more than 76 times faster than similar calculations measured on a MPC7410 processor. / by Justin D. McBride. / M.Eng.and S.B.
175

FPGA technology mapping optimizaion by rewiring algorithms.

January 2005 (has links)
Tang Wai Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 40-41). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgement --- p.iii / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Rewiring Algorithms --- p.3 / Chapter 2.1 --- REWIRE --- p.5 / Chapter 2.2 --- RAMFIRE --- p.7 / Chapter 2.3 --- GBAW --- p.8 / Chapter 3 --- FPGA Technology Mapping --- p.11 / Chapter 3.1 --- Problem Definition --- p.13 / Chapter 3.2 --- Network-flow-based Algorithms for FPGA Technology Mapping --- p.16 / Chapter 3.2.1 --- FlowMap --- p.16 / Chapter 3.2.2 --- FlowSYN --- p.21 / Chapter 3.2.3 --- CutMap --- p.22 / Chapter 4 --- LUT Minimization by Rewiring --- p.24 / Chapter 4.1 --- Greedy Decision Heuristic for LUT Minimization --- p.27 / Chapter 4.2 --- Experimental Result --- p.28 / Chapter 5 --- Conclusion --- p.38 / Bibliography --- p.40
176

On FPGA implementations for bioinformatics, neural prosthetics and reinforcement learning problems.

January 2005 (has links)
Mak Sui Tung Terrence. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (leaves 132-142). / Abstracts in English and Chinese. / Abstract --- p.i / List of Tables --- p.iv / List of Figures --- p.v / Acknowledgements --- p.ix / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Bioinformatics --- p.1 / Chapter 1.2 --- Neural Prosthetics --- p.4 / Chapter 1.3 --- Learning in Uncertainty --- p.5 / Chapter 1.4 --- The Field Programmable Gate Array (FPGAs) --- p.7 / Chapter 1.5 --- Scope of the Thesis --- p.10 / Chapter 2. --- A Hybrid GA-DP Approach for Searching Equivalence Sets --- p.14 / Chapter 2.1 --- Introduction --- p.16 / Chapter 2.2 --- Equivalence Set Criterion --- p.18 / Chapter 2.3 --- Genetic Algorithm and Dynamic Programming --- p.19 / Chapter 2.3.1 --- Genetic Algorithm Formulation --- p.20 / Chapter 2.3.2 --- Bounded Mutation --- p.21 / Chapter 2.3.3 --- Conditioned Crossover --- p.22 / Chapter 2.3.4 --- Implementation --- p.22 / Chapter 2.4 --- FPGAs Implementation of GA-DP --- p.24 / Chapter 2.4.1 --- System Overview --- p.25 / Chapter 2.4.2 --- Parallel Computation for Transitive Closure --- p.26 / Chapter 2.4.3 --- Genetic Operation Realization --- p.28 / Chapter 2.5 --- Discussion --- p.30 / Chapter 2.6 --- Limitation and Future Work --- p.33 / Chapter 2.7 --- Conclusion --- p.34 / Chapter 3. --- An FPGA-based Architecture for Maximum-Likelihood Phylogeny Evaluation --- p.35 / Chapter 3.1 --- Introduction --- p.36 / Chapter 3.2 --- Maximum-Likelihood Model --- p.39 / Chapter 3.3 --- Hardware Mapping for Pruning Algorithm --- p.41 / Chapter 3.3.1 --- Related Works --- p.41 / Chapter 3.3.2 --- Number Representation --- p.42 / Chapter 3.3.3 --- Binary Tree Representation --- p.43 / Chapter 3.3.4 --- Binary Tree Traversal --- p.45 / Chapter 3.3.5 --- Maximum-Likelihood Evaluation Algorithm --- p.46 / Chapter 3.4 --- System Architecture --- p.49 / Chapter 3.4.1 --- Transition Probability Unit --- p.50 / Chapter 3.4.2 --- State-Parallel Computation Unit --- p.51 / Chapter 3.4.3 --- Error Computation --- p.54 / Chapter 3.5 --- Discussion --- p.56 / Chapter 3.5.1 --- Hardware Resource Consumption --- p.56 / Chapter 3.5.2 --- Delay Evaluation --- p.57 / Chapter 3.6 --- Conclusion --- p.59 / Chapter 4. --- Field Programmable Gate Array Implementation of Neuronal Ion Channel Dynamics --- p.61 / Chapter 4.1 --- Introduction --- p.62 / Chapter 4.2 --- Background --- p.63 / Chapter 4.2.1 --- Analog VLSI Model for Hebbian Synapse --- p.63 / Chapter 4.2.2 --- A Unifying Model of Bi-directional Synaptic Plasticity --- p.64 / Chapter 4.2.3 --- Non-NMDA Receptor Channel Regulation --- p.65 / Chapter 4.3 --- FPGAs Implementation --- p.65 / Chapter 4.3.1 --- FPGA Design Flow --- p.65 / Chapter 4.3.2 --- Digital Model of NMD A and AMPA receptors --- p.65 / Chapter 4.3.3 --- Synapse Modification --- p.67 / Chapter 4.4 --- Results --- p.68 / Chapter 4.4.1 --- Simulation Results --- p.68 / Chapter 4.5 --- Discussion --- p.70 / Chapter 4.6 --- Conclusion --- p.71 / Chapter 5. --- Continuous-Time and Discrete-Time Inference Networks for Distributed Dynamic Programming --- p.72 / Chapter 5.1 --- Introduction --- p.74 / Chapter 5.2 --- Background --- p.77 / Chapter 5.2.1 --- Markov decision process (MDPs) --- p.78 / Chapter 5.2.2 --- Learning in the MDPs --- p.80 / Chapter 5.2.3 --- Bellman Optimal Criterion --- p.80 / Chapter 5.2.4 --- Value Iteration --- p.81 / Chapter 5.3 --- A Computational Framework for Continuous-Time Inference Network --- p.82 / Chapter 5.3.1 --- Binary Relation Inference Network --- p.83 / Chapter 5.3.2 --- Binary Relation Inference Network for MDPs --- p.85 / Chapter 5.3.3 --- Continuous-Time Inference Network for MDPs --- p.87 / Chapter 5.4 --- Convergence Consideration --- p.88 / Chapter 5.5 --- Numerical Simulation --- p.90 / Chapter 5.5.1 --- Example 1: Random Walk --- p.90 / Chapter 5.5.2 --- Example 2: Random Walk on a Grid --- p.94 / Chapter 5.5.3 --- Example 3: Stochastic Shortest Path Problem --- p.97 / Chapter 5.5.4 --- Relationships Between λ and γ --- p.99 / Chapter 5.6 --- Discrete-Time Inference Network --- p.100 / Chapter 5.6.1 --- Results --- p.101 / Chapter 5.7 --- Conclusion --- p.102 / Chapter 6. --- On Distributed g-Learning Network --- p.104 / Chapter 6.1 --- Introduction --- p.105 / Chapter 6.2 --- Distributed Q-Learniing Network --- p.108 / Chapter 6.2.1 --- Distributed Q-Learning Network --- p.109 / Chapter 6.2.2 --- Q-Learning Network Architecture --- p.111 / Chapter 6.3 --- Experimental Results --- p.114 / Chapter 6.3.1 --- Random Walk --- p.114 / Chapter 6.3.2 --- The Shortest Path Problem --- p.116 / Chapter 6.4 --- Discussion --- p.120 / Chapter 6.4.1 --- Related Work --- p.121 / Chapter 6.5 --- FPGAs Implementation --- p.122 / Chapter 6.5.1 --- Distributed Registering Approach --- p.123 / Chapter 6.5.2 --- Serial BRAM Storing Approach --- p.124 / Chapter 6.5.3 --- Comparison --- p.125 / Chapter 6.5.4 --- Discussion --- p.127 / Chapter 6.6 --- Conclusion --- p.128 / Chapter 7. --- Summary --- p.129 / Bibliography --- p.132 / Appendix / Chapter A. --- Simplified Floating-Point Arithmetic --- p.143 / Chapter B. --- "Logarithm, Exponential and Division Implementation" --- p.144 / Chapter B.1 --- Introduction --- p.144 / Chapter B.2 --- Approximation Scheme --- p.145 / Chapter B.2.1 --- Logarithm --- p.145 / Chapter B.2.2 --- Exponentiation --- p.147 / Chapter B.2.3 --- Division --- p.148 / Chapter C. --- Analog VLSI Implementation --- p.150 / Chapter C.1 --- Site Function --- p.150 / Chapter C.1.1 --- Multiplication Cell --- p.150 / Chapter C.2 --- The Unit Function --- p.153 / Chapter C.3 --- The Inference Network Computation --- p.154 / Chapter C.4 --- Layout --- p.157 / Chapter C.5 --- Fabrication --- p.159 / Chapter C.5.1 --- Testing and Characterization --- p.161
177

Enhancing routing architecture and routing algorithm for improving FPGAs performance. / CUHK electronic theses & dissertations collection

January 2007 (has links)
(I) Architectural revisions: Probably due to historical reasons, programmable switches on conventional FPGA architectures are divided into two kinds of substructures: Connection boxes (C-boxes) and Switch boxes (S-boxes), where C-boxes are used to connect logic/pad pins with their crossing wire segments, and S-boxes are used to connect wire segments of surrounding routing channels. In this work, we will challenge if this divided C- and S-boxes structure is really necessary and will explore a new experimental architecture which adopts only one kind of switching components - Connection-Switch boxes (CS-boxes). Extensive experiments are conducted on MCNC benchmark circuits to justify its architectural performance impacts. The results show that this CS-box based FPGA outperforms the conventional FPGA in terms of channel width, circuit delay, and segment usage. Besides an over 20% drastic dropping in the total number of manufactured switches needed, circuit delay performance is improved by 10% under the usage of the same pin assignments and router. / (II) New EDA technique/flow: By applying circuit rewirings, logic perturbations can be carried out by shifting logic resources from perhaps costly Look-Up-Table (LUT) external to cost-free LUT internal areas, or from critical to non-critical paths. This work presents a simple, while effective and low-overhead postlayout logic perturbation scheme for improving LUT-based FPGA routings without altering placements. A rewiring-based logic perturbation technique is used to improve upon a timing-driven FPGA P&R tool - TVPR. Compared with the already high-quality pure TVPR results, our approach reduces critical path delay by up to 31.74% (avg. 11%) without disturbing the placement or sacrificing chip areas, where only 4% of the nets are perturbed in our scheme. The complexity of our algorithm is linear in the total number of nets of the circuit. The experimental results show that the CPU time used by the rewiring engine is only 5% of the total time consumed by the placement and routing of TVPR. / Based on these studies, we believe the prospect for FPGA performance improvement is still quite profound in both architectural and EDA aspects. On the EDA technique, we have also performed logic perturbations to improve both the technology mapping and routing to investigate the effectiveness of the logic perturbation if applied in a larger context. The results show that a best technology mapping is not always leading to a best final routing, which seems to suggest that an ideal FPGA EDA flow should consider more on trade-offs between different stages. To the best of our knowledge, this is the first work exploring the power of logic perturbations applied for multiple physical stages for LUT-based FPGAs. The encouraging hardware improvement shown in our proposed CS-box based FPGAs seems to suggest a new design direction for FPGA routing architectures. / With the advent of deep submicron technologies, the extreme high design and mask costs incurred for ASICs have made FPGAs an increasingly popular hardware implementation option. However, it has been shown that the programmable routing structure underlined contributes over 60% of the signal delay and as high as 90% of the total chip area. As a result, current FPGAs still cannot meet performance requirements of many high-end applications. To attack this issue, we propose new solutions along the two major tracks: (I) architectural revisions (hardware) and (II) new EDA technique/flow (software). / Zhou Lin. / "October 2007." / Adviser: Yu-Liang Wu. / Source: Dissertation Abstracts International, Volume: 69-08, Section: B, page: 4953. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 101-108). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307.
178

A microcoded elliptic curve cryptographic processor.

January 2001 (has links)
Leung Ka Ho. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2001. / Includes bibliographical references (leaves [85]-90). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgments --- p.iii / List of Figures --- p.ix / List of Tables --- p.xi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Aims --- p.3 / Chapter 1.3 --- Contributions --- p.3 / Chapter 1.4 --- Thesis Outline --- p.4 / Chapter 2 --- Cryptography --- p.6 / Chapter 2.1 --- Introduction --- p.6 / Chapter 2.2 --- Foundations --- p.6 / Chapter 2.3 --- Secret Key Cryptosystems --- p.8 / Chapter 2.4 --- Public Key Cryptosystems --- p.9 / Chapter 2.4.1 --- One-way Function --- p.10 / Chapter 2.4.2 --- Certification Authority --- p.10 / Chapter 2.4.3 --- Discrete Logarithm Problem --- p.11 / Chapter 2.4.4 --- RSA vs. ECC --- p.12 / Chapter 2.4.5 --- Key Exchange Protocol --- p.13 / Chapter 2.4.6 --- Digital Signature --- p.14 / Chapter 2.5 --- Secret Key vs. Public Key Cryptography --- p.16 / Chapter 2.6 --- Summary --- p.18 / Chapter 3 --- Mathematical Background --- p.19 / Chapter 3.1 --- Introduction --- p.19 / Chapter 3.2 --- Groups and Fields --- p.19 / Chapter 3.3 --- Finite Fields --- p.21 / Chapter 3.4 --- Modular Arithmetic --- p.21 / Chapter 3.5 --- Polynomial Basis --- p.21 / Chapter 3.6 --- Optimal Normal Basis --- p.22 / Chapter 3.6.1 --- Addition --- p.23 / Chapter 3.6.2 --- Squaring --- p.24 / Chapter 3.6.3 --- Multiplication --- p.24 / Chapter 3.6.4 --- Inversion --- p.30 / Chapter 3.7 --- Summary --- p.33 / Chapter 4 --- Literature Review --- p.34 / Chapter 4.1 --- Introduction --- p.34 / Chapter 4.2 --- Hardware Elliptic Curve Implementation --- p.34 / Chapter 4.2.1 --- Field Processors --- p.34 / Chapter 4.2.2 --- Curve Processors --- p.36 / Chapter 4.3 --- Software Elliptic Curve Implementation --- p.36 / Chapter 4.4 --- Summary --- p.38 / Chapter 5 --- Introduction to Elliptic Curves --- p.39 / Chapter 5.1 --- Introduction --- p.39 / Chapter 5.2 --- Historical Background --- p.39 / Chapter 5.3 --- Elliptic Curves over R2 --- p.40 / Chapter 5.3.1 --- Curve Addition and Doubling --- p.41 / Chapter 5.4 --- Elliptic Curves over Finite Fields --- p.44 / Chapter 5.4.1 --- Elliptic Curves over Fp with p>〉3 --- p.44 / Chapter 5.4.2 --- Elliptic Curves over F2n --- p.45 / Chapter 5.4.3 --- Operations of Elliptic Curves over F2n --- p.46 / Chapter 5.4.4 --- Curve Multiplication --- p.49 / Chapter 5.5 --- Elliptic Curve Discrete Logarithm Problem --- p.51 / Chapter 5.6 --- Public Key Cryptography --- p.52 / Chapter 5.7 --- Elliptic Curve Diffie-Hellman Key Exchange --- p.54 / Chapter 5.8 --- Summary --- p.55 / Chapter 6 --- Design Methodology --- p.56 / Chapter 6.1 --- Introduction --- p.56 / Chapter 6.2 --- CAD Tools --- p.56 / Chapter 6.3 --- Hardware Platform --- p.59 / Chapter 6.3.1 --- FPGA --- p.59 / Chapter 6.3.2 --- Reconfigurable Hardware Computing --- p.62 / Chapter 6.4 --- Elliptic Curve Processor Architecture --- p.63 / Chapter 6.4.1 --- Arithmetic Logic Unit (ALU) --- p.64 / Chapter 6.4.2 --- Register File --- p.68 / Chapter 6.4.3 --- Microcode --- p.69 / Chapter 6.5 --- Parameterized Module Generator --- p.72 / Chapter 6.6 --- Microcode Toolkit --- p.73 / Chapter 6.7 --- Initialization by Bitstream Reconfiguration --- p.74 / Chapter 6.8 --- Summary --- p.75 / Chapter 7 --- Results --- p.76 / Chapter 7.1 --- Introduction --- p.76 / Chapter 7.2 --- Elliptic Curve Processor with Serial Multiplier (p = 1) --- p.76 / Chapter 7.3 --- Projective verses Affine Coordinates --- p.78 / Chapter 7.4 --- Elliptic Curve Processor with Parallel Multiplier (p > 1) --- p.79 / Chapter 7.5 --- Summary --- p.80 / Chapter 8 --- Conclusion --- p.82 / Chapter 8.1 --- Recommendations for Future Research --- p.83 / Bibliography --- p.85 / Chapter A --- Elliptic Curves in Characteristics 2 and3 --- p.91 / Chapter A.1 --- Introduction --- p.91 / Chapter A.2 --- Derivations --- p.91 / Chapter A.3 --- "Elliptic Curves over Finite Fields of Characteristic ≠ 2,3" --- p.92 / Chapter A.4 --- Elliptic Curves over Finite Fields of Characteristic = 2 --- p.94 / Chapter B --- Examples of Curve Multiplication --- p.95 / Chapter B.1 --- Introduction --- p.95 / Chapter B.2 --- Numerical Results --- p.96
179

Cost-effective dynamic repair for FPGAs in real-time systems / Reparo dinâmico de baixo custo para FPGAs em sistemas tempo-real

Santos, Leonardo Pereira January 2016 (has links)
Field-Programmable Gate Arrays (FPGAs) são largamente utilizadas em sistemas digitais por características como flexibilidade, baixo custo e alta densidade. Estas características advém do uso de células de SRAM na memória de configuração, o que torna estes dispositivos suscetíveis a erros induzidos por radiação, tais como SEUs. TMR é o método de mitigação mais utilizado, no entanto, possui um elevado custo tanto em área como em energia, restringindo seu uso em aplicações de baixo custo e/ou baixo consumo. Como alternativa a TMR, propõe-se utilizar DMR associado a um mecanismo de reparo da memória de configuração da FPGA chamado scrubbing. O reparo de FPGAs em sistemas em tempo real apresenta desafios específicos. Além da garantia da computação correta dos dados, esta computação deve se dar completamente dentro do tempo disponível (time-slot), devendo ser finalizada antes do tempo limite (deadline). A diferença entre o tempo de computação dos dados e a deadline é chamado de slack e é o tempo disponível para reparo do sistema. Este trabalho faz uso de scrubbing deslocado dinâmico, que busca maximizar a probabilidade de reparo da memória de configuração de FPGAs dentro do slack disponível, baseado em um diagnóstico do erro. O scrubbing deslocado já foi utilizado com técnicas de diagnóstico de grão fino (NAZAR, 2015). Este trabalho propõe o uso de técnicas de diagnóstico de grão grosso para o scrubbing deslocado, evitando as penalidades de desempenho e custos em área associados a técnicas de grão fino. Circuitos do conjunto MCNC foram protegidos com as técnicas propostas e submetidos a seções de injeção de erros (NAZAR; CARRO, 2012a). Os dados obtidos foram analisados e foram calculadas as melhores posição iniciais do scrubbing para cada um dos circuitos. Calculou-se a taxa de Failure-in-Time (FIT) para comparação entre as diferentes técnicas de diagnóstico propostas. Os resultados obtidos confirmaram a hipótese inicial deste trabalho que a redução do número de bits sensíveis e uma baixa degradação do período do ciclo de relógio permitiram reduzir a taxa de FIT quando comparadas com técnicas de grão fino. Por fim, uma comparação entre as três técnicas propostas é feita, analisando o desempenho e custos em área associados a cada uma. / Field-Programmable Gate Arrays (FPGAs) are widely used in digital systems due to characteristics such as flexibility, low cost and high density. These characteristics are due to the use of SRAM memory cells in the configuration memory, which make these devices susceptible to radiation-induced errors, such as SEUs. TMR is the most used mitigation technique, but it has an elevated cost both in area as well as in energy, restricting its use in low cost/low energy applications. As an alternative to TMR, we propose the use of DMR associated with a repair mechanism of the FPGA configuration memory called scrubbing. The repair of FPGA in real-time systems present a specific set of challenges. Besides guaranteeing the correct computation of data, this computation must be completely carried out within the available time (time-slot), being finalized before a time limit (deadline). The difference between the computation time and the deadline is called the slack and is the time available to repair the system. This work uses a dynamic shifted scrubbing that aims to maximize the repair probability of the configuration memory of the FPGA within the available slack based on error diagnostic. The shifted scrubbing was already proposed with fine-grained diagnostic techniques (NAZAR, 2015). This work proposes the use of coarse-grained diagnostic technique as a way to avoid the performance penalties and area costs associated to fine-grained techniques. Circuits of the MCNC suite were protected by the proposed techniques and subject to error-injection campaigns (NAZAR; CARRO, 2012a). The obtained data was analyzed and the best scrubbing starting positions for each circuit were calculated. The Failure-in-Time (FIT) rates were calculated to compare the different proposed diagnostic techniques. The obtained results validated the initial hypothesis of this work that the reduction of the number of sensitive bits and a low degradation of the clock cycle allowed a reduced FIT rate when compared with fine-grained diagnostic techniques. Finally, a comparison is made between the proposed techniques, considering performance and area costs associated to each one.
180

The Hybrid Architecture Parallel Fast Fourier Transform (HAPFFT)

Palmer, Joseph M. 16 June 2005 (has links)
The FFT is an efficient algorithm for computing the DFT. It drastically reduces the cost of implementing the DFT on digital computing systems. Nevertheless, the FFT is still computationally intensive, and continued technological advances of computers demand larger and faster implementations of this algorithm. Past attempts at producing high-performance, and small FFT implementations, have focused on custom hardware (ASICs and FPGAs). Ultimately, the most efficient have been single-chipped, streaming I/O, pipelined FFT architectures. These architectures increase computational concurrency through the use of hardware pipelining. Streaming I/O, pipelined FFT architectures are capable of accepting a single data sample every clock cycle. In principle, the maximum clock frequency of such a circuit is limited only by its critical delay path. The delay of the critical path may be decreased by the addition of pipeline registers. Nevertheless this solution gives diminishing returns. Thus, the streaming I/O, pipelined FFT is ultimately limited in the maximum performance it can provide. Attempts have been made to map the Parallel FFT algorithm to custom hardware. Yet, the Parallel FFT was formulated and optimized to execute on a machine with multiple, identical, processing elements. When executed on such a machine, the FFT requires a large expense on communications. Therefore, a direct mapping of the Parallel FFT to custom hardware results in a circuit with complex control and global data movement. This thesis proposes the Hybrid Architecture Parallel FFT (HAPFFT) as an alternative. The HAPFFT is an improved formulation for building Parallel FFT custom hardware modules. It provides improved performance, efficient resource utilization, and reduced design time. The HAPFFT is modular in nature. It includes a custom front-end parallel processing unit which produces intermediate results. The intermediate results are sent to multiple, independent FFT modules. These independent modules form the back-end of the HAPFFT, and are generic, meaning that any prexisting FFT architecture may be used. With P back-end modules a speedup of P will be achieved, in comparison to an FFT module composed solely of a single module. Furthermore, the HAPFFT defines the front-end processing unit as a function of P. It hides the high communication costs typically seen in Parallel FFTs. Reductions in control complexity, memory demands, and logical resources, are achieved. An extraordinary result of the HAPFFT formulation is a sublinear area-time growth. This phenomenon is often also called superlinear speedup. Sublinear area-time growth and superlinear speedup are equivalent terms. This thesis will subsequently use the term superlinear speedup to refer to the HAPFFT's outstanding speedup behavior. A further benefit resulting from the HAPFFT formulation is reduced design time. Because the HAPFFT defines only the front-end module, and because the back-end parallel modules may be composed of any preexisting FFT modules, total design time for a HAPFFT is greatly reduced

Page generated in 0.0464 seconds