1 |
Bit Serial Systolic Architectures for Multiplicative Inversion and Division over GF(2<sup>m</sup>)Daneshbeh, Amir January 2005 (has links)
Systolic architectures are capable of achieving high throughput by maximizing pipelining and by eliminating global data interconnects. Recursive algorithms with regular data flows are suitable for systolization. The computation of multiplicative inversion using algorithms based on EEA (Extended Euclidean Algorithm) are particularly suitable for systolization. Implementations based on EEA present a high degree of parallelism and pipelinability at bit level which can be easily optimized to achieve local data flow and to eliminate the global interconnects which represent most important bottleneck in todays sub-micron design process. The net result is to have high clock rate and performance based on efficient systolic architectures.
This thesis examines high performance but also scalable implementations of multiplicative inversion or field division over Galois fields <i>GF</i>(2<i><sup>m</sup></i>) in the specific case of cryptographic applications where field dimension <i>m</i> may be very large (greater than 400) and either <i>m</i> or defining irreducible polynomial may vary. For this purpose, many inversion schemes with different basis representation are studied and most importantly variants of EEA and binary (Stein's) GCD computation implementations are reviewed. A set of common as well as contrasting characteristics of these variants are discussed. As a result a generalized and optimized variant of EEA is proposed which can compute division, and multiplicative inversion as its subset, with divisor in either <i>polynomial</i> or <i>triangular</i> basis representation. Further results regarding Hankel matrix formation for double-basis inversion is provided. The validity of using the same architecture to compute field division with polynomial or triangular basis representation is proved.
Next, a scalable unidirectional bit serial systolic array implementation of this proposed variant of EEA is implemented. Its complexity measures are defined and these are compared against the best known architectures. It is shown that assuming the requirements specified above, this proposed architecture may achieve a higher clock rate performance w. r. t. other designs while being more flexible, reliable and with minimum number of inter-cell interconnects.
The main contribution at system level architecture is the substitution of all counter or adder/subtractor elements with a simpler distributed and free of carry propagation delays structure. Further a novel restoring mechanism for result sequences of EEA is proposed using a double delay element implementation.
Finally, using this systolic architecture a CMD (Combined Multiplier Divider) datapath is designed which is used as the core of a novel systolic elliptic curve processor. This EC processor uses affine coordinates to compute scalar point multiplication which results in having a very small control unit and negligible with respect to the datapath for all practical values of <i>m</i>. The throughput of this EC based on this bit serial systolic architecture is comparable with designs many times larger than itself reported previously.
|
2 |
Bit Serial Systolic Architectures for Multiplicative Inversion and Division over GF(2<sup>m</sup>)Daneshbeh, Amir January 2005 (has links)
Systolic architectures are capable of achieving high throughput by maximizing pipelining and by eliminating global data interconnects. Recursive algorithms with regular data flows are suitable for systolization. The computation of multiplicative inversion using algorithms based on EEA (Extended Euclidean Algorithm) are particularly suitable for systolization. Implementations based on EEA present a high degree of parallelism and pipelinability at bit level which can be easily optimized to achieve local data flow and to eliminate the global interconnects which represent most important bottleneck in todays sub-micron design process. The net result is to have high clock rate and performance based on efficient systolic architectures.
This thesis examines high performance but also scalable implementations of multiplicative inversion or field division over Galois fields <i>GF</i>(2<i><sup>m</sup></i>) in the specific case of cryptographic applications where field dimension <i>m</i> may be very large (greater than 400) and either <i>m</i> or defining irreducible polynomial may vary. For this purpose, many inversion schemes with different basis representation are studied and most importantly variants of EEA and binary (Stein's) GCD computation implementations are reviewed. A set of common as well as contrasting characteristics of these variants are discussed. As a result a generalized and optimized variant of EEA is proposed which can compute division, and multiplicative inversion as its subset, with divisor in either <i>polynomial</i> or <i>triangular</i> basis representation. Further results regarding Hankel matrix formation for double-basis inversion is provided. The validity of using the same architecture to compute field division with polynomial or triangular basis representation is proved.
Next, a scalable unidirectional bit serial systolic array implementation of this proposed variant of EEA is implemented. Its complexity measures are defined and these are compared against the best known architectures. It is shown that assuming the requirements specified above, this proposed architecture may achieve a higher clock rate performance w. r. t. other designs while being more flexible, reliable and with minimum number of inter-cell interconnects.
The main contribution at system level architecture is the substitution of all counter or adder/subtractor elements with a simpler distributed and free of carry propagation delays structure. Further a novel restoring mechanism for result sequences of EEA is proposed using a double delay element implementation.
Finally, using this systolic architecture a CMD (Combined Multiplier Divider) datapath is designed which is used as the core of a novel systolic elliptic curve processor. This EC processor uses affine coordinates to compute scalar point multiplication which results in having a very small control unit and negligible with respect to the datapath for all practical values of <i>m</i>. The throughput of this EC based on this bit serial systolic architecture is comparable with designs many times larger than itself reported previously.
|
3 |
Systolic design space exploration of EEA-based inversion over binary and ternary fieldsHazmi, Ibrahim 29 August 2018 (has links)
Cryptographic protocols are implemented in hardware to ensure low-area, high speed and reduced power consumption especially for mobile devices. Elliptic Curve Cryptography (ECC) is the most commonly used public-key cryptosystem and its performance depends heavily on efficient finite field arithmetic hardware. Finding the multiplicative inverse (inversion) is the most expensive finite field operation in ECC. The two predominant algorithms for computing finite field inversion are Fermat’s Little Theorem (FLT) and Extended Euclidean Algorithm (EEA). EEA is reported to be the most efficient inversion algorithm in terms of performance and power consumption.
This dissertation presents a new reformulation of EEA algorithm, which allows for speedup and optimization techniques such as concurrency and resource sharing. Modular arithmetic operations over GF(p) are introduced for small values of p, observing interesting figures, particularly for modular division. Whereas, polynomial arithmetic operations over GF(pm) are discussed adequately in order to examine the potential for processes concurrency. In particular, polynomial division and multiplication are revisited in order to derive their iterative equations, which are suitable for systolic array implementation. Consequently, several designs are proposed for each individual process and their complexities are analyzed and compared. Subsequently, a concurrent divider/multiplier-accumulator is developed, while the resulting systolic architecture is utilized to build the EEA-based inverter.
The processing elements of our systolic architectures are created accordingly, and enhanced to accommodate data management throughout our reformulated EEA algorithm. Meanwhile, accurate models for the complexity analysis of the proposed inverters are developed. Finally, a novel, fast, and compact inverter over binary fields is proposed and implemented on FPGA. The proposed design outperforms the reported inverters in terms of area and speed. Correspondingly, an EEA-based inverter over ternary fields is built, showing the lowest area-time complexity among the reported inverters. / Graduate
|
4 |
Elliptic curve cryptosystem over optimal extension fields for computationally constrained devicesAbu-Mahfouz, Adnan Mohammed 08 June 2005 (has links)
Data security will play a central role in the design of future IT systems. The PC has been a major driver of the digital economy. Recently, there has been a shift towards IT applications realized as embedded systems, because they have proved to be good solutions for many applications, especially those which require data processing in real time. Examples include security for wireless phones, wireless computing, pay-TV, and copy protection schemes for audio/video consumer products and digital cinemas. Most of these embedded applications will be wireless, which makes the communication channel vulnerable. The implementation of cryptographic systems presents several requirements and challenges. For example, the performance of algorithms is often crucial, and guaranteeing security is a formidable challenge. One needs encryption algorithms to run at the transmission rates of the communication links at speeds that are achieved through custom hardware devices. Public-key cryptosystems such as RSA, DSA and DSS have traditionally been used to accomplish secure communication via insecure channels. Elliptic curves are the basis for a relatively new class of public-key schemes. It is predicted that elliptic curve cryptosystems (ECCs) will replace many existing schemes in the near future. The main reason for the attractiveness of ECC is the fact that significantly smaller parameters can be used in ECC than in other competitive system, but with equivalent levels of security. The benefits of having smaller key size include faster computations, and reduction in processing power, storage space and bandwidth. This makes ECC ideal for constrained environments where resources such as power, processing time and memory are limited. The implementation of ECC requires several choices, such as the type of the underlying finite field, algorithms for implementing the finite field arithmetic, the type of the elliptic curve, algorithms for implementing the elliptic curve group operation, and elliptic curve protocols. Many of these selections may have a major impact on overall performance. In this dissertation a finite field from a special class called the Optimal Extension Field (OEF) is chosen as the underlying finite field of implementing ECC. OEFs utilize the fast integer arithmetic available on modern microcontrollers to produce very efficient results without resorting to multiprecision operations or arithmetic using polynomials of large degree. This dissertation discusses the theoretical and implementation issues associated with the development of this finite field in a low end embedded system. It also presents various improvement techniques for OEF arithmetic. The main objectives of this dissertation are to --Implement the functions required to perform the finite field arithmetic operations. -- Implement the functions required to generate an elliptic curve and to embed data on that elliptic curve. -- Implement the functions required to perform the elliptic curve group operation. All of these functions constitute a library that could be used to implement any elliptic curve cryptosystem. In this dissertation this library is implemented in an 8-bit AVR Atmel microcontroller. / Dissertation (MEng (Computer Engineering))--University of Pretoria, 2006. / Electrical, Electronic and Computer Engineering / unrestricted
|
Page generated in 0.0865 seconds