Thanks to the development of the Internet in recent years, we can see more and more applications on E-commerce in the world. At the same time, we have to prevent our personal information to be leaked out during the transaction. Therefore, topic on researching network security becomes increasingly popular. It is well-known that an encryption system can be applied to consolidate the network security. RSA encryption algorithm is a special kind of asymmetric cryptography, commonly used in public key encryption system on the network, by using two prime numbers as the two keys to encrypt and decrypt. These two keys are called public key and private key, and the key length is at least 512 bits. As a public key encryption, the only way to decrypt is using the private key. As long as the private key is not revealed, it is very difficult to get the private key from the public key even using the reverse engineering. Therefore, RSA encryption algorithm can be regarded as a very safe encryption and decryption algorithm. As the minimum key length has to be greater than 512 bits to ensure information security, using software to execute RSA encryption and decryption will be very slow so that the real time requirement may not be satisfied. Hence we will have to implement RSA encryption system with a hardware circuit to meet the real time requirement on the network.
Modular exponentiation (i.e., ME mod N) in RSA cryptosystem is usually achieved by repeated modular multiplications on large integers. A famous approach to implement the modular multiplication into hardware circuits is based on the Montgomery modular multiplication algorithm, which replaces the trial division by modulus with a series of addition and shift operations. However, a large amount of clock cycle is still required to complete a modular multiplication. For example, Montgomery multiplication algorithm will take 512 clock cycles to complete an A․B mod N. As a result, performing one modular exponentiation ME mod N in RSA cryptosystm will need 512․512 clock cycles.
To counter the above disadvantage, we employ radix-4 algorithm to reduce 50% of clock cycle number for each A•B mod N. In addition, we also modify the architecture of conventional in order to achieve the radix-4 algorithm to reduce its critical path delay so that the performance can be improved further.
Experimental results show that the proposed 1024-bit radix-4 modular multiplier (Our-Booth-Radix-4) before performing as pipeline is 70% faster than the radix-2 multiplier with 24% area overhead. Furthermore, it is 20% faster than traditional radix-4 modular multiplier with 12% area reduction. Therefore, its AT is smaller than the previous architectures.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0830111-180506 |
Date | 30 August 2011 |
Creators | Hsu, Hong-Yi |
Contributors | Yu-Hong, XIAO, Li-Yang, WANG, Xian-Rong, KUANG, Pei-Yan, CHEN, Shun-Zhi, CHEN |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | Cholon |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0830111-180506 |
Rights | user_define, Copyright information available at source archive |
Page generated in 0.0024 seconds