1171 |
High performance reconfigurable architectures for biological sequence alignmentIsa, Mohammad Nazrin January 2013 (has links)
Bioinformatics and computational biology (BCB) is a rapidly developing multidisciplinary field which encompasses a wide range of domains, including genomic sequence alignments. It is a fundamental tool in molecular biology in searching for homology between sequences. Sequence alignments are currently gaining close attention due to their great impact on the quality aspects of life such as facilitating early disease diagnosis, identifying the characteristics of a newly discovered sequence, and drug engineering. With the vast growth of genomic data, searching for a sequence homology over huge databases (often measured in gigabytes) is unable to produce results within a realistic time, hence the need for acceleration. Since the exponential increase of biological databases as a result of the human genome project (HGP), supercomputers and other parallel architectures such as the special purpose Very Large Scale Integration (VLSI) chip, Graphic Processing Unit (GPUs) and Field Programmable Gate Arrays (FPGAs) have become popular acceleration platforms. Nevertheless, there are always trade-off between area, speed, power, cost, development time and reusability when selecting an acceleration platform. FPGAs generally offer more flexibility, higher performance and lower overheads. However, they suffer from a relatively low level programming model as compared with off-the-shelf microprocessors such as standard microprocessors and GPUs. Due to the aforementioned limitations, the need has arisen for optimized FPGA core implementations which are crucial for this technology to become viable in high performance computing (HPC). This research proposes the use of state-of-the-art reprogrammable system-on-chip technology on FPGAs to accelerate three widely-used sequence alignment algorithms; the Smith-Waterman with affine gap penalty algorithm, the profile hidden Markov model (HMM) algorithm and the Basic Local Alignment Search Tool (BLAST) algorithm. The three novel aspects of this research are firstly that the algorithms are designed and implemented in hardware, with each core achieving the highest performance compared to the state-of-the-art. Secondly, an efficient scheduling strategy based on the double buffering technique is adopted into the hardware architectures. Here, when the alignment matrix computation task is overlapped with the PE configuration in a folded systolic array, the overall throughput of the core is significantly increased. This is due to the bound PE configuration time and the parallel PE configuration approach irrespective of the number of PEs in a systolic array. In addition, the use of only two configuration elements in the PE optimizes hardware resources and enables the scalability of PE systolic arrays without relying on restricted onboard memory resources. Finally, a new performance metric is devised, which facilitates the effective comparison of design performance between different FPGA devices and families. The normalized performance indicator (speed-up per area per process technology) takes out advantages of the area and lithography technology of any FPGA resulting in fairer comparisons. The cores have been designed using Verilog HDL and prototyped on the Alpha Data ADM-XRC-5LX card with the Virtex-5 XC5VLX110-3FF1153 FPGA. The implementation results show that the proposed architectures achieved giga cell updates per second (GCUPS) performances of 26.8, 29.5 and 24.2 respectively for the acceleration of the Smith-Waterman with affine gap penalty algorithm, the profile HMM algorithm and the BLAST algorithm. In terms of speed-up improvements, comparisons were made on performance of the designed cores against their corresponding software and the reported FPGA implementations. In the case of comparison with equivalent software execution, acceleration of the optimal alignment algorithm in hardware yielded an average speed-up of 269x as compared to the SSEARCH 35 software. For the profile HMM-based sequence alignment, the designed core achieved speed-up of 103x and 8.3x against the HMMER 2.0 and the latest version of HMMER (version 3.0) respectively. On the other hand, the implementation of the gapped BLAST with the two-hit method in hardware achieved a greater than tenfold speed-up compared to the latest NCBI BLAST software. In terms of comparison against other reported FPGA implementations, the proposed normalized performance indicator was used to evaluate the designed architectures fairly. The results showed that the first architecture achieved more than 50 percent improvement, while acceleration of the profile HMM sequence alignment in hardware gained a normalized speed-up of 1.34. In the case of the gapped BLAST with the two-hit method, the designed core achieved 11x speed-up after taking out advantages of the Virtex-5 FPGA. In addition, further analysis was conducted in terms of cost and power performances; it was noted that, the core achieved 0.46 MCUPS per dollar spent and 958.1 MCUPS per watt. This shows that FPGAs can be an attractive platform for high performance computation with advantages of smaller area footprint as well as represent economic ‘green’ solution compared to the other acceleration platforms. Higher throughput can be achieved by redeploying the cores on newer, bigger and faster FPGAs with minimal design effort.
|
1172 |
Génération rapide d'accélérateurs matériels par synthèse d'architecture sous contraintes de ressourcesProst-Boucle, A. 08 January 2014 (has links) (PDF)
Bien que les FPGA soient très attrayants pour leur performance et leur faible consommation, leur emploi en tant qu'accélérateurs matériels reste marginal. Les logiciels de développement existants ne sont en effet accessibles qu'à un public expert en conception de circuits. Afin de repousser leurs limites, une nouvelle méthodologie de génération basée sur la synthèse d'architecture est proposée. En appliquant des transformations successives à une solution initiale, le processus converge rapidement et permet de respecter strictement des contraintes matérielles, notamment en ressources. Un logiciel démonstrateur, AUGH, a été construit, et des expérimentations ont été menées sur plusieurs applications reconnues. La méthodologie proposée est très proche du processus de compilation pour les microprocesseurs, ce qui permet son utilisation même par des utilisateurs non spécialistes de la conception de circuits numériques.
|
1173 |
Conception sur mesure d'un FPGA durci aux radiations à base de mémoires magnétiquesGoncalves, Olivier 19 June 2013 (has links) (PDF)
Le but de la thèse a été de montrer que les cellules mémoires MRAM présentent de nombreux avantages pour une utilisation en tant que mémoire de configuration pour les architectures reconfigurables et en particulier les FPGAs (Field Programmable Gate Arrays). Ce type de composant est programmable et permet de concevoir un circuit numérique simplement en programmant des cellules mémoires qui définissent sa fonctionnalité. Un FPGA est principalement constitué de cellules mémoires. C'est pourquoi elles déterminent en grande partie ses caractéristiques comme sa surface ou sa consommation et influencent ses performances comme sa rapidité. Les mémoires MRAM sont composées de Jonctions Tunnel Magnétiques (JTMs) qui stockent l'information sous la forme d'une aimantation. Une JTM est composée de trois couches : deux couches de matériaux ferromagnétiques séparées par une couche isolante. Une des deux couches ferromagnétiques a une aimantation fixée dans un certaine direction (couche de référence) tandis que l'autre peut voir son aimantation changer dans deux directions (couche de stockage). Ainsi, la propagation des électrons est changée suivant que les deux aimantations sont parallèles ou antiparallèles c'est-à-dire que la résistance électrique de la jonction change suivant l'orientation relative des aimantations. Elle est faible lorsque les aimantations sont parallèles et forte lorsqu'elles sont antiparallèles. L'écriture d'une JTM consiste donc à changer l'orientation de l'aimantation de la couche de stockage tandis que la lecture consiste à déterminer si l'on a une forte ou une faible résistance. Les atouts de la JTM font d'elle une bonne candidate pour être une mémoire dite universelle, bien que des efforts de recherche restent à accomplir. Cependant, elle a de nombreux avantages comme la non-volatilité, la rapidité et la faible consommation à l'écriture comparée à la mémoire Flash ainsi que la résistance aux radiations. Grâce à ces avantages, on peut déjà l'utiliser dans certaines applications et en particulier dans le domaine du spatial. En effet, l'utilisation dans ce domaine permet de tirer parti de tous les avantages de la JTM en raison du fait qu'elle est intrinsèquement immune aux radiations et non-volatile. Elle permet donc de réaliser un FPGA résistant aux radiations et avec une basse consommation et de nouvelles fonctionnalités. Le travail de la thèse s'est donc déroulé sur trois ans. La première année a d'abord été dédiée à l'état de l'art afin d'apprendre le fonctionnement des JTMs, l'architecture des FPGAs, les techniques de durcissement aux radiations et de basse consommation ainsi que le fonctionnement des outils utilisés en microélectronique. Au bout de la première année, un nouveau concept d'architecture de FPGA a été proposé. Les deuxième et troisième années ont été dédiées à la réalisation de cette innovation avec la recherche de la meilleure structure de circuit et la réalisation d'un circuit de base d'un FPGA ainsi que la conception puis la fabrication d'un démonstrateur. Le démonstrateur a été testé avec succès et a permis de prouver le concept. La nouvelle architecture de circuit de FPGA a permis de montrer que l'utilisation des mémoires MRAM comme mémoire de configuration de FPGA était avantageuse et en particulier pour les technologies futures.
|
1174 |
Digital implementation of an upstream DOCSIS QAM modulator and channel emulator2015 June 1900 (has links)
The concept of cable television, originally called community antenna television (CATV), began in the 1940's. The information and services provided by cable operators have changed drastically since the early days. Cable service providers are no longer simply providing their customers with broadcast television but are providing a multi-purpose, two-way link to the digital world. Custom programming, telephone service, radio, and high-speed internet access are just a few of the services offered by cable service providers in the 21st century.
At the dawn of the internet the dominant mode of access was through telephone lines. Despite advances in dial-up modem technology, the telephone system was unable to keep pace with the demand for data throughput. In the late 1990's an industry consortium known as Cable Television Laboratories, Inc. developed a standard protocol for providing high-speed internet access through the existing CATV infrastructure. This protocol is known as Data Over Cable Service Interface Specification (DOCSIS) and it helped to usher in the era of the information superhighway.
CATV systems use different parts of the radio frequency (RF) spectrum for communication to and from the user. The downstream portion (data destined for the user) consumes the bulk of the spectrum and is located at relatively high frequencies. The upstream portion (data destined to the network from the user) of the spectrum is smaller and located at the low end of the spectrum. This lower frequency region of the RF spectrum is particularly prone to impairments such as micro-reflections, which can be viewed as a type of multipath interference. Upstream data transfer in the presence of these impairments is therefore problematic and requires complex signal correction algorithms to be employed in the receiver.
The quality of a receiver is largely determined by how well it mitigates the signal impairments introduced by the channel. For this reason, engineers developing a receiver require a piece of equipment that can emulate the channel impairments in any permutation in order to test their receiver. The conventional test methodology uses a hardware RF channel emulator connected between the transmitter and the receiver under test. This method not only requires an expensive RF channel emulator, but a functioning analog front-end as well. Of these two problems, the expense of the hardware emulator is likely less important than the delay in development caused by waiting for a functional analog front-end. Receiver design is an iterative, time consuming process that requires the receiver's digital signal processing (DSP) algorithms be tested as early as possible to reduce the time-to-market.
This thesis presents a digital implementation of a DOCSIS-compliant channel emulator whereby cable micro-reflections and thermal noise at the analog front-end of the receiver are modelled digitally at baseband. The channel emulator and the modulator are integrated into a single hardware structure to produce a compact circuit that, during receiver testing, resides inside the same field programmable gate array (FPGA) as the receiver. This approach removes the dependence on the analog front-end allowing it to be developed concurrently with the receiver's DSP circuits, thus reducing the time-to-market.
The approach taken in this thesis produces a fully programmable channel emulator that can be loaded onto FPGAs as needed by engineers working independently on different receiver designs. The channel emulator uses 3 independent data streams to produce a 3-channel signal, whereby a main channel with micro-reflections is flanked on either side by adjacent channels. Thermal noise normally generated by the receiver's analog front-end is emulated and injected into the signal. The resulting structure utilizes 43 dedicated multipliers and 401.125 KB of RAM, and achieves a modulation error ratio (MER) of 55.29 dB.
|
1175 |
Nouvelles Architectures Hybrides : Logique / Mémoires Non-Volatiles et technologies associées.Palma, Giorgio 29 November 2013 (has links) (PDF)
Les nouvelles approches de technologies mémoires permettront une intégration dite back-end, où les cellules élémentaires de stockage seront fabriquées lors des dernières étapes de réalisation à grande échelle du circuit. Ces approches innovantes sont souvent basées sur l'utilisation de matériaux actifs présentant deux états de résistance distincts. Le passage d'un état à l'autre est contrôlé en courant ou en tension donnant lieu à une caractéristique I-V hystérétique. Nos mémoires résistives sont composées d'argent en métal électrochimiquement actif et de sulfure amorphe agissant comme électrolyte. Leur fonctionnement repose sur la formation réversible et la dissolution d'un filament conducteur. Le potentiel d'application de ces nouveaux dispositifs n'est pas limité aux mémoires ultra-haute densité mais aussi aux circuits embarqués. En empilant ces mémoires dans la troisième dimension au niveau des interconnections des circuits logiques CMOS, de nouvelles architectures hybrides et innovantes deviennent possibles. Il serait alors envisageable d'exploiter un fonctionnement à basse énergie, à haute vitesse d'écriture/lecture et de haute performance telles que l'endurance et la rétention. Dans cette thèse, en se concentrant sur les aspects de la technologie de mémoire en vue de développer de nouvelles architectures, l'introduction d'une fonctionnalité non-volatile au niveau logique est démontrée par trois circuits hybrides: commutateurs de routage non volatiles dans un Field Programmable Gate Arrays, un 6T-SRAM non volatile, et les neurones stochastiques pour un réseau neuronal. Pour améliorer les solutions existantes, les limitations de la performances des dispositifs mémoires sont identifiés et résolus avec des nouveaux empilements ou en fournissant des défauts de circuits tolérants.
|
1176 |
FPGA implementation of an enhanced digital detection algorithm for medium range RFID readers / Francois Dominicus MullerMuller, Francois Dominicus January 2008 (has links)
The School of Electrical, Electronic and Computer Engineering of the North-West University is conducting research about RFID (radio frequency identification) medium range reader systems for an international company, iPico. The focus area of the present research is the development of a robust tag detection algorithm for noisy environments.
During the past three years a digital detection algorithm was developed. This digital detection algorithm delivered significant improvements in detection of RFIDs over its analogue counterpart, especially in noisy environments. However, the digital detection algorithm was found to be very sensitive with regard to data rate deviations.
Although the latter algorithm improved the detection of RFIDs, ghost (absent) tags were now also detected. The objectives of this project are, to develop an enhanced detection algorithm which is less sensitive to frequency deviations and to eliminate the appearance of the so called ghost tags.
The proposed enhanced algorithm will be implemented on a FPGA (field programmable gate array), more specific the Altera Cyclone EP1CT144C6 FPGA. / Thesis (M.Ing. (Computer and Electronical Engineering))--North-West University, Potchefstroom Campus, 2009.
|
1177 |
FPGA implementation of an enhanced digital detection algorithm for medium range RFID readers / Francois Dominicus MullerMuller, Francois Dominicus January 2008 (has links)
The School of Electrical, Electronic and Computer Engineering of the North-West University is conducting research about RFID (radio frequency identification) medium range reader systems for an international company, iPico. The focus area of the present research is the development of a robust tag detection algorithm for noisy environments.
During the past three years a digital detection algorithm was developed. This digital detection algorithm delivered significant improvements in detection of RFIDs over its analogue counterpart, especially in noisy environments. However, the digital detection algorithm was found to be very sensitive with regard to data rate deviations.
Although the latter algorithm improved the detection of RFIDs, ghost (absent) tags were now also detected. The objectives of this project are, to develop an enhanced detection algorithm which is less sensitive to frequency deviations and to eliminate the appearance of the so called ghost tags.
The proposed enhanced algorithm will be implemented on a FPGA (field programmable gate array), more specific the Altera Cyclone EP1CT144C6 FPGA. / Thesis (M.Ing. (Computer and Electronical Engineering))--North-West University, Potchefstroom Campus, 2009.
|
1178 |
Circuit partitioning for application-dependent FPGA testingFeng, Rui Zhen 30 August 2007 (has links)
Application-dependent FPGA testing is performed to ensure that a particular user-defined application is implemented on fault-free areas of an FPGA. Applying this type of test technique leads to yield increases and cost reductions in the use of FPGAs.
In this thesis, we propose a novel application-dependent FPGA testing strategy, in which a recursive circuit partitioning algorithm is employed to obtain a testing configuration solution for a user-specific application. This algorithm is implemented and the experimental results are analyzed to demonstrate the effectiveness of the proposed testing strategy.
Our experimental results show that the circuit partitioning method can be used to provide a reasonable solution for an arbitrary application with significantly improved fault coverage and an approximately minimized number of cut points
|
1179 |
Embedded reconfigurable solutions for cryptographyChu, Chi-Chun (Ambrose) 16 July 2008 (has links)
We first propose a reconfigurable processor, which consists of a MicroBlaze processor
augmented with a Field-Programmable Gate Array (FPGA) to mitigate the computing time
for public-key cryptography algorithms. We first consider Virtex-II Pro from Xilinx to analyze the potential solution of a Field-Programmable Custom Computing Machine (FCCM), which is composed of MicroBlaze augmented with a Virtex-II FPGA. We then propose a cryptography-oriented reconfigurable array, called CryptoRA, that efficiently supports long-word integer addition, subtraction and comparison. As a result, RISC processor can potentially be augmented with the CryptoRA rather than Virtex-II. The three main features that CryptoRA has are: (i) an increased granularity of the logic tile, (ii) the extension of the dedicated carry chain over the horizontal direction, and (iii) the incremental splitting Look-Up Table. According to our simulations, the CryptoRA-based FCCM provides a significant
performance improvement over an optimized pure-software solution at an acceptable cost.
|
1180 |
FPGA interconnection networks with capacitive boosting in strong and weak inversionEslami, Fatemeh 22 August 2012 (has links)
Designers of Field-Programmable Gate Arrays (FPGAs) are always striving to
improve the speed of their designs. The propagation delay of FPGA interconnection networks is a major challenge and continues to grow with newer technologies.
FPGAs interconnection networks are implemented using NMOS pass transistor based
multiplexers followed by buffers. The threshold voltage drop across an NMOS device
degrades the high logic value, and results in unbalanced rising and falling edges, static
power consumption due to the crowbar currents, and reduced noise margins. In this
work, circuit design techniques to construct interconnection circuit with capacitive
boosting are proposed. By using capacitive boosting in FPGAs interconnection networks, the signal transitions are accelerated and the crowbar currents of downstream
buffers are reduced. In addition, buffers can be non-skewed or slightly skewed to improve noise immunity of the interconnection network. Results indicate that by using
the presented circuit design technique, the propagation delay can be reduced by at
least 10% versus prior art at the expense of a slight increase in silicon area.
In addition, in a bid to reduce power consumption in reconfigurable arrays, operation in weak inversion region has been suggested. Current programmable interconnections cannot be directly used in this region due to a very poor propagation delay
and sensitivity to Process-Voltage-Temperature (PVT) variations. This work also focuses on designing a common structure for FPGAs interconnection networks that
can operate in both strong and weak inversion. We propose to use capacitive boosting together with a new circuit design technique, called Twins transmission gates in
implementing FPGA interconnect multiplexers. We also propose to use capacitive
boosting in designing buffers. This way, the operation region of the interconnection
circuitry is shifted away from weak inversion toward strong inversion resulting in improved speed and enhanced tolerance to PVT variations. Simulation results indicate
using capacitive boosting to implement the interconnection network can have a significant influence on delay and tolerance to variations. The interconnection network
with capacitive boosting is at least 34% faster than prior art in weak inversion. / Graduate
|
Page generated in 0.0509 seconds