Global ETD Search

61	Design and evaluation of on-line arithmetic modules and networks for signal processing applications on FPGAs Galli, Reto 07 June 2001 (has links) Several papers propose the use of on-line arithmetic for signal processing applications implemented on FPGAs. Although those papers provide reasonable arguments for the use of on-line arithmetic, they give only inadequate or incomplete comparisons of the proposed on-line designs to other state of the art solutions on FPGAs. In this thesis, the design, implementation and evaluation of on-line modules and networks for DSP applications, using FPGAS as the target technology, are shown. The presented designs of the modules are highly optimized for the target hardware, which allows a significant increase in efficiency compared to standard on-line designs. The design process for the networks of on-line modules is described in detail, and a methodology to analyze the dataflow and timing is presented. A comparison of on-line signal processing solutions with other approaches. that are available as IP building blocks or components, is given. It is shown that on-line designs are better in terms of latency but that they can not compete in terms of throughput and area for basic applications like FIR filters. However, it is also shown that on-line designs are able to overtake other approaches as the applications become more sophisticated. e.g. when data dependencies exist, or when non constant multiplicands restrict the use of other approaches, such as serial distributed arithmetic. For these applications, online arithmetic shows, compared to other designs, a lower latency and a significant area reduction, while maintaining a high throughput. Several properties of algorithms for which on-line arithmetic is advantageous are identified in this thesis. With this information, it is possible to determine if an on-line solution for an application should be considered. The conclusions are based on experimental data collected using CAD tools for the Xilinx XC4000 family of chips. All the designs are synthesized for the same type of devices for comparison, avoiding rough estimates of the system performance. This generates a more reliable comparison allowing designers to decide between on-line or conventional approaches for their DSP designs. / Graduation date: 2002 Digital communications Computer arithmetic Algorithms
62	Improved architectures for fused floating-point arithmetic units Sohn, Jongwook 05 November 2013 (has links) Most general purpose processors (GPP) and application specific processors (ASP) use the floating-point arithmetic due to its wide and precise number system. However, the floating-point operations require complex processes such as alignment, normalization and rounding. To reduce the overhead, fused floating-point arithmetic units are introduced. In this dissertation, improved architectures for three fused floating-point arithmetic units are proposed: 1) Fused floating-point add-subtract unit, 2) Fused floating-point two-term dot product unit, and 3) Fused floating-point three-term adder. Also, the three fused floating-point units are implemented for both single and double precision and evaluated in terms of the area, power consumption, latency and throughput. To improve the performance of the fused floating-point add-subtract unit, a new alignment scheme, fast rounding, two dual-path algorithms and pipelining are applied. The improved fused floating-point two-term dot product unit applies several optimizations: a new alignment scheme, early normalization and fast rounding, four-input leading zero anticipation (LZA), dual-path algorithm and pipelining. The proposed fused floating-point three-term adder applies a new exponent compare and significand alignment scheme, double reduction, early normalization and fast rounding, three-input LZA and pipelining to improve the performance. / text Floating-point arithmetic Fused floating-point operation High speed computer arithmetic Add-subtract unit Dot product unit Three-term adder
63	Low Power and Low Complexity Shift-and-Add Based Computations Johansson, Kenny January 2008 (has links) The main issue in this thesis is to minimize the energy consumption per operation for the arithmetic parts of DSP circuits, such as digital filters. More specific, the focus is on single- and multiple-constant multiplications, which are realized using shift-and-add based computations. The possibilities to reduce the complexity, i.e., the chip area, and the energy consumption are investigated. Both serial and parallel arithmetic are considered. The main difference, which is of interest here, is that shift operations in serial arithmetic require flip-flops, while shifts can be hardwired in parallel arithmetic.The possible ways to connect a given number of adders is limited. Thus, for single-constant multiplication, the number of shift-and-add structures is finite. We show that it is possible to save both adders and shifts compared to traditional multipliers. Two algorithms for multiple-constant multiplication using serial arithmetic are proposed. For both algorithms, the total complexity is decreased compared to one of the best-known algorithms designed for parallel arithmetic. Furthermore, the impact of the digit-size, i.e., the number of bits to be processed in parallel, is studied for FIR filters implemented using serial arithmetic. Case studies indicate that the minimum energy consumption per sample is often obtained for a digit-size of around four bits.The energy consumption is proportional to the switching activity, i.e., the average number of transitions between the two logic levels per clock cycle. To achieve low power designs, it is necessary to develop accurate high-level models that can be used to estimate the switching activity. A method for computing the switching activity in bit-serial constant multipliers is proposed.For parallel arithmetic, a detailed complexity model for constant multiplication is introduced. The model counts the required number of full and half adder cells. It is shown that the complexity can be significantly reduced by considering the interconnection between the adders. A main factor for energy consumption in constant multipliers is the adder depth, i.e., the number of cascaded adders. The reason for this is that the switching activity will increase when glitches are propagated to subsequent adders. We propose an algorithm, where all multiplier coefficients are guaranteed to be realized at the theoretically lowest depth possible. Implementation examples show that the energy consumption is significantly reduced using this algorithm compared to solutions with fewer word level adders.For most applications, the input data are correlated since real world signals are processed. A data dependent switching activity model is derived for ripple-carry adders. Furthermore, a switching activity model for the single adder multiplier is proposed. This is a good starting point for accurate modeling of shift-and-add based computations using more adders.Finally, a method to rewrite an arbitrary function as a sum of weighted bit-products is presented. It is shown that for many elementary functions, a majority of the bit-products can be neglected while still maintaining reasonable high accuracy, since the weights are significantly smaller than the allowed error. The function approximation algorithms can be implemented using a low complexity architecture, which can easily be pipelined to an arbitrary degree for increased throughput. FIR filters Function approximation Digital circuits Computer arithmetic Constant multiplication Addition Low power Switching activity estimation Electronics Elektronik
64	High-performance coarse operators for FPGA-based computing / Opérateurs grossiers haute performance pour l'informatique basée FPGA Istoan, Matei Valentin 06 April 2017 (has links) Les FPGA (Field Programmable Gate Arrays) constituent un type de circuit reprogrammable qui, sous certaines conditions, peuvent avoir de meilleures performances que les microprocesseurs classiques. Les FPGA utilisent le circuit comme paradigme de programmation, ce qui permet d'effectuer des calculs parallèles propres à l'application visée. Ils permettent aussi d’atteindre l’efficacité arithmétique: un bit ne doit être calculé que s'il est utile dans le résultat final. Pour ce faire, l’arithmétique utilisée par les FPGA ne peut se limiter qu’à des fonctions conçues pour les microprocesseurs. Cette thèse se propose d’étudier les méthodes pour l’implémentation des fonctions gros-grain pour les FPGA à travers trois voies. De nouvelles méthodes pour évaluer des fonctions trigonométriques, telles que le sinus, cosinus et arc tangente ont été développés dans cette thèse. Chaque méthode est optimisée dans son contexte, de la manière la plus flexible et la plus souple possible. Pour que les méthodes aboutissent à leur efficacité arithmétique, il est nécessaire de procéder à une analyse d'erreurs, ainsi qu’à un choix attentif des paramétrés de la méthode et à une fine compréhension des algorithmes utilisés. Les filtres numériques constituent une famille importante d’opérateurs arithmétiques qui rassemble des fonctions élémentaires. Ils peuvent être spécifiés à un niveau élevé d'abstraction, à travers une fonction de transfert avec des contraintes sur le rapport signal/bruit. Ils peuvent être ensuite implémentés comme des chemins de données basés sur des additions et des multiplications. Le principal résultat est donc une méthode qui transforme une spécification de haut niveau en une implémentation d’une façon automatique. La première étape se rapporte au développement d'une méthode pour le calcul des produits par des constantes. Des filtres FIR et IIR peuvent être construits à l'aide de cette brique de base. Pour que les opérateurs arithmétiques atteignent leur performance maximale, on a besoin d’un pipeline correspondant au contexte donné. Même si les connaissances du développeur s’avèrent d’un grand avantage pendant le processus de création d'un pipeline d'un chemin de données, cette étape demeure complexe et facilement susceptible à des erreurs. Une méthode automatique, contrôlée par le développeur a dont été développée. Cette thèse fournit un générateur des opérateurs arithmétiques de haute qualité près à l'emploi, et qui propagent le domaine des calculs sur des FPGA à un pas plus proche de l’adoption générale. Les cœurs arithmétiques font partie d'un générateur open-source, où les fonctions peuvent être décrites par une spécification de haut niveau, comme par exemple une formule mathématique. / Field-Programmable Gate Arrays (FPGAs) have been shown to sometimes outperform mainstream microprocessors. The circuit paradigm enables efficient application-specific parallel computations. FPGAs also enable arithmetic efficiency: a bit is only computed if it is useful to the final result. To achieve this, FPGA arithmetic shouldn’t be limited to basic arithmetic operations offered by microprocessors. This thesis studies the implementation of coarser operations on FPGAs, in three main directions: New FPGA-specific approaches for evaluating the sine, cosine and the arctangent have been developed. Each function is tuned for its context and is as versatile and flexible as possible. Arithmetic efficiency requires error analysis and parameter tuning, and a fine understanding of the algorithms used. Digital filters are an important family of coarse operators resembling elementary functions: they can be specified at a high level as a transfer function with constraints on the signal/noise ratio, and then be implemented as an arithmetic datapath based on additions and multiplications. The main result is a method which transforms a high-level specification into a filter in an automated way. The first step is building an efficient method for computing sums of products by constants. Based on this, FIR and IIR filter generators are constructed. For arithmetic operators to achieve maximum performance, context-specific pipelining is required. Even if the designer’s knowledge is of great help when building and pipelining an arithmetic datapath, this remains complex and error-prone. A user-directed, automated method for pipelining has been developed. This thesis provides a generator of high-quality, ready-made operators for coarse computing cores, which brings FPGA-based computing a step closer to mainstream adoption. The cores are part of an open-ended generator, where functions are described as high-level objects such as mathematical expressions. Informatique Arithmétique de l'ordinateur Architecture des circuits Filtres numériques Computer science Computer arithmetic Computer architecture Digital filters 004.015 107 2
65	Methods to evaluate accuracy-energy trade-off in operator-level approximate computing / Méthodes d'évaluation du compromis précision-énergie pour le calcul approximatif niveau opérateur Barrois, Benjamin 11 December 2017 (has links) Les limites physiques des circuits à base de silicium étant en passe d'être atteintes, de nouveaux moyens doivent être trouvés pour outrepasser la fin de la loi de Moore. Beaucoup d'applications peuvent tolérer des approximations dans leurs calculs à différents niveaux, sans dégrader la qualité de leur sortie, ou en la dégradant de manière acceptable. Cette thèse se concentre sur les architectures arithmétiques approximatives afin de saisir cette opportunité. Tout d'abord, une étude critique de l'état de l'art des additionneurs et multiplieurs approximatifs est présentée. Ensuite, un modèle de propagation d'erreur virgule-fixe mettant en œuvre la densité spectrale de puissance est proposée, suivi d'un modèle de propagation du taux d'erreur binaire positionnel des opérateurs approximatifs. Les opérateurs approximatifs sont ensuite utilisés pour la reproduction des effets de la VOS dans les opérateurs arithmétiques exacts. Grâce à notre outil de travail open-source ApxPerf et ses bibliothèques synthétisables C++ apx_fixed pour les opérateurs approximatifs et ct_float pour l'arithmétique flottante basse consommation, deux études consécutives sont proposées, basées sur des applications de traitement du signal complexes. Tout d'abord, les opérateurs approximatifs sont comparés à l'arithmétique virgule-fixe, et la supériorité de la virgule-fixe est soulignée. Enfin, la virgule fixe est comparée aux petits flottants dans des conditions équivalentes. En fonction des conditions applicatives, la virgule-flottante montre une compétitivité inattendue face à la virgule-fixe. Les résultats et discussions de cette thèse donnent un regard nouveau sur l'arithmétique approximative et suggère de nouvelles directions pour le futur des architectures efficaces en énergie. / The physical limits being reached in silicon-based computing, new ways have to be found to overcome the predicted end of Moore's law. Many applications can tolerate approximations in their computations at several levels without degrading the quality of their output, or degrading it in an acceptable way. This thesis focuses on approximate arithmetic architectures to seize this opportunity. Firstly, a critical study of state-of-the-art approximate adders and multipliers is presented. Then, a model for fixed-point error propagation leveraging power spectral density is proposed, followed by a model for bitwise-error rate propagation of approximate operators. Approximate operators are then used for the reproduction of voltage over-scaling effects in exact arithmetic operators. Leveraging our open-source framework ApxPerf and its synthesizable template-based C++ libraries apx_fixed for approximate operators, and ct_float for low-power floating-point arithmetic, two consecutive studies are proposed leveraging complex signal processing applications. Firstly, approximate operators are compared to fixed-point arithmetic, and the superiority of fixed-point is highlighted. Secondly, fixed-point is compared to small-width floating-point in equivalent conditions. Depending on the applicative conditions, floating-point shows an unexpected competitiveness compared to fixed-point. The results and discussions of this thesis give a fresh look on approximate arithmetic and suggest new directions for the future of energy-efficient architectures. Arithmétique interne des ordinateurs Arithmétique en virgule fixe Arithmétique en virgule flottante Computer arithmetic Fixed-Point arithmetic Floating-Point arithmetic
66	Approche arithmétique RNS de la cryptographie asymétrique / RNS arithmetic approach of asymmetric cryptography Eynard, Julien 28 May 2015 (has links) Cette thèse se situe à l'intersection de la cryptographie et de l'arithmétique des ordinateurs. Elle traite de l'amélioration de primitives cryptographiques asymétriques en termes d'accélération des calculs et de protection face aux attaques par fautes par le biais particulier de l'utilisation des systèmes de représentation des nombres par les restes (RNS). Afin de contribuer à la sécurisation de la multiplication modulaire, opération centrale en cryptographie asymétrique, un nouvel algorithme de réduction modulaire doté d'une capacité de détection de faute est présenté. Une preuve formelle garantit la détection des fautes sur un ou plusieurs résidus pouvant apparaître au cours d'une réduction. De plus, le principe de cet algorithme est généralisé au cas d'une arithmétique dans un corps fini non premier. Ensuite, les RNS sont exploités dans le domaine de la cryptographie sur les réseaux euclidiens. L'objectif est d'importer dans ce domaine certains avantages des systèmes de représentation par les restes dont l'intérêt a déjà été montré pour une arithmétique sur GF(p) notamment. Le premier résultat obtenu est une version en représentation hybride RNS-MRS de l'algorithme du « round-off » de Babai. Puis une technique d'accélération est introduite, permettant d'aboutir dans certains cas à un algorithme entièrement RNS pour le calcul d'un vecteur proche. / This thesis is at the crossroads between cryptography and computer arithmetic. It deals with enhancement of cryptographic primitives with regard to computation acceleration and protection against fault injections through the use of residue number systems (RNS) and their associated arithmetic. So as to contribute to secure the modular multiplication, which is a core operation for many asymmetric cryptographic primitives, a new modular reduction algorithm supplied with fault detection capability is presented. A formal proof guarantees that faults affecting one or more residues during a modular reduction are well detected. Furthermore, this approach is generalized to an arithmetic dedicated to non-prime finite fields Fps . Afterwards, RNS are used in lattice-based cryptography area. The aim is to exploit acceleration properties enabled by RNS, as it is widely done for finite field arithmetic. As first result, a new version of Babai’s round-off algorithm based on hybrid RNS-MRS representation is presented. Then, a new and specific acceleration technique enables to create a full RNS algorithm computing a close lattice vector. Cryptographie asymétrique Arithmétique des ordinateurs RNS Attaque par faute Réseaux euclidiens Asymmetric cryptography Computer arithmetic 005.8
67	Towards a modern floating-point environment / Vers l'environnement flottant moderne Kupriianova, Olga 11 December 2015 (has links) Cette thèse fait une étude sur deux moyens d'enrichir l'environnement flottant courant : le premier est d'obtenir plusieurs versions d'implantation pour chaque fonction mathématique, le deuxième est de fournir des opérations de la norme IEEE754, qui permettent de mélanger les entrées et la sortie dans les bases différentes. Comme la quantité de versions différentes pour chaque fonction mathématique est énorme, ce travail se concentre sur la génération du code. Notre générateur de code adresse une large variété de fonctions: il produit les implantations paramétrées pour les fonctions définies par l'utilisateur. Il peut être vu comme un générateur de fonctions boîtes-noires. Ce travail inclut un nouvel algorithme pour le découpage de domaine et une tentative de remplacer les branchements pendant la reconstruction par un polynôme. Le nouveau découpage de domaines produit moins de sous-domaines et les degrés polynomiaux sur les sous-domaines adjacents ne varient pas beaucoup. Pour fournir les implantations vectorisables il faut éviter les branchements if-else pendant la reconstruction. Depuis la révision de la norme IEEE754 en 2008, il est devenu possible de mélanger des nombres de différentes précisions dans une opération. Par contre, il n'y a aucun mécanisme qui permettrait de mélanger les nombres dans des bases différentes dans une opération. La recherche dans l'arithmétique en base mixte a commencé par les pires cas pour le FMA. Un nouvel algorithme pour convertir une suite de caractères décimaux du longueur arbitraire en nombre flottant binaire est présenté. Il est indépendant du mode d'arrondi actuel et produit un résultat correctement arrondi. / This work investigates two ways of enlarging the current floating-point environment. The first is to support several implementation versions of each mathematical function (elementary such as $\exp$ or $\log$ and special such as $\erf$ or $\Gamma$), the second one is to provide IEEE754 operations that mix the inputs and the output of different \radixes. As the number of various implementations for each mathematical function is large, this work is focused on code generation. Our code generator supports the huge variety of functions: it generates parametrized implementations for the user-specified functions. So it may be considered as a black-box function generator. This work contains a novel algorithm for domain splitting and an approach to replace branching on reconstruction by a polynomial. This new domain splitting algorithm produces less subdomains and the polynomial degrees on adjacent subdomains do not change much. To produce vectorizable implementations, if-else statements on the reconstruction step have to be avoided. Since the revision of the IEEE754 Standard in 2008 it is possible to mix numbers of different precisions in one operation. However, there is no mechanism that allows users to mix numbers of different radices in one operation. This research starts an examination ofmixed-radix arithmetic with the worst cases search for FMA. A novel algorithm to convert a decimal character sequence of arbitrary length to a binary floating-point number is presented. It is independent of currently-set rounding mode and produces correctly-rounded results. Arithmétique des ordinateurs Virgule flottante Fonctions élémentaires Générateur de code Metalibm Arithmétique en base mixte Computer arithmetic Floating-point numbers Elementary functions 004
68	Implementation of adaptive digital FIR and reprogrammable mixed-signal filters using distributed arithmetic Huang, Walter 12 November 2009 (has links) When computational resources are limited, especially multipliers, distributed arithmetic (DA) is used in lieu of the typical multiplier-based filtering structures. However, DA is not well suited for adaptive applications. The bottleneck is updating the memory table. Several attempts have been done to accelerate updating the memory, but at the expense of additional memory usage and of convergence speed. To develop an adaptive DA filter with an uncompromised convergence rate, the memory table must be fully updated. In this research, an efficient method for fully updating a DA memory table is proposed. The proposed update method is based on exploiting the temporal locality of the stored data and subexpression sharing. The proposed update method reduces the computational workload and requires no additional memory resources. DA using the proposed update method is called conjugate distributed arithmetic. Filters can also be constructed from analog components. Often, for lower precision computations, analog circuits use less power and less chip area than their digital counterparts. However, digital components are often used because of their ease of reprogrammability. Achieving such reprogrammability in analog is possible, but at the expense of additional chip area. A reprogrammable mixed-signal DA finite impulse response (FIR) filter is proposed to address the issues with reprogrammable analog FIR filters that are constructing compact reprogrammable filtering structures, non-symmetric and imprecise filter coefficients, inconsistent sampling of the input data, and input sample data corruption. These issues are successfully addressed using distributed arithmetic, digital registers, and epots. Also, a mixed-signal DA second-order section (SOS), which is used as the building block for higher order infinite impulse response filters, was proposed. The type of issues with an analog SOS filter are similar to those of an analog FIR filter, which are the lack of a compact reprogrammable filtering structure, the imprecise filter coefficients, the inconsistent sampling of the data, and the corruption of the data samples. These issues are successfully addressed using distributed arithmetic and digital registers. Mixed-signal implementations Reprogrammable Distributed arithmetic Adaptive filtering implementations Adaptive filters Adaptive signal processing Computer arithmetic and logic units Computer algorithms
69	Algorithmes et arithmétique pour l'implémentation de couplages cryptographiques Estibals, Nicolas 30 October 2013 (has links) (PDF) Les couplages sont des primitives cryptographiques qui interviennent désormais dans de nombreux protocoles. Dès lors, il est nécessaire de s'intéresser à leur calcul et à leur implémentation efficace. Pour ce faire, nous nous reposons sur une étude algorithmique et arithmétique de ces fonctions mathématiques. Les couplages sont des applications bilinéaires définies sur des courbes algébriques, plus particulièrement, dans le cas qui nous intéresse, des courbes elliptiques et hyperelliptiques. Nous avons choisi de nous concentrer sur une sous-famille de celles-ci : les courbes supersingulières dont les propriétés permettent d'obtenir à la fois des couplages symétriques et des algorithmes efficaces pour leur calcul. Nous décrivons alors une approche unifiée permettant d'établir une large variété d'algorithmes calculant des couplages. Nous l'appliquons notamment à la construc- tion d'un nouvel algorithme pour le calcul de couplages sur des courbes supersin- gulières de genre 2 et de caractéristique 2. Les calculs nécessaires aux couplages que nous décrivons s'appuient sur l'implé- mentation d'une arithmétique rapide pour les corps finis de petite caractéristique : la multiplication est l'opération critique qu'il convient d'optimiser. Nous présen- tons donc un algorithme de recherche exhaustive de formules de multiplication. Enfin, nous appliquons toutes les méthodes précédentes à la conception et l'im- plémentation de différents accélérateurs matériels pour le calcul de couplages sur différentes courbes dont les architectures ont été optimisées soit pour leur rapidité, soit pour leur compacité. cryptographie couplage opérateur matériel arithmétique architecture des ordinateurs
70	Tools for the Design of Reliable and Efficient Functions Evaluation Libraries / Outils pour la conception de bibliothèques de calcul de fonctions efficaces et fiables Torres, Serge 22 September 2016 (has links) La conception des bibliothèques d’évaluation de fonctions est un activité complexe qui requiert beaucoup de soin et d’application, particulièrement lorsque l’on vise des niveaux élevés de fiabilité et de performances. En pratique et de manière habituelle, on ne peut se livrer à ce travail sans disposer d’outils qui guident le concepteur dans le dédale d’un espace de solutions étendu et complexe mais qui lui garantissent également la correction et la quasi-optimalité de sa production. Dans l’état actuel de l’art, il nous faut encore plutôt raisonner en termes de « boite à outils » d’où le concepteur doit tirer et combiner des mécanismes de base, au mieux de ses objectifs, plutôt qu’imaginer que l’on dispose d’un dispositif à même de résoudre automatiquement tous les problèmes.Le présent travail s’attache à la conception et la réalisation de tels outils dans deux domaines:∙ la consolidation du test d’arrondi de Ziv utilisé, jusqu’à présent de manière plus ou moins empirique, dans l’implantation des approximations de fonction ;∙ le développement d’une implantation de l’algorithme SLZ dans le but de résoudre le « Dilemme du fabricant de table » dans le cas de fonctions ayant pour opérandes et pour résultat approché des nombres flottants en quadruple précision (format Binary64 selon la norme IEEE-754). / The design of function evaluation libraries is a complex task that requires a great care and dedication, especially when one wants to satisfy high standards of reliability and performance. In actual practice, it cannot be correctly performed, as a routine operation, without tools that not only help the designer to find his way in a complex and extended solution space but also to guarantee that his solutions are correct and (almost) optimal. As of the present state of the art, one has to think in terms of “toolbox” from which he can smartly mix-and-match the utensils that fit better his goals rather than expect to have at hand a solve-all automatic device.The work presented here is dedicated to the design and implementation of such tools in two realms:∙ the consolidation of Ziv’s rounding test that is used, in a more or less empirical way, for the implementation of functions approximation;∙ the development of an implementation of the SLZ-algorithm in order to solve the Table Maker Dilemma for the function with quad-precision floating point (IEEE-754 Binary128 format) arguments and images. Arithmétique des ordinateurs Approximation de fonctions Arithmétique flottante Dilemme du fabricant de tables Arrondi correct Computer arithmetic Function approximation Floating-point arithmetic Table Maker's Dilemma Correct rounding

Search results