Spelling suggestions: "subject:"multipliers"" "subject:"multiplier""
1 |
MIMO Multiplierless FIR SystemImran, Muhammad, Khursheed, Khursheed January 2009 (has links)
<p>The main issue in this thesis is to minimize the number of operations and the energy consumption per operation for the computation (arithmetic operation) part of DSP circuits, such as Finite Impulse Response Filters (FIR), Discrete Cosine Transform (DCT), and Discrete Fourier Transform (DFT) etc. More specific, the focus is on the elimination of most frequent common sub-expression (CSE) in binary, Canonic Sign Digit (CSD), Twos Complement or Sign Digit representation of the coefficients of non-recursive multiple input multiple output (MIMO) FIR system , which can be realized using shift-and-add based operations only. The possibilities to reduce the complexity i.e. the chip area, and the energy consumption have been investigated.</p><p>We have proposed an algorithm which finds the most common sub expression in the binary/CSD/Twos Complement/Sign Digit representation of coefficients of non-recursive MIMO multiplier less FIR systems. We have implemented the algorithm in MATLAB. Also we have proposed different tie-breakers for the selection of most frequent common sub-expression, which will affect the complexity (Area and Power consumption) of the overall system. One choice (tie breaker) is to select the pattern (if there is a tie for the most frequent pattern) which will result in minimum number of delay elements and hence the area of the overall system will be reduced. Another tie-breaker is to choose the pattern which will result in minimum adder depth (the number of cascaded adders). Minimum adder depth will result in least number of glitches which is the main factor for the power consumption in MIMO multiplier less FIR systems. Switching activity will be increased when glitches are propagated to subsequent adders (which occur if adder depth is high). As the power consumption is proportional to the switching activity (glitches) hence we will use the sub-expression which will result in lowest adder depth for the overall system.</p>
|
2 |
MIMO Multiplierless FIR SystemImran, Muhammad, Khursheed, Khursheed January 2009 (has links)
The main issue in this thesis is to minimize the number of operations and the energy consumption per operation for the computation (arithmetic operation) part of DSP circuits, such as Finite Impulse Response Filters (FIR), Discrete Cosine Transform (DCT), and Discrete Fourier Transform (DFT) etc. More specific, the focus is on the elimination of most frequent common sub-expression (CSE) in binary, Canonic Sign Digit (CSD), Twos Complement or Sign Digit representation of the coefficients of non-recursive multiple input multiple output (MIMO) FIR system , which can be realized using shift-and-add based operations only. The possibilities to reduce the complexity i.e. the chip area, and the energy consumption have been investigated. We have proposed an algorithm which finds the most common sub expression in the binary/CSD/Twos Complement/Sign Digit representation of coefficients of non-recursive MIMO multiplier less FIR systems. We have implemented the algorithm in MATLAB. Also we have proposed different tie-breakers for the selection of most frequent common sub-expression, which will affect the complexity (Area and Power consumption) of the overall system. One choice (tie breaker) is to select the pattern (if there is a tie for the most frequent pattern) which will result in minimum number of delay elements and hence the area of the overall system will be reduced. Another tie-breaker is to choose the pattern which will result in minimum adder depth (the number of cascaded adders). Minimum adder depth will result in least number of glitches which is the main factor for the power consumption in MIMO multiplier less FIR systems. Switching activity will be increased when glitches are propagated to subsequent adders (which occur if adder depth is high). As the power consumption is proportional to the switching activity (glitches) hence we will use the sub-expression which will result in lowest adder depth for the overall system.
|
3 |
Video Processing using multiplierless 2D-DCT with Algebraic Integers and MR-DCTNimmalapalli, Sushmabhargavi January 2018 (has links)
No description available.
|
4 |
Optimal, Multiplierless Implementations of the Discrete Wavelet Transform for Image Compression ApplicationsKotteri, Kishore 12 May 2004 (has links)
The use of the discrete wavelet transform (DWT) for the JPEG2000 image compression standard has sparked interest in the design of fast, efficient hardware implementations of the perfect reconstruction filter bank used for computing the DWT. The accuracy and efficiency with which the filter coefficients are quantized in a multiplierless implementation impacts the image compression and hardware performance of the filter bank. A high precision representation ensures good compression performance, but at the cost of increased hardware resources and processing time. Conversely, lower precision in the filter coefficients results in smaller, faster hardware, but at the cost of poor compression performance. In addition to filter coefficient quantization, the filter bank structure also determines critical hardware properties such as throughput and power consumption.
This thesis first investigates filter coefficient quantization strategies and filter bank structures for the hardware implementation of the biorthogonal 9/7 wavelet filters in a traditional convolution-based filter bank. Two new filter bank properties—"no-distortion-mse" and "deviation-at-dc"—are identified as critical to compression performance, and two new "compensating" filter coefficient quantization methods are developed to minimize degradation of these properties. The results indicate that the best performance is obtained by using a cascade form for the filters with coefficients quantized using the "compensating zeros" technique. The hardware properties of this implementation are then improved by developing a cascade polyphase structure that increases throughput and decreases power consumption.
Next, this thesis investigates implementations of the lifting structure—an orthogonal structure that is more robust to coefficient quantization than the traditional convolution-based filter bank in computing the DWT. Novel, optimal filter coefficient quantization techniques are developed for a rational and an irrational set of lifting coefficients. The results indicate that the best quantized lifting coefficient set is obtained by starting with the rational coefficient set and using a "lumped scaling" and "gain compensation" technique for coefficient quantization.
Finally, the image compression properties and hardware properties of the convolution and lifting based DWT implementations are compared. Although the lifting structure requires fewer computations, the cascaded arrangement of the lifting filters requires significant hardware overhead. Consequently, the results depict that the convolution-based cascade polyphase structure (with "<i>z</i>₁-compensated" coefficients) gives the best performance in terms of image compression performance and hardware metrics like throughput, latency and power consumption. / Master of Science
|
5 |
Design of digital filters using genetic algorithmsAhmad, Sabbir U. 17 December 2008 (has links)
In recent years, genetic algorithms (GAs) began to be used in many disciplines such as pattern recognition, robotics, biology, and medicine to name just a few. GAs are based on Darwin's principle of natural selection which happens to be a slow process and, as a result, these algorithms tend to require a large amount of computation. However, they offer certain advantages as well over classical gradient-based optimization algorithms such as steepest-descent and Newton-type algorithms. For example, having located local suboptimal solutions they can discard them in favor of more promising local solutions and, therefore, they are more likely to obtain better solutions in multimodal problems. By contrast, classical optimization algorithms though very efficient, they are not equipped to discard inferior local solutions in favour of more optimal ones.
This dissertation is concerned with the design of several types of digital filters by using GAs as detailed bellow.
In Chap. 2, two approaches for the design of fractional delay (FD) filters based on a GA are developed. The approaches exploit the advantages of a global search technique to determine the coefficients of FD FIR and allpass-IIR filters based on the so-called Farrow structure. The GA approach was compared with a least-squares approach and was found to lead to improvements in the amplitude response and/or delay characteristic.
In Chap. 3, a GA-based approach is developed for the design of delay equalizers. In this approach, the equalizer coefficients are optimized using an objective function based on the passband filter-equalizer group delay. The required equalizer is built by adding new second-order sections until the desired accuracy in terms of the flatness of the group delay with respect to the passband is achieved. With this approach stable delay equalizers satisfying arbitrary prescribed specifications with the desired degree of group-delay flatness can easily be obtained.
In Chap. 4, a GA-based approach for the design of multiplierless FIR filters is developed. A recently-introduced GA, called orthogonal GA (OGA) based on the so-called experimental design technique, is exploited to obtain fixed-point implementations of linear-phase FIR filters. In this approach, the effects of finite word length are minimized by considering the filter as a cascade of two sections. The OGA leads to an improved amplitude response relative to that of an equivalent direct-form cascade filter obtained using the Remez exchange algorithm.
In Chap. 5, a multiobjective GA for the design of asymmetric FIR filters is proposed. This GA uses a specially tailored elitist nondominated sorting GA (ENSGA) to obtain so-called Pareto-optimal solutions for the problem at hand. Flexibility is introduced in the design by imposing phase-response linearity only in the passband instead of the entire baseband as in conventional designs. Three objective functions based on the amplitude-response error and the flatness of the group-delay characteristic are explored in the design examples considered. When compared with a WLS design method, the ENSGA was found to lead to improvements in the amplitude response and passband group-delay characteristic.
In Chap. 6, a hybrid approach for the design of IIR filters using a GA along with a quasi-Newton (QN) algorithm is developed. The hybrid algorithm, referenced to as the genetic quasi-Newton (GQN) algorithm combines the flexibility and reliability inherent in the GA with the fast convergence and precision of the QN algorithm. The GA is used as a global search tool to explore different regions in the parameter space whereas the QN algorithm exploits the efficiency of a gradient-based algorithm in locating local solutions. The GQN algorithm works well with an arbitrary random initialization and filters that would satisfy prescribed amplitude-response specifications can easily be designed
|
6 |
Zjednodušené násobení v konvolučních neuronových sítích / Simplified Multiplication in Convolutional Neural NetworksJuhaňák, Pavel January 2019 (has links)
This thesis provides an introduction to classical and convolutional neural networks. It describes how hardware multiplication is conventionally performed and optimized. A simplified multiplication method is proposed, namely multiplierless multiplication. This method is implemented and integrated into the TypeCNN library. The cost of the hardware solution of both conventional and simplified multipliers is estimated. The thesis also introduces software tools developed to work with convolutional neural networks and datasets used to test them in the image classification task. Test architectures and experimentation methodology are proposed. The results are evaluated, and both the classification accuracy and cost of the hardware solution are discussed.
|
7 |
Hardware Implementation and Applications of Deep Belief NetworksImbulgoda Liyangahawatte, Gihan Janith Mendis January 2016 (has links)
No description available.
|
8 |
Multiple Constant Multiplication Optimization Using Common Subexpression Elimination and Redundant NumbersAl-Hasani, Firas Ali Jawad January 2014 (has links)
The multiple constant multiplication (MCM) operation is a fundamental operation in digital signal processing (DSP) and digital image processing (DIP). Examples of the MCM are in finite impulse response (FIR) and infinite impulse response (IIR) filters, matrix multiplication, and transforms.
The aim of this work is minimizing the complexity of the MCM operation using common subexpression elimination (CSE) technique and redundant number representations. The CSE technique searches and eliminates common digit patterns (subexpressions) among MCM coefficients. More common subexpressions can be found by representing the MCM coefficients using redundant number representations.
A CSE algorithm is proposed that works on a type of redundant numbers called the zero-dominant set (ZDS). The ZDS is an extension over the representations of minimum number of non-zero digits called minimum Hamming weight (MHW). Using the ZDS improves CSE algorithms' performance as compared with using the MHW representations. The disadvantage of using the ZDS is it increases the possibility of overlapping patterns (digit collisions). In this case, one or more digits are shared between a number of patterns. Eliminating a pattern results in losing other patterns because of eliminating the common digits. A pattern preservation algorithm (PPA) is developed to resolve the overlapping patterns in the representations.
A tree and graph encoders are proposed to generate a larger space of number representations. The algorithms generate redundant representations of a value for a given digit set, radix, and wordlength. The tree encoder is modified to search for common subexpressions simultaneously with generating of the representation tree. A complexity measure is proposed to compare between the subexpressions at each node. The algorithm terminates generating the rest of the representation tree when it finds subexpressions with maximum sharing. This reduces the search space while minimizes the hardware complexity.
A combinatoric model of the MCM problem is proposed in this work. The model is obtained by enumerating all the possible solutions of the MCM that resemble a graph called the demand graph. Arc routing on this graph gives the solutions of the MCM problem. A similar arc routing is found in the capacitated arc routing such as the winter salting problem. Ant colony optimization (ACO) meta-heuristics is proposed to traverse the demand graph. The ACO is simulated on a PC using Python programming language. This is to verify the model correctness and the work of the ACO. A parallel simulation of the ACO is carried out on a multi-core super computer using C++ boost graph library.
|
Page generated in 0.0742 seconds