• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 84
  • 13
  • 10
  • 8
  • 7
  • 4
  • 4
  • 2
  • 1
  • Tagged with
  • 146
  • 64
  • 36
  • 32
  • 24
  • 23
  • 19
  • 19
  • 18
  • 16
  • 15
  • 14
  • 14
  • 13
  • 13
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

The Cell Processor

Hoefler, Torsten 07 March 2006 (has links)
Mainstream processor development is mostly targeted at compatibility and continuity. Thus, the processor market is dominated by x86 compatible CPUs since more than two decades now. Several new concepts tried to gain some market share, but it was not possible to overtake the old compatibility driven concepts. A group of three corporates tries another way to come into the market with a new idea, the cell design. The cell processor is a new try to leverage the increasing amount of transistors per die in an efficient way. The new processor is targeted at the game console and consumer electronics market to enhance the quality of these devices. This will lead to a wide spreading, and if everybody has two or more cell processors in TV, game console or PDA, the interesting question comes up: what can I do with these processors? This paper gives a short overview of the architecture and several programming ideas which help to exploit the whole processing power of the cell processor.
82

Analytical Query Processing Using Heterogeneous SIMD Instruction Sets

Ungethüm, Annett 30 October 2020 (has links)
Numerous applications gather increasing amounts of data, which have to be managed and queried. Different hardware developments help to meet this challenge. The grow-ing capacity of main memory enables database systems to keep all their data in memory. Additionally, the hardware landscape is becoming more diverse. A plethora of homo-geneous and heterogeneous co-processors is available, where heterogeneity refers not only to a different computing power, but also to different instruction set architectures. For instance, modern Intel® CPUs offer different instruction sets supporting the Single Instruction Multiple Data (SIMD) paradigm, e.g. SSE, AVX, and AVX512. Database systems have started to exploit SIMD to increase performance. However, this is still a challenging task, because existing algorithms were mainly developed for scalar processing and because there is a huge variety of different instruction sets, which were never standardized and have no unified interface. This requires to completely rewrite the source code for porting a system to another hardware architecture, even if those archi-tectures are not fundamentally different and designed by the same company. Moreover, operations on large registers, which are the core principle of SIMD processing, behave counter-intuitively in several cases. This is especially true for analytical query process-ing, where different memory access patterns and data dependencies caused by the com-pression of data, challenge the limits of the SIMD principle. Finally, there are physical constraints to the use of such instructions affecting the CPU frequency scaling, which is further influenced by the use of multiple cores. This is because the supply power of a CPU is limited, such that not all transistors can be powered at the same time. Hence, there is a complex relationship between performance and power, and therefore also between performance and energy consumption. This thesis addresses the specific challenges, which are introduced by the application of SIMD in general, and the heterogeneity of SIMD ISAs in particular. Hence, the goal of this thesis is to exploit the potential of heterogeneous SIMD ISAs for increasing the performance as well as the energy-efficiency.
83

Bit-parallel and SIMD alignment algorithms for biological sequence analysis

Loving, Joshua 21 November 2017 (has links)
High-throughput next-generation sequencing techniques have hugely decreased the cost and increased the speed of sequencing, resulting in an explosion of sequencing data. This motivates the development of high-efficiency sequence alignment algorithms. In this thesis, I present multiple bit-parallel and Single Instruction Multiple Data (SIMD) algorithms that greatly accelerate the processing of biological sequences. The first chapter describes the BitPAl bit-parallel algorithms for global alignment with general integer scoring, which assigns integer weights for match, mismatch, and insertion/deletion. The bit-parallel approach represents individual cells in an alignment scoring matrix as bits in computer words and emulates the calculation of scores by a series of logic operations. Bit-parallelism has previously been applied to other pattern matching problems, producing fast algorithms. In timed tests, we show that BitPAl runs 7 - 25 times faster than a standard iterative algorithm. The second part involves two approaches to alignment with substitution scoring, which assigns a potentially different substitution weight to every pair of alphabet characters, better representing the relative rates of different mutations. The first approach extends the existing BitPAl method. The second approach is a new SIMD algorithm that uses partial sums of adjacent score differences. I present a simple partial sum method as well as one that uses parallel scan for additional acceleration. Results demonstrate that these algorithms are significantly faster than existing SIMD dynamic programming algorithms. Finally, I describe two extensions to the partial sums algorithm. The first adds support for affine gap penalty scoring. Affine gap scoring represents the biological likelihood that it is more likely for gaps to be continuous than to be distributed throughout a region by introducing a gap opening penalty and a gap extension penalty. The second extension is an algorithm that uses the partial sums method to calculate the tandem alignment of a pattern against a text sequence using a single pattern copy. Next generation sequencing data provides a wealth of information to researchers. Extracting that information in a timely manner increases the utility and practicality of sequence analysis algorithms. This thesis presents a family of algorithms which provide alignment scores in less time than previous algorithms.
84

Akcelerace detekce objektů pomocí klasifikátorů / Acceleration of Object Detection Using Classifiers

Juránek, Roman January 2012 (has links)
Detekce objektů v počítačovém vidění je složítá úloha. Velmi populární a rozšířená metoda pro detekci je využití statistických klasifikátorů a skenovacích oken. Pro učení kalsifikátorů se často používá algoritmus AdaBoost (nebo jeho modifikace), protože dosahuje vysoké úspěšnosti detekce, nízkého počtu chybných detekcí a je vhodný pro detekci v reálném čase. Implementaci detekce objektů je možné provést různými způsoby a lze využít vlastnosti konkrétní architektury, pro urychlení detekce. Pro akceleraci je možné využít grafické procesory, vícejádrové architektury, SIMD instrukce, nebo programovatelný hardware. Tato práce představuje metodu optimalizace, která vylepšuje výkon detekce objektů s ohledem na cenovou funkci zadanou uživatelem. Metoda rozděluje předem natrénovaný klasifikátor do několika různých implementací, tak aby celková cena klasifikace byla minimalizována. Metoda je verifikována na základním experimentu, kdy je klasifikátor rozdělen do předzpracovací jednotku v FPGA a do jednotky ve standardním PC.
85

A Multiple Associative Computing Model to Support the Execution of Data Parallel Branches Using the Manager-worker Paradigm

Chantamas, Wittaya 01 December 2009 (has links)
No description available.
86

Supporting Applications Involving Irregular Accesses and Recursive Control Flow on Emerging Parallel Environments

Huo, Xin 14 November 2014 (has links)
No description available.
87

Vectorization and Register Reuse in High Performance Computing

Stock, Kevin Alan January 2014 (has links)
No description available.
88

A PAIRWISE COMPARISON OF DNA SEQUENCE ALIGNMENT USING AN OPENMP IMPLEMENTATION OF THE SWAMP PARALLEL SMITH-WATERMAN ALGORITHM

Cuevas, Tristan Lee 22 April 2015 (has links)
No description available.
89

Parallel ILU Preconditioning for Structured Grid Matrices

Eisenlohr, John Merrick 20 May 2015 (has links)
No description available.
90

Enabling Task Parallelism on Hardware/Software Layers using the Polyhedral Model

Kong, Martin Richard 09 June 2016 (has links)
No description available.

Page generated in 0.0632 seconds