Spelling suggestions: "subject:"gpu""
291 |
Paralelní trénování neuronových sítí pro rozpoznávání řeči / Parallel Training of Neural Networks for Speech RecognitionVeselý, Karel January 2010 (has links)
This thesis deals with different parallelizations of training procedure for artificial neural networks. The networks are trained as phoneme-state acoustic descriptors for speech recognition. Two effective parallelization strategies were implemented and compared. The first strategy is data parallelization, where the training is split into several POSIX threads. The second strategy is node parallelization, which uses CUDA framework for general purpose computing on modern graphic cards. The first strategy showed a 4x speed-up, while using the second strategy we observed nearly 10x speed-up. The Stochastic Gradient Descent algorithm with error backpropagation was used for the training. After a short introduction, the second chapter of this thesis shows the motivation and introduces the neural networks into the context of speech recognition. The third chapter is theoretical, the anatomy of a neural network and the used training method are discussed. The following chapters are focused on the design and implementation of the project, while the phases of the iterative development are described. The last extensive chapter describes the setup of the testing system and reports the experimental results. Finally, the obtained results are concluded and the possible extensions of the project are proposed.
|
292 |
Optimalizace parametrů sekundárního chlazení plynulého odlévání oceli / Optimization of Secondary Cooling Parameters of Continuous Steel CastingKlimeš, Lubomír January 2014 (has links)
Continuous casting is a dominant production technology of steelmaking which is currently used for more that 95% of the world steel production. Mathematical modelling and optimal control of casting machine are crucial tasks in continuous steel casting which directly influence productivity and quality of produced steel, competitiveness of steelworks, safety of casting machine operation and its impact on the environment. This thesis concerns with the development and implementation of the numerical model of temperature field for continuously cast steel billets and its use for optimal control of the casting machine. The numerical model was developed and implemented in MATLAB. Due to computational demands the model was parallelized by means of the computation on graphics processing units NVIDIA with the computational architecture CUDA. Validation and verification of the model were performed with the use of operational data from Trinecke zelezarny steelworks. The model was then utilized as a part of the developed model-based predictive control system for the optimal control of dynamic situations in the casting machine operation. The behaviour of the developed control system was examined by means of dynamic model situations that have confirmed the ability of the implemented system to optimally control dynamic operations of the continuous casting machine. Both the numerical model of the temperature field and the model-based predictive control system have been implemented so that they can be modified for any casting machine and this allows for their prospective commercial applications.
|
293 |
Real-time vizualizace povětrnostních vlivů v terénu / Realtime Weather in a Landscape VisualisationVlček, Adam January 2009 (has links)
Thanks to the increasing computation power the complexity and dynamism of virtual reality is continuously improving. This work aims to examine influences of weather in a landscape and the means to simulate and dynamically visualize them in real time on the current personal computer hardware. The main goal is to find quick well looking approximations rather than a complex physically correct simulation. The work covers using modern programmable GPU not only for visualization but also as a powerful simulation instrument. The main topic is water movement in the terrain and its effects on it like erosion, snow melting and moisture impact on vegetation. This requires dynamic terrain texturing and algorithms supporting fast geometry and normals updates.
|
294 |
Využití GPU pro algoritmy grafiky a zpracování obrazu / Exploitation of GPU in graphics and image processing algorithmsJošth, Radovan January 2015 (has links)
Táto práca popisuje niekoľko vybraných algoritmov, ktoré boli primárne vyvinuté pre CPU procesory, avšak vzhľadom k vysokému dopytu po ich vylepšeniach sme sa rozhodli ich využiť v prospech GPGPU (procesorov grafického adaptéra). Modifikácia týchto algoritmov bola zároveň cieľom nášho výskumu, ktorý bol prevedený pomocou CUDA rozhrania. Práca je členená podľa troch skupín algoritmov, ktorým sme sa venovali: detekcia objektov v reálnom čase, spektrálna analýza obrazu a detekcia čiar v reálnom čase. Pre výskum detekcie objektov v reálnom čase sme zvolili použitie LRD a LRP funkcií. Výskum spektrálnej analýzy obrazu bol prevedný pomocou PCA a NTF algoritmov. Pre potreby skúmania detekcie čiar v reálnom čase sme používali dva rôzne spôsoby modifikovanej akumulačnej schémy Houghovej transformácie. Pred samotnou časťou práce venujúcej sa konkrétnym algoritmom a predmetu skúmania, je v úvodných kapitolách, hneď po kapitole ozrejmujúcej dôvody skúmania vybranej problematiky, stručný prehľad architektúry GPU a GPGPU. Záverečné kapitoly sú zamerané na konkretizovanie vlastného prínosu autora, jeho zameranie, dosiahnuté výsledky a zvolený prístup k ich dosiahnutiu. Súčasťou výsledkov je niekoľko vyvinutých produktov.
|
295 |
Využití paralelizace při numerickém řešení úloh nelineární dynamiky / The exploitation of parallelization to numerical solutions regarding problems in nonlinear dynamicsRek, Václav January 2018 (has links)
The main aim of this thesis is the exploration of the potential use of the parallelism of numerical computations in the field of nonlinear dynamics. In the last decade the dramatic onset of multicore and multi-processor systems in combination with the possibilities which now provide modern computer networks has risen. The complexity and size of the investigated models are constantly increasing due to the high computational complexity of computational tasks in dynamics and statics of structures, mainly because of the nonlinear character of the solved models. Any possibility to speed up such calculation procedures is more than desirable. This is a relatively new branch of science, therefore specific algorithms and parallel implementation are still in the stage of research and development which is attributed to the latest advances in computer hardware, which is growing rapidly. More questions are raised on how best to utilize the available computing power. The proposed parallel model is based on the explicit form of the finite element method, which naturaly provides the possibility of efficient parallelization. The possibilities of multicore processors, as well as parallel hybrid model combining both the possibilities of multicore processors, and the form of the parallelism on a computer network are investigated. The designed approaches are then examined in addressing of the numerical analysis regarding contact/impact phenomena of shell structures.
|
296 |
Hardware Acceleration of a Neighborhood Dependent Component Feature Learning (NDCFL) Super-Resolution AlgorithmMathari Bakthavatsalam, Pagalavan 22 May 2013 (has links)
No description available.
|
297 |
TAMING IRREGULAR CONTROL-FLOW WITH TARGETED COMPILER TRANSFORMATIONSCharitha Saumya Gusthinna Waduge (15460634) 15 May 2023 (has links)
<p> </p>
<p>Irregular control-flow structures like deeply nested conditional branches are common in real-world software applications. Improving the performance and efficiency of such programs is often challenging because it is difficult to analyze and optimize programs with irregular control flow. We observe that real-world programs contain similar or identical computations within different code paths of the conditional branches. Compilers can merge similar code to improve performance or code size. However, existing compiler optimizations like code hoisting/sinking, and tail merging do not fully exploit this opportunity. We propose a new technique called Control-Flow Melding (CFM) that can merge similar code sequences at the control-flow region level. We evaluate CFM in two applications. First, we show that CFM reduces the control divergence in GPU programs and improves the performance. Second, we apply CFM to CPU programs and show its effectiveness in reducing code size without sacrificing performance. In the next part of this dissertation, we investigate how CFM can be extended to improve dynamic test generation techniques like Dynamic Symbolic Execution (DSE). DSE suffers from path explosion problem when many conditional branches are present in the program. We propose a non-semantics-preserving branch elimination transformation called CFM-SE that reduces the number of symbolic branches in a program. We also provide a framework for detecting and reasoning about false positive bugs that might be added to the program by non-semantics-preserving transformations like CFM-SE. Furthermore, we evaluate CFM-SE on real-world applications and show its effectiveness in improving DSE performance and code coverage. </p>
|
298 |
Methods for 3D Structured Light Sensor Calibration and GPU Accelerated ColormapKurella, Venu January 2018 (has links)
In manufacturing, metrological inspection is a time-consuming process.
The higher the required precision in inspection, the longer the
inspection time. This is due to both slow devices that collect
measurement data and slow computational methods that process the data.
The goal of this work is to propose methods to speed up some of these
processes. Conventional measurement devices like Coordinate Measuring
Machines (CMMs) have high precision but low measurement speed while
new digitizer technologies have high speed but low precision. Using
these devices in synergy gives a significant improvement in the
measurement speed without loss of precision. The method of synergistic
integration of an advanced digitizer with a CMM is discussed.
Computational aspects of the inspection process are addressed next. Once
a part is measured, measurement data is compared against its
model to check for tolerances. This comparison is a time-consuming
process on conventional CPUs. We developed and benchmarked some GPU accelerations. Finally, naive data fitting methods can produce misleading results in cases with non-uniform data. Weighted total least-squares methods can compensate for non-uniformity. We show how they can be accelerated with GPUs, using plane fitting as an example. / Thesis / Doctor of Philosophy (PhD)
|
299 |
Une Méthodologie pour le Développement d'Applications Hautes Performances sur des Architectures GPGPU: Application à la Simulation des Machines ÉléctriquesAntonio Wendell, De Oliveira Rodrigues 26 January 2012 (has links) (PDF)
Les phénomènes physiques complexes peuvent être simulés numériquement par des techniques mathématiques basées souvent sur la discrétisation des équations aux dérivées partielles régissant ces phénomènes. Ces simulations peuvent mener ainsi à la résolution de très grands systèmes. La parallélisation des codes de simulation numérique, c'est-à-dire leur adaptation aux architectures des calculateurs parallèles, est alors une nécessité pour parvenir à faire ces simulations en des temps non-exorbitants. Le parallélisme s'est imposé au niveau des architectures de processeurs et les cartes graphiques sont maintenant utilisées pour des fins de calcul généraliste, aussi appelé "General-Purpose computation on Graphics Processing Unit (GPGPU)", avec comme avantage évident l'excellent rapport performance/prix. Cette thèse se place dans le domaine de la conception de ces applications hautes-performances pour la simulation des machines électriques. Nous fournissons une méthodologie basée sur l'Ingénierie Dirigées par les Modèles (IDM) qui permet de modéliser une application et l'architecture sur laquelle l'exécuter afin de générer un code OpenCL. Notre objectif est d'aider les spécialistes en algorithmes de simulations numériques à créer un code efficace qui tourne sur les architectures GPGPU. Pour cela, une chaine de compilation de modèles qui prend en compte les plusieurs aspects du modèle de programmation OpenCL est fournie. De plus, pour rendre le code raisonnablement efficace par rapport à un code développé à la main, nous fournissons des transformations de modèles qui regardent des niveaux d'optimisations basées sur les caractéristiques de l'architecture (niveau de mémoire par exemple). Comme validation expérimentale, la méthodologie est appliquée à la création d'une application qui résout un système linéaire issu de la Méthode des Éléments Finis pour la simulation de machines électriques. Dans ce cas nous montrons, entre autres, la capacité de la méthodologie de passer à l'échelle par une simple modification de la multiplicité des unités GPU disponibles.
|
300 |
Detekce pohyblivého objektu ve videu na CUDA / Moving Object Detection in Video Using CUDAČermák, Michal January 2011 (has links)
This thesis deals with model-based approach to 3D tracking from monocular video. The 3D model pose dynamically estimated through minimization of objective function by particle filter. Objective function is based on rendered scene to real video similarity.
|
Page generated in 0.043 seconds