Global ETD Search

1	Extração de informações de desempenho em GPUs NVIDIA / Performance Information Extraction on NVIDIA GPUs Santos, Paulo Carlos Ferreira dos 15 March 2013 (has links) O recente crescimento da utilização de Unidades de Processamento Gráfico (GPUs) em aplicações científicas, que são voltadas ao desempenho, gerou a necessidade de otimizar os programas que nelas rodam. Uma ferramenta adequada para essa tarefa é o modelo de desempenho que, por sua vez, se beneficia da existência de uma ferramenta de extração de informações de desempenho para GPUs. Este trabalho cobre a criação de um gerador de microbenchmark para instruções PTX que também obtém informações sobre as características do hardware da GPU. Os resultados obtidos com o microbenchmark foram validados através de um modelo simplificado que obteve erros entre 6,11% e 16,32% em cinco kernels de teste. Também foram levantados os fatores de imprecisão nos resultados do microbenchmark. Utilizamos a ferramenta para analisar o perfil de desempenho das instruções e identificar grupos de comportamentos semelhantes. Também testamos a dependência do desempenho do pipeline da GPU em função da sequência de instruções executada e verificamos a otimização do compilador para esse caso. Ao fim deste trabalho concluímos que a utilização de microbenchmarks com instruções PTX é factível e se mostrou eficaz para a construção de modelos e análise detalhada do comportamento das instruções. / The recent growth in the use of tailored for performance Graphics Processing Units (GPUs) in scientific applications, generated the need to optimize GPU targeted programs. Performance models are the suitable tools for this task and they benefits from existing GPUs performance information extraction tools. This work covers the creation of a microbenchmark generator using PTX instructions and it also retrieves information about the GPU hardware characteristics. The microbenchmark results were validated using a simplified model with errors rates between 6.11% and 16.32% under five diferent GPU kernels. We also explain the imprecision factors present in the microbenchmark results. This tool was used to analyze the instructions performance profile, identifying groups with similar behavior. We also evaluated the corelation of the GPU pipeline performance and instructions execution sequence. Compiler optimization capabilities for this case were also verified. We concluded that the use of microbenchmarks with PTX instructions is a feasible approach and an efective way to build performance models and to generate detailed analysis of the instructions\' behavior. desempenho de GPU GPU performance linguagem PTX microbenchmark microbenchmark modelo de desempenho paralelismo. parallelism. performance model PTX language
2	Extração de informações de desempenho em GPUs NVIDIA / Performance Information Extraction on NVIDIA GPUs Paulo Carlos Ferreira dos Santos 15 March 2013 (has links) O recente crescimento da utilização de Unidades de Processamento Gráfico (GPUs) em aplicações científicas, que são voltadas ao desempenho, gerou a necessidade de otimizar os programas que nelas rodam. Uma ferramenta adequada para essa tarefa é o modelo de desempenho que, por sua vez, se beneficia da existência de uma ferramenta de extração de informações de desempenho para GPUs. Este trabalho cobre a criação de um gerador de microbenchmark para instruções PTX que também obtém informações sobre as características do hardware da GPU. Os resultados obtidos com o microbenchmark foram validados através de um modelo simplificado que obteve erros entre 6,11% e 16,32% em cinco kernels de teste. Também foram levantados os fatores de imprecisão nos resultados do microbenchmark. Utilizamos a ferramenta para analisar o perfil de desempenho das instruções e identificar grupos de comportamentos semelhantes. Também testamos a dependência do desempenho do pipeline da GPU em função da sequência de instruções executada e verificamos a otimização do compilador para esse caso. Ao fim deste trabalho concluímos que a utilização de microbenchmarks com instruções PTX é factível e se mostrou eficaz para a construção de modelos e análise detalhada do comportamento das instruções. / The recent growth in the use of tailored for performance Graphics Processing Units (GPUs) in scientific applications, generated the need to optimize GPU targeted programs. Performance models are the suitable tools for this task and they benefits from existing GPUs performance information extraction tools. This work covers the creation of a microbenchmark generator using PTX instructions and it also retrieves information about the GPU hardware characteristics. The microbenchmark results were validated using a simplified model with errors rates between 6.11% and 16.32% under five diferent GPU kernels. We also explain the imprecision factors present in the microbenchmark results. This tool was used to analyze the instructions performance profile, identifying groups with similar behavior. We also evaluated the corelation of the GPU pipeline performance and instructions execution sequence. Compiler optimization capabilities for this case were also verified. We concluded that the use of microbenchmarks with PTX instructions is a feasible approach and an efective way to build performance models and to generate detailed analysis of the instructions\' behavior. desempenho de GPU linguagem PTX microbenchmark modelo de desempenho paralelismo. GPU performance microbenchmark parallelism. performance model PTX language
3	Contributions on approximate computing techniques and how to measure them / Contributions sur les techniques de computation approximée et comment les mesurer Rodriguez Cancio, Marcelino 19 December 2017 (has links) La Computation Approximée est basée dans l'idée que des améliorations significatives de l'utilisation du processeur, de l'énergie et de la mémoire peuvent être réalisées, lorsque de faibles niveaux d'imprécision peuvent être tolérés. C'est un concept intéressant, car le manque de ressources est un problème constant dans presque tous les domaines de l'informatique. Des grands superordinateurs qui traitent les big data d'aujourd'hui sur les réseaux sociaux, aux petits systèmes embarqués à contrainte énergétique, il y a toujours le besoin d'optimiser la consommation de ressources. La Computation Approximée propose une alternative à cette rareté, introduisant la précision comme une autre ressource qui peut à son tour être échangée par la performance, la consommation d'énergie ou l'espace de stockage. La première partie de cette thèse propose deux contributions au domaine de l'informatique approximative: Aproximate Loop Unrolling : optimisation du compilateur qui exploite la nature approximative des données de séries chronologiques et de signaux pour réduire les temps d'exécution et la consommation d'énergie des boucles qui le traitent. Nos expériences ont montré que l'optimisation augmente considérablement les performances et l'efficacité énergétique des boucles optimisées (150% - 200%) tout en préservant la précision à des niveaux acceptables. Primer: le premier algorithme de compression avec perte pour les instructions de l'assembleur, qui profite des zones de pardon des programmes pour obtenir un taux de compression qui surpasse techniques utilisées actuellement jusqu'à 10%. L'objectif principal de la Computation Approximée est d'améliorer l'utilisation de ressources, telles que la performance ou l'énergie. Par conséquent, beaucoup d'efforts sont consacrés à l'observation du bénéfice réel obtenu en exploitant une technique donnée à l'étude. L'une des ressources qui a toujours été difficile à mesurer avec précision, est le temps d'exécution. Ainsi, la deuxième partie de cette thèse propose l'outil suivant : AutoJMH : un outil pour créer automatiquement des microbenchmarks de performance en Java. Microbenchmarks fournissent l'évaluation la plus précis de la performance. Cependant, nécessitant beaucoup d'expertise, il subsiste un métier de quelques ingénieurs de performance. L'outil permet (grâce à l'automatisation) l'adoption de microbenchmark par des non-experts. Nos résultats montrent que les microbencharks générés, correspondent à la qualité des manuscrites par des experts en performance. Aussi ils surpassent ceux écrits par des développeurs professionnels dans Java sans expérience en microbenchmarking. / Approximate Computing is based on the idea that significant improvements in CPU, energy and memory usage can be achieved when small levels of inaccuracy can be tolerated. This is an attractive concept, since the lack of resources is a constant problem in almost all computer science domains. From large super-computers processing today’s social media big data, to small, energy-constraint embedded systems, there is always the need to optimize the consumption of some scarce resource. Approximate Computing proposes an alternative to this scarcity, introducing accuracy as yet another resource that can be in turn traded by performance, energy consumption or storage space. The first part of this thesis proposes the following two contributions to the field of Approximate Computing :Approximate Loop Unrolling: a compiler optimization that exploits the approximative nature of signal and time series data to decrease execution times and energy consumption of loops processing it. Our experiments showed that the optimization increases considerably the performance and energy efficiency of the optimized loops (150% - 200%) while preserving accuracy to acceptable levels. Primer: the first ever lossy compression algorithm for assembler instructions, which profits from programs’ forgiving zones to obtain a compression ratio that outperforms the current state-of-the-art up to a 10%. The main goal of Approximate Computing is to improve the usage of resources such as performance or energy. Therefore, a fair deal of effort is dedicated to observe the actual benefit obtained by exploiting a given technique under study. One of the resources that have been historically challenging to accurately measure is execution time. Hence, the second part of this thesis proposes the following tool : AutoJMH: a tool to automatically create performance microbenchmarks in Java. Microbenchmarks provide the finest grain performance assessment. Yet, requiring a great deal of expertise, they remain a craft of a few performance engineers. The tool allows (thanks to automation) the adoption of microbenchmark by non-experts. Our results shows that the generated microbencharks match the quality of payloads handwritten by performance experts and outperforms those written by professional Java developers without experience in microbenchmarking. Computation Approximée Optimisation des compilateurs Diversité logicielle Compression avec perte Microbenchmark Approximate computing Compiler optimizations Software diversity Lossy compression Microbenchmark
4	Java jämfört med C#, vilken sorterar snabbast på Raspberry Pi? / Java compared to C#, which sorts fastest on Raspberry Pi? Olofsson, Christoffer January 2015 (has links) I denna studie skall Java och C# ställas mot varandra och köras på en Raspberry Pi för att se vilken av dem som kan sortera heltalsvektorer snabbast. Som Java-motor kommer Hot-Spot att användas och Mono för C# och de ska sortera vektorer med sorteringsalgoritmer från språkens stödbibliotek och en implementerad algoritm baserad på urvalssortering. Detta arbete är till för att dem som vill arbeta med ett objektorienterat språk på Raspberry Pi, men inte har bestämt sig än för vilket som skall användas. Resultatet visar att Java presterar bättre än C# i de flesta fall och att det finns undantag då C# presterar bättre. / In this study, Java and C# is set against each other and running on a Raspberry Pi to see if they have similar processing times, or if there is a clear difference between the two languages. As Java-engine HotSpot will be used and Mono for C# and they will sort vectors with sorting algorithms from the language's support library and one implemented algorithm based on selection sort. This work is for those who want to work with an object-oriented language on Raspberry Pi, but has not decided yet on which one to choose. The result shows that Java performs better than C# in most cases, but in some cases C# is performing better. Java C# C sharp Raspberry Pi Raspbian Benchmark Microbenchmark HotSpot Mono Linux sorting algorithms Java C# C sharp Raspberry Pi Raspbian Benchmark Microbenchmark HotSpot Mono Linux Sorteringsalgoritmer

1

Page generated in 0.0572 seconds