Spelling suggestions: "subject:"roofing"" "subject:"roofed""
1 |
Performance Models For Distributed Memory HPC Systems And Deep Neural NetworksCardwell, David 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Performance models are useful as mathematical models to reason about the behavior of different computer systems while running various applications. In this thesis, we aim to provide two distinct performance models: one for distributed-memory high performance computing systems with network communication, and one for deep neural networks. Our main goal for the first model is insight and simplicity, while for the second we aim for accuracy in prediction. The first model is generalized for networked multi-core computer systems, while the second is specific to deep neural networks on a shared-memory system.
|
2 |
Accélérations algorithmiques pour la simulation numérique d’impacts de vagues. Modèles de type "roofline" pour la caractérisation des performances, application à la CFD / Algorithmic accelerations for wave impacts numerical simulation. Roofline type models for the performance characterization, application to CFDMrabet, Ahmed Amine 15 May 2018 (has links)
Au cours de ces dernières années les processeurs sont devenus de plus en plus complexes (plusieurs niveaux de cache, vectorisation,...), l’augmentation de la complexité fait que l’étude des performances et les optimisations sont eux aussi devenus de plus en plus complexes et difficiles à comprendre. Donc développer un outil de caractérisation simple et facile d’utilisation des performances d’applications, serait de grande valeur. Le Modèle Roofline [17] promet un début de réponse à ces critères, mais reste insuffisant pour une caractérisation robuste et détaillée. Dans la première partie de cette thèse, Nous allons développer plusieurs versions améliorées du Roofline, robustes et précises, en passant par une version du Roofline en fonction du temps, des blocs et enfin la nouvelle version du Roofline introduite dans la suite de caractérisation Vtune d’Intel. Pour valider ces modèles, nous utilisons le benchmark LINPACK, STREAM ainsi qu’une mini-application développée au cours de cette thèse, qui résout l’équation de l’advection et qui servira de prototype pour l’évaluation de codes hydrodynamiques explicites. Nous portons aussi cette mini-application sur les co-processeurs d’Intel Xeon Phi KNL et KNC. Dans la deuxième partie de cette thèse nous nous intéressons à la simulation d’impact de vagues, à l’aide de codes industriels compressibles et incompressibles. Nous rajoutons plusieurs fonctionnalités dans le code compressible FluxIC, nous effectuons un chaînage de codes incompressible et compressible et enfin nous introduisons un nouveau schéma numérique appelé liquide incompressible et gaz quasi-compressible, qui permet de réaliser une simulation d’impact d’une vague via un code incompressible avec une correction compressible dans les zones où la compressibilité du gaz est importante. / During recent years computer processors have become increasingly complex (multiple levels of cache, vectorization, etc), meaning that the study of performance and optimization is also becoming more complex and difficult to understand. So a simple and easy-to-use model aimed at studying the performance of applications would be of great value. The Roofline model [17] promises to meet this criteria, but it is insufficient for robust and detailed characterization.In the first part of this thesis, several improved versions of the Roofline model, that are more robust and accurate, are developed by going through theRoofline version as a function of time and block, and finally a new Rooflinemodel is implemented in the Intel Vtune characterization suite. To validate thenew models, the LINPACK andtextitSTREAM benchmarks are used, as wellas, a mini-application developed during this thesis that solves the advectionequation and serves as a prototype for the evaluation of explicit hydrodynamicsimulation codes. This mini-application is also ported to the new Intel XeonPhi KNL and KNC co-processors.Simulation of wave impact using compressible and incompressible industrialcodes is the focus of the second part of this thesis. Several functionalities are added to the compressible FluxIC code, and a chaining of compressible andincompressible codes is carried out. Finally, a new numerical scheme called"incompressible liquid and quasi-compressible gas" is introduced, which allowsthe simulation of wave impact using an incompressible code with a compressiblecorrection in areas where gas compressibility is significant.
|
3 |
Performance Models For Distributed Memory HPC Systems And Deep Neural NetworksDavid William Cardwell (8037125) 26 November 2019 (has links)
Performance models are useful as mathematical models to reason about the behavior of different computer systems while running various applications. In this thesis, we aim to provide two distinct performance models: one for distributed-<br>memory high performance computing systems with network communication, and one for deep neural networks. Our main goal for the first model is insight and simplicity, while for the second we aim for accuracy in prediction. The first model is generalized for networked multi-core computer systems, while the second is specific to deep neural networks on a shared-memory system.<br>
|
4 |
Performance Analysis and Modeling of Parallel Applications in the Context of Architectural RooflinesShaila, Nashid 27 October 2016 (has links)
Understanding the performance of applications on modern multi- and manycore platforms is a difficult task and involves complex measurement, analysis, and modeling. The Roofline model is used to assess an application's performance on a given architecture. Not much work has been done with the Roofline model using real measurements. Because it can be a very useful tool for understanding application performance on a given architecture, in this thesis we demonstrate the use of architectural roofline data with measured data for analyzing the performance of different benchmarks. We first explain how to use different toolkits to measure the performance of a program. Next, these data are used to generate the roofline plots, based on which we can decide how can we make the application more efficient and remove bottlenecks. Our results show that this can be a powerful tool for analyzing performance of applications over different architectures and different code versions.
|
5 |
Characterizing and Accelerating Deep Learning and Stream Processing Workloads using Roofline TrajectoriesJaved, Muhammad Haseeb January 2019 (has links)
No description available.
|
6 |
Optimalizace výpočtu v multigridu / Performance Engineering of Stencils Optimization in Geometric MultigridJanalík, Radim January 2015 (has links)
V této práci představujeme blokovou metodu pro zlepšení lokality v cache paměti u výpočtů typu stencil a dva nástroje, Pluto a PATUS, které tuto metodu používají ke generování optimalizovaného kódu. Provádíme různá měření a zkoumáme zrychlení výpočtu při použití různých optimalizací. Nakonec implementujeme vyhlazovací krok v multigridu s různými optimalizacemi a zkoumáme jak se tyto optimalizace projeví na výkonu multigridu.
|
Page generated in 0.0432 seconds