• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • Tagged with
  • 5
  • 5
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Performance Models For Distributed Memory HPC Systems And Deep Neural Networks

Cardwell, David 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Performance models are useful as mathematical models to reason about the behavior of different computer systems while running various applications. In this thesis, we aim to provide two distinct performance models: one for distributed-memory high performance computing systems with network communication, and one for deep neural networks. Our main goal for the first model is insight and simplicity, while for the second we aim for accuracy in prediction. The first model is generalized for networked multi-core computer systems, while the second is specific to deep neural networks on a shared-memory system.
2

Performance Models For Distributed Memory HPC Systems And Deep Neural Networks

David William Cardwell (8037125) 26 November 2019 (has links)
Performance models are useful as mathematical models to reason about the behavior of different computer systems while running various applications. In this thesis, we aim to provide two distinct performance models: one for distributed-<br>memory high performance computing systems with network communication, and one for deep neural networks. Our main goal for the first model is insight and simplicity, while for the second we aim for accuracy in prediction. The first model is generalized for networked multi-core computer systems, while the second is specific to deep neural networks on a shared-memory system.<br>
3

Performance Analysis and Modeling of Parallel Applications in the Context of Architectural Rooflines

Shaila, Nashid 27 October 2016 (has links)
Understanding the performance of applications on modern multi- and manycore platforms is a difficult task and involves complex measurement, analysis, and modeling. The Roofline model is used to assess an application's performance on a given architecture. Not much work has been done with the Roofline model using real measurements. Because it can be a very useful tool for understanding application performance on a given architecture, in this thesis we demonstrate the use of architectural roofline data with measured data for analyzing the performance of different benchmarks. We first explain how to use different toolkits to measure the performance of a program. Next, these data are used to generate the roofline plots, based on which we can decide how can we make the application more efficient and remove bottlenecks. Our results show that this can be a powerful tool for analyzing performance of applications over different architectures and different code versions.
4

Characterizing and Accelerating Deep Learning and Stream Processing Workloads using Roofline Trajectories

Javed, Muhammad Haseeb January 2019 (has links)
No description available.
5

Optimalizace výpočtu v multigridu / Performance Engineering of Stencils Optimization in Geometric Multigrid

Janalík, Radim January 2015 (has links)
V této práci představujeme blokovou metodu pro zlepšení lokality v cache paměti u výpočtů typu stencil a dva nástroje, Pluto a PATUS, které tuto metodu používají ke generování optimalizovaného kódu. Provádíme různá měření a zkoumáme zrychlení výpočtu při použití různých optimalizací. Nakonec implementujeme vyhlazovací krok v multigridu s různými optimalizacemi a zkoumáme jak se tyto optimalizace projeví na výkonu multigridu.

Page generated in 0.0371 seconds