• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 9
  • 1
  • 1
  • Tagged with
  • 11
  • 11
  • 10
  • 10
  • 9
  • 6
  • 6
  • 6
  • 4
  • 4
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Évaluation de la performance des collèges d'enseignement général et professionnel au Québec par la méthode Data Envelopment Analysis /

Broussau, Frédéric, January 2004 (has links)
Thèse (M. en économique)--Université du Québec à Montréal, 2004. / En tête du titre: Université du Québec à Montréal. Bibliogr.: f. [53]-54. Publié aussi en version électronique.
2

Comparison and End-to-End Performance Analysis of Parallel Filesystems

Kluge, Michael 20 September 2011 (has links) (PDF)
This thesis presents a contribution to the field of performance analysis for Input/Output (I/O) related problems, focusing on the area of High Performance Computing (HPC). Beside the compute nodes, High Performance Computing systems need a large amount of supporting components that add their individual behavior to the overall performance characteristic of the whole system. Especially file systems in such environments have their own infrastructure. File operations are typically initiated at the compute nodes and proceed through a deep software stack until the file content arrives at the physical medium. There is a handful of shortcomings that characterize the current state of the art for performance analyses in this area. This includes a system wide data collection, a comprehensive analysis approach for all collected data, an adjusted trace event analysis for I/O related problems, and methods to compare current with archived performance data. This thesis proposes to instrument all soft- and hardware layers to enhance the performance analysis for file operations. The additional information can be used to investigate performance characteristics of parallel file systems. To perform I/O analyses on HPC systems, a comprehensive approach is needed to gather related performance events, examine the collected data and, if necessary, to replay relevant parts on different systems. One larger part of this thesis is dedicated to algorithms that reduce the amount of information that are found in trace files to the level that is needed for an I/O analysis. This reduction is based on the assumption that for this type of analysis all I/O events, but only a subset of all synchronization events of a parallel program trace have to be considered. To extract an I/O pattern from an event trace, only these synchronization points are needed that describe dependencies among different I/O requests. Two algorithms are developed to remove negligible events from the event trace. Considering the related work for the analysis of a parallel file systems, the inclusion of counter data from external sources, e.g. the infrastructure of a parallel file system, has been identified as a major milestone towards a holistic analysis approach. This infrastructure contains a large amount of valuable information that are essential to describe performance effects observed in applications. This thesis presents an approach to collect and subsequently process and store the data. Certain ways how to correctly merge the collected values with application traces are discussed. Here, a revised definition of the term "performance counter" is the first step followed by a tree based approach to combine raw values into secondary values. A visualization approach for I/O patterns closes another gap in the analysis process. Replaying I/O related performance events or event patterns can be done by a flexible I/O benchmark. The constraints for the development of such a benchmark are identified as well as the overall architecture for a prototype implementation. Finally, different examples demonstrate the usage of the developed methods and show their potential. All examples are real use cases and are situated on the HRSK research complex and the 100GBit Testbed at TU Dresden. The I/O related parts of a Bioinformatics and a CFD application have been analyzed in depth and enhancements for both are proposed. An instance of a Lustre file system was deployed and tuned on the 100GBit Testbed by the extensive use of external performance counters.
3

Structural Performance Comparison of Parallel Software Applications

Weber, Matthias 15 December 2016 (has links) (PDF)
With rising complexity of high performance computing systems and their parallel software, performance analysis and optimization has become essential in the development of efficient applications. The comparison of performance data is a key operation required in performance analysis. An analyst may conduct different types of comparisons in order to understand the performance properties of an application. One use case is comparing performance data from multiple measurements. Typical examples for such comparisons are before/after comparisons when applying optimizations or changing code versions. Besides comparing performance between multiple runs, also comparing performance characteristics across the parallel execution streams of an application is essential to detect performance problems. This is typically useful to detect imbalances, outliers, or changing runtime behavior during the execution of an application. While such comparisons are straightforward for the aggregated data in performance profiles, only limited solutions exist for comparing event traces. Trace-based analysis, i.e., the collection of fine-grained information on individual application events with timestamps and application context, has proven to be a powerful technique. The detailed performance information included in event traces make them very suitable for performance analysis. However, this level of detail also presents a challenge because it implies a large and overwhelming amount of data. Currently, users need to perform manual comparison of event traces, which is extremely challenging and time consuming because of the large volume of detailed data and the need to correctly line up trace events. To fill the gap of missing solutions for automatic comparison of event traces, this work proposes a set of techniques that automatically align traces. The alignment allows their structural comparison and the highlighting of differences between them. A set of novel metrics provide the user with an objective measure of the differences between traces, both in terms of differences in the event stream and timing differences across events. An additional important aspect of trace-based analysis is the visualization of performance data in event timelines. This has proven to be a powerful approach for the detection of various types of performance problems. However, visualization of large numbers of event timelines quickly hits the limits of available display resolution. Likewise, identifying performance problems is challenging in the large amount of visualized performance data. To alleviate these problems this work proposes two new approaches for event timeline visualization. First, novel folding strategies for event timelines facilitate visual scalability and provide powerful overviews of performance data at the same time. Second, this work presents an effective approach that automatically identifies and highlights several types of performance critical sections in an application run. This approach identifies time dominant functions of an application and subsequently uses them to analyze runtime imbalances throughout the application run. Intuitive visualizations present the resulting runtime variations and guide the analyst to performance hot spots. Evaluations with benchmarks and real-world applications assess all introduced techniques. The effectiveness of the comparison approaches is demonstrated by showing automatically detected performance issues and structural differences between different versions of applications and across parallel execution streams. Case studies showcase the capabilities of the event timeline visualization techniques by demonstrating scalable performance data visualizations and detecting performance problems and code inefficiencies in real-world applications.
4

Comparison and End-to-End Performance Analysis of Parallel Filesystems

Kluge, Michael 05 September 2011 (has links)
This thesis presents a contribution to the field of performance analysis for Input/Output (I/O) related problems, focusing on the area of High Performance Computing (HPC). Beside the compute nodes, High Performance Computing systems need a large amount of supporting components that add their individual behavior to the overall performance characteristic of the whole system. Especially file systems in such environments have their own infrastructure. File operations are typically initiated at the compute nodes and proceed through a deep software stack until the file content arrives at the physical medium. There is a handful of shortcomings that characterize the current state of the art for performance analyses in this area. This includes a system wide data collection, a comprehensive analysis approach for all collected data, an adjusted trace event analysis for I/O related problems, and methods to compare current with archived performance data. This thesis proposes to instrument all soft- and hardware layers to enhance the performance analysis for file operations. The additional information can be used to investigate performance characteristics of parallel file systems. To perform I/O analyses on HPC systems, a comprehensive approach is needed to gather related performance events, examine the collected data and, if necessary, to replay relevant parts on different systems. One larger part of this thesis is dedicated to algorithms that reduce the amount of information that are found in trace files to the level that is needed for an I/O analysis. This reduction is based on the assumption that for this type of analysis all I/O events, but only a subset of all synchronization events of a parallel program trace have to be considered. To extract an I/O pattern from an event trace, only these synchronization points are needed that describe dependencies among different I/O requests. Two algorithms are developed to remove negligible events from the event trace. Considering the related work for the analysis of a parallel file systems, the inclusion of counter data from external sources, e.g. the infrastructure of a parallel file system, has been identified as a major milestone towards a holistic analysis approach. This infrastructure contains a large amount of valuable information that are essential to describe performance effects observed in applications. This thesis presents an approach to collect and subsequently process and store the data. Certain ways how to correctly merge the collected values with application traces are discussed. Here, a revised definition of the term "performance counter" is the first step followed by a tree based approach to combine raw values into secondary values. A visualization approach for I/O patterns closes another gap in the analysis process. Replaying I/O related performance events or event patterns can be done by a flexible I/O benchmark. The constraints for the development of such a benchmark are identified as well as the overall architecture for a prototype implementation. Finally, different examples demonstrate the usage of the developed methods and show their potential. All examples are real use cases and are situated on the HRSK research complex and the 100GBit Testbed at TU Dresden. The I/O related parts of a Bioinformatics and a CFD application have been analyzed in depth and enhancements for both are proposed. An instance of a Lustre file system was deployed and tuned on the 100GBit Testbed by the extensive use of external performance counters.
5

Advanced Memory Data Structures for Scalable Event Trace Analysis

Knüpfer, Andreas 17 April 2009 (has links) (PDF)
The thesis presents a contribution to the analysis and visualization of computational performance based on event traces with a particular focus on parallel programs and High Performance Computing (HPC). Event traces contain detailed information about specified incidents (events) during run-time of programs and allow minute investigation of dynamic program behavior, various performance metrics, and possible causes of performance flaws. Due to long running and highly parallel programs and very fine detail resolutions, event traces can accumulate huge amounts of data which become a challenge for interactive as well as automatic analysis and visualization tools. The thesis proposes a method of exploiting redundancy in the event traces in order to reduce the memory requirements and the computational complexity of event trace analysis. The sources of redundancy are repeated segments of the original program, either through iterative or recursive algorithms or through SPMD-style parallel programs, which produce equal or similar repeated event sequences. The data reduction technique is based on the novel Complete Call Graph (CCG) data structure which allows domain specific data compression for event traces in a combination of lossless and lossy methods. All deviations due to lossy data compression can be controlled by constant bounds. The compression of the CCG data structure is incorporated in the construction process, such that at no point substantial uncompressed parts have to be stored. Experiments with real-world example traces reveal the potential for very high data compression. The results range from factors of 3 to 15 for small scale compression with minimum deviation of the data to factors > 100 for large scale compression with moderate deviation. Based on the CCG data structure, new algorithms for the most common evaluation and analysis methods for event traces are presented, which require no explicit decompression. By avoiding repeated evaluation of formerly redundant event sequences, the computational effort of the new algorithms can be reduced in the same extent as memory consumption. The thesis includes a comprehensive discussion of the state-of-the-art and related work, a detailed presentation of the design of the CCG data structure, an elaborate description of algorithms for construction, compression, and analysis of CCGs, and an extensive experimental validation of all components. / Diese Dissertation stellt einen neuartigen Ansatz für die Analyse und Visualisierung der Berechnungs-Performance vor, der auf dem Ereignis-Tracing basiert und insbesondere auf parallele Programme und das Hochleistungsrechnen (High Performance Computing, HPC) zugeschnitten ist. Ereignis-Traces (Ereignis-Spuren) enthalten detaillierte Informationen über spezifizierte Ereignisse während der Laufzeit eines Programms und erlauben eine sehr genaue Untersuchung des dynamischen Verhaltens, verschiedener Performance-Metriken und potentieller Performance-Probleme. Aufgrund lang laufender und hoch paralleler Anwendungen und dem hohen Detailgrad kann das Ereignis-Tracing sehr große Datenmengen produzieren. Diese stellen ihrerseits eine Herausforderung für interaktive und automatische Analyse- und Visualisierungswerkzeuge dar. Die vorliegende Arbeit präsentiert eine Methode, die Redundanzen in den Ereignis-Traces ausnutzt, um sowohl die Speicheranforderungen als auch die Laufzeitkomplexität der Trace-Analyse zu reduzieren. Die Ursachen für Redundanzen sind wiederholt ausgeführte Programmabschnitte, entweder durch iterative oder rekursive Algorithmen oder durch SPMD-Parallelisierung, die gleiche oder ähnliche Ereignis-Sequenzen erzeugen. Die Datenreduktion basiert auf der neuartigen Datenstruktur der "Vollständigen Aufruf-Graphen" (Complete Call Graph, CCG) und erlaubt eine Kombination von verlustfreier und verlustbehafteter Datenkompression. Dabei können konstante Grenzen für alle Abweichungen durch verlustbehaftete Kompression vorgegeben werden. Die Datenkompression ist in den Aufbau der Datenstruktur integriert, so dass keine umfangreichen unkomprimierten Teile vor der Kompression im Hauptspeicher gehalten werden müssen. Das enorme Kompressionsvermögen des neuen Ansatzes wird anhand einer Reihe von Beispielen aus realen Anwendungsszenarien nachgewiesen. Die dabei erzielten Resultate reichen von Kompressionsfaktoren von 3 bis 5 mit nur minimalen Abweichungen aufgrund der verlustbehafteten Kompression bis zu Faktoren > 100 für hochgradige Kompression. Basierend auf der CCG_Datenstruktur werden außerdem neue Auswertungs- und Analyseverfahren für Ereignis-Traces vorgestellt, die ohne explizite Dekompression auskommen. Damit kann die Laufzeitkomplexität der Analyse im selben Maß gesenkt werden wie der Hauptspeicherbedarf, indem komprimierte Ereignis-Sequenzen nicht mehrmals analysiert werden. Die vorliegende Dissertation enthält eine ausführliche Vorstellung des Stands der Technik und verwandter Arbeiten in diesem Bereich, eine detaillierte Herleitung der neu eingeführten Daten-strukturen, der Konstruktions-, Kompressions- und Analysealgorithmen sowie eine umfangreiche experimentelle Auswertung und Validierung aller Bestandteile.
6

Structural Performance Comparison of Parallel Software Applications

Weber, Matthias 09 December 2016 (has links)
With rising complexity of high performance computing systems and their parallel software, performance analysis and optimization has become essential in the development of efficient applications. The comparison of performance data is a key operation required in performance analysis. An analyst may conduct different types of comparisons in order to understand the performance properties of an application. One use case is comparing performance data from multiple measurements. Typical examples for such comparisons are before/after comparisons when applying optimizations or changing code versions. Besides comparing performance between multiple runs, also comparing performance characteristics across the parallel execution streams of an application is essential to detect performance problems. This is typically useful to detect imbalances, outliers, or changing runtime behavior during the execution of an application. While such comparisons are straightforward for the aggregated data in performance profiles, only limited solutions exist for comparing event traces. Trace-based analysis, i.e., the collection of fine-grained information on individual application events with timestamps and application context, has proven to be a powerful technique. The detailed performance information included in event traces make them very suitable for performance analysis. However, this level of detail also presents a challenge because it implies a large and overwhelming amount of data. Currently, users need to perform manual comparison of event traces, which is extremely challenging and time consuming because of the large volume of detailed data and the need to correctly line up trace events. To fill the gap of missing solutions for automatic comparison of event traces, this work proposes a set of techniques that automatically align traces. The alignment allows their structural comparison and the highlighting of differences between them. A set of novel metrics provide the user with an objective measure of the differences between traces, both in terms of differences in the event stream and timing differences across events. An additional important aspect of trace-based analysis is the visualization of performance data in event timelines. This has proven to be a powerful approach for the detection of various types of performance problems. However, visualization of large numbers of event timelines quickly hits the limits of available display resolution. Likewise, identifying performance problems is challenging in the large amount of visualized performance data. To alleviate these problems this work proposes two new approaches for event timeline visualization. First, novel folding strategies for event timelines facilitate visual scalability and provide powerful overviews of performance data at the same time. Second, this work presents an effective approach that automatically identifies and highlights several types of performance critical sections in an application run. This approach identifies time dominant functions of an application and subsequently uses them to analyze runtime imbalances throughout the application run. Intuitive visualizations present the resulting runtime variations and guide the analyst to performance hot spots. Evaluations with benchmarks and real-world applications assess all introduced techniques. The effectiveness of the comparison approaches is demonstrated by showing automatically detected performance issues and structural differences between different versions of applications and across parallel execution streams. Case studies showcase the capabilities of the event timeline visualization techniques by demonstrating scalable performance data visualizations and detecting performance problems and code inefficiencies in real-world applications.
7

Advanced Memory Data Structures for Scalable Event Trace Analysis

Knüpfer, Andreas 16 December 2008 (has links)
The thesis presents a contribution to the analysis and visualization of computational performance based on event traces with a particular focus on parallel programs and High Performance Computing (HPC). Event traces contain detailed information about specified incidents (events) during run-time of programs and allow minute investigation of dynamic program behavior, various performance metrics, and possible causes of performance flaws. Due to long running and highly parallel programs and very fine detail resolutions, event traces can accumulate huge amounts of data which become a challenge for interactive as well as automatic analysis and visualization tools. The thesis proposes a method of exploiting redundancy in the event traces in order to reduce the memory requirements and the computational complexity of event trace analysis. The sources of redundancy are repeated segments of the original program, either through iterative or recursive algorithms or through SPMD-style parallel programs, which produce equal or similar repeated event sequences. The data reduction technique is based on the novel Complete Call Graph (CCG) data structure which allows domain specific data compression for event traces in a combination of lossless and lossy methods. All deviations due to lossy data compression can be controlled by constant bounds. The compression of the CCG data structure is incorporated in the construction process, such that at no point substantial uncompressed parts have to be stored. Experiments with real-world example traces reveal the potential for very high data compression. The results range from factors of 3 to 15 for small scale compression with minimum deviation of the data to factors > 100 for large scale compression with moderate deviation. Based on the CCG data structure, new algorithms for the most common evaluation and analysis methods for event traces are presented, which require no explicit decompression. By avoiding repeated evaluation of formerly redundant event sequences, the computational effort of the new algorithms can be reduced in the same extent as memory consumption. The thesis includes a comprehensive discussion of the state-of-the-art and related work, a detailed presentation of the design of the CCG data structure, an elaborate description of algorithms for construction, compression, and analysis of CCGs, and an extensive experimental validation of all components. / Diese Dissertation stellt einen neuartigen Ansatz für die Analyse und Visualisierung der Berechnungs-Performance vor, der auf dem Ereignis-Tracing basiert und insbesondere auf parallele Programme und das Hochleistungsrechnen (High Performance Computing, HPC) zugeschnitten ist. Ereignis-Traces (Ereignis-Spuren) enthalten detaillierte Informationen über spezifizierte Ereignisse während der Laufzeit eines Programms und erlauben eine sehr genaue Untersuchung des dynamischen Verhaltens, verschiedener Performance-Metriken und potentieller Performance-Probleme. Aufgrund lang laufender und hoch paralleler Anwendungen und dem hohen Detailgrad kann das Ereignis-Tracing sehr große Datenmengen produzieren. Diese stellen ihrerseits eine Herausforderung für interaktive und automatische Analyse- und Visualisierungswerkzeuge dar. Die vorliegende Arbeit präsentiert eine Methode, die Redundanzen in den Ereignis-Traces ausnutzt, um sowohl die Speicheranforderungen als auch die Laufzeitkomplexität der Trace-Analyse zu reduzieren. Die Ursachen für Redundanzen sind wiederholt ausgeführte Programmabschnitte, entweder durch iterative oder rekursive Algorithmen oder durch SPMD-Parallelisierung, die gleiche oder ähnliche Ereignis-Sequenzen erzeugen. Die Datenreduktion basiert auf der neuartigen Datenstruktur der "Vollständigen Aufruf-Graphen" (Complete Call Graph, CCG) und erlaubt eine Kombination von verlustfreier und verlustbehafteter Datenkompression. Dabei können konstante Grenzen für alle Abweichungen durch verlustbehaftete Kompression vorgegeben werden. Die Datenkompression ist in den Aufbau der Datenstruktur integriert, so dass keine umfangreichen unkomprimierten Teile vor der Kompression im Hauptspeicher gehalten werden müssen. Das enorme Kompressionsvermögen des neuen Ansatzes wird anhand einer Reihe von Beispielen aus realen Anwendungsszenarien nachgewiesen. Die dabei erzielten Resultate reichen von Kompressionsfaktoren von 3 bis 5 mit nur minimalen Abweichungen aufgrund der verlustbehafteten Kompression bis zu Faktoren > 100 für hochgradige Kompression. Basierend auf der CCG_Datenstruktur werden außerdem neue Auswertungs- und Analyseverfahren für Ereignis-Traces vorgestellt, die ohne explizite Dekompression auskommen. Damit kann die Laufzeitkomplexität der Analyse im selben Maß gesenkt werden wie der Hauptspeicherbedarf, indem komprimierte Ereignis-Sequenzen nicht mehrmals analysiert werden. Die vorliegende Dissertation enthält eine ausführliche Vorstellung des Stands der Technik und verwandter Arbeiten in diesem Bereich, eine detaillierte Herleitung der neu eingeführten Daten-strukturen, der Konstruktions-, Kompressions- und Analysealgorithmen sowie eine umfangreiche experimentelle Auswertung und Validierung aller Bestandteile.
8

Thema mit Variablen: Zur Phänomenologie der Jazzkomposition und musikalischer Analyse

Dreps, Krystoffer 23 October 2023 (has links)
No description available.
9

Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel Processors

Schöne, Robert, Ilsche, Thomas, Bielert, Mario, Molka, Daniel, Hackenberg, Daniel 24 October 2017 (has links) (PDF)
Current Intel processors implement a variety of power saving features like frequency scaling and idle states. These mechanisms limit the power draw and thereby decrease the thermal dissipation of the processors. However, they also have an impact on the achievable performance. The various mechanisms significantly differ regarding the amount of power savings, the latency of mode changes, and the associated overhead. In this paper, we describe and closely examine the so-called software controlled clock modulation mechanism for different processor generations. We present results that imply that the available documentation is not always correct and describe when this feature can be used to improve energy efficiency. We additionally compare it against the more popular feature of dynamic voltage and frequency scaling and develop a model to decide which feature should be used to optimize inter-process synchronizations on Intel Haswell-EP processors.
10

Software Controlled Clock Modulation for Energy Efficiency Optimization on Intel Processors

Schöne, Robert, Ilsche, Thomas, Bielert, Mario, Molka, Daniel, Hackenberg, Daniel 24 October 2017 (has links)
Current Intel processors implement a variety of power saving features like frequency scaling and idle states. These mechanisms limit the power draw and thereby decrease the thermal dissipation of the processors. However, they also have an impact on the achievable performance. The various mechanisms significantly differ regarding the amount of power savings, the latency of mode changes, and the associated overhead. In this paper, we describe and closely examine the so-called software controlled clock modulation mechanism for different processor generations. We present results that imply that the available documentation is not always correct and describe when this feature can be used to improve energy efficiency. We additionally compare it against the more popular feature of dynamic voltage and frequency scaling and develop a model to decide which feature should be used to optimize inter-process synchronizations on Intel Haswell-EP processors.

Page generated in 0.0704 seconds