• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 28
  • 9
  • 2
  • 1
  • Tagged with
  • 68
  • 68
  • 20
  • 13
  • 13
  • 12
  • 10
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Analyzing and Evaluating the Resilience of Scheduling Scientific Applications on High Performance Computing Systems using a Simulation-based Methodology

Sukhija, Nitin 09 May 2015 (has links)
Large scale systems provide a powerful computing platform for solving large and complex scientific applications. However, the inherent complexity, heterogeneity, wide distribution, and dynamism of the computing environments can lead to performance degradation of the scientific applications executing on these computing systems. Load imbalance arising from a variety of sources such as application, algorithmic, and systemic variations is one of the major contributors to their performance degradation. In general, load balancing is achieved via scheduling. Moreover, frequently occurring resource failures drastically affect the execution of applications running on high performance computing systems. Therefore, the study of deploying support for integrated scheduling and fault-tolerance mechanisms for guaranteeing that applications deployed on computing systems are resilient to failures becomes of paramount importance. Recently, several research initiatives have started to address the issue of resilience. However, the major focus of these efforts was geared more toward achieving system level resilience with less emphasis on achieving resilience at the application level. Therefore, it is increasingly important to extend the concept of resilience to the scheduling techniques at the application level for establishing a holistic approach that addresses the performability of these applications on high performance computing systems. This can be achieved by developing a comprehensive modeling framework that can be used to evaluate the resiliency of such techniques on heterogeneous computing systems for assessing the impact of failures as well as workloads in an integrated way. This dissertation presents an experimental methodology based on discrete event simulation for the analysis and the evaluation of the resilience of scheduling scientific applications on high performance computing systems. With the aid of the methodology a wide class of dependencies existing between application and computing system are captured within a deterministic model for quantifying the performance impact expected from changes in application and system characteristics. Ideally, the results obtained by employing the proposed simulation-based performance prediction framework enabled an introspective design and investigation of scheduling heuristics to reason about how to best fully optimize various often antagonistic objectives, such as minimizing application makespan and maximizing reliability.
42

Evaluating the Robustness of Resource Allocations Obtained through Performance Modeling with Stochastic Process Algebra

Srivastava, Srishti 09 May 2015 (has links)
Recent developments in the field of parallel and distributed computing has led to a proliferation of solving large and computationally intensive mathematical, science, or engineering problems, that consist of several parallelizable parts and several non-parallelizable (sequential) parts. In a parallel and distributed computing environment, the performance goal is to optimize the execution of parallelizable parts of an application on concurrent processors. This requires efficient application scheduling and resource allocation for mapping applications to a set of suitable parallel processors such that the overall performance goal is achieved. However, such computational environments are often prone to unpredictable variations in application (problem and algorithm) and system characteristics. Therefore, a robustness study is required to guarantee a desired level of performance. Given an initial workload, a mapping of applications to resources is considered to be robust if that mapping optimizes execution performance and guarantees a desired level of performance in the presence of unpredictable perturbations at runtime. In this research, a stochastic process algebra, Performance Evaluation Process Algebra (PEPA), is used for obtaining resource allocations via a numerical analysis of performance modeling of the parallel execution of applications on parallel computing resources. The PEPA performance model is translated into an underlying mathematical Markov chain model for obtaining performance measures. Further, a robustness analysis of the allocation techniques is performed for finding a robustmapping from a set of initial mapping schemes. The numerical analysis of the performance models have confirmed similarity with the simulation results of earlier research available in existing literature. When compared to direct experiments and simulations, numerical models and the corresponding analyses are easier to reproduce, do not incur any setup or installation costs, do not impose any prerequisites for learning a simulation framework, and are not limited by the complexity of the underlying infrastructure or simulation libraries.
43

Development and Application of an Analyst Process Model for a Search Task Scenario

Karl, Hendrickson K. 04 June 2014 (has links)
No description available.
44

Hyperspectral Target Detection Performance Modeling

Morman, Christopher Joseph January 2015 (has links)
No description available.
45

Simulation-based Cognitive Workload Modeling And Evaluation Of Adaptive Automation Invoking And Revoking Strategies

Rusnock, Christina 01 January 2013 (has links)
In human-computer systems, such as supervisory control systems, large volumes of incoming and complex information can degrade overall system performance. Strategically integrating automation to offload tasks from the operator has been shown to increase not only human performance but also operator efficiency and safety. However, increased automation allows for increased task complexity, which can lead to high cognitive workload and degradation of situational awareness. Adaptive automation is one potential solution to resolve these issues, while maintaining the benefits of traditional automation. Adaptive automation occurs dynamically, with the quantity of automated tasks changing in real-time to meet performance or workload goals. While numerous studies evaluate the relative performance of manual and adaptive systems, little attention has focused on the implications of selecting particular invoking or revoking strategies for adaptive automation. Thus, evaluations of adaptive systems tend to focus on the relative performance among multiple systems rather than the relative performance within a system. This study takes an intra-system approach specifically evaluating the relationship between cognitive workload and situational awareness that occurs when selecting a particular invoking-revoking strategy for an adaptive system. The case scenario is a human supervisory control situation that involves a system operator who receives and interprets intelligence outputs from multiple unmanned assets, and then identifies and reports potential threats and changes in the environment. In order to investigate this relationship between workload and situational awareness, discrete event simulation (DES) is used. DES is a standard technique in the analysis iv of systems, and the advantage of using DES to explore this relationship is that it can represent a human-computer system as the state of the system evolves over time. Furthermore, and most importantly, a well-designed DES model can represent the human operators, the tasks to be performed, and the cognitive demands placed on the operators. In addition to evaluating the cognitive workload to situational awareness tradeoff, this research demonstrates that DES can quite effectively model and predict human cognitive workload, specifically for system evaluation. This research finds that the predicted workload of the DES models highly correlates with well-established subjective measures and is more predictive of cognitive workload than numerous physiological measures. This research then uses the validated DES models to explore and predict the cognitive workload impacts of adaptive automation through various invoking and revoking strategies. The study provides insights into the workload-situational awareness tradeoffs that occur when selecting particular invoking and revoking strategies. First, in order to establish an appropriate target workload range, it is necessary to account for both performance goals and the portion of the workload-performance curve for the task in question. Second, establishing an invoking threshold may require a tradeoff between workload and situational awareness, which is influenced by the task’s location on the workload-situational awareness continuum. Finally, this study finds that revoking strategies differ in their ability to achieve workload and situational awareness goals. For the case scenario examined, revoking strategies based on duration are best suited to improve workload, while revoking strategies based on revoking thresholds are better for maintaining situational awareness.
46

A performance model for wormhole-switched interconnection networks under self-similar traffic.

Min, Geyong, Ould-Khaoua, M. January 2004 (has links)
No / Many recent studies have convincingly demonstrated that network traffic exhibits a noticeable self-similar nature which has a considerable impact on queuing performance. However, the networks used in current multicomputers have been primarily designed and analyzed under the assumption of the traditional Poisson arrival process, which is inherently unable to capture traffic self-similarity. Consequently, it is crucial to reexamine the performance properties of multicomputer networks in the context of more realistic traffic models before practical implementations show their potential faults. In an effort toward this end, this paper proposes the first analytical model for wormhole-switched k-ary n-cubes in the presence of self-similar traffic. Simulation experiments demonstrate that the proposed model exhibits a good degree of accuracy for various system sizes and under different operating conditions. The analytical model is then used to investigate the implications of traffic self-similarity on network performance. This study reveals that the network suffers considerable performance degradation when subjected to self-similar traffic, stressing the great need for improving network performance to ensure efficient support for this type of traffic.
47

Scalable and Energy Efficient Execution Methods for Multicore Systems

Li, Dong 16 February 2011 (has links)
Multicore architectures impose great pressure on resource management. The exploration spaces available for resource management increase explosively, especially for large-scale high end computing systems. The availability of abundant parallelism causes scalability concerns at all levels. Multicore architectures also impose pressure on power management. Growth in the number of cores causes continuous growth in power. In this dissertation, we introduce methods and techniques to enable scalable and energy efficient execution of parallel applications on multicore architectures. We study strategies and methodologies that combine DCT and DVFS for the hybrid MPI/OpenMP programming model. Our algorithms yield substantial energy saving (8.74% on average and up to 13.8%) with either negligible performance loss or performance gain (up to 7.5%). To save additional energy for high-end computing systems, we propose a power-aware MPI task aggregation framework. The framework predicts the performance effect of task aggregation in both computation and communication phases and its impact in terms of execution time and energy of MPI programs. Our framework provides accurate predictions that lead to substantial energy saving through aggregation (64.87% on average and up to 70.03%) with tolerable performance loss (under 5%). As we aggregate multiple MPI tasks within the same node, we have the scalability concern of memory registration for high performance networking. We propose a new memory registration/deregistration strategy to reduce registered memory on multicore architectures with helper threads. We investigate design polices and performance implications of the helper thread approach. Our method efficiently reduces registered memory (23.62% on average and up to 49.39%) and avoids memory registration/deregistration costs for reused communication memory. Our system enables the execution of application input sets that could not run to the completion with the memory registration limitation. / Ph. D.
48

Automated Runtime Analysis and Adaptation for Scalable Heterogeneous Computing

Helal, Ahmed Elmohamadi Mohamed 29 January 2020 (has links)
In the last decade, there have been tectonic shifts in computer hardware because of reaching the physical limits of the sequential CPU performance. As a consequence, current high-performance computing (HPC) systems integrate a wide variety of compute resources with different capabilities and execution models, ranging from multi-core CPUs to many-core accelerators. While such heterogeneous systems can enable dramatic acceleration of user applications, extracting optimal performance via manual analysis and optimization is a complicated and time-consuming process. This dissertation presents graph-structured program representations to reason about the performance bottlenecks on modern HPC systems and to guide novel automation frameworks for performance analysis and modeling and runtime adaptation. The proposed program representations exploit domain knowledge and capture the inherent computation and communication patterns in user applications, at multiple levels of computational granularity, via compiler analysis and dynamic instrumentation. The empirical results demonstrate that the introduced modeling frameworks accurately estimate the realizable parallel performance and scalability of a given sequential code when ported to heterogeneous HPC systems. As a result, these frameworks enable efficient workload distribution schemes that utilize all the available compute resources in a performance-proportional way. In addition, the proposed runtime adaptation frameworks significantly improve the end-to-end performance of important real-world applications which suffer from limited parallelism and fine-grained data dependencies. Specifically, compared to the state-of-the-art methods, such an adaptive parallel execution achieves up to an order-of-magnitude speedup on the target HPC systems while preserving the inherent data dependencies of user applications. / Doctor of Philosophy / Current supercomputers integrate a massive number of heterogeneous compute units with varying speed, computational throughput, memory bandwidth, and memory access latency. This trend represents a major challenge to end users, as their applications have been designed from the ground up to primarily exploit homogeneous CPUs. While heterogeneous systems can deliver several orders of magnitude speedup compared to traditional CPU-based systems, end users need extensive software and hardware expertise as well as significant time and effort to efficiently utilize all the available compute resources. To streamline such a daunting process, this dissertation presents automated frameworks for analyzing and modeling the performance on parallel architectures and for transforming the execution of user applications at runtime. The proposed frameworks incorporate domain knowledge and adapt to the input data and the underlying hardware using novel static and dynamic analyses. The experimental results show the efficacy of the introduced frameworks across many important application domains, such as computational fluid dynamics (CFD), and computer-aided design (CAD). In particular, the adaptive execution approach on heterogeneous systems achieves up to an order-of-magnitude speedup over the optimized parallel implementations.
49

Performance Modeling, Optimization, and Characterization on Heterogeneous Architectures

Panwar, Lokendra Singh 21 October 2014 (has links)
Today, heterogeneous computing has truly reshaped the way scientists think and approach high-performance computing (HPC). Hardware accelerators such as general-purpose graphics processing units (GPUs) and Intel Many Integrated Core (MIC) architecture continue to make in-roads in accelerating large-scale scientific applications. These advancements, however, introduce new sets of challenges to the scientific community such as: selection of best processor for an application, effective performance optimization strategies, maintaining performance portability across architectures etc. In this thesis, we present our techniques and approach to address some of these significant issues. Firstly, we present a fully automated approach to project the relative performance of an OpenCL program over different GPUs. Performance projections can be made within a small amount of time, and the projection overhead stays relatively constant with the input data size. As a result, the technique can help runtime tools make dynamic decisions about which GPU would run faster for a given kernel. Usage cases of this technique include scheduling or migrating GPU workloads over a heterogeneous cluster with different types of GPUs. We then present our approach to accelerate a seismology modeling application that is based on the finite difference method (FDM), using MPI and CUDA over a hybrid CPU+GPU cluster. We describe the generic computational complexities involved in porting such applications to the GPUs and present our strategy of efficient performance optimization and characterization. We also show how performance modeling can be used to reason and drive the hardware-specific optimizations on the GPU. The performance evaluation of our approach delivers a maximum speedup of 23-fold with a single GPU and 33-fold with dual GPUs per node over the serial version of the application, which in turn results in a many-fold speedup when coupled with the MPI distribution of the computation across the cluster. We also study the efficacy of GPU-integrated MPI, with MPI-ACC as an example implementation, in a seismology modeling application and discuss the lessons learned. / Master of Science
50

Theories and Techniques for Efficient High-End Computing

Ge, Rong 02 November 2007 (has links)
Today, power consumption costs supercomputer centers millions of dollars annually and the heat produced can reduce system reliability and availability. Achieving high performance while reducing power consumption is challenging since power and performance are inextricably interwoven; reducing power often results in degradation in performance. This thesis aims to address these challenges by providing theories, techniques, and tools to 1) accurately predict performance and improve it in systems with advanced hierarchical memories, 2) understand and evaluate power and its impacts on performance, 3) control power and performance for maximum efficiency. Our theories, techniques, and tools have been applied to high-end computing systems. Our theroetical models can improve algorithm performance by up to 59% and accurately predict the impacts of power on performance. Our techniques can evaluate power consumption of high-end computing systems and their applications with fine granularity and save up to 36% energy with little performance degradation. / Ph. D.

Page generated in 0.1286 seconds