Global ETD Search

841	An Adaptive Framework for Managing Heterogeneous Many-Core Clusters Rafique, Muhammad Mustafa 21 October 2011 (has links) The computing needs and the input and result datasets of modern scientific and enterprise applications are growing exponentially. To support such applications, High-Performance Computing (HPC) systems need to employ thousands of cores and innovative data management. At the same time, an emerging trend in designing HPC systems is to leverage specialized asymmetric multicores, such as IBM Cell and AMD Fusion APUs, and commodity computational accelerators, such as programmable GPUs, which exhibit excellent price to performance ratio as well as the much needed high energy efficiency. While such accelerators have been studied in detail as stand-alone computational engines, integrating the accelerators into large-scale distributed systems with heterogeneous computing resources for data-intensive computing presents unique challenges and trade-offs. Traditional programming and resource management techniques cannot be directly applied to many-core accelerators in heterogeneous distributed settings, given the complex and custom instruction sets architectures, memory hierarchies and I/O characteristics of different accelerators. In this dissertation, we explore the design space of using commodity accelerators, specifically IBM Cell and programmable GPUs, in distributed settings for data-intensive computing and propose an adaptive framework for programming and managing heterogeneous clusters. The proposed framework provides a MapReduce-based extended programming model for heterogeneous clusters, which distributes tasks between asymmetric compute nodes by considering workload characteristics and capabilities of individual compute nodes. The framework provides efficient data prefetching techniques that leverage general-purpose cores to stage the input data in the private memories of the specialized cores. We also explore the use of an advanced layered-architecture based software engineering approach and provide mixin-layers based reusable software components to enable easy and quick deployment of heterogeneous clusters. The framework also provides multiple resource management and scheduling policies under different constraints, e.g., energy-aware and QoS-aware, to support executing concurrent applications on multi-tenant heterogeneous clusters. When applied to representative applications and benchmarks, our framework yields significantly improved performance in terms of programming efficiency and optimal resource management as compared to conventional, hand-tuned, approaches to program and manage accelerator-based heterogeneous clusters. / Ph. D. Heterogeneous Computing High-Performance Computing Resource Sharing Resource Management and Scheduling Programming Models
842	Prediction Models for Multi-dimensional Power-Performance Optimization on Many Cores Shah, Ankur Savailal 28 May 2008 (has links) Power has become a primary concern for HPC systems. Dynamic voltage and frequency scaling (DVFS) and dynamic concurrency throttling (DCT) are two software tools (or knobs) for reducing the dynamic power consumption of HPC systems. To date, few works have considered the synergistic integration of DVFS and DCT in performance-constrained systems, and, to the best of our knowledge, no prior research has developed application-aware simultaneous DVFS and DCT controllers in real systems and parallel programming frameworks. We present a multi-dimensional, online performance prediction framework, which we deploy to address the problem of simultaneous runtime optimization of DVFS, DCT, and thread placement on multi-core systems. We present results from an implementation of the prediction framework in a runtime system linked to the Intel OpenMP runtime environment and running on a real dual-processor quad-core system as well as a dual-processor dual-core system. We show that the prediction framework derives near-optimal settings of the three power-aware program adaptation knobs that we consider. Our overall runtime optimization framework achieves significant reductions in energy (12.27% mean) and ED² (29.6% mean), through simultaneous power savings (3.9% mean) and performance improvements (10.3% mean). Our prediction and adaptation framework outperforms earlier solutions that adapt only DVFS or DCT, as well as one that sequentially applies DCT then DVFS. Further, our results indicate that prediction-based schemes for runtime adaptation compare favorably and typically improve upon heuristic search-based approaches in both performance and energy savings. / Master of Science concurrency throttling power-aware computing runtime adaptation performance prediction high-performance computing Multicore processors
843	Power Saving Analysis and Experiments for Large Scale Global Optimization Cao, Zhenwei 03 August 2009 (has links) Green computing, an emerging field of research that seeks to reduce excess power consumption in high performance computing (HPC), is gaining popularity among researchers. Research in this field often relies on simulation or only uses a small cluster, typically 8 or 16 nodes, because of the lack of hardware support. In contrast, System G at Virginia Tech is a 2592 processor supercomputer equipped with power aware components suitable for large scale green computing research. DIRECT is a deterministic global optimization algorithm, implemented in the mathematical software package VTDIRECT95. This thesis explores the potential energy savings for the parallel implementation of DIRECT, called pVTdirect, when used with a large scale computational biology application, parameter estimation for a budding yeast cell cycle model, on System G. Two power aware approaches for pVTdirect are developed and compared against the CPUSPEED power saving system tool. The results show that knowledge of the parallel workload of the underlying application is beneficial for power management. / Master of Science VTDIRECT95 power aware computing high performance computing DVFS large scale global optimization budding yeast problem
844	Enabling the use of Heterogeneous Computing for Bioinformatics Bijanapalli Chakri, Ramakrishna 02 October 2013 (has links) The huge amount of information in the encoded sequence of DNA and increasing interest in uncovering new discoveries has spurred interest in accelerating the DNA sequencing and alignment processes. The use of heterogeneous systems, that use different types of computational units, has seen a new light in high performance computing in recent years; However expertise in multiple domains and skills required to program these systems is causing an hindrance to bioinformaticians in rapidly deploying their applications into these heterogeneous systems. This work attempts to make an heterogeneous system, Convey HC-1, with an x86-based host processor and FPGA-based co-processor, accessible to bioinformaticians. First, a highly efficient dynamic programming based Smith-Waterman kernel is implemented in hardware, which is able to achieve a peak throughput of 307.2 Giga Cell Updates per Second (GCUPS) on Convey HC-1. A dynamic programming accelerator interface is provided to any application that uses Smith-Waterman. This implementation is also extended to General Purpose Graphics Processing Units (GP-GPUs), which achieved a peak throughput of 9.89 GCUPS on NVIDIA GTX580 GPU. Second, a well known graphical programming tool, LabVIEW is enabled as a programming tool for the Convey HC-1. A connection is established between the graphical interface and the Convey HC-1 to control and monitor the application running on the FPGA-based co-processor. / Master of Science Field programmable gate arrays Hardware Acceleration High Performance Computing DNA Alignment LabVIEW Heterogeneous Computing GP-GPUs
845	On the Interaction of High-Performance Network Protocol Stacks with Multicore Architectures Chunangad Narayanaswamy, Ganesh 20 May 2008 (has links) Multicore architectures have been one of the primary driving forces in the recent rapid growth in high-end computing systems, contributing to its growing scales and capabilities. With significant enhancements in high-speed networking technologies and protocol stacks which support these high-end systems, a growing need to understand the interaction between them closely is realized. Since these two components have been designed mostly independently, there tend to have often serious and surprising interactions that result in heavy asymmetry in the effective capability of the different cores, thereby degrading the performance for various applications. Similarly, depending on the communication pattern of the application and the layout of processes across nodes, these interactions could potentially introduce network scalability issues, which is also an important concern for system designers. In this thesis, we analyze these asymmetric interactions and propose and design a novel systems level management framework called SIMMer (Systems Interaction Mapping Manager) that automatically monitors these interactions and dynamically manages the mapping of processes on processor cores to transparently maximize application performance. Performance analysis of SIMMer shows that it can improve the communication performance of applications by more than twofold and the overall application performance by 18%. We further analyze the impact of contention in network and processor resources and relate it to the communication pattern of the application. Insights learnt from these analyses can lead to efficient runtime configurations for scientific applications on multicore architectures. / Master of Science Multicore Architectures High-Performance Networking Process-to-Core Mapping Network Contention Runtime Adaptation
846	Bridging the Diffusion of Innovation Chasm for Green Housing Sanderford, Andrew R. 28 August 2013 (has links) Limited transaction and unit attribute information curtail the diffusion potential of green homes and create significant valuation and underwriting problems for the housing debt capital markets, more specifically mortgage originators (lenders) and appraisers. Put into the context of the technology adoption life cycle this missing information prevents green homes from crossing the chasm into the mainstream market. As lenders and appraisers are the gatekeepers of the mainstream mortgage markets, they will be key stakeholders in any strategy for green homes to cross this chasm. The missing transaction and attribute data creates two opportunities for scholarship. The first opportunity is to create and provide preliminary evidence of the chasm in the green housing market place. The second opportunity is to analyze, in the context of this chasm, what information and tools appraisers are using, at present, to estimate the value of high performance homes. / Ph. D. High Performance Housing Appraisal Green Building Valuation Innovation Diffusion Chasm Gatekeeper
847	Ultra-High Performance Concrete Shear Walls in Tall Buildings Dacanay, Thomas Christian 18 April 2016 (has links) This thesis presents the results of an effort to quantify the implications of using ultra-high performance concrete (UHPC) for shear walls in tall buildings considering structural efficiency and environmental sustainability. The Lattice Discrete Particle Model (LDPM) was used to simulate the response to failure of concrete shear walls without web steel bar reinforcement under lateral loading and constant axial compressive loading. The structural efficiency of UHPC with simulated compressive strength of f'c = 231 MPa was compared to that of a high-performance concrete (HPC) with f'c = 51.7 MPa simulated compressive strength. UHPC shear walls were found to have equal uncracked stiffness and superior post-cracking capacity at a thickness 58% of the HPC shear wall thickness, and at 59% of the HPC shear wall weight. Next, the environmental sustainability of UHPC with compressive strength f'c = 220-240 MPa was compared to that of an HPC with compressive strength f'c = 49 MPa with a life-cycle assessment (LCA) approach, using SimaPro sustainability software. At a thickness 58% of the HPC shear wall thickness, UHPC shear walls with 0% fiber by volume were found to have an environmental impact 6% to 10% worse than that of HPC shear walls, and UHPC shear walls with 2% fiber by volume were found to have an environmental impact 47% to 58% worse than that of HPC shear walls. The results detailed herein will allow for design guidelines to be developed which take advantage of UHPC response in shear. Additionally, this work may be implemented into topology optimization frameworks that incorporate the potential improvements in structural efficiency and sustainability through using UHPC. / Master of Science Ultra-high performance concrete shear wall Lattice Discrete Particle Model Sustainability
848	Polymeric Complexes and Composites for Aerospace and Biomedical Applications Zhang, Rui 01 August 2018 (has links) Polymers, among metals and ceramics, are major solid materials which are widely used in all kinds of applications. Polymers are of particular interest because they can be tailored with desirable properties. Polymer-based complexes and composites, which contain both the polymers and other components such as metal oxide/salts, are playing a more and more important role in the material fields. Such complexes and composites may display the benefits of both the polymer and other materials, endowing them with excellent functionalities for targeted applications. In this dissertation, a great deal of research was conducted to synthesize novel polymers and build polymeric complexes and composites for biomedical and aerospace applications. In chapter 3, two methods were developed and optimized to fabricate sub-micron high-performance polymer particles which were subsequently used to coat onto functional carbon fibers via electrostatic interactions, for the purpose of fabricating carbon fiber reinforced polymer composites. In chapter 4, a novel Pluronic® P85-bearing penta-block copolymer was synthesized and formed complexes with magnetite. The complexes displayed non-toxicity to cells normally but were able to selectively kill cancer cells without killing normal cells when subjected to a low-frequency alternating current magnetic field. Such results demonstrated the potential of such polymeric complexes in cancer treatment. Chapter 5 described the synthesis of several ionic graft copolymers primarily bisphosphonate-containing polymers, and the fabrication of polymer-magnetite complexes. The in-depth investigation results indicated the capability of the complexes for potential drug delivery, imaging, and other biomedical applications. Chapter 6 described additional polymer synthesis and particle or complex fabrication for potential drug delivery and imaging, as well as radiation shielding. / PHD / Polymers, metals, and ceramics are three major classes of solid materials that are used every day and everywhere. Polymers are of particular significance because they can be tailored to possess certain desirable properties, and, hence, they are playing a more and more important role as substitutes for metals and ceramics in a wide array of applications. Engineering and high-performance polymers were synthesized with excellent properties for biomedical and aerospace applications. Polymers can be fabricated into composites and complexes which contain not only polymers but also other materials, such as metal oxides/salts, carbon fibers, glass fibers, etc. When composites and complexes are made with sufficient stability, the materials may display the advantages of each component, making them more promising for specific applications. In this dissertation, effort was focused on developing versatile polymer-based complexes and composites for aerospace and biomedical applications. Chapter 3 describes the fabrication of sub-micron high-performance polymer particles by two methods and they were subsequently coated onto functional carbon fibers for making composites. Chapter 4 describes the synthesis of a novel copolymer that formed complexes with magnetite nanoparticles. The complexes were able to selectively kill cancerous cells without killing normal cells when exposed to an external magnetic field, and thus these materials have potential for cancer treatment. Chapter 5 describes the fabrication of phosphonate-bearing ionic copolymer-magnetite complexes and their potential applications in drug delivery, imaging, and other biomedical applications. Chapter 6 describes the synthesis of polymers and their corresponding complexes for potential drug delivery and imaging, as well as potential radiation shielding applications. high-performance polymer carbon fiber suspending agent ionic copolymer magnetite particles imaging drug delivery
849	Scalable Data Management for Object-based Storage Systems Wadhwa, Bharti 19 August 2020 (has links) Parallel I/O performance is crucial to sustain scientific applications on large-scale High-Performance Computing (HPC) systems. Large scale distributed storage systems, in particular the object-based storage systems, face severe challenges for managing the data efficiently. Inefficient data management leads to poor I/O and storage performance in HPC applications and scientific workflows. Some of the main challenges for efficient data management arise from poor resource allocation, load imbalance in object storage targets, and inflexible data sharing between applications in a workflow. In addition, parallel I/O makes it challenging to shoehorn new interfaces, such as taking advantage of multiple layers of storage and support for analysis in the data path. Solving these challenges to improve performance and efficiency of object-based storage systems is crucial, especially for upcoming era of exascale systems. This dissertation is focused on solving these major challenges in object-based storage systems by providing scalable data management strategies. In the first part of the dis-sertation (Chapter 3), we present a resource contention aware load balancing tool (iez) for large scale distributed object-based storage systems. In Chapter 4, we extend iez to support Progressive File Layout for object-based storage system: Lustre. In the second part (Chapter 5), we present a technique to facilitate data sharing in scientific workflows using object-based storage, with our proposed tool Workflow Data Communicator. In the last part of this dissertation, we present a solution for transparent data management in multi-layer storage hierarchy of present and next-generation HPC systems.This dissertation shows that by intelligently employing scalable data management techniques, scientific applications' and workflows' flexibility and performance in object-based storage systems can be enhanced manyfold. Our proposed data management strategies can guide next-generation HPC storage systems' software design to efficiently support data for scientific applications and workflows. / Doctor of Philosophy / Large scale object-based storage systems face severe challenges to manage the data efficiently for HPC applications and workflows. These storage systems often manage and share data inflexibly, without considering the load imbalance and resource contention in the underlying multi-layer storage hierarchy. This dissertation first studies how resource contention and inflexible data sharing mechanisms impact HPC applications' storage and I/O performance; and then presents a series of efficient techniques, tools and algorithms to provide efficient and scalable data management for current and next-generation HPC storage systems Lustre Ceph High Performance Computing Parallel File Systems ParallelI/O Optimization Load Imbalance Resource Contention
850	On the Use of Containers in High Performance Computing Abraham, Subil 09 July 2020 (has links) The lightweight, portable, and flexible nature of containers is driving their widespread adoption in cloud solutions. Data analysis and deep learning applications have especially benefited from containerized solutions. As such data analysis is also being utilized in the high performance computing (HPC) domain, the need for container support in HPC has become paramount. However, container adoption in HPC face crucial performance and I/O challenges. One obstacle is that while there have been container solutions for HPC, such solutions have not been thoroughly investigated, especially from the aspect of their impact on the crucial I/O throughput needs of HPC. To this end, this paper provides a first-of-its-kind empirical analysis of state-of-the-art representative container solutions (Docker, Podman, Singularity, and Charliecloud) in HPC environments, especially how containers interact with the HPC storage systems. We present the design of an analysis framework that is deployed on all nodes in an HPC environment, and captures aspects such as CPU, memory, network, and file I/O statistics from the nodes and the storage system. We are able to garner key insights from our analysis, e.g., Charliecloud outperforms other container solutions in terms of container start-up time, while Singularity and Charliecloud are equivalent in I/O throughput. But this comes at a cost, as Charliecloud invokes the most metadata and I/O operations on the underlying Lustre file system. By identifying such optimization opportunities, we can enhance performance of containers atop HPC and help the aforementioned applications. / Master of Science / Containers are a technology that allow for applications to be packaged along with its ideal environment, all the way down to its preferred operating system. This allows an application to run anywhere that can support containers without a huge hit to the application performance. Hence containers have seen wide adoption for use in the cloud. These qualities have also made it very appealing for use in the world of scientific research in national labs. Modern research heavily relies on the power of computing in order to model, simulate, and test the behavior of real world entities, often making use of large amounts of data and utilizing machine learning and deep learning. Doing this often requires the high performance computing power found in supercomputers. In most cases, scientists just want to be able to write their code and expect it to just work. Their applications might depend on other source code that form part of their standard toolkit and would expect to also be installed in the supercomputing environment. This may not always be the case, taking the scientist's focus away from their work in order ensure their requirements are set up in the supercomputing environment which might require extensive cooperation with the operations team responsible for the supercomputers. Containers easily solve this problem because it can package everything together. However, the use of containers in these environments have not been extensively tested, especially for applications that are very heavy on the analysis of large quantities of data. To fill this gap, this work analyzes the performance of several state-of-the-art container technologies (Docker, Podman, Singularity, Charliecloud), with a particular focus on its interaction with the Lustre data storage systems widely used in supercomputing environments. As part of this work, we design an analysis setup that captures the behavior of various aspects of the high performance computing environment like CPU, memory, network usage and data movement while using containers to run data heavy applications. We garner important insights about their performance that can help inform the best choice of container technology given an environment and the kind of application that needs to be run. Container Performance High Performance Computing Parallel File Systems HPC Storage and I/O

Search results