• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 146
  • 24
  • 19
  • 12
  • 8
  • 4
  • 4
  • 4
  • 3
  • 2
  • 2
  • 1
  • Tagged with
  • 265
  • 96
  • 82
  • 73
  • 67
  • 47
  • 36
  • 35
  • 30
  • 29
  • 28
  • 25
  • 25
  • 25
  • 23
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

The Value Proposition of Campus High-Performance Computing Facilities to Institutional Productivity - A Production Function Model

Preston M Smith (13119846) 21 July 2022 (has links)
<p>This disseration measurses the ROI of the institution’s investment in HPC facilities, and an application of a production function to create a model that will measure the HPC facility investment’s impact on the financial, academic, and reputational outputs of the institution.</p>
22

Towards a Resource Efficient Framework for Distributed Deep Learning Applications

Han, Jingoo 24 August 2022 (has links)
Distributed deep learning has achieved tremendous success for solving scientific problems in research and discovery over the past years. Deep learning training is quite challenging because it requires training on large-scale massive dataset, especially with graphics processing units (GPUs) in latest high-performance computing (HPC) supercomputing systems. HPC architectures bring different performance trends in training throughput compared to the existing studies. Multiple GPUs and high-speed interconnect are used for distributed deep learning on HPC systems. Extant distributed deep learning systems are designed for non-HPC systems without considering efficiency, leading to under-utilization of expensive HPC hardware. In addition, increasing resource heterogeneity has a negative effect on resource efficiency in distributed deep learning methods including federated learning. Thus, it is important to focus on an increasing demand for both high performance and high resource efficiency for distributed deep learning systems, including latest HPC systems and federated learning systems. In this dissertation, we explore and design novel methods and frameworks to improve resource efficiency of distributed deep learning training. We address the following five important topics: performance analysis on deep learning for supercomputers, GPU-aware deep learning job scheduling, topology-aware virtual GPU training, heterogeneity-aware adaptive scheduling, and token-based incentive algorithm. In the first chapter (Chapter 3), we explore and focus on analyzing performance trend of distributed deep learning on latest HPC systems such as Summitdev supercomputer at Oak Ridge National Laboratory. We provide insights by conducting a comprehensive performance study on how deep learning workloads have effects on the performance of HPC systems with large-scale parallel processing capabilities. In the second part (Chapter 4), we design and develop a novel deep learning job scheduler MARBLE, which considers efficiency of GPU resource based on non-linear scalability of GPUs in a single node and improves GPU utilization by sharing GPUs with multiple deep learning training workloads. The third part of this dissertation (Chapter 5) proposes topology-aware virtual GPU training systems TOPAZ, specifically designed for distributed deep learning on recent HPC systems. In the fourth chapter (Chapter 6), we conduct exploration on an innovative holistic federated learning scheduling that employs a heterogeneity-aware adaptive selection method for improving resource efficiency and accuracy performance, coupled with resource usage profiling and accuracy monitoring to achieve multiple goals. In the fifth part of this dissertation (Chapter 7), we are focused on how to provide incentives to participants according to contribution for reaching high performance of final federated model, while tokens are used as a means of paying for the services of providing participants and the training infrastructure. / Doctor of Philosophy / Distributed deep learning is widely used for solving critical scientific problems with massive datasets. However, to accelerate the scientific discovery, resource efficiency is also important for the deployment on real-world systems, such as high-performance computing (HPC) systems. Deployment of existing deep learning applications on these distributed systems may lead to underutilization of HPC hardware resources. In addition, extreme resource heterogeneity has negative effects on distributed deep learning training. However, much of the prior work has not focused on specific challenges in distributed deep learning including HPC systems and heterogeneous federated systems, in terms of optimizing resource utilization.This dissertation addresses the challenges in improving resource efficiency of distributed deep learning applications, through performance analysis on deep learning for supercomputers, GPU-aware deep learning job scheduling, topology-aware virtual GPU training, and heterogeneity-aware adaptive federated learning scheduling and incentive algorithms.
23

Relational Computing Using HPC Resources: Services and Optimizations

Soundarapandian, Manikandan 15 September 2015 (has links)
Computational epidemiology involves processing, analysing and managing large volumes of data. Such massive datasets cannot be handled efficiently by using traditional standalone database management systems, owing to their limitation in the degree of computational efficiency and bandwidth to scale to large volumes of data. In this thesis, we address management and processing of large volumes of data for modeling, simulation and analysis in epidemiological studies. Traditionally, compute intensive tasks are processed using high performance computing resources and supercomputers whereas data intensive tasks are delegated to standalone databases and some custom programs. DiceX framework is a one-stop solution for distributed database management and processing and its main mission is to leverage and utilize supercomputing resources for data intensive computing, in particular relational data processing. While standalone databases are always on and a user can submit queries at any time for required results, supercomputing resources must be acquired and are available for a limited time period. These resources are relinquished either upon completion of execution or at the expiration of the allocated time period. This kind of reservation based usage style poses critical challenges, including building and launching a distributed data engine onto the supercomputer, saving the engine and resuming from the saved image, devising efficient optimization upgrades to the data engine and enabling other applications to seamlessly access the engine . These challenges and requirements cause us to align our approach more closely with cloud computing paradigms of Infrastructure as a Service(IaaS) and Platform as a Service(PaaS). In this thesis, we propose cloud computing like workflows, but using supercomputing resources to manage and process relational data intensive tasks. We propose and implement several services including database freeze and migrate and resume, ad-hoc resource addition and table redistribution. These services assist in carrying out the workflows defined. We also propose an optimization upgrade to the query planning module of postgres-XC, the core relational data processing engine of the DiceX framework. With a knowledge of domain semantics, we have devised a more robust data distribution strategy that would enable to push down most time consuming sql operations forcefully to the postgres-XC data nodes, bypassing its query planner's default shippability criteria without compromising correctness. Forcing query push down reduces the query processing time by a factor of almost 40%-60% for certain complex spatio-temporal queries on our epidemiology datasets. As part of this work, a generic broker service has also been implemented, which acts as an interface to the DiceX framework by exposing restful apis, which applications can make use of to query and retrieve results irrespective of the programming language or environment. / Master of Science
24

Remote High Performance Visualization of Big Data for Immersive Science

Abidi, Faiz Abbas 15 June 2017 (has links)
Remote visualization has emerged as a necessary tool in the analysis of big data. High-performance computing clusters can provide several benefits in scaling to larger data sizes, from parallel file systems to larger RAM profiles to parallel computation among many CPUs and GPUs. For scalable data visualization, remote visualization tools and infrastructure is critical where only pixels and interaction events are sent over the network instead of the data. In this paper, we present our pipeline using VirtualGL, TurboVNC, and ParaView to render over 40 million points using remote HPC clusters and project over 26 million pixels in a CAVE-style system. We benchmark the system by varying the video stream compression parameters supported by TurboVNC and establish some best practices for typical usage scenarios. This work will help research scientists and academicians in scaling their big data visualizations for real time interaction. / Master of Science
25

Towards Using Free Memory to Improve Microarchitecture Performance

Panwar, Gagandeep 18 May 2020 (has links)
A computer system's memory is designed to accommodate the worst-case workloads with the highest memory requirement; as such, memory is underutilized when a system runs workloads with common-case memory requirements. Through a large-scale study of four production HPC systems, we find that memory underutilization problem in HPC systems is very severe. As unused memory is wasted memory, we propose exposing a compute node's unused memory to its CPU(s) through a user-transparent CPU-OS codesign. This can enable many new microarchitecture techniques that transparently leverage unused memory locations to help improve microarchitecture performance. We refer to these techniques as Free-memory-aware Microarchitecture Techniques (FMTs). In the context of HPC systems, we present a detailed example of an FMT called Free-memory-aware Replication (FMR). FMR replicates in-use data to unused memory locations to effectively reduce average memory read latency. On average across five HPC benchmark suites, FMR provides 13% performance and 8% system-level energy improvement. / M.S. / Random-access memory (RAM) or simply memory, stores the temporary data of applications that run on a computer system. Its size is determined by the worst-case application workload that the computer system is supposed to run. Through our memory utilization study of four large multi-node high-performance computing (HPC) systems, we find that memory is underutilized severely in these systems. Unused memory is a wasted resource that does nothing. In this work, we propose techniques that can make use of this wasted memory to boost computer system performance. We call these techniques Free-memory-aware Microarchitecture Techniques (FMTs). We then present an FMT for HPC systems in detail called Free-memory-aware Replication (FMR) that provides performance improvement of over 13%.
26

Le choix des architectures hybrides, une stratégie réaliste pour atteindre l'échelle exaflopique. / The choice of hybrid architectures, a realistic strategy to reach the Exascale.

Loiseau, Julien 14 September 2018 (has links)
La course à l'Exascale est entamée et tous les pays du monde rivalisent pour présenter un supercalculateur exaflopique à l'horizon 2020-2021.Ces superordinateurs vont servir à des fins militaires, pour montrer la puissance d'une nation, mais aussi pour des recherches sur le climat, la santé, l'automobile, physique, astrophysique et bien d'autres domaines d'application.Ces supercalculateurs de demain doivent respecter une enveloppe énergétique de 1 MW pour des raisons à la fois économiques et environnementales.Pour arriver à produire une telle machine, les architectures classiques doivent évoluer vers des machines hybrides équipées d'accélérateurs tels que les GPU, Xeon Phi, FPGA, etc.Nous montrons que les benchmarks actuels ne nous semblent pas suffisants pour cibler ces applications qui ont un comportement irrégulier.Cette étude met en place une métrique ciblant les aspects limitants des architectures de calcul: le calcul et les communications avec un comportement irrégulier.Le problème mettant en avant la complexité de calcul est le problème académique de Langford.Pour la communication nous proposons notre implémentation du benchmark du Graph500.Ces deux métriques mettent clairement en avant l'avantage de l'utilisation d'accélérateurs, comme des GPUs, dans ces circonstances spécifiques et limitantes pour le HPC.Pour valider notre thèse nous proposons l'étude d'un problème réel mettant en jeu à la fois le calcul, les communications et une irrégularité extrême.En réalisant des simulations de physique et d'astrophysique nous montrons une nouvelle fois l'avantage de l'architecture hybride et sa scalabilité. / The countries of the world are already competing for Exascale and the first exaflopics supercomputer should be release by 2020-2021.These supercomputers will be used for military purposes, to show the power of a nation, but also for research on climate, health, physics, astrophysics and many other areas of application.These supercomputers of tomorrow must respect an energy envelope of 1 MW for reasons both economic and environmental.In order to create such a machine, conventional architectures must evolve to hybrid machines equipped with accelerators such as GPU, Xeon Phi, FPGA, etc.We show that the current benchmarks do not seem sufficient to target these applications which have an irregular behavior.This study sets up a metrics targeting the walls of computational architectures: computation and communication walls with irregular behavior.The problem for the computational wall is the Langford's academic combinatorial problem.We propose our implementation of the Graph500 benchmark in order to target the communication wall.These two metrics clearly highlight the advantage of using accelerators, such as GPUs, in these specific and representative problems of HPC.In order to validate our thesis we propose the study of a real problem bringing into play at the same time the computation, the communications and an extreme irregularity.By performing simulations of physics and astrophysics we show once again the advantage of the hybrid architecture and its scalability.
27

Métodos bacteriológicos aplicados à tuberculose bovina: comparação de três métodos de descontaminação e de três protocolos para criopreservação de isolados / Bacteriologic methods applied to bovine tuberculosis: comparison of three decontamination methods and three protocols for cryopreservation of isolates

Ambrosio, Simone Rodrigues 09 December 2005 (has links)
Dada a importância do Programa Nacional de Controle e Erradicação da Brucelose e Tuberculose (PNCEBT), a necessidade de uma eficiente caracterização bacteriológica dos focos como ponto fundamental do sistema de vigilância e as dificuldades encontradas pelos laboratórios quanto aos métodos de isolamento de Mycobacterium bovis fizeram crescer o interesse do meio científico por estudos, sobretudo moleculares, de isolados M. bovis. Para a realização dessas técnicas moleculares, é necessária abundância de massa bacilar, obtida através da manutenção dos isolados em laboratório e repiques em meios de cultura. Entretanto o crescimento fastidioso do M. bovis em meios de cultura traz grandes dificuldades para essas operações. Assim sendo, o presente estudo teve por objetivos: 1º) Comparar três métodos de descontaminação para homogeneizados de órgãos, etapa que precede a semeadura em meios de cultura, onde 60 amostras de tecidos com lesões granulomatosas, provenientes de abatedouros bovinos do Estado de São Paulo, foram colhidas e imersas em solução saturada de Borato de Sódio e transportadas para o Laboratório de Zoonoses Bacterianas do VPS-FMVZ-USP, onde foram processadas até 60 dias após a colheita. Essas amostras foram submetidas a três métodos de descontaminação: Básico (NaOH 4%), Ácido (H2SO4 12%) e 1- Hexadecylpyridinium chloride a 1,5% (HPC) e o quarto método foi representado pela simples diluição com solução salina (controle). Os resultados foram submetidos à comparação de proporções, pelo teste de &#967;², na qual verificou-se que o método HPC foi o que apresentou menor proporção de contaminação (3%) e maior proporção de sucesso para isolamento de BAAR (40%). 2º) Comparar três diferentes meios criopreservates para M. bovis, foram utilizados 16 isolados identificados pela técnica de spoligotyping. Cada um desses isolados foi solubilizado em três meios (solução salina, 7H9 original e 7H9 modificado), e armazenado em três diferentes temperaturas (-20ºC, -80ºC e -196ºC), sendo descongelado em três diferentes tempos (45, 90 e 120 dias de congelamento). Antes do congelamento e após o descongelamento foram feitos cultivos quantitativos em meios de Stonebrink Leslie. Os porcentuais de redução de Unidades Formadoras de Colônias (UFC) nas diferentes condições foram calculados e comparados entre si através de métodos paramétricos e não-paramétricos. Os resultados obtidos foram: na análise da variável tempo, em 90 dias de congelamento foi observada uma maior proporção de perda de M. bovis, quando comparado ao tempo 120 dias (p=0,0002); na análise da variável temperatura, foi observada uma diferença estatística significativa entre as proporções de perda média nas temperaturas de -20ºC e -80ºC (p<0,05); na análise da variável meio, foi observada uma diferença significativa (p=0,044) entre os meios A e C, para 45 dias de congelamento e -20ºC de temperatura de criopreservação. Embora as medianas dos porcentuais de perdas de UFC terem sido sempre inferiores a 4,2%, permitiram sugerir que o melhor protocolo de criopreservação de isolados de M. bovis é solubilizá-los em 7H9 modificado e mantê-los à temperatura de -20ºC / In the context of the National Program of Control and Eradication of Brucellosis and Tuberculosis (PNCEBT), the necessity of an efficient bacteriologic characterization of the infected herds as a cornerstone of the monitoring system and the difficulties faced by the laboratories regarding the methods for Mycobacterium bovis isolation led to a growing interest for scientific studies, especially molecular, of M. bovis isolates. To use these molecular techniques it is necessary to have an abundant bacillary mass, obtained through the maintenance of isolates in laboratory and replication in culture media. However the fastidious growth of M. bovis in culture media brings out great difficulties for these activities. Thus, the present study has the following objectives: First, to compare three decontamination methods for organ homogenates, phase that precedes the sowing in culture media, 60 samples of tissues with granulomatosis injuries, proceeding from bovine slaughterhouses in the State of São Paulo, were obtained, immersed in sodium borato saturated solution and transported to the Laboratório de Zoonoses Bacterianas of the VPS-FMVZ-USP, where they were processed up to 60 days after the sampling. These samples were submitted to three methods of decontamination: Basic NaOH 4%, Acid (H2SO4 12%) and 1- Hexadecylpyridinium chloride (HPC) 1.5% and a simple dilution with saline solution (control method). The results were analysed by means of the &chi test to compare proportions, and it was verified that HPC method presented the smallest proportion of contamination (3%) and the greatest proportion of success for M. bovis isolation (40%). Second, to compare three different cryopreservation media for M. bovis, 16 isolates identified by the technique of spoligotyping were used. Each one of these isolates was solubilized in three media (original saline solution, 7H9 and 7H9 modified), and stored in three different temperatures (-20ºC, -80ºC and -196ordm;C), and defrosted in three different time periods (45, 90 and 120 days of freezing). Before the freezing and after the unfreezing, quantitative cultivations in Stonebrink Leslie media were carried out. The proportions of Colony-Forming Units (CFU) loss in the different conditions were calculated and compared with one another through parametric and non-parametric methods. The results obtained were: in the analysis of the variable time, at 90 days of freezing a bigger proportion of CFU loss was observed when compared to 120 days (p=0,0002); in the analysis of the variable temperature, a statistically significant difference was observed between the average proportions of CFU loss for the temperatures of -20ºC and -80ºC (p<0,05); in the analysis of the variable media, a significant difference was observed (p=0,044) between the media A and C, for 45 days of freezing and -20ºC of cryopreservation temperature. Althougth the medium ones of the proportion of losses of CFU to always have been inferior 4,2%, had allowed to suggest that the best protocol for cryopreservation of M. bovis isolates is to solubilize them in 7H9 modified medium and to keep them at a temperature of -20ºC
28

Analyse statistique et interprétation automatique de données diagraphiques pétrolières différées à l’aide du calcul haute performance / Statistical analysis and automatic interpretation of oil logs using high performance computing

Bruned, Vianney 18 October 2018 (has links)
Dans cette thèse, on s'intéresse à l’automatisation de l’identification et de la caractérisation de strates géologiques à l’aide des diagraphies de puits. Au sein d’un puits, on détermine les strates géologiques grâce à la segmentation des diagraphies assimilables à des séries temporelles multivariées. L’identification des strates de différents puits d’un même champ pétrolier nécessite des méthodes de corrélation de séries temporelles. On propose une nouvelle méthode globale de corrélation de puits utilisant les méthodes d’alignement multiple de séquences issues de la bio-informatique. La détermination de la composition minéralogique et de la proportion des fluides au sein d’une formation géologique se traduit en un problème inverse mal posé. Les méthodes classiques actuelles sont basées sur des choix d’experts consistant à sélectionner une combinaison de minéraux pour une strate donnée. En raison d’un modèle à la vraisemblance non calculable, une approche bayésienne approximée (ABC) aidée d’un algorithme de classification basé sur la densité permet de caractériser la composition minéralogique de la couche géologique. La classification est une étape nécessaire afin de s’affranchir du problème d’identifiabilité des minéraux. Enfin, le déroulement de ces méthodes est testé sur une étude de cas. / In this thesis, we investigate the automation of the identification and the characterization of geological strata using well logs. For a single well, geological strata are determined thanks to the segmentation of the logs comparable to multivariate time series. The identification of strata on different wells from the same field requires correlation methods for time series. We propose a new global method of wells correlation using multiple sequence alignment algorithms from bioinformatics. The determination of the mineralogical composition and the percentage of fluids inside a geological stratum results in an ill-posed inverse problem. Current methods are based on experts’ choices: the selection of a subset of mineral for a given stratum. Because of a model with a non-computable likelihood, an approximate Bayesian method (ABC) assisted with a density-based clustering algorithm can characterize the mineral composition of the geological layer. The classification step is necessary to deal with the identifiability issue of the minerals. At last, the workflow is tested on a study case.
29

Modélisation, prédiction et optimisation de la consommation énergétique d'applications MPI à l'aide de SimGrid / Modeling, Prediction and Optimization of Energy Consumption of MPI Applications using SimGrid

Heinrich, Franz 21 May 2019 (has links)
Les changements technologiques dans la communauté du calcul hauteperformance (HPC) sont importants, en particulier dans le secteurdu parallélisme massif avec plusieurs milliers de cœurs de calcul sur unGPU unique ou accélérateur, et aussi des nouveaux réseaux complexes.La consommation d’énergie de ces machines continuera de croître dans les années à venir,faisant de l’énergie l’un des principaux facteurs de coût.Cela explique pourquoi même la métrique classique"flop / s", généralement utilisé pour évaluer les applications HPC etles machines, est progressivement remplacé par une métrique centré surl’énergie en "flop / watt".Une approche pour prédire la consommation d'énergie se fait parsimulation, cependant, une prédiction précise de la performance estcruciale pour estimer l’énergie. Dans cette thèse, nouscontribuons à la prédiction de performance et d'énergie des architectures HPC.Nous proposons un modèle énergétique qui a été implémenté dans unsimulateur open source, sg. Nous validons ce modèle avec soin eten le comparant systématiquement avec des expériences réelles.Nous utilisons cette contribution pour évaluer les projetsexistants et nous proposons de nouveaux governors DVFS spécialementconçus pour le contexte HPC. / The High-Performance Computing (HPC) community is currently undergoingdisruptive technology changes in almost all fields, including a switch towardsmassive parallelism with several thousand compute cores on a single GPU oraccelerator and new, complex networks. Powering a massively parallel machinebecomesThe energy consumption of these machines will continue to grow in the future,making energy one of the principal cost factors of machine ownership. This explainswhy even the classic metric "flop/s", generally used to evaluate HPC applicationsand machines, is widely regarded as to be replaced by an energy-centric metric"flop/watt".One approach to predict energy consumption is through simulation, however, a pre-cise performance prediction is crucial to estimate the energy faithfully. In this thesis,we contribute to the performance and energy prediction of HPC architectures. Wepropose an energy model which we have implemented in the open source SimGridsimulator. We validate this model by carefully and systematically comparing itwith real experiments. We leverage this contribution to both evaluate existingand propose new DVFS governors that are part*icularly designed to suit the HPCcontext.
30

Efficient Big Data Processing on Large-Scale Shared Platforms ˸ managing I/Os and Failure / Sur l'efficacité des traitements Big Data sur les plateformes partagées à grandes échelle ˸ gestion des entrées-sorties et des pannes

Yildiz, Orcun 08 December 2017 (has links)
En 2017 nous vivons dans un monde régi par les données. Les applications d’analyse de données apportent des améliorations fondamentales dans de nombreux domaines tels que les sciences, la santé et la sécurité. Cela a stimulé la croissance des volumes de données (le déluge du Big Data). Pour extraire des informations utiles à partir de cette quantité énorme d’informations, différents modèles de traitement des données ont émergé tels que MapReduce, Hadoop, et Spark. Les traitements Big Data sont traditionnellement exécutés à grande échelle (les systèmes HPC et les Clouds) pour tirer parti de leur puissance de calcul et de stockage. Habituellement, ces plateformes à grande échelle sont utilisées simultanément par plusieurs utilisateurs et de multiples applications afin d’optimiser l’utilisation des ressources. Bien qu’il y ait beaucoup d’avantages à partager de ces plateformes, plusieurs problèmes sont soulevés dès lors qu’un nombre important d’utilisateurs et d’applications les utilisent en même temps, parmi lesquels la gestion des E / S et des défaillances sont les principales qui peuvent avoir un impact sur le traitement efficace des données.Nous nous concentrons tout d’abord sur les goulots d’étranglement liés aux performances des E/S pour les applications Big Data sur les systèmes HPC. Nous commençons par caractériser les performances des applications Big Data sur ces systèmes. Nous identifions les interférences et la latence des E/S comme les principaux facteurs limitant les performances. Ensuite, nous nous intéressons de manière plus détaillée aux interférences des E/S afin de mieux comprendre les causes principales de ce phénomène. De plus, nous proposons un système de gestion des E/S pour réduire les dégradations de performance que les applications Big Data peuvent subir sur les systèmes HPC. Par ailleurs, nous introduisons des modèles d’interférence pour les applications Big Data et HPC en fonction des résultats que nous obtenons dans notre étude expérimentale concernant les causes des interférences d’E/S. Enfin, nous exploitons ces modèles afin de minimiser l’impact des interférences sur les performances des applications Big Data et HPC. Deuxièmement, nous nous concentrons sur l’impact des défaillances sur la performance des applications Big Data en étudiant la gestion des pannes dans les clusters MapReduce partagés. Nous présentons un ordonnanceur qui permet un recouvrement rapide des pannes, améliorant ainsi les performances des applications Big Data. / As of 2017, we live in a data-driven world where data-intensive applications are bringing fundamental improvements to our lives in many different areas such as business, science, health care and security. This has boosted the growth of the data volumes (i.e., deluge of Big Data). To extract useful information from this huge amount of data, different data processing frameworks have been emerging such as MapReduce, Hadoop, and Spark. Traditionally, these frameworks run on largescale platforms (i.e., HPC systems and clouds) to leverage their computation and storage power. Usually, these largescale platforms are used concurrently by multiple users and multiple applications with the goal of better utilization of resources. Though benefits of sharing these platforms exist, several challenges are raised when sharing these large-scale platforms, among which I/O and failure management are the major ones that can impact efficient data processing.To this end, we first focus on I/O related performance bottlenecks for Big Data applications on HPC systems. We start by characterizing the performance of Big Data applications on these systems. We identify I/O interference and latency as the major performance bottlenecks. Next, we zoom in on I/O interference problem to further understand the root causes of this phenomenon. Then, we propose an I/O management scheme to mitigate the high latencies that Big Data applications may encounter on HPC systems. Moreover, we introduce interference models for Big Data and HPC applications based on the findings we obtain in our experimental study regarding the root causes of I/O interference. Finally, we leverage these models to minimize the impact of interference on the performance of Big Data and HPC applications. Second, we focus on the impact of failures on the performance of Big Data applications by studying failure handling in shared MapReduce clusters. We introduce a failure-aware scheduler which enables fast failure recovery while optimizing data locality thus improving the application performance.

Page generated in 0.4013 seconds