• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 337
  • 189
  • 134
  • 56
  • 45
  • 44
  • 4
  • 4
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 922
  • 922
  • 922
  • 404
  • 394
  • 351
  • 351
  • 329
  • 325
  • 320
  • 319
  • 316
  • 314
  • 313
  • 313
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
311

Um ambiente de programação e processamento de aplicações paralelas para grades computacionais. / A programming and prrocessing environment of parallel applications to grid computing.

Augusto Mendes Gomes Júnior 28 November 2011 (has links)
A execução de uma aplicação paralela, utilizando grades computacionais, necessita de um ambiente que permita a sua execução, além de realizar o seu gerenciamento, escalonamento e monitoramento. O ambiente de execução deve prover um modelo de processamento, composto pelos modelos de programação e de execução, no qual o objetivo é a exploração adequada das características das grades computacionais. Este trabalho objetiva a proposta de um modelo de processamento paralelo, baseado em variáveis compartilhadas, para grades computacionais, sendo composto por um modelo de execução apropriado para grades e pelo modelo de programação da linguagem paralela CPAR. O ambiente CPAR-Grid foi desenvolvido para executar aplicações paralelas em grades computacionais, abstraindo do usuário todas as características presentes em uma grade computacional. Os resultados obtidos mostram que este ambiente é uma solução eficiente para a execução de aplicações paralelas. / The execution of parallel applications, using grid computing, requires an environment that enables them to be executed, managed, scheduled and monitored. The execution environment must provide a processing model, consisting of programming and execution models, with the objective appropriately exploiting grid computing characteristics. This paper proposes a parallel processing model, based on shared variables for grid computing, consisting of an execution model that is appropriate for the grid and a CPAR parallel language programming model. The CPAR-Grid environment is designed to execute parallel applications in grid computing, where all the characteristics present in grid computing are transparent to users. The results show that this environment is an efficient solution for the execution of parallel applications.
312

The management of multiple submissions in parallel systems : the fair scheduling approach / La gestion de plusieurs soumissions dans les systèmes parallèles : l'approche d'ordonnancement équitable

Gama Pinheiro, Vinicius 14 February 2014 (has links)
Le problème étudié est celui de l'ordonnancement d'applications dans lessystèmes parallèles et distribués avec plusieurs utilisateurs. Les nouvellesplates-formes de calcul parallèle et distribué offrent des puissances trèsgrandes qui permettent d'envisager la résolution d'applications complexesinteractives. Aujourd'hui, il reste encore difficile d'utiliser efficacementcette puissance par manque d'outils de gestion de ressources. Le travaileffectué dans cette thèse se place dans cette perspective d'analyser etdévelopper des algorithmes efficaces pour gérer efficacement des ressources decalcul partagées entre plusieurs utilisateurs. On analyse les scénarios avecplusieurs soumissions lancées par multiples utilisateurs au cours du temps. Cessoumissions ont un ou plus de processus et l'ensemble de soumissions estorganisé en successifs campagnes. Les processus d'une seule campagnesont séquentiels et indépendants, mais les processus d'une campagne ne peuventpas commencer leur exécution avant que tous les processus provenant de ladernière campagne sont completés. Chaque utilisateur est intéressé à minimiserla somme des temps de réponses des campagnes. On définit un modèle théorique pour l'ordonnancement des campagnes et on montreque, dans le cas général, c'est NP-difficile. Pour le cas avec un utilisateur,on démontre qu'un algorithme d'ordonnancement $ho$-approximation pour le(classique) problème d'ordonnancement de tâches parallèles est aussi un$ho$-approximation pour le problème d'ordonnancement de campagnes. Pour lecas général avec $k$ utilisateurs, on établis un critère de emph{fairness}inspiré par partage de temps. On propose FairCamp, un algorithmed'ordonnancement qu'utilise dates limite pour réaliser emph{fairness} parmiles utilisateurs entre consécutifes campagnes. On prouve que FairCamp augmentele temps de réponse de chaque utilisateur par a facteur maximum de $kho$ parrapport un processeur dédiée à l'utilisateur. On prouve aussi que FairCamp estun algorithme $ho$-approximation pour le maximum emph{stretch}.On compare FairCamp contre emph{First-Come-First-Served} (FCFS) parsimulation. On démontre que, comparativement à FCFS, FairCamp réduit le maximal{em stretch} a la limite de $3.4$ fois. La différence est significative dansles systèmes utilisé pour plusieurs ($k>5$) utilisateurs.Les résultats montrent que, plutôt que juste des tâches individuelle etindépendants, campagnes de tâches peuvent être manipulées d'une manièreefficace et équitable. / We study the problem of scheduling in parallel and distributedsystems with multiple users. New platforms for parallel and distributedcomputing offers very large power which allows to contemplate the resolution ofcomplex interactive applications. Nowadays, it is still difficult to use thispower efficiently due to lack of resource management tools. The work done inthis thesis lies in this context: to analyse and develop efficient algorithmsfor manage computing resources shared among multiple users. We analyzescenarios with many submissions issued from multiple users over time. Thesesubmissions contain one or more jobs and the set of submissions are organizedin successive campaigns. Any job from a campaign can not start until allthe jobs from the previous campaign are completed. Each user is interested inminimizing the sum of flow times of the campaigns.In the first part of this work, we define a theoretical model for Campaign Scheduling under restrictive assumptions andwe show that, in the general case, it is NP-hard. For the single-user case, we show that an$ho$-approximation scheduling algorithm for the (classic) parallel jobscheduling problem is also an $ho$-approximation for the Campaign Schedulingproblem. For the general case with $k$ users, we establish a fairness criteriainspired by time sharing. Then, we propose FairCamp, a scheduling algorithm whichuses campaign deadlines to achieve fairness among users between consecutivecampaigns. We prove that FairCamp increases the flow time of each user by afactor of at most $kho$ compared with a machine dedicated to the user. Wealso prove that FairCamp is an $ho$-approximation algorithm for the maximumstretch.We compare FairCamp to {em First-Come-First-Served} (FCFS) by simulation. We showthat, compared with FCFS, FairCamp reduces the maximum stretch by up to $3.4$times. The difference is significant in systems used by many ($k>5$) users.Our results show that, rather than just individual, independent jobs, campaignsof jobs can be handled by the scheduler efficiently and fairly.
313

Um ambiente de programação e processamento de aplicações paralelas para grades computacionais. / A programming and prrocessing environment of parallel applications to grid computing.

Gomes Júnior, Augusto Mendes 28 November 2011 (has links)
A execução de uma aplicação paralela, utilizando grades computacionais, necessita de um ambiente que permita a sua execução, além de realizar o seu gerenciamento, escalonamento e monitoramento. O ambiente de execução deve prover um modelo de processamento, composto pelos modelos de programação e de execução, no qual o objetivo é a exploração adequada das características das grades computacionais. Este trabalho objetiva a proposta de um modelo de processamento paralelo, baseado em variáveis compartilhadas, para grades computacionais, sendo composto por um modelo de execução apropriado para grades e pelo modelo de programação da linguagem paralela CPAR. O ambiente CPAR-Grid foi desenvolvido para executar aplicações paralelas em grades computacionais, abstraindo do usuário todas as características presentes em uma grade computacional. Os resultados obtidos mostram que este ambiente é uma solução eficiente para a execução de aplicações paralelas. / The execution of parallel applications, using grid computing, requires an environment that enables them to be executed, managed, scheduled and monitored. The execution environment must provide a processing model, consisting of programming and execution models, with the objective appropriately exploiting grid computing characteristics. This paper proposes a parallel processing model, based on shared variables for grid computing, consisting of an execution model that is appropriate for the grid and a CPAR parallel language programming model. The CPAR-Grid environment is designed to execute parallel applications in grid computing, where all the characteristics present in grid computing are transparent to users. The results show that this environment is an efficient solution for the execution of parallel applications.
314

Applications, performance analysis, and optimization of weather and air quality models

Sobhani, Negin 01 December 2017 (has links)
Atmospheric particulate matter (PM) is linked to various adverse environmental and health impacts. PM in the atmosphere reduces visibility, alters precipitation patterns by acting as cloud condensation nuclei (CCN), and changes the Earth’s radiative balance by absorbing or scattering solar radiation in the atmosphere. The long-range transport of pollutants leads to increase in PM concentrations even in remote locations such as polar regions and mountain ranges. One significant effect of PM on the earth’s climate occurs while light absorbing PM, such as Black Carbon (BC), deposits over snow. In the Arctic, BC deposition on highly reflective surfaces (e.g. glaciers and sea ices) has very intense effects, causing snow to melt more quickly. Thus, characterizing PM sources, identifying long-range transport pathways, and quantifying the climate impacts of PM are crucial in order to inform emission abatement policies for reducing both health and environmental impacts of PM. Chemical transport models provide mathematical tools for better understanding atmospheric system including chemical and particle transport, pollution diffusion, and deposition. The technological and computational advances in the past decades allow higher resolution air quality and weather forecast simulations with more accurate representations of physical and chemical mechanisms of the atmosphere. Due to the significant role of air pollutants on public health and environment, several countries and cities perform air quality forecasts for warning the population about the future air pollution events and taking local preventive measures such as traffic regulations to minimize the impacts of the forecasted episode. However, the costs associated with the complex air quality forecast models especially for simulations with higher resolution simulations make “forecasting” a challenge. This dissertation also focuses on applications, performance analysis, and optimization of meteorology and air quality modeling forecasting models. This dissertation presents several modeling studies with various scales to better understand transport of aerosols from different geographical sources and economic sectors (i.e. transportation, residential, industry, biomass burning, and power) and quantify their climate impacts. The simulations are evaluated using various observations including ground site measurements, field campaigns, and satellite data. The sector-based modeling studies elucidated the importance of various economical sector and geographical regions on global air quality and the climatic impacts associated with BC. This dissertation provides the policy makers with some implications to inform emission mitigation policies in order to target source sectors and regions with highest impacts. Furthermore, advances were made to better understand the impacts of light absorbing particles on climate and surface albedo. Finally, for improving the modeling speed, the performances of the models are analyzed, and optimizations were proposed for improving the computational efficiencies of the models. Theses optimizations show a significant improvement in the performance of Weather Research and Forecasting (WRF) and WRF-Chem models. The modified codes were validated and incorporated back into the WRF source code to benefit all WRF users. Although weather and air quality models are shown to be an excellent means for forecasting applications both for local and hemispheric scale, further studies are needed to optimize the models and improve the performance of the simulations.
315

Coupled computational fluid dynamics/multibody dynamics method with application to wind turbine simulations

Li, Yuwei 01 May 2014 (has links)
A high fidelity approach coupling the computational fluid dynamics method (CFD) and multi-body dynamics method (MBD) is presented for aero-servo-elastic wind turbine simulations. The approach uses the incompressible CFD dynamic overset code CFDShip-Iowa v4.5 to compute the aerodynamics, coupled with the MBD code Virtual.Lab Motion to predict the motion responses to the aerodynamic loads. The IEC 61400-1 ed. 3 recommended Mann wind turbulence model was implemented in this thesis into the code CFDShip-Iowa v4.5 as boundary and initial conditions, and used as the explicit wind turbulence for CFD simulations. A drivetrain model with control systems was implemented in the CFD/MBD framework for investigation of drivetrain dynamics. The tool and methodology developed in this thesis are unique, being the first time with complete wind turbine simulations including CFD of the rotor/tower aerodynamics, elastic blades, gearbox dynamics and feedback control systems in turbulent winds. Dynamic overset CFD simulations were performed with the benchmark experiment UAE phase VI to demonstrate capabilities of the code for wind turbine aerodynamics. The complete turbine geometry was modeled, including blades and approximate geometries for hub, nacelle and tower. Unsteady Reynolds-Averaged Navier-Stokes (URANS) and Detached Eddy Simulation (DES) turbulence models were used in the simulations. Results for both variable wind speed at constant blade pitch angle and variable blade pitch angle at fixed wind speed show that the CFD predictions match the experimental data consistently well, including the general trends for power and thrust, sectional normal force coefficients and pressure coefficients at different sections along the blade. The implemented Mann wind turbulence model was validated both theoretically and statistically by comparing the generated stationary wind turbulent field with the theoretical one-point spectrum for the three components of the velocity fluctuations, and by comparing the expected statistics from the simulated turbulent field by CFD with the explicit wind turbulence inlet boundary from the Mann model. The proposed coupled CFD/MBD approach was applied to the conceptual NREL 5MW offshore wind turbine. Extensive simulations were performed in an increasing level of complexity to investigate the aerodynamic predictions, turbine performance, elastic blades, wind shear and atmospheric wind turbulence. Comparisons against the publicly available OC3 simulation results show good agreements between the CFD/MBD approach and the OC3 participants in time and frequency domains. Wind turbulence/turbine interaction was examined for the wake flow to analyze the influence of turbulent wind on wake diffusion. The Gearbox Reliability Collaborative project gearbox was up-scaled in size and added to the NREL 5MW turbine with the purpose of demonstrating drivetrain dynamics. Generator torque and blade pitch controllers were implemented to simulate realistic operational conditions of commercial wind turbines. Interactions between wind turbulence, rotor aerodynamics, elastic blades, drivetrain dynamics at the gear-level and servo-control dynamics were studied, showing the potential of the methodology to study complex aerodynamic/mechanic systems.
316

Energy Demand Response for High-Performance Computing Systems

Ahmed, Kishwar 22 March 2018 (has links)
The growing computational demand of scientific applications has greatly motivated the development of large-scale high-performance computing (HPC) systems in the past decade. To accommodate the increasing demand of applications, HPC systems have been going through dramatic architectural changes (e.g., introduction of many-core and multi-core systems, rapid growth of complex interconnection network for efficient communication between thousands of nodes), as well as significant increase in size (e.g., modern supercomputers consist of hundreds of thousands of nodes). With such changes in architecture and size, the energy consumption by these systems has increased significantly. With the advent of exascale supercomputers in the next few years, power consumption of the HPC systems will surely increase; some systems may even consume hundreds of megawatts of electricity. Demand response programs are designed to help the energy service providers to stabilize the power system by reducing the energy consumption of participating systems during the time periods of high demand power usage or temporary shortage in power supply. This dissertation focuses on developing energy-efficient demand-response models and algorithms to enable HPC system's demand response participation. In the first part, we present interconnection network models for performance prediction of large-scale HPC applications. They are based on interconnected topologies widely used in HPC systems: dragonfly, torus, and fat-tree. Our interconnect models are fully integrated with an implementation of message-passing interface (MPI) that can mimic most of its functions with packet-level accuracy. Extensive experiments show that our integrated models provide good accuracy for predicting the network behavior, while at the same time allowing for good parallel scaling performance. In the second part, we present an energy-efficient demand-response model to reduce HPC systems' energy consumption during demand response periods. We propose HPC job scheduling and resource provisioning schemes to enable HPC system's emergency demand response participation. In the final part, we propose an economic demand-response model to allow both HPC operator and HPC users to jointly reduce HPC system's energy cost. Our proposed model allows the participation of HPC systems in economic demand-response programs through a contract-based rewarding scheme that can incentivize HPC users to participate in demand response.
317

Développement d'un système in situ à base de tâches pour un code de dynamique moléculaire classique adapté aux machines exaflopiques / Integration of High-Performance Task-Based In Situ for Molecular Dynamics on Exascale Computers

Dirand, Estelle 06 November 2018 (has links)
L’ère de l’exascale creusera encore plus l’écart entre la vitesse de génération des données de simulations et la vitesse d’écriture et de lecture pour analyser ces données en post-traitement. Le temps jusqu’à la découverte scientifique sera donc grandement impacté et de nouvelles techniques de traitement des données doivent être mises en place. Les méthodes in situ réduisent le besoin d’écrire des données en les analysant directement là où elles sont produites. Il existe plusieurs techniques, en exécutant les analyses sur les mêmes nœuds de calcul que la simulation (in situ), en utilisant des nœuds dédiés (in transit) ou en combinant les deux approches (hybride). La plupart des méthodes in situ traditionnelles ciblent les simulations qui ne sont pas capables de tirer profit du nombre croissant de cœurs par processeur mais elles n’ont pas été conçues pour les architectures many-cœurs qui émergent actuellement. La programmation à base de tâches est quant à elle en train de devenir un standard pour ces architectures mais peu de techniques in situ à base de tâches ont été développées.Cette thèse propose d’étudier l’intégration d’un système in situ à base de tâches pour un code de dynamique moléculaire conçu pour les supercalculateurs exaflopiques. Nous tirons profit des propriétés de composabilité de la programmation à base de tâches pour implanter l’architecture hybride TINS. Les workflows d’analyses sont représentés par des graphes de tâches qui peuvent à leur tour générer des tâches pour une exécution in situ ou in transit. L’exécution in situ est rendue possible grâce à une méthode innovante de helper core dynamique qui s’appuie sur le concept de vol de tâches pour entrelacer efficacement tâches de simulation et d’analyse avec un faible impact sur le temps de la simulation.TINS utilise l’ordonnanceur de vol de tâches d’Intel® TBB et est intégré dans ExaStamp, un code de dynamique moléculaire. De nombreuses expériences ont montrées que TINS est jusqu’à 40% plus rapide que des méthodes existantes de l’état de l’art. Des simulations de dynamique moléculaire sur des système de 2 milliards de particles sur 14,336 cœurs ont montré que TINS est capable d’exécuter des analyses complexes à haute fréquence avec un surcoût inférieur à 10%. / The exascale era will widen the gap between data generation rate and the time to manage their output and analysis in a post-processing way, dramatically increasing the end-to-end time to scientific discovery and calling for a shift toward new data processing methods. The in situ paradigm proposes to analyze data while still resident in the supercomputer memory to reduce the need for data storage. Several techniques already exist, by executing simulation and analytics on the same nodes (in situ), by using dedicated nodes (in transit) or by combining the two approaches (hybrid). Most of the in situ techniques target simulations that are not able to fully benefit from the ever growing number of cores per processor but they are not designed for the emerging manycore processors.Task-based programming models on the other side are expected to become a standard for these architectures but few task-based in situ techniques have been developed so far. This thesis proposes to study the design and integration of a novel task-based in situ framework inside a task-based molecular dynamics code designed for exascale supercomputers. We take benefit from the composability properties of the task-based programming model to implement the TINS hybrid framework. Analytics workflows are expressed as graphs of tasks that can in turn generate children tasks to be executed in transit or interleaved with simulation tasks in situ. The in situ execution is performed thanks to an innovative dynamic helper core strategy that uses the work stealing concept to finely interleave simulation and analytics tasks inside a compute node with a low overhead on the simulation execution time.TINS uses the Intel® TBB work stealing scheduler and is integrated into ExaStamp, a task-based molecular dynamics code. Various experiments have shown that TINS is up to 40% faster than state-of-the-art in situ libraries. Molecular dynamics simulations of up to 2 billions particles on up to 14,336 cores have shown that TINS is able to execute complex analytics workflows at a high frequency with an overhead smaller than 10%.
318

Accelerated many-body protein side-chain repacking using gpus: application to proteins implicated in hearing loss

Tollefson, Mallory RaNae 15 December 2017 (has links)
With recent advances and cost reductions in next generation sequencing (NGS), the amount of genetic sequence data is increasing rapidly. However, before patient specific genetic information reaches its full potential to advance clinical diagnostics, the immense degree of genetic heterogeneity that contributes to human disease must be more fully understood. For example, although large numbers of genetic variations are discovered during clinical use of NGS, annotating and understanding the impact of such coding variations on protein phenotype remains a bottleneck (i.e. what is the molecular mechanism behind deafness phenotypes). Fortunately, computational methods are emerging that can be used to efficiently study protein coding variants, and thereby overcome the bottleneck brought on by rapid adoption of clinical sequencing. To study proteins via physics-based computational algorithms, high-quality 3D structural models are essential. These protein models can be obtained using a variety of numerical optimization methods that operate on physics-based potential energy functions. Accurate protein structures serve as input to downstream variation analysis algorithms. In this work, we applied a novel amino acid side-chain optimization algorithm, which operated on an advanced model of atomic interactions (i.e. the AMOEBA polarizable force field), to a set of 164 protein structural models implicated in deafness. The resulting models were evaluated with the MolProbity structure validation tool. MolProbity “scores” were originally calibrated to predict the quality of X-ray diffraction data used to generate a given protein model (i.e. a 1.0 Å or lower MolProbity score indicates a protein model from high quality data, while a score of 4.0 Å or higher reflects relatively poor data). In this work, the side-chain optimization algorithm improved mean MolProbity score from 2.65 Å (42nd percentile) to nearly atomic resolution at 1.41 Å (95th percentile). However, side-chain optimization with the AMOEBA many-body potential function is computationally expensive. Thus, a second contribution of this work is a parallelization scheme that utilizes nVidia graphical processing units (GPUs) to accelerate the side-chain repacking algorithm. With the use of one GPU, our side-chain optimization algorithm achieved a 25 times speed-up compared to using two Intel Xeon E5-2680v4 central processing units (CPUs). We expect the GPU acceleration scheme to lessen demand on computing resources dedicated to protein structure optimization efforts and thereby dramatically expand the number of protein structures available to aid in interpretation of missense variations associated with deafness.
319

Towards Simulations of Binary Neutron Star Mergers and Core-Collapse Supernovae with GenASiS

Budiardja, Reuben Donald 01 August 2010 (has links)
This dissertation describes the current version of GenASiS and reports recent progress in its development. GenASiS is a new computational astrophysics code built for large-scale and multi-dimensional computer simulations of astrophysical phenomena, with primary emphasis on the simulations of neutron star mergers and core-collapse supernovae. Neutron star mergers are of high interest to the astrophysics community because they should be the prodigious source of gravitation waves and the most promising candidates for gravitational wave detection. Neutron star mergers are also thought to be associated with the production of short-duration, hard-spectral gamma-ray bursts, though the mechanism is not well understood. In contrast, core-collapse supernovae with massive progenitors are associated with long-duration, soft-spectral gamma-ray bursts, with the `collapsar' hypothesis as the favored mechanism. Of equal interest is the mechanism of core-collapse supernovae themselves, which has been in the forefront of many research efforts for the better half of a century but remains a partially-solved mystery. In addition supernovae, and possibly neutron star mergers, are thought to be sites for the emph{r}-process nucleosynthesis responsible for producing many of the heavy elements. Until we have a proper understanding of these events, we will have only a limited understanding of the origin of the elements. These questions provide some of the scientific motivations and guidelines for the development of GenASiS. In this document the equations and numerical scheme for Newtonian and relativistic magnetohydrodynamics are presented. A new FFT-based parallel solver for Poisson's equation in GenASiS are described. Adaptive mesh refinement in GenASiS, and a novel way to solve Poisson's equation on a mesh with refinement based on a multigrid algorithm, are also presented. Following these descriptions, results of simulations of neutron star mergers with GenASiS such as their evolution and the gravitational wave signals and spectra that they generate are shown. In the context of core-collapse supernovae, we explore the capacity of the stationary shock instability to generate magnetic fields starting from a weak, stationary, and radial magnetic field in an initially spherically symmetric fluid configuration that models the stalled shock in the post-bounce supernova environment. Our results show that the magnetic energy can be amplified by almost 4 orders of magnitude. The amplification mechanisms for the magnetic fields are then explained.
320

Algorithms for Advection on Hybrid Parallel Computers

White, James Buford, III 01 May 2011 (has links)
Current climate models have a limited ability to increase spatial resolution because numerical stability requires the time step to decrease. I describe initial experiments with two independent but complementary strategies for attacking this "time barrier". First I describe computational experiments exploring the performance improvements from overlapping computation and communication on hybrid parallel computers. My test case is explicit time integration of linear advection with constant uniform velocity in a three-dimensional periodic domain. I present results for Fortran implementations using various combinations of MPI, OpenMP, and CUDA, with and without overlap of computation and communication. Second I describe a semi-Lagrangian method for tracer transport that is stable for arbitrary Courant numbers, along with a parallel implementation discretized on the cubed sphere. It shows optimal accuracy at Courant numbers of 10-20, more than an order of magnitude higher than explicit methods. Finally I describe the development and stability analyses of the time integrators and advection methods I used for my experiments. I develop explicit single-step methods with stability up to Courant numbers of one in each dimension, hybrid explicit-implict methods with stability for arbitrary Courant numbers, and interpolation operators that enable the arbitrary stability of semi-Lagrangian methods.

Page generated in 0.105 seconds