• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 318
  • 189
  • 134
  • 56
  • 45
  • 32
  • 4
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 874
  • 874
  • 874
  • 391
  • 387
  • 350
  • 349
  • 328
  • 325
  • 319
  • 319
  • 316
  • 314
  • 313
  • 313
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
121

A web-based high performance simulation system for transport and retention of dissolved contaminants in soils

Zeng, Honghai. January 2002 (has links)
Thesis (Ph. D.)--Mississippi State University. Department of Engineering. / Title from title screen. Includes bibliographical references.
122

Best effort MPI/RT as an alternative to MPI design and performance comparison /

Angadi, Raghavendra. January 2002 (has links)
Thesis (M.S.)--Mississippi State University. Department of Computer Science. / Title from title screen. Includes bibliographical references.
123

Petrophysical modeling and simulatin study of geological CO₂ sequestration

Kong, Xianhui 24 June 2014 (has links)
Global warming and greenhouse gas (GHG) emissions have recently become the significant focus of engineering research. The geological sequestration of greenhouse gases such as carbon dioxide (CO₂) is one approach that has been proposed to reduce the greenhouse gas emissions and slow down global warming. Geological sequestration involves the injection of produced CO₂ into subsurface formations and trapping the gas through many geological mechanisms, such as structural trapping, capillary trapping, dissolution, and mineralization. While some progress in our understanding of fluid flow in porous media has been made, many petrophysical phenomena, such as multi-phase flow, capillarity, geochemical reactions, geomechanical effect, etc., that occur during geological CO₂ sequestration remain inadequately studied and pose a challenge for continued study. It is critical to continue to research on these important issues. Numerical simulators are essential tools to develop a better understanding of the geologic characteristics of brine reservoirs and to build support for future CO₂ storage projects. Modeling CO₂ injection requires the implementation of multiphase flow model and an Equation of State (EOS) module to compute the dissolution of CO₂ in brine and vice versa. In this study, we used the Integrated Parallel Accurate Reservoir Simulator (IPARS) developed at the Center for Subsurface Modeling at The University of Texas at Austin to model the injection process and storage of CO₂ in saline aquifers. We developed and implemented new petrophysical models in IPARS, and applied these models to study the process of CO₂ sequestration. The research presented in this dissertation is divided into three parts. The first part of the dissertation discusses petrophysical and computational models for the mechanical, geological, petrophysical phenomena occurring during CO₂ injection and sequestration. The effectiveness of CO₂ storage in saline aquifers is governed by the interplay of capillary, viscous, and buoyancy forces. Recent experimental data reveals the impact of pressure, temperature, and salinity on interfacial tension (IFT) between CO₂ and brine. The dependence of CO₂-brine relative permeability and capillary pressure on IFT is also clearly evident in published experimental results. Improved understanding of the mechanisms that control the migration and trapping of CO₂ in the subsurface is crucial to design future storage projects for long-term, safe containment. We have developed numerical models for CO₂ trapping and migration in aquifers, including a compositional flow model, a relative permeability model, a capillary model, an interfacial tension model, and others. The heterogeneities in porosity and permeability are also coupled to the petrophysical models. We have developed and implemented a general relative permeability model that combines the effects of pressure gradient, buoyancy, and capillary pressure in a compositional and parallel simulator. The significance of IFT variations on CO₂ migration and trapping is assessed. The variation of residual saturation is modeled based on interfacial tension and trapping number, and a hysteretic trapping model is also presented. The second part of this dissertation is a model validation and sensitivity study using coreflood simulation data derived from laboratory study. The motivation of this study is to gain confidence in the results of the numerical simulator by validating the models and the numerical accuracies using laboratory and field pilot scale results. Published steady state, core-scale CO₂/brine displacement results were selected as a reference basis for our numerical study. High-resolution compositional simulations of brine displacement with supercritical CO₂ are presented using IPARS. A three-dimensional (3D) numerical model of the Berea sandstone core was constructed using heterogeneous permeability and porosity distributions based on geostatistical data. The measured capillary pressure curve was scaled using the Leverett J-function to include local heterogeneity in the sub-core scale. Simulation results indicate that accurate representation of capillary pressure at sub-core scales is critical. Water drying and the shift in relative permeability had a significant impact on the final CO₂ distribution along the core. This study provided insights into the role of heterogeneity in the final CO₂ distribution, where a slight variation in porosity gives rise to a large variation in the CO₂ saturation distribution. The third part of this study is a simulation study using IPARS for Cranfield pilot CO₂ sequestration field test, conducted by the Bureau of Economic Geology (BEG) at The University of Texas at Austin. In this CO₂ sequestration project, a total of approximately 2.5 million tons supercritical CO₂ was injected into a deep saline aquifer about ~10000 ft deep over 2 years, beginning December 1st 2009. In this chapter, we use the simulation capabilities of IPARS to numerically model the CO₂ injection process in Cranfield. We conducted a corresponding history-matching study and got good agreement with field observation. Extensive sensitivity studies were also conducted for CO₂ trapping, fluid phase behavior, relative permeability, wettability, gravity and buoyancy, and capillary effects on sequestration. Simulation results are consistent with the observed CO₂ breakthrough time at the first observation well. Numerical results are also consistent with bottomhole injection flowing pressure for the first 350 days before the rate increase. The abnormal pressure response with rate increase on day 350 indicates possible geomechanical issues, which can be represented in simulation using an induced fracture near the injection well. The recorded injection well bottomhole pressure data were successfully matched after modeling the fracture in the simulation model. Results also illustrate the importance of using accurate trapping models to predict CO₂ immobilization behavior. The impact of CO₂/brine relative permeability curves and trapping model on bottom-hole injection pressure is also demonstrated. / text
124

Modeling Cardiovascular Hemodynamics Using the Lattice Boltzmann Method on Massively Parallel Supercomputers

Randles, Amanda Elizabeth 24 September 2013 (has links)
Accurate and reliable modeling of cardiovascular hemodynamics has the potential to improve understanding of the localization and progression of heart diseases, which are currently the most common cause of death in Western countries. However, building a detailed, realistic model of human blood flow is a formidable mathematical and computational challenge. The simulation must combine the motion of the fluid, the intricate geometry of the blood vessels, continual changes in flow and pressure driven by the heartbeat, and the behavior of suspended bodies such as red blood cells. Such simulations can provide insight into factors like endothelial shear stress that act as triggers for the complex biomechanical events that can lead to atherosclerotic pathologies. Currently, it is not possible to measure endothelial shear stress in vivo, making these simulations a crucial component to understanding and potentially predicting the progression of cardiovascular disease. In this thesis, an approach for efficiently modeling the fluid movement coupled to the cell dynamics in real-patient geometries while accounting for the additional force from the expansion and contraction of the heart will be presented and examined. First, a novel method to couple a mesoscopic lattice Boltzmann fluid model to the microscopic molecular dynamics model of cell movement is elucidated. A treatment of red blood cells as extended structures, a method to handle highly irregular geometries through topology driven graph partitioning, and an efficient molecular dynamics load balancing scheme are introduced. These result in a large-scale simulation of the cardiovascular system, with a realistic description of the complex human arterial geometry, from centimeters down to the spatial resolution of red-blood cells. The computational methods developed to enable scaling of the application to 294,912 processors are discussed, thus empowering the simulation of a full heartbeat. Second, further extensions to enable the modeling of fluids in vessels with smaller diameters and a method for introducing the deformational forces exerted on the arterial flows from the movement of the heart by borrowing concepts from cosmodynamics are presented. These additional forces have a great impact on the endothelial shear stress. Third, the fluid model is extended to not only recover Navier-Stokes hydrodynamics, but also a wider range of Knudsen numbers, which is especially important in micro- and nano-scale flows. The tradeoffs of many optimizations methods such as the use of deep halo level ghost cells that, alongside hybrid programming models, reduce the impact of such higher-order models and enable efficient modeling of extreme regimes of computational fluid dynamics are discussed. Fourth, the extension of these models to other research questions like clogging in microfluidic devices and determining the severity of co-arctation of the aorta is presented. Through this work, a validation of these methods by taking real patient data and the measured pressure value before the narrowing of the aorta and predicting the pressure drop across the co-arctation is shown. Comparison with the measured pressure drop in vivo highlights the accuracy and potential impact of such patient specific simulations. Finally, a method to enable the simulation of longer trajectories in time by discretizing both spatially and temporally is presented. In this method, a serial coarse iterator is used to initialize data at discrete time steps for a fine model that runs in parallel. This coarse solver is based on a larger time step and typically a coarser discretization in space. Iterative refinement enables the compute-intensive fine iterator to be modeled with temporal parallelization. The algorithm consists of a series of prediction-corrector iterations completing when the results have converged within a certain tolerance. Combined, these developments allow large fluid models to be simulated for longer time durations than previously possible. / Engineering and Applied Sciences
125

The Case For Hardware Overprovisioned Supercomputers

Patki, Tapasya January 2015 (has links)
Power management is one of the most critical challenges on the path to exascale supercomputing. High Performance Computing (HPC) centers today are designed to be worst-case power provisioned, leading to two main problems: limited application performance and under-utilization of procured power. In this dissertation we introduce hardware overprovisioning: a novel, flexible design methodology for future HPC systems that addresses the aforementioned problems and leads to significant improvements in application and system performance under a power constraint. We first establish that choosing the right configuration based on application characteristics when using hardware overprovisioning can improve application performance under a power constraint by up to 62%. We conduct a detailed analysis of the infrastructure costs associated with hardware overprovisioning and show that it is an economically viable supercomputing design approach. We then develop RMAP (Resource MAnager for Power), a power-aware, low-overhead, scalable resource manager for future hardware overprovisioned HPC systems. RMAP addresses the issue of under-utilized power by using power-aware backfilling and improves job turnaround times by up to 31%. This dissertation opens up several new avenues for research in power-constrained supercomputing as we venture toward exascale, and we conclude by enumerating these.
126

Evaluation and Optimization of Turnaround Time and Cost of HPC Applications on the Cloud

Marathe, Aniruddha Prakash January 2014 (has links)
The popularity of Amazon's EC2 cloud platform has increased in commercial and scientific high-performance computing (HPC) applications domain in recent years. However, many HPC users consider dedicated high-performance clusters, typically found in large compute centers such as those in national laboratories, to be far superior to EC2 because of significant communication overhead of the latter. We find this view to be quite narrow and the proper metrics for comparing high-performance clusters to EC2 is turnaround time and cost. In this work, we first compare the HPC-grade EC2 cluster to top-of-the-line HPC clusters based on turnaround time and total cost of execution. When measuring turnaround time, we include expected queue wait time on HPC clusters. Our results show that although as expected, standard HPC clusters are superior in raw performance, they suffer from potentially significant queue wait times. We show that EC2 clusters may produce better turnaround times due to typically lower wait queue times. To estimate cost, we developed a pricing model---relative to EC2's node-hour prices---to set node-hour prices for (currently free) HPC clusters. We observe that the cost-effectiveness of running an application on a cluster depends on raw performance and application scalability. However, despite the potentially lower queue wait and turnaround times, the primary barrier to using clouds for many HPC users is the cost. Amazon EC2 provides a fixed-cost option (called on-demand) and a variable-cost, auction-based option (called the spot market). The spot market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 spot market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to 7x cheaper than using the on-demand market and up to 44% cheaper than the best non-redundant, spot-market algorithm. Finally, we extend our adaptive algorithm to exploit several opportunities for cost-savings on the EC2 spot market. First, we incorporate application scalability characteristics into our adaptive policy. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56% cost-savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale. Second, we demonstrate potential for obtaining considerable free computation time on the spot market enabled by its hour-boundary pricing model.
127

Autonomic Programming Paradigm for High Performance Computing

Jararweh, Yaser January 2010 (has links)
The advances in computing and communication technologies and software tools have resulted in an explosive growth in networked applications and information services that cover all aspects of our life. These services and applications are inherently complex, dynamic and heterogeneous. In a similar way, the underlying information infrastructure, e.g. the Internet, is large, complex, heterogeneous and dynamic, globally aggregating large numbers of independent computing and communication resources. The combination of the two results in application development and management complexities that break current computing paradigms, which are based on static behaviors. As a result, applications, programming environments and information infrastructures are rapidly becoming fragile, unmanageable and insecure. This has led researchers to consider alternative programming paradigms and management techniques that are based on strategies used by biological systems. Autonomic programming paradigm is inspired by the human autonomic nervous system that handles complexity, uncertainties and abnormality. The overarching goal of the autonomic programming paradigm is to help building systems and applications capable of self-management. Firstly, we investigated the large-scale scientific computing applications which generally experience different execution phases at run time and each phase has different computational, communication and storage requirements as well as different physical characteristics. In this dissertation, we present Physics Aware Optimization (PAO) paradigm that enables programmers to identify the appropriate solution methods to exploit the heterogeneity and the dynamism of the application execution states. We implement a Physics Aware Optimization Manager to exploit the PAO paradigm. On the other hand we present a self configuration paradigm based on the principles of autonomic computing that can handle efficiently complexity, dynamism and uncertainty in configuring server and networked systems and their applications. Our approach is based on making any resource/application to operate as an Autonomic Component (that means it can be self-managed component) by using our autonomic programming paradigm. Our POA technique for medical application yielded about 3X improvement of performance with 98.3% simulation accuracy compared to traditional techniques for performance optimization. Also, our Self-configuration management for power and performance management in GPU cluster demonstrated 53.7% power savings for CUDAworkload while maintaining the cluster performance within given acceptable thresholds.
128

Cooperative Resource Management for Parallel and Distributed Systems

Klein-Halmaghi, Cristian 29 November 2012 (has links) (PDF)
High-Performance Computing (HPC) resources, such as Supercomputers, Clusters, Grids and HPC Clouds, are managed by Resource Management Systems (RMSs) that multiple resources among multiple users and decide how computing nodes are allocated to user applications. As more and more petascale computing resources are built and exascale is to be achieved by 2020, optimizing resource allocation to applications is critical to ensure their efficient execution. However, current RMSs, such as batch schedulers, only offer a limited interface. In most cases, the application has to blindly choose resources at submittal without being able to adapt its choice to the state of the target resources, neither before it started nor during execution. The goal of this Thesis is to improve resource management, so as to allow applications to efficiently allocate resources. We achieve this by proposing software architectures that promote collaboration between the applications and the RMS, thus, allowing applications to negotiate the resources they run on. To this end, we start by analysing the various types of applications and their unique resource requirements, categorizing them into rigid, moldable, malleable and evolving. For each case, we highlight the opportunities they open up for improving resource management.The first contribution deals with moldable applications, for which resources are only negotiated before they start. We propose CooRMv1, a centralized RMS architecture, which delegates resource selection to the application launchers. Simulations show that the solution is both scalable and fair. The results are validated through a prototype implementation deployed on Grid'5000. Second, we focus on negotiating allocations on geographically-distributed resources, managed by multiple institutions. We build upon CooRMv1 and propose distCooRM, a distributed RMS architecture, which allows moldable applications to efficiently co-allocate resources managed by multiple independent agents. Simulation results show that distCooRM is well-behaved and scales well for a reasonable number of applications. Next, attention is shifted to run-time negotiation of resources, so as to improve support for malleable and evolving applications. We propose CooRMv2, a centralized RMS architecture, that enables efficient scheduling of evolving applications, especially non-predictable ones. It allows applications to inform the RMS about their maximum expected resource usage, through pre-allocations. Resources which are pre-allocated but unused can be filled by malleable applications. Simulation results show that considerable gains can be achieved. Last, production-ready software are used as a starting point, to illustrate the interest as well as the difficulty of improving cooperation between existing systems. GridTLSE is used as an application and DIET as an RMS to study a previously unsupported use-case. We identify the underlying problem of scheduling optional computations and propose an architecture to solve it. Real-life experiments done on the Grid'5000 platform show that several metrics are improved, such as user satisfaction, fairness and the number of completed requests. Moreover, it is shown that the solution is scalable.
129

CellPilot: An extension of the Pilot library for Cell Broadband Engine processors and heterogeneous clusters

Girard, Natalie 13 January 2012 (has links)
The CellPilot library provides a uniform communication programming model, based on Pilot's process/channel approach, for clusters of Cell Broadband Engine processors. Pilot, a thin layer on top of the Message Passing Interface (MPI) library, allows processes to read/write messages on channels defined between pairs of processes on the cluster, but Pilot alone does not help a Cell programmer cope with the considerable complexities of intra-Cell communication. With CellPilot, programmers still design software in terms of processes, but they can now be located on a Cell node's Power Processor Elements (PPEs), Synergistic Processing Elements (SPEs), or non-Cell node within a heterogeneous Cell cluster, and communication is accomplished via channels between process pairs. Programs are coded in terms of reading and writing on those channels, whereupon CellPilot transparently applies whichever communication mechanisms are required to transport the message, regardless of its endpoints. This gives the programmer a way to handle inter-process communication while avoiding low-level I/O operations and the use of multiple libraries.
130

Scalable data-management systems for Big Data

Tran, Viet-Trung 21 January 2013 (has links) (PDF)
Big Data can be characterized by 3 V's. * Big Volume refers to the unprecedented growth in the amount of data. * Big Velocity refers to the growth in the speed of moving data in and out management systems. * Big Variety refers to the growth in the number of different data formats. Managing Big Data requires fundamental changes in the architecture of data management systems. Data storage should continue being innovated in order to adapt to the growth of data. They need to be scalable while maintaining high performance regarding data accesses. This thesis focuses on building scalable data management systems for Big Data. Our first and second contributions address the challenge of providing efficient support for Big Volume of data in data-intensive high performance computing (HPC) environments. Particularly, we address the shortcoming of existing approaches to handle atomic, non-contiguous I/O operations in a scalable fashion. We propose and implement a versioning-based mechanism that can be leveraged to offer isolation for non-contiguous I/O without the need to perform expensive synchronizations. In the context of parallel array processing in HPC, we introduce Pyramid, a large-scale, array-oriented storage system. It revisits the physical organization of data in distributed storage systems for scalable performance. Pyramid favors multidimensional-aware data chunking, that closely matches the access patterns generated by applications. Pyramid also favors a distributed metadata management and a versioning concurrency control to eliminate synchronizations in concurrency. Our third contribution addresses Big Volume at the scale of the geographically distributed environments. We consider BlobSeer, a distributed versioning-oriented data management service, and we propose BlobSeer-WAN, an extension of BlobSeer optimized for such geographically distributed environments. BlobSeer-WAN takes into account the latency hierarchy by favoring locally metadata accesses. BlobSeer-WAN features asynchronous metadata replication and a vector-clock implementation for collision resolution. To cope with the Big Velocity characteristic of Big Data, our last contribution feautures DStore, an in-memory document-oriented store that scale vertically by leveraging large memory capability in multicore machines. DStore demonstrates fast and atomic complex transaction processing in data writing, while maintaining high throughput read access. DStore follows a single-threaded execution model to execute update transactions sequentially, while relying on a versioning concurrency control to enable a large number of simultaneous readers.

Page generated in 0.0439 seconds