Global ETD Search

1	Computation offloading for algorithms in absence of the Cloud Sthapit, Saurav January 2018 (has links) Mobile cloud computing is a way of delegating complex algorithms from a mobile device to the cloud to complete the tasks quickly and save energy on the mobile device. However, the cloud may not be available or suitable for helping all the time. For example, in a battlefield scenario, the cloud may not be reachable. This work considers neighbouring devices as alternatives to the cloud for offloading computation and presents three key contributions, namely a comprehensive investigation of the trade-off between computation and communication, Multi-Objective Optimisation based approach to offloading, and Queuing Theory based algorithms that present the benefits of offloading to neighbours. Initially, the states of neighbouring devices are considered to be known and the decision of computation offloading is proposed as a multi-objective optimisation problem. Novel Pareto optimal solutions are proposed. The results on a simulated dataset show up to 30% increment in performance even when cloud computing is not available. However, information about the environment is seldom known completely. In Chapter 5, a realistic environment is considered such as delayed node state information and partially connected sensors. The network of sensors is modelled as a network of queues (Open Jackson network). The offloading problem is posed as minimum cost problem and solved using Linear solvers. In addition to the simulated dataset, the proposed solution is tested on a real computer vision dataset. The experiments on the random waypoint dataset showed up to 33% boost on performance whereas in the real dataset, exploiting the temporal and spatial distribution of the targets, a significantly higher increment in performance is achieved.
2	Linear solvers and coupling methods for compositional reservoir simulators Li, Wenjun, doctor of engineering 17 February 2011 (has links) Three compositional reservoir simulators have been developed in the Department of Petroleum and Geosystems Engineering at The University of Texas at Austin (UT-Austin): UTCOMP (miscible gas flooding simulator), UTCHEM (chemical flooding simulator), and GPAS (General Purpose Adaptive Simulator). UTCOMP and UTCHEM simulators have been used by various oil companies for solving a variety of field problems. The efficiency and accuracy of each simulator becomes critically important when they are used to solve field problems. In this study, two well-developed solver packages, SAMG and HYPRE, along with existing solvers were compared. Our numerical results showed that SAMG can be an excellent solver for the usage in the three simulators for solving problems with a high accuracy requirement and long simulation times, and BoomerAMG in HYPRE package can also be a good solver for application in the UTCHEM simulator. In order to investigate the flexibility and the efficiency of a partitioned coupling method, the second part of this thesis presents a new implementation using a partition method for a thermal module in an equation-of-state (EOS) compositional simulator, the General Purpose Adaptive Simulator (GPAS) developed at The University of Texas at Austin. The finite difference method (FDM) was used for the solution of governing partial differential equations. Specifically, the new coupled implementation was based on the Schur complement method. For the partition method, two suitable acceleration techniques were constructed. One technique was the optimized choice of preconditioner for the Schur complement; the other was the optimized selection of tolerances for the two solution steps. To validate the implementation, we present simulation examples of hot water injection in an oil reservoir. The numerical comparison between the new implementation and the traditional, fully implicit method showed that the partition method is not only more flexible, but also faster than the classical, fully implicit method for the same test problems without sacrificing accuracy. In conclusion, the new implementation of the partition method is a more flexible and more efficient method for coupling a new module into an existing simulator than the classical, fully implicit method.The third part of this thesis presents another type of coupling method, iterative coupling methods, which has been implemented into GPAS with thermal module, FICM (Fully, Iterative Coupling Method) and GICM (General, Iterative Coupling Method), LICM (Loose, Iterative Coupling Method). The results show that LICM is divergent, and GICM and FICM can work normally. GICM is the fastest among the compared methods, and FICM has a similar efficiency as CFIM (Classic Fully Implicit Method). Although GICM is the fastest method, GICM is less accurate than FICM for in the test cases carried out in this study. / text Compositional reservoir simulators UTCOMP UTCHEM GPAS SAMG HYPRE BoomerAMG Coupling methods FICM GICM LICM Linear solvers
3	Methods for solving discontinuous-Galerkin finite element equations with application to neutron transport Murphy, Steven 26 August 2015 (has links) (PDF) We consider high order discontinuous-Galerkin finite element methods for partial differential equations, with a focus on the neutron transport equation. We begin by examining a method for preprocessing block-sparse matrices, of the type that arise from discontinuous-Galerkin methods, prior to factorisation by a multifrontal solver. Numerical experiments on large two and three dimensional matrices show that this pre-processing method achieves a significant reduction in fill-in, when compared to methods that fail to exploit block structures. A discontinuous-Galerkin finite element method for the neutron transport equation is derived that employs high order finite elements in both space and angle. Parallel Krylov subspace based solvers are considered for both source problems and $k_{eff}$-eigenvalue problems. An a-posteriori error estimator is derived and implemented as part of an h-adaptive mesh refinement algorithm for neutron transport $k_{eff}$-eigenvalue problems. This algorithm employs a projection-based error splitting in order to balance the computational requirements between the spatial and angular parts of the computational domain. An hp-adaptive algorithm is presented and results are collected that demonstrate greatly improved efficiency compared to the h-adaptive algorithm, both in terms of reduced computational expense and enhanced accuracy. Computed eigenvalues and effectivities are presented for a variety of challenging industrial benchmarks. Accurate error estimation (with effectivities of 1) is demonstrated for a collection of problems with inhomogeneous, irregularly shaped spatial domains as well as multiple energy groups. Numerical results are presented showing that the hp-refinement algorithm can achieve exponential convergence with respect to the number of degrees of freedom in the finite element space A-posteriori methods Hp-refinement Discontinuous-Galerkin methods Neutron Transport Sparse matrices Linear Solvers
4	Methods for solving discontinuous-Galerkin finite element equations with application to neutron transport / Méthodes de résolution d'équations aux éléments finis Galerkin discontinus et application à la neutronique Murphy, Steven 26 August 2015 (has links) Cette thèse traite des méthodes d’éléments finis Galerkin discontinus d’ordre élevé pour la résolution d’équations aux dérivées partielles, avec un intérêt particulier pour l’équation de transport des neutrons. Nous nous intéressons tout d’abord à une méthode de pré-traitement de matrices creuses par blocs, qu’on retrouve dans les méthodes Galerkin discontinues, avant factorisation par un solveur multifrontal. Des expériences numériques conduites sur de grandes matrices bi- et tri-dimensionnelles montrent que cette méthode de pré-traitement permet une réduction significative du ’fill-in’, par rapport aux méthodes n’exploitant pas la structure par blocs. Ensuite, nous proposons une méthode d’éléments finis Galerkin discontinus, employant des éléments d’ordre élevé en espace comme en angle, pour résoudre l’équation de transport des neutrons. Nous considérons des solveurs parallèles basés sur les sous-espaces de Krylov à la fois pour des problèmes ’source’ et des problèmes aux valeur propre multiplicatif. Dans cet algorithme, l’erreur est décomposée par projection(s) afin d’équilibrer les contraintes numériques entre les parties spatiales et angulaires du domaine de calcul. Enfin, un algorithme HP-adaptatif est présenté ; les résultats obtenus démontrent une nette supériorité par rapport aux algorithmes h-adaptatifs, à la fois en terme de réduction de coût de calcul et d’amélioration de la précision. Les valeurs propres et effectivités sont présentées pour un panel de cas test industriels. Une estimation précise de l’erreur (avec effectivité de 1) est atteinte pour un ensemble de problèmes aux domaines inhomogènes et de formes irrégulières ainsi que des groupes d’énergie multiples. Nous montrons numériquement que l’algorithme HP-adaptatif atteint une convergence exponentielle par rapport au nombre de degrés de liberté de l’espace éléments finis. / We consider high order discontinuous-Galerkin finite element methods for partial differential equations, with a focus on the neutron transport equation. We begin by examining a method for preprocessing block-sparse matrices, of the type that arise from discontinuous-Galerkin methods, prior to factorisation by a multifrontal solver. Numerical experiments on large two and three dimensional matrices show that this pre-processing method achieves a significant reduction in fill-in, when compared to methods that fail to exploit block structures. A discontinuous-Galerkin finite element method for the neutron transport equation is derived that employs high order finite elements in both space and angle. Parallel Krylov subspace based solvers are considered for both source problems and $k_{eff}$-eigenvalue problems. An a-posteriori error estimator is derived and implemented as part of an h-adaptive mesh refinement algorithm for neutron transport $k_{eff}$-eigenvalue problems. This algorithm employs a projection-based error splitting in order to balance the computational requirements between the spatial and angular parts of the computational domain. An hp-adaptive algorithm is presented and results are collected that demonstrate greatly improved efficiency compared to the h-adaptive algorithm, both in terms of reduced computational expense and enhanced accuracy. Computed eigenvalues and effectivities are presented for a variety of challenging industrial benchmarks. Accurate error estimation (with effectivities of 1) is demonstrated for a collection of problems with inhomogeneous, irregularly shaped spatial domains as well as multiple energy groups. Numerical results are presented showing that the hp-refinement algorithm can achieve exponential convergence with respect to the number of degrees of freedom in the finite element space Méthodes a posteriori Algorithmes HP-adaptatif Méthodes Galerkin discontinus Neutronique Matrices creuses Solveurs linéaires A-posteriori methods Hp-refinement Discontinuous-Galerkin methods Neutron Transport Sparse matrices Linear Solvers
5	An open source HPC-enabled model of cardiac defibrillation of the human heart Bernabeu Llinares, Miguel Oscar January 2011 (has links) Sudden cardiac death following cardiac arrest is a major killer in the industrialised world. The leading cause of sudden cardiac death are disturbances in the normal electrical activation of cardiac tissue, known as cardiac arrhythmia, which severely compromise the ability of the heart to fulfill the body's demand of oxygen. Ventricular fibrillation (VF) is the most deadly form of cardiac arrhythmia. Furthermore, electrical defibrillation through the application of strong electric shocks to the heart is the only effective therapy against VF. Over the past decades, a large body of research has dealt with the study of the mechanisms underpinning the success or failure of defibrillation shocks. The main mechanism of shock failure involves shocks terminating VF but leaving the appropriate electrical substrate for new VF episodes to rapidly follow (i.e. shock-induced arrhythmogenesis). A large number of models have been developed for the in silico study of shock-induced arrhythmogenesis, ranging from single cell models to three-dimensional ventricular models of small mammalian species. However, no extrapolation of the results obtained in the aforementioned studies has been done in human models of ventricular electrophysiology. The main reason is the large computational requirements associated with the solution of the bidomain equations of cardiac electrophysiology over large anatomically-accurate geometrical models including representation of fibre orientation and transmembrane kinetics. In this Thesis we develop simulation technology for the study of cardiac defibrillation in the human heart in the framework of the open source simulation environment Chaste. The advances include the development of novel computational and numerical techniques for the solution of the bidomain equations in large-scale high performance computing resources. More specifically, we have considered the implementation of effective domain decomposition, the development of new numerical techniques for the reduction of communication in Chaste's finite element method (FEM) solver, and the development of mesh-independent preconditioners for the solution of the linear system arising from the FEM discretisation of the bidomain equations. The developments presented in this Thesis have brought Chaste to the level of performance and functionality required to perform bidomain simulations with large three-dimensional cardiac geometries made of tens of millions of nodes and including accurate representation of fibre orientation and membrane kinetics. This advances have enabled the in silico study of shock-induced arrhythmogenesis for the first time in the human heart, therefore bridging an important gap in the field of cardiac defibrillation research. 616.1280645
6	Numerical methods for dynamic micromagnetics Shepherd, David January 2015 (has links) Micromagnetics is a continuum mechanics theory of magnetic materials widely used in industry and academia. In this thesis we describe a complete numerical method, with a number of novel components, for the computational solution of dynamic micromagnetic problems by solving the Landau-Lifshitz-Gilbert (LLG) equation. In particular we focus on the use of the implicit midpoint rule (IMR), a time integration scheme which conserves several important properties of the LLG equation. We use the finite element method for spatial discretisation, and use nodal quadrature schemes to retain the conservation properties of IMR despite the weak-form approach. We introduce a novel, generally-applicable adaptive time step selection algorithm for the IMR. The resulting scheme selects error-appropriate time steps for a variety of problems, including the semi-discretised LLG equation. We also show that it retains the conservation properties of the fixed step IMR for the LLG equation. We demonstrate how hybrid FEM/BEM magnetostatic calculations can be coupled to the LLG equation in a monolithic manner. This allows the coupled solver to maintain all properties of the standard time integration scheme, in particular stability properties and the energy conservation property of IMR. We also develop a preconditioned Krylov solver for the coupled system which can efficiently solve the monolithic system provided that an effective preconditioner for the LLG sub-problem is available. Finally we investigate the effect of the spatial discretisation on the comparative effectiveness of implicit and explicit time integration schemes (i.e. the stiffness). We find that explicit methods are more efficient for simple problems, but for the fine spatial discretisations required in a number of more complex cases implicit schemes become orders of magnitude more efficient. 620.1
7	A Computational Framework for Assessing and Optimizing the Performance of Observational Networks in 4D-Var Data Assimilation Cioaca, Alexandru 04 September 2013 (has links) A deep scientific understanding of complex physical systems, such as the atmosphere, can be achieved neither by direct measurements nor by numerical simulations alone. Data assimilation is a rigorous procedure to fuse information from a priori knowledge of the system state, the physical laws governing the evolution of the system, and real measurements, all with associated error statistics. Data assimilation produces best (a posteriori) estimates of model states and parameter values, and results in considerably improved computer simulations. The acquisition and use of observations in data assimilation raises several important scientific questions related to optimal sensor network design, quantification of data impact, pruning redundant data, and identifying the most beneficial additional observations. These questions originate in operational data assimilation practice, and have started to attract considerable interest in the recent past. This dissertation advances the state of knowledge in four dimensional variational (4D-Var) - data assimilation by developing, implementing, and validating a novel computational framework for estimating observation impact and for optimizing sensor networks. The framework builds on the powerful methodologies of second-order adjoint modeling and the 4D-Var sensitivity equations. Efficient computational approaches for quantifying the observation impact include matrix free linear algebra algorithms and low-rank approximations of the sensitivities to observations. The sensor network configuration problem is formulated as a meta-optimization problem. Best values for parameters such as sensor location are obtained by optimizing a performance criterion, subject to the constraint posed by the 4D-Var optimization. Tractable computational solutions to this "optimization-constrained" optimization problem are provided. The results of this work can be directly applied to the deployment of intelligent sensors and adaptive observations, as well as to reducing the operating costs of measuring networks, while preserving their ability to capture the essential features of the system under consideration. / Ph. D. data assimilation dynamic data-driven problem second-order adjoints adaptive observations sensor placement intelligent sensors sensitivity analysis uncertainty quantification nonlinear optimization inverse problems parameter estimation matrix-free linear solvers truncated singular value decomposition
8	A parallel iterative solver for large sparse linear systems enhanced with randomization and GPU accelerator, and its resilience to soft errors / Un solveur parallèle itératif pour les grands systèmes linéaires creux, amélioré par la randomisation et l'utilisation des accélérateurs GPU, et sa résilience aux fautes logicielles Jamal, Aygul 28 September 2017 (has links) Dans cette thèse de doctorat, nous abordons trois défis auxquels sont confrontés les solveurs d'algèbres linéaires dans la perspective des futurs systèmes exascale: accélérer la convergence en utilisant des techniques innovantes au niveau algorithmique, en profitant des accélérateurs GPU (Graphics Processing Units) pour améliorer le calcul sur plusieurs systèmes, en évaluant l'impact des erreurs due à l'augmentation du parallélisme dans les superordinateurs. Nous nous intéressons à l'étude des méthodes permettant d'accélérer la convergence et le temps d'exécution des solveurs itératifs pour les grands systèmes linéaires creux. Le solveur plus spécifiquement considéré dans ce travail est le “parallel Algebraic Recursive Multilevel Solver (pARMS)” qui est un soldeur parallèle sur mémoire distribuée basé sur les méthodes de sous-espace de Krylov.Tout d'abord, nous proposons d'intégrer une technique de randomisation appelée “Random Butterfly Transformations (RBT)” qui a été proposée avec succès pour éliminer le coût du pivotage dans la résolution des systèmes linéaires denses. Notre objectif est d'appliquer cette technique dans le préconditionneur ARMS de pARMS pour résoudre plus efficacement le dernier système Complément de Schur dans l'application du processus à multi-niveaux récursif. En raison de l'importance considérable du dernier Complément de Schur pour certains problèmes de test, nous proposons également d'utiliser une variante creux de RBT suivie d'un solveur direct creux (SuperLU). Les résultats expérimentaux sur certaines matrices de la collection de Davis montrent une amélioration de la convergence et de la précision par rapport aux implémentations existantes.Ensuite, nous illustrons comment une approche non intrusive peut être appliquée pour implémenter des calculs GPU dans le solveur pARMS, plus particulièrement pour la phase de préconditionnement locale qui représente une partie importante du temps pour la résolution. Nous comparons les solveurs purement CPU avec les solveurs hybrides CPU / GPU sur plusieurs problèmes de test issus d'applications physiques. Les résultats de performance du solveur hybride CPU / GPU utilisant le préconditionnement ARMS combiné avec RBT, ou le préconditionnement ILU(0), montrent un gain de performance jusqu'à 30% sur les problèmes de test considérés dans nos expériences.Enfin, nous étudions l'effet des défaillances logicielles variable sur la convergence de la méthode itérative flexible GMRES (FGMRES) qui est couramment utilisée pour résoudre le système préconditionné dans pARMS. Le problème ciblé dans nos expériences est un problème elliptique PDE sur une grille régulière. Nous considérons deux types de préconditionneurs: une factorisation LU incomplète à double seuil (ILUT) et le préconditionneur ARMS combiné avec randomisation RBT. Nous considérons deux modèle de fautes logicielles différentes où nous perturbons la multiplication du vecteur matriciel et la phase de préconditionnement, et nous comparons leur impact potentiel sur la convergence. / In this PhD thesis, we address three challenges faced by linear algebra solvers in the perspective of future exascale systems: accelerating convergence using innovative techniques at the algorithm level, taking advantage of GPU (Graphics Processing Units) accelerators to enhance the performance of computations on hybrid CPU/GPU systems, evaluating the impact of errors in the context of an increasing level of parallelism in supercomputers. We are interested in studying methods that enable us to accelerate convergence and execution time of iterative solvers for large sparse linear systems. The solver specifically considered in this work is the parallel Algebraic Recursive Multilevel Solver (pARMS), which is a distributed-memory parallel solver based on Krylov subspace methods.First we integrate a randomization technique referred to as Random Butterfly Transformations (RBT) that has been successfully applied to remove the cost of pivoting in the solution of dense linear systems. Our objective is to apply this method in the ARMS preconditioner to solve more efficiently the last Schur complement system in the application of the recursive multilevel process in pARMS. The experimental results show an improvement of the convergence and the accuracy. Due to memory concerns for some test problems, we also propose to use a sparse variant of RBT followed by a sparse direct solver (SuperLU), resulting in an improvement of the execution time.Then we explain how a non intrusive approach can be applied to implement GPU computing into the pARMS solver, more especially for the local preconditioning phase that represents a significant part of the time to compute the solution. We compare the CPU-only and hybrid CPU/GPU variant of the solver on several test problems coming from physical applications. The performance results of the hybrid CPU/GPU solver using the ARMS preconditioning combined with RBT, or the ILU(0) preconditioning, show a performance gain of up to 30% on the test problems considered in our experiments.Finally we study the effect of soft fault errors on the convergence of the commonly used flexible GMRES (FGMRES) algorithm which is also used to solve the preconditioned system in pARMS. The test problem in our experiments is an elliptical PDE problem on a regular grid. We consider two types of preconditioners: an incomplete LU factorization with dual threshold (ILUT), and the ARMS preconditioner combined with RBT randomization. We consider two soft fault error modeling approaches where we perturb the matrix-vector multiplication and the application of the preconditioner, and we compare their potential impact on the convergence of the solver. Calcul haute performance Algorithmes randomisés Calculs sur GPU GMRES flexible Modèles de fautes logicielles Solveur pARMS Preconditionnement Tolérance aux fautes High performance computing Parallel iterative linear solvers Randomized algorithms GPU computing Flexible GMRES Soft fault models PARMS solver Preconditioning Fault tolerance

Search results