Global ETD Search

231	Optimisations des solveurs linéaires creux hybrides basés sur une approche par complément de Schur et décomposition de domaine / Optimizations of hybrid sparse linear solvers relying on Schur complement and domain decomposition approaches Casadei, Astrid 19 October 2015 (has links) Dans cette thèse, nous nous intéressons à la résolution parallèle de grands systèmes linéaires creux. Nous nous focalisons plus particulièrement sur les solveurs linéaires creux hybrides directs itératifs tels que HIPS, MaPHyS, PDSLIN ou ShyLU, qui sont basés sur une décomposition de domaine et une approche « complément de Schur ». Bien que ces solveurs soient moins coûteux en temps et en mémoire que leurs homologues directs, ils ne sont néanmoins pas exempts de surcoûts. Dans une première partie, nous présentons les différentes méthodes de réduction de la consommation mémoire déjà existantes et en proposons une nouvelle qui n’impacte pas la robustesse numérique du précondionneur construit. Cette technique se base sur une atténuation du pic mémoire par un ordonnancement spécifique des tâches de calcul, d’allocation et de désallocation des blocs, notamment ceux se trouvant dans les parties « couplage » des domaines.Dans une seconde partie, nous nous intéressons à la question de l’équilibrage de la charge que pose la décomposition de domaine pour le calcul parallèle. Ce problème revient à partitionner le graphe d’adjacence de la matrice en autant de parties que de domaines désirés. Nous mettons en évidence le fait que pour avoir un équilibrage correct des temps de calcul lors des phases les plus coûteuses d’un solveur hybride tel que MaPHyS, il faut à la fois équilibrer les domaines en termes de nombre de noeuds et de taille d’interface locale. Jusqu’à aujourd’hui, les partitionneurs de graphes tels que Scotch et MeTiS ne s’intéressaient toutefois qu’au premier critère (la taille des domaines) dans le contexte de la renumérotation des matrices creuses. Nous proposons plusieurs variantes des algorithmes existants afin de prendre également en compte l’équilibrage des interfaces locales. Toutes nos modifications sont implémentées dans le partitionneur Scotch, et nous présentons des résultats sur de grands cas de tests industriels. / In this thesis, we focus on the parallel solving of large sparse linear systems. Our main interestis on direct-iterative hybrid solvers such as HIPS, MaPHyS, PDSLIN or ShyLU, whichrely on domain decomposition and Schur complement approaches. Althrough these solvers arenot as time and space consuming as direct methods, they still suffer from serious overheads. Ina first part, we thus present the existing techniques for reducing the memory consumption, andwe present a new method which does not impact the numerical robustness of the preconditioner.This technique reduces the memory peak by doing a special scheduling of computation, allocation,and freeing tasks in particular in the Schur coupling blocks of the matrix. In a second part,we focus on the load balancing of the domain decomposition in a parallel context. This problemconsists in partitioning the adjacency graph of the matrix in as many domains as desired. Wepoint out that a good load balancing for the most expensive steps of an hybrid solver such asMaPHyS relies on the balancing of both interior nodes and interface nodes of the domains.Through, until now, graph partitioners such as MeTiS or Scotch used to optimize only thefirst criteria (i.e., the balancing of interior nodes) in the context of sparse matrix ordering. Wepropose different variations of the existing algorithms to improve the balancing of interface nodesand interior nodes simultaneously. All our changes are implemented in the Scotch partitioner.We present our results on large collection of matrices coming from real industrial cases. Calcul haute performance Algèbre linéaire creuse Solveur parallèle Méthode hybride directe-itérative Décompostion de domaine Complément de Schur Réduction du pic mémoire Équilibrage de la charge Partitionnement de graphe Bipartitionnement récursif High-performance computing Sparse linear algebra Parallel solver Direct-iterative hybrid method Domain decomposition Schur complement Memory peak reduction Load balancing Graph partitioning Recursive bipartitioning
232	Wing in Ground Effect Mondal, Partha January 2013 (has links) (PDF) The thesis presents a two pronged approach for predicting aerodynamics of air- foils/wings in the vicinity of the ground. The ﬁrst approach is eﬀectively a model for ground eﬀect studies, employing an inexpensive Discrete Vortex Method for the 2D pre- dictions and the well known Numerical lifting line theory for the 3D predictions. The second one pertains to the dynamic ground eﬀect analysis which employs the state of the art moving mesh methodology based time accurate CFD. In that sense, the thesis deals with two ends of spectrum in the ground eﬀect analysis; one, a model to be used in the concept design phase and the other an advanced CFD tool for analysis. The proposed model for ground eﬀect studies is based on the well known Discrete Vortex Method (DVM). An important aspect of this method is that it employs what is referred to as the Generalized Kutta Joukowski Theorem (GKJ), meant for interaction problems with multiple vortices, for predicting the lift (and drag) within a potential ﬂow framework. After ascertaining the correctness of using the GKJ theorem for lift prediction for airfoils in ground eﬀect, a modiﬁed DVM is presented as a model for ground eﬀect predictions. As per this model, knowing the free stream lift and drag (either from an ex- periment or from a RANS computation) the aerodynamics of the section in ground eﬀect can be predicted. The model is eﬀectively built by constraining the DVM to produce the reference lift/drag in the free stream. The accuracy of the model, particularly for the more relevant high lift sections used during take-oﬀ and landing, is systematically estab- lished for a number of test cases. Knowing the sectional ground eﬀect, the extension to 3D analysis is very simple and this is achieved through the well known Numerical Lifting Line theory. The eﬃcacy of the proposed method for the 3D applications is demonstrated using a high lift wing in ground eﬀect. It is worth noting that the proposed model predicts the lift and drag very accurately, practically at no computational cost as compared to modern RANS based CFD tools requiring over 40 or 50 million volumes at a high computational cost and intense human intervention for generating the grids for every ground clearance. The other aspect of the thesis pertains to what is referred to as the Dynamic Ground Eﬀect. Normally the CFD computations mimic the ground eﬀect experiments in simulat- ing the ground eﬀect. These simulations do not maintain geometric similarity with the actual landing or take-oﬀ sequence of the aircrafts and this can only be achieved when the simulations are dynamic. Dynamics is also important in case of combat aircrafts (particularly their naval versions) with an aggressive landing and take-oﬀ. The dynamic ground eﬀect simulations also provides a framework for simulating varied gust conditions. This dynamic simulation of the ground eﬀect is accomplished using a novel sinking grid methodology, which allows the grids to sink in the ground as the aircraft approaches the ground along the glide path. These simulations make use of the state of the art, time accurate moving grid methods and therefore can be computationally expensive. Never- theless, the utility of such computations in terms of their ability to produce continuous data has been highlighted in the thesis. In that sense, these dynamic computations will be cheaper as compared to the static simulations to produce data at the same level of resolution. Ground Effect Ground-Cushion Phenomenon Airfoils Wings Kutta-Joukowski Theorem Discrete Vortex Method 3D Ground Effect Model Computational Fluid Dynamics Aerodynamics Dynamic Ground Effect Analysis Sinking Grid Methodology Flow Solver Inverted Ground Effect Wing Dynamic Ground Approach Ground Effect Studies Aerospace Engineering
233	A parallel iterative solver for large sparse linear systems enhanced with randomization and GPU accelerator, and its resilience to soft errors / Un solveur parallèle itératif pour les grands systèmes linéaires creux, amélioré par la randomisation et l'utilisation des accélérateurs GPU, et sa résilience aux fautes logicielles Jamal, Aygul 28 September 2017 (has links) Dans cette thèse de doctorat, nous abordons trois défis auxquels sont confrontés les solveurs d'algèbres linéaires dans la perspective des futurs systèmes exascale: accélérer la convergence en utilisant des techniques innovantes au niveau algorithmique, en profitant des accélérateurs GPU (Graphics Processing Units) pour améliorer le calcul sur plusieurs systèmes, en évaluant l'impact des erreurs due à l'augmentation du parallélisme dans les superordinateurs. Nous nous intéressons à l'étude des méthodes permettant d'accélérer la convergence et le temps d'exécution des solveurs itératifs pour les grands systèmes linéaires creux. Le solveur plus spécifiquement considéré dans ce travail est le “parallel Algebraic Recursive Multilevel Solver (pARMS)” qui est un soldeur parallèle sur mémoire distribuée basé sur les méthodes de sous-espace de Krylov.Tout d'abord, nous proposons d'intégrer une technique de randomisation appelée “Random Butterfly Transformations (RBT)” qui a été proposée avec succès pour éliminer le coût du pivotage dans la résolution des systèmes linéaires denses. Notre objectif est d'appliquer cette technique dans le préconditionneur ARMS de pARMS pour résoudre plus efficacement le dernier système Complément de Schur dans l'application du processus à multi-niveaux récursif. En raison de l'importance considérable du dernier Complément de Schur pour certains problèmes de test, nous proposons également d'utiliser une variante creux de RBT suivie d'un solveur direct creux (SuperLU). Les résultats expérimentaux sur certaines matrices de la collection de Davis montrent une amélioration de la convergence et de la précision par rapport aux implémentations existantes.Ensuite, nous illustrons comment une approche non intrusive peut être appliquée pour implémenter des calculs GPU dans le solveur pARMS, plus particulièrement pour la phase de préconditionnement locale qui représente une partie importante du temps pour la résolution. Nous comparons les solveurs purement CPU avec les solveurs hybrides CPU / GPU sur plusieurs problèmes de test issus d'applications physiques. Les résultats de performance du solveur hybride CPU / GPU utilisant le préconditionnement ARMS combiné avec RBT, ou le préconditionnement ILU(0), montrent un gain de performance jusqu'à 30% sur les problèmes de test considérés dans nos expériences.Enfin, nous étudions l'effet des défaillances logicielles variable sur la convergence de la méthode itérative flexible GMRES (FGMRES) qui est couramment utilisée pour résoudre le système préconditionné dans pARMS. Le problème ciblé dans nos expériences est un problème elliptique PDE sur une grille régulière. Nous considérons deux types de préconditionneurs: une factorisation LU incomplète à double seuil (ILUT) et le préconditionneur ARMS combiné avec randomisation RBT. Nous considérons deux modèle de fautes logicielles différentes où nous perturbons la multiplication du vecteur matriciel et la phase de préconditionnement, et nous comparons leur impact potentiel sur la convergence. / In this PhD thesis, we address three challenges faced by linear algebra solvers in the perspective of future exascale systems: accelerating convergence using innovative techniques at the algorithm level, taking advantage of GPU (Graphics Processing Units) accelerators to enhance the performance of computations on hybrid CPU/GPU systems, evaluating the impact of errors in the context of an increasing level of parallelism in supercomputers. We are interested in studying methods that enable us to accelerate convergence and execution time of iterative solvers for large sparse linear systems. The solver specifically considered in this work is the parallel Algebraic Recursive Multilevel Solver (pARMS), which is a distributed-memory parallel solver based on Krylov subspace methods.First we integrate a randomization technique referred to as Random Butterfly Transformations (RBT) that has been successfully applied to remove the cost of pivoting in the solution of dense linear systems. Our objective is to apply this method in the ARMS preconditioner to solve more efficiently the last Schur complement system in the application of the recursive multilevel process in pARMS. The experimental results show an improvement of the convergence and the accuracy. Due to memory concerns for some test problems, we also propose to use a sparse variant of RBT followed by a sparse direct solver (SuperLU), resulting in an improvement of the execution time.Then we explain how a non intrusive approach can be applied to implement GPU computing into the pARMS solver, more especially for the local preconditioning phase that represents a significant part of the time to compute the solution. We compare the CPU-only and hybrid CPU/GPU variant of the solver on several test problems coming from physical applications. The performance results of the hybrid CPU/GPU solver using the ARMS preconditioning combined with RBT, or the ILU(0) preconditioning, show a performance gain of up to 30% on the test problems considered in our experiments.Finally we study the effect of soft fault errors on the convergence of the commonly used flexible GMRES (FGMRES) algorithm which is also used to solve the preconditioned system in pARMS. The test problem in our experiments is an elliptical PDE problem on a regular grid. We consider two types of preconditioners: an incomplete LU factorization with dual threshold (ILUT), and the ARMS preconditioner combined with RBT randomization. We consider two soft fault error modeling approaches where we perturb the matrix-vector multiplication and the application of the preconditioner, and we compare their potential impact on the convergence of the solver. Calcul haute performance Algorithmes randomisés Calculs sur GPU GMRES flexible Modèles de fautes logicielles Solveur pARMS Preconditionnement Tolérance aux fautes High performance computing Parallel iterative linear solvers Randomized algorithms GPU computing Flexible GMRES Soft fault models PARMS solver Preconditioning Fault tolerance
234	One-Dimensional Velocity Distributions of Fast Ions under RF Heating Including Doppler Shift in Tokamaks Bähner, Lukas January 2022 (has links) The goal of nuclear fusion research is to create a clean and virtually limitless energy source. In order to that, a plasma must be heated to hundreds of millions degrees Celsius. A commonly used heating mechanism is ion cyclotron resonance heating, where antennas emit radio waves into the plasma. The wave can resonate with the ions at their cyclotron frequency, which leads to wave absorption. In order to investigate and improve the heating, one can perform computer simulations. FEMIC is a finite element model for ICRH that calculates the wave field created by the antennas. However, this code does not take into account how the wave modifies the velocity distribution of the plasma. Therefore, a time-independent Fokker-Planck solver is implemented that computes the fast ion distribution due to the incident wave field calculated with FEMIC. The novelty of this code is to include Doppler shift, which influences where ions resonate and how they are heated. / Målet med fusionsforskningen är att skapa en ren energikälla som kan producera obegränsade mängder energi. För detta krävs att ett plasma värms till hundratals miljoner grader Celsius. En vanlig teknik för att värma plasmat är joncyklotronuppvärmning, där en antenn emitterar radiovågor som propagerar in i plasmat. Om vågen är i resonans med jonernas cyklotronrörelse leder detta till att vågen absorberas av jonerna. För att studera och utveckla denna uppvärmningsteknik kan man använda datorsimuleringar. FEMIC är en kod baserad på den finita elementmetoden som beräknar vågfälten som skapas av antennen. Med denna kod kan vi dock inte beräkna hur vågen påverkar jonernas fördelningsfunktioner. Därför har en Fokker-Planck-lösare implementerats som kan beräkna fördelningen av snabba joner som accelererats av vågfältet från FEMIC. Det nya i denna modell är att koden tar hänsyn till Dopplerskiftet, vilket påverkar var jonerna är i resonans med vågen och hur de värms upp. plasma fusion ICRH Ion Cyclotron Resonance Heating RF radio frequency Fokker-Planck equation Fokker-Planck solver quasilinear operator Coulomb collision Doppler shift velocity distribution local temperature plasma fusion ICRH joncyklotronuppvärmning RF radiofrekvens Fokker-Planck-ekvationen Fokker-Planck-lösare kvasilinjär operator Coulombkollisioner Dopplerskift hastighetsfördelning lokal temperatur Elektroteknik och elektronik
235	A Physically Based Pipeline for Real-Time Simulation and Rendering of Realistic Fire and Smoke / En fysiskt baserad rörledning för realtidssimulering och rendering av realistisk eld och rök He, Yiyang January 2018 (has links) With the rapidly growing computational power of modern computers, physically based rendering has found its way into real world applications. Real-time simulations and renderings of fire and smoke had become one major research interest in modern video game industry, and will continue being one important research direction in computer graphics. To visually recreate realistic dynamic fire and smoke is a complicated problem. Furthermore, to solve the problem requires knowledge from various areas, ranged from computer graphics and image processing to computational physics and chemistry. Even though most of the areas are well-studied separately, when combined, new challenges will emerge. This thesis focuses on three aspects of the problem, dynamic, real-time and realism, to propose a solution in form of a GPGPU pipeline, along with its implementation. Three main areas with application in the problem are discussed in detail: fluid simulation, volumetric radiance estimation and volumetric rendering. The weights are laid upon the first two areas. The results are evaluated around the three aspects, with graphical demonstrations and performance measurements. Uniform grids are used with Finite Difference (FD) discretization scheme to simplify the computation. FD schemes are easy to implement in parallel, especially with ComputeShader, which is well supported in Unity engine. The whole implementation can easily be integrated into any real-world applications in Unity or other game engines that support DirectX 11 or higher. Visualization Physically Based Pipeline Real-Time Dynamic Simulation Combustion Fire Explosion Smoke Thick Smoke Computer Graphics PDE Numerical Methods Grid Based Navier-Stokes Equation Imcompressible Fluid Radiative Transfer Equation RTE Volumetric Illumination Volumetric Rendering Volumetric Global Illumination Distant Light Source Volumetric Global Shadow Local Shadow Approximation Black-body Radiation Poisson Solver Spectroscopy Fire Color Color Reproduction Tone Mapping Color Temperature Color Matching Function Tristimulus CIE Standard Observer Color Space XYZ RGB GPU GPGPU ComputeShader DirectX 12 Unity C# Computer Sciences Datavetenskap (datalogi)
236	Nonlinear Dynamic Modeling, Simulation And Characterization Of The Mesoscale Neuron-electrode Interface Thakore, Vaibhav 01 January 2012 (has links) Extracellular neuroelectronic interfacing has important applications in the fields of neural prosthetics, biological computation and whole-cell biosensing for drug screening and toxin detection. While the field of neuroelectronic interfacing holds great promise, the recording of high-fidelity signals from extracellular devices has long suffered from the problem of low signal-to-noise ratios and changes in signal shapes due to the presence of highly dispersive dielectric medium in the neuron-microelectrode cleft. This has made it difficult to correlate the extracellularly recorded signals with the intracellular signals recorded using conventional patch-clamp electrophysiology. For bringing about an improvement in the signalto-noise ratio of the signals recorded on the extracellular microelectrodes and to explore strategies for engineering the neuron-electrode interface there exists a need to model, simulate and characterize the cell-sensor interface to better understand the mechanism of signal transduction across the interface. Efforts to date for modeling the neuron-electrode interface have primarily focused on the use of point or area contact linear equivalent circuit models for a description of the interface with an assumption of passive linearity for the dynamics of the interfacial medium in the cell-electrode cleft. In this dissertation, results are presented from a nonlinear dynamic characterization of the neuroelectronic junction based on Volterra-Wiener modeling which showed that the process of signal transduction at the interface may have nonlinear contributions from the interfacial medium. An optimization based study of linear equivalent circuit models for representing signals recorded at the neuron-electrode interface subsequently iv proved conclusively that the process of signal transduction across the interface is indeed nonlinear. Following this a theoretical framework for the extraction of the complex nonlinear material parameters of the interfacial medium like the dielectric permittivity, conductivity and diffusivity tensors based on dynamic nonlinear Volterra-Wiener modeling was developed. Within this framework, the use of Gaussian bandlimited white noise for nonlinear impedance spectroscopy was shown to offer considerable advantages over the use of sinusoidal inputs for nonlinear harmonic analysis currently employed in impedance characterization of nonlinear electrochemical systems. Signal transduction at the neuron-microelectrode interface is mediated by the interfacial medium confined to a thin cleft with thickness on the scale of 20-110 nm giving rise to Knudsen numbers (ratio of mean free path to characteristic system length) in the range of 0.015 and 0.003 for ionic electrodiffusion. At these Knudsen numbers, the continuum assumptions made in the use of Poisson-Nernst-Planck system of equations for modeling ionic electrodiffusion are not valid. Therefore, a lattice Boltzmann method (LBM) based multiphysics solver suitable for modeling ionic electrodiffusion at the mesoscale neuron-microelectrode interface was developed. Additionally, a molecular speed dependent relaxation time was proposed for use in the lattice Boltzmann equation. Such a relaxation time holds promise for enhancing the numerical stability of lattice Boltzmann algorithms as it helped recover a physically correct description of microscopic phenomena related to particle collisions governed by their local density on the lattice. Next, using this multiphysics solver simulations were carried out for the charge relaxation dynamics of an electrolytic nanocapacitor with the intention of ultimately employing it for a simulation of the capacitive coupling between the neuron and the v planar microelectrode on a microelectrode array (MEA). Simulations of the charge relaxation dynamics for a step potential applied at t = 0 to the capacitor electrodes were carried out for varying conditions of electric double layer (EDL) overlap, solvent viscosity, electrode spacing and ratio of cation to anion diffusivity. For a large EDL overlap, an anomalous plasma-like collective behavior of oscillating ions at a frequency much lower than the plasma frequency of the electrolyte was observed and as such it appears to be purely an effect of nanoscale confinement. Results from these simulations are then discussed in the context of the dynamics of the interfacial medium in the neuron-microelectrode cleft. In conclusion, a synergistic approach to engineering the neuron-microelectrode interface is outlined through a use of the nonlinear dynamic modeling, simulation and characterization tools developed as part of this dissertation research. Whole cell biosensors cell sensor interface neuron electrode interface neuroelectronic interfacing microelectrode arrays meas field effect transistor (fet) arrays limitations of equivalent circuit models volterra wiener modeling nonlinear dynamic modeling nonlinear impedance spectroscopy lattice boltzmann method lattice poisson boltzmann method multiphysics solver entrance flow problem surface chemical reaction electroosmotic flow ionic electrodiffusion primitive model of electrolyte charge relaxation dynamics electrolytic nanocapacitor effect of nanoscale confinement overlapping electric double layers electric double layer relaxation viscous drag force on ions in a solvent Physics

Page generated in 0.047 seconds