Global ETD Search

21	A Recommendation System for Preconditioned Iterative Solvers George, Thomas 2009 December 1900 (has links) Solving linear systems of equations is an integral part of most scientific simulations. In recent years, there has been a considerable interest in large scale scientific simulation of complex physical processes. Iterative solvers are usually preferred for solving linear systems of such magnitude due to their lower computational requirements. Currently, computational scientists have access to a multitude of iterative solver options available as "plug-and- play" components in various problem solving environments. Choosing the right solver configuration from the available choices is critical for ensuring convergence and achieving good performance, especially for large complex matrices. However, identifying the "best" preconditioned iterative solver and parameters is challenging even for an expert due to issues such as the lack of a unified theoretical model, complexity of the solver configuration space, and multiple selection criteria. Therefore, it is desirable to have principled practitioner-centric strategies for identifying solver configuration(s) for solving large linear systems. The current dissertation presents a general practitioner-centric framework for (a) problem independent retrospective analysis, and (b) problem-specific predictive modeling of performance data. Our retrospective performance analysis methodology introduces new metrics such as area under performance-profile curve and conditional variance-based finetuning score that facilitate a robust comparative performance evaluation as well as parameter sensitivity analysis. We present results using this analysis approach on a number of popular preconditioned iterative solvers available in packages such as PETSc, Trilinos, Hypre, ILUPACK, and WSMP. The predictive modeling of performance data is an integral part of our multi-stage approach for solver recommendation. The key novelty of our approach lies in our modular learning based formulation that comprises of three sub problems: (a) solvability modeling, (b) performance modeling, and (c) performance optimization, which provides the flexibility to effectively target challenges such as software failure and multiobjective optimization. Our choice of a "solver trial" instance space represented in terms of the characteristics of the corresponding "linear system", "solver configuration" and their interactions, leads to a scalable and elegant formulation. Empirical evaluation of our approach on performance datasets associated with fairly large groups of solver configurations demonstrates that one can obtain high quality recommendations that are close to the ideal choices. Recommendation iterative solvers machine learning preconditioners system empirical study experimental packages
22	Computation And Analysis Of Spectra Of Large Undirected Networks Erdem, Ozge 01 June 2010 (has links) (PDF) Many interacting complex systems in biology, in physics, in technology and social systems, can be represented in a form of large networks. These large networks are mathematically represented by graphs. A graph is represented usually by the adjacency or the Laplacian matrix. Important features of the underlying structure and dynamics of them can be extracted from the analysis of the spectrum of the graphs. Spectral analysis of the so called normalized Laplacian of large networks became popular in the recent years. The Laplacian matrices of the empirical networks are in form of unstructured large sparse matrices. The aim of this thesis is the comparison of different eigenvalue solvers for large sparse symmetric matrices which arise from the graph theoretical epresentation of undirected networks. The spectrum of the normalized Laplacian is in the interval [0 2] and the multiplicity of the eigenvalue 1 plays a particularly important role for the network analysis. Moreover, the spectral analysis of protein-protein interaction networks has revealed that these networks have a different distribution type than other model networks such as scale free networks. In this respect, the eigenvalue solvers implementing the well-known implicitly restarted Arnoldi method, Lanczos method, Krylov-Schur and Jacobi Davidson methods are investigated. They exist as MATLAB routines and are included in some freely available packages. The performances of different eigenvalue solvers PEIG, AHBEIGS, IRBLEIGS, EIGIFP, LANEIG, JDQR, JDCG in MATLAB and the library SLEPc in C++ were tested for matrices of size between 100-13000 and are compared in terms of accuracy and computing time. The accuracy of the eigenvalue solvers are validated for the Paley graphs with known eigenvalues and are compared for large empirical networks using the residual plots and spectral density plots are computed. QA Numerical Analysis 297-299.4
23	Multiresolution weighted norm equivalences and applications Beuchler, Sven, Schneider, Reinhold, Schwab, Christoph 05 April 2006 (has links) (PDF) We establish multiresolution norm equivalences in weighted spaces <i>L<sup>2</sup><sub>w</sub></i>((0,1)) with possibly singular weight functions <i>w(x)</i>≥0 in (0,1). Our analysis exploits the locality of the biorthogonal wavelet basis and its dual basis functions. The discrete norms are sums of wavelet coefficients which are weighted with respect to the collocated weight function <i>w(x)</i> within each scale. Since norm equivalences for Sobolev norms are by now well-known, our result can also be applied to weighted Sobolev norms. We apply our theory to the problem of preconditioning <i>p</i>-Version FEM and wavelet discretizations of degenerate elliptic problems. fast solvers norm equivalences weighted norms ddc:510 Norm <Mathematik> Wavelet
24	Development and application of a parallel compositional reservoir simulator Ghasemi Doroh, Mojtaba 06 November 2012 (has links) Simulation of large-scale and complex reservoirs requires fine and detailed gridding, which involves a significant amount of memory and is computationally expensive. Nowadays, clusters of PCs and high-performance computing (HPC) centers are widely available. These systems allow parallel processing, which helps large-scale simulations run faster and more efficient. In this research project, we developed a parallel version of The University of Texas Compositional Simulator (UTCOMP). The parallel UTCOMP is capable of running on both shared and distributed memory parallel computers. This parallelization included all physical features of the original code, such as higher-order finite difference, physical dispersion, and asphaltene precipitation. The parallelization was verified for several case studies using multiple processors. The parallel simulator produces outputs required for visualizing simulation results using the S3graph visualization software. The efficiency of the parallel simulator was assessed in terms of speedup using various numbers of processors. Subsequently, we improved the coding and implementation in the simulator in order to minimize the communications between the processors to improve the parallel efficiency to carry out the simulations. To improve the efficiency of the linear solver in the simulator, we implemented three well-known high-performance parallel solver packages (SAMG, Hypre, and PETSc) in the parallel simulator. Then, the performances of the solver packages were improved in terms of the input parameters for solving large-scale reservoir simulation problems. The developed parallel simulator has expanded the capability of the original code for simulating large-scale reservoir simulation case studies. In other words, with sufficient number of processors, a field-scale simulation with a million grid cells can be performed in few hours. Several case studies are presented to show the performance of the parallel simulator. / text Reservoir simulation Development Application Parallel computing Parallel efficiency High-performance solvers
25	Application of boundary element methods (BEM) to internal propulsion systems; application to water-jets and inducers Valsaraj, Alokraj 2013 August 1900 (has links) A panel method derived from inviscid irrotational flow theory and utilizing hyperboloid panels is developed and applied to the simulation of steady fully wetted flows inside water-jet pumps and rocket engine inducers. The source and dipole influence coefficients of the hyperboloid panels are computed using Gauss quadrature. The present method solves the boundary value problem subject to a uniform inflow directly by discretizing the blade, casing/shroud and hub geometries with panels. The Green's integral equation and the influence coefficients for the water-jet/inducer problem are defined and solved by allocating constant strength sources and dipoles on the blade, hub and casing surfaces and constant strength dipoles on the shed wake sheets from the rotor/ stator blades. The rotor- stator interaction is accomplished using an iterative procedure which considers the effects between the rotor and the stator, via circumferentially averaged induced velocities. Finally, the hydrodynamic performance predictions for the water-jet pump and the inducer from the present method are validated against existing experimental data and numerical results from Reynolds Averaged Navier- Stokes (RANS) solvers. / text Boundary Element Methods Reynolds Averaged Navier-Stokes solvers water-jet propulsion system inducer
26	Linear solvers and coupling methods for compositional reservoir simulators Li, Wenjun, doctor of engineering 17 February 2011 (has links) Three compositional reservoir simulators have been developed in the Department of Petroleum and Geosystems Engineering at The University of Texas at Austin (UT-Austin): UTCOMP (miscible gas flooding simulator), UTCHEM (chemical flooding simulator), and GPAS (General Purpose Adaptive Simulator). UTCOMP and UTCHEM simulators have been used by various oil companies for solving a variety of field problems. The efficiency and accuracy of each simulator becomes critically important when they are used to solve field problems. In this study, two well-developed solver packages, SAMG and HYPRE, along with existing solvers were compared. Our numerical results showed that SAMG can be an excellent solver for the usage in the three simulators for solving problems with a high accuracy requirement and long simulation times, and BoomerAMG in HYPRE package can also be a good solver for application in the UTCHEM simulator. In order to investigate the flexibility and the efficiency of a partitioned coupling method, the second part of this thesis presents a new implementation using a partition method for a thermal module in an equation-of-state (EOS) compositional simulator, the General Purpose Adaptive Simulator (GPAS) developed at The University of Texas at Austin. The finite difference method (FDM) was used for the solution of governing partial differential equations. Specifically, the new coupled implementation was based on the Schur complement method. For the partition method, two suitable acceleration techniques were constructed. One technique was the optimized choice of preconditioner for the Schur complement; the other was the optimized selection of tolerances for the two solution steps. To validate the implementation, we present simulation examples of hot water injection in an oil reservoir. The numerical comparison between the new implementation and the traditional, fully implicit method showed that the partition method is not only more flexible, but also faster than the classical, fully implicit method for the same test problems without sacrificing accuracy. In conclusion, the new implementation of the partition method is a more flexible and more efficient method for coupling a new module into an existing simulator than the classical, fully implicit method.The third part of this thesis presents another type of coupling method, iterative coupling methods, which has been implemented into GPAS with thermal module, FICM (Fully, Iterative Coupling Method) and GICM (General, Iterative Coupling Method), LICM (Loose, Iterative Coupling Method). The results show that LICM is divergent, and GICM and FICM can work normally. GICM is the fastest among the compared methods, and FICM has a similar efficiency as CFIM (Classic Fully Implicit Method). Although GICM is the fastest method, GICM is less accurate than FICM for in the test cases carried out in this study. / text Compositional reservoir simulators UTCOMP UTCHEM GPAS SAMG HYPRE BoomerAMG Coupling methods FICM GICM LICM Linear solvers
27	Programmbeschreibung SPC-PM3-AdH-XX - Teil 1 / Program description of SPC-PM3-AdH-XX - part 1 Meyer, Arnd 11 March 2014 (has links) (PDF) Beschreibung der Finite Elemente Software-Familie SPC-PM3-AdH-XX für: (S)cientific (P)arallel (C)omputing - (P)rogramm-(M)odul (3)D (ad)aptiv (H)exaederelemente. Für XX stehen die einzelnen Spezialvarianten, die in Teil 2 detailliert geschildert werden. Stand: Ende 2013 adaptive Verfahren FEM-Software finite elements fast solvers ddc:518 Finite-Elemente-Methode Software
28	Algorithm Design Using Spectral Graph Theory Peng, Richard 01 August 2013 (has links) Spectral graph theory is the interplay between linear algebra and combinatorial graph theory. Laplace’s equation and its discrete form, the Laplacian matrix, appear ubiquitously in mathematical physics. Due to the recent discovery of very fast solvers for these equations, they are also becoming increasingly useful in combinatorial optimization, computer vision, computer graphics, and machine learning. In this thesis, we develop highly efficient and parallelizable algorithms for solving linear systems involving graph Laplacian matrices. These solvers can also be extended to symmetric diagonally dominant matrices and M-matrices, both of which are closely related to graph Laplacians. Our algorithms build upon two decades of progress on combinatorial preconditioning, which connects numerical and combinatorial algorithms through spectral graph theory. They in turn rely on tools from numerical analysis, metric embeddings, and random matrix theory. We give two solver algorithms that take diametrically opposite approaches. The first is motivated by combinatorial algorithms, and aims to gradually break the problem into several smaller ones. It represents major simplifications over previous solver constructions, and has theoretical running time comparable to sorting. The second is motivated by numerical analysis, and aims to rapidly improve the algebraic connectivity of the graph. It is the first highly efficient solver for Laplacian linear systems that parallelizes almost completely. Our results improve the performances of applications of fast linear system solvers ranging from scientific computing to algorithmic graph theory. We also show that these solvers can be used to address broad classes of image processing tasks, and give some preliminary experimental results. Combinatorial Preconditioning Linear System Solvers Spectral Graph Theory Parallel Algorithms Low Stretch Embeddings Image Processing
29	Development And Validation Of Two-dimensional Depth-averaged Free Surface Flow Solver Yilmaz, Burak 01 January 2003 (has links) (PDF) A numerical solution algorithm based on finite volume method is developed for unsteady, two-dimensional, depth-averaged shallow water flow equations. The model is verified using test cases from the literature and free surface data obtained from measurements in a laboratory flume. Experiments are carried out in a horizontal, rectangular channel with vertical solid boxes attached on the sidewalls to obtain freesurface data set in flows where three-dimensionality is significant. Experimental data contain both subcritical and supercritical states. The shallow water equations are solved on a structured, rectangular grid system. Godunov type solution procedure evaluates the interface fluxes using an upwind method with an exact Riemann solver. The numerical solution reproduces analytical solutions for the test cases successfully. Comparison of the numerical results with the experimental two-dimensional free surface data is used to illustrate the limitations of the shallow water equations and improvements necessary for better simulation of such cases.
30	Mapping and scheduling on multi-core processors using SMT solvers / Allocation et ordonnancement sur des processeurs multi-coeur avec des solveurs SMT Tendulkar, Pranav 13 October 2014 (has links) Dans l’objectif d’augmenter les performances, l’architecture des processeurs a évolué versdes plate-formes "multi-core" et "many-core" composées de multiple unités de traitements.Toutefois, trouver des moyens efficaces pour exécuter du logiciel parallèle reste un problèmedifficile. Avec un grand nombre d’unités de calcul disponibles, le logiciel doit orchestrer lacommunication et assurer la synchronisation lors de l’exécution du code. La communication(transport des données entre les différents processeurs) est gérée de façon transparente par lematériel ou explicitement par le logiciel.Les modèles qui représentent les algorithmes de façon structurée et formelle mettent enévidence leur parallélisme inhérent. Le déploiement des logiciels représentés par ces modèlesnécessite de spécifier placement (sur quel processeur s’exécute une certaine tâche) et l’ordonnancement(dans quel ordre sont exécutées les tâches). Le placement et l’ordonnancement sontdes problèmes combinatoires difficile avec un nombre exponentiel de solutions. En outre, lessolutions ont différents coûts qui doivent être optimisés : la consommation de mémoire, letemps d’exécution, les ressources utilisées, etc. C’est un problème d’optimisation multi-critères.La solution à ce problème est ce qu’on appelle un ensemble Pareto-optimal nécessitant desalgorithmes spéciaux pour l’approximer.Nous ciblons une classe d’applications, appelées applications de streaming, qui traitentun flux continu de données. Ces applications qui appliquent un calcul similaire sur différentséléments de données successifs, peuvent être commodément exprimées par une classe de modèlesappelés modèles de flux de données. Le problème du placement et de l’ordonnancementest codé sous forme de contraintes logiques et résolu par un solveur Satisfaisabilité ModuloThéories (SMT). Les solveurs SMT résolvent le problème en combinant des techniques derecherche et de la propagation de contraintes afin d’attribuer des valeurs aux variables duproblème satisfaisant les contraintes de coût données.Dans les applications de flux de données, l’espace de conception explose avec l’augmentationdu nombre de tâches et de processeurs. Dans cette thèse, nous nous attaquons à ceproblème par l’introduction des techniques de réduction de symétrie et démontrons que larupture de symétrie accélère la recherche dans un solveur SMT, permettant ainsi l’augmentationde la taille du problème qui peut être résolu. Notre algorithme d’exploration de l’espacede conception approxime le front de Pareto du problème et produit des solutions pour différentscompromis de coûts. De plus, nous étendons le problème d’ordonnancement pour lesplate-formes "many-core" qui sont une catégorie de plate-forme multi coeurs où les unités sontconnectés par un réseau sur puce (NoC). Nous fournissons un flot de conception qui réalise leplacement des applications sur de telles plate-formes et insert automatiquement des élémentssupplémentaires pour modéliser la communication à l’aide de mémoires de taille bornée. Nousprésentons des résultats expérimentaux obtenus sur deux plate-formes existantes : la machineKalray à 256 processeurs et les Tilera TILE-64. / In order to achieve performance gains, computers have evolved to multi-core and many-core platforms abounding with multiple processor cores. However the problem of finding efficient ways to execute parallel software on them is hard. With a large number of processor cores available, the software must orchestrate the communication, synchronization along with the code execution. Communication corresponds to the transport of data between different processors, handled transparently by the hardware or explicitly by the software.Models which represent the algorithms in a structured and formal way expose the available parallelism. Deployment of the software algorithms represented by such models needs a specification of which processor to execute the tasks on (mapping) and when to execute them (scheduling). Mapping and scheduling is a hard combinatorial problem with exponential number of solutions. In addition, the solutions have multiple costs that need to be optimized, such as memory consumption, time to execute, resources used etc. Such a problem with multiple costs is called a multi-criteria optimization problem. The solution to this problem is a set of incomparable solutions called Pareto solutions which need special algorithms to approximate them.We target a class of applications called streaming applications, which process a continuous stream of data. These applications apply similar computation on different data items, can be conveniently expressed by a class of models called dataflow models. We encode mapping and scheduling problem in form of logical constraints and present it to satisfiability modulo theory (SMT) solvers. SMT solvers, solve the encoded problem by using a combination of search techniques and constraint propagation to find an assignment to the problem variables satisfying the given cost constraints.In dataflow applications, the design space explodes with increased number of tasks and processors. In this thesis, we tackle this problem by introduction symmetry reduction techniques and demonstrate that symmetry breaking accelerates search in SMT solver, increasing the size of the problem that can be solved. Our design-space exploration algorithm approximates Pareto front of the problem and produces solutions with different cost trade-offs. Further we extend the scheduling problem to the many-core platforms which are a group of multi-core platforms connected by network-on-chip. We provide a design flow which performs mapping of the applications on such platforms and automatic insertion of additional elements to model the communication using bounded memory. We provide experimental results obtained on the 256-processor Kalray and the Tilera TILE-64 platforms.The multi-core processors have typically a small amount of memory close to the processor, generally insufficient for all application data to fit. We study a class of parallel applications having a regular data access pattern and large amount of data to be processed by a uniform computation. The data must be brought from main memory to local memory, processed and then the results written back to main memory, all in batches. Selecting the proper granularity of the data that is brought into local memory is an optimization problem. We formalize this problem and provide a way to determine the optimal transfer granularity depending on the characteristics of application and the hardware platform.In addition to the scheduling problems and local memory management, we study a part of the problem of runtime management of the applications. Applications in modern embedded systems can start and stop dynamically. In order to execute all the applications efficiently and to optimize global costs such as power consumption, execution time etc., the applications must be reconfigured dynamically at runtime. We present a predictable and composable (executing independently without affecting others) way of migrating tasks according to the reconfiguration decision. Multi-coeur processeurs Allocation Ordonnancement Solveurs SMT Multiprocessors Mapping Scheduling SMT Solvers 004

Search results