Global ETD Search

281	Dense and sparse parallel linear algebra algorithms on graphics processing units Lamas Daviña, Alejandro 13 November 2018 (has links) Una línea de desarrollo seguida en el campo de la supercomputación es el uso de procesadores de propósito específico para acelerar determinados tipos de cálculo. En esta tesis estudiamos el uso de tarjetas gráficas como aceleradores de la computación y lo aplicamos al ámbito del álgebra lineal. En particular trabajamos con la biblioteca SLEPc para resolver problemas de cálculo de autovalores en matrices de gran dimensión, y para aplicar funciones de matrices en los cálculos de aplicaciones científicas. SLEPc es una biblioteca paralela que se basa en el estándar MPI y está desarrollada con la premisa de ser escalable, esto es, de permitir resolver problemas más grandes al aumentar las unidades de procesado. El problema lineal de autovalores, Ax = lambda x en su forma estándar, lo abordamos con el uso de técnicas iterativas, en concreto con métodos de Krylov, con los que calculamos una pequeña porción del espectro de autovalores. Este tipo de algoritmos se basa en generar un subespacio de tamaño reducido (m) en el que proyectar el problema de gran dimensión (n), siendo m << n. Una vez se ha proyectado el problema, se resuelve este mediante métodos directos, que nos proporcionan aproximaciones a los autovalores del problema inicial que queríamos resolver. Las operaciones que se utilizan en la expansión del subespacio varían en función de si los autovalores deseados están en el exterior o en el interior del espectro. En caso de buscar autovalores en el exterior del espectro, la expansión se hace mediante multiplicaciones matriz-vector. Esta operación la realizamos en la GPU, bien mediante el uso de bibliotecas o mediante la creación de funciones que aprovechan la estructura de la matriz. En caso de autovalores en el interior del espectro, la expansión requiere resolver sistemas de ecuaciones lineales. En esta tesis implementamos varios algoritmos para la resolución de sistemas de ecuaciones lineales para el caso específico de matrices con estructura tridiagonal a bloques, que se ejecutan en GPU. En el cálculo de las funciones de matrices hemos de diferenciar entre la aplicación directa de una función sobre una matriz, f(A), y la aplicación de la acción de una función de matriz sobre un vector, f(A)b. El primer caso implica un cálculo denso que limita el tamaño del problema. El segundo permite trabajar con matrices dispersas grandes, y para resolverlo también hacemos uso de métodos de Krylov. La expansión del subespacio se hace mediante multiplicaciones matriz-vector, y hacemos uso de GPUs de la misma forma que al resolver autovalores. En este caso el problema proyectado comienza siendo de tamaño m, pero se incrementa en m en cada reinicio del método. La resolución del problema proyectado se hace aplicando una función de matriz de forma directa. Nosotros hemos implementado varios algoritmos para calcular las funciones de matrices raíz cuadrada y exponencial, en las que el uso de GPUs permite acelerar el cálculo. / One line of development followed in the field of supercomputing is the use of specific purpose processors to speed up certain types of computations. In this thesis we study the use of graphics processing units as computer accelerators and apply it to the field of linear algebra. In particular, we work with the SLEPc library to solve large scale eigenvalue problems, and to apply matrix functions in scientific applications. SLEPc is a parallel library based on the MPI standard and is developed with the premise of being scalable, i.e. to allow solving larger problems by increasing the processing units. We address the linear eigenvalue problem, Ax = lambda x in its standard form, using iterative techniques, in particular with Krylov's methods, with which we calculate a small portion of the eigenvalue spectrum. This type of algorithms is based on generating a subspace of reduced size (m) in which to project the large dimension problem (n), being m << n. Once the problem has been projected, it is solved by direct methods, which provide us with approximations of the eigenvalues of the initial problem we wanted to solve. The operations used in the expansion of the subspace vary depending on whether the desired eigenvalues are from the exterior or from the interior of the spectrum. In the case of searching for exterior eigenvalues, the expansion is done by matrix-vector multiplications. We do this on the GPU, either by using libraries or by creating functions that take advantage of the structure of the matrix. In the case of eigenvalues from the interior of the spectrum, the expansion requires solving linear systems of equations. In this thesis we implemented several algorithms to solve linear systems of equations for the specific case of matrices with a block-tridiagonal structure, that are run on GPU. In the computation of matrix functions we have to distinguish between the direct application of a matrix function, f(A), and the action of a matrix function on a vector, f(A)b. The first case involves a dense computation that limits the size of the problem. The second allows us to work with large sparse matrices, and to solve it we also make use of Krylov's methods. The expansion of subspace is done by matrix-vector multiplication, and we use GPUs in the same way as when solving eigenvalues. In this case the projected problem starts being of size m, but it is increased by m on each restart of the method. The solution of the projected problem is done by directly applying a matrix function. We have implemented several algorithms to compute the square root and the exponential matrix functions, in which the use of GPUs allows us to speed up the computation. / Una línia de desenvolupament seguida en el camp de la supercomputació és l'ús de processadors de propòsit específic per a accelerar determinats tipus de càlcul. En aquesta tesi estudiem l'ús de targetes gràfiques com a acceleradors de la computació i ho apliquem a l'àmbit de l'àlgebra lineal. En particular treballem amb la biblioteca SLEPc per a resoldre problemes de càlcul d'autovalors en matrius de gran dimensió, i per a aplicar funcions de matrius en els càlculs d'aplicacions científiques. SLEPc és una biblioteca paral·lela que es basa en l'estàndard MPI i està desenvolupada amb la premissa de ser escalable, açò és, de permetre resoldre problemes més grans en augmentar les unitats de processament. El problema lineal d'autovalors, Ax = lambda x en la seua forma estàndard, ho abordem amb l'ús de tècniques iteratives, en concret amb mètodes de Krylov, amb els quals calculem una xicoteta porció de l'espectre d'autovalors. Aquest tipus d'algorismes es basa a generar un subespai de grandària reduïda (m) en el qual projectar el problema de gran dimensió (n), sent m << n. Una vegada s'ha projectat el problema, es resol aquest mitjançant mètodes directes, que ens proporcionen aproximacions als autovalors del problema inicial que volíem resoldre. Les operacions que s'utilitzen en l'expansió del subespai varien en funció de si els autovalors desitjats estan en l'exterior o a l'interior de l'espectre. En cas de cercar autovalors en l'exterior de l'espectre, l'expansió es fa mitjançant multiplicacions matriu-vector. Aquesta operació la realitzem en la GPU, bé mitjançant l'ús de biblioteques o mitjançant la creació de funcions que aprofiten l'estructura de la matriu. En cas d'autovalors a l'interior de l'espectre, l'expansió requereix resoldre sistemes d'equacions lineals. En aquesta tesi implementem diversos algorismes per a la resolució de sistemes d'equacions lineals per al cas específic de matrius amb estructura tridiagonal a blocs, que s'executen en GPU. En el càlcul de les funcions de matrius hem de diferenciar entre l'aplicació directa d'una funció sobre una matriu, f(A), i l'aplicació de l'acció d'una funció de matriu sobre un vector, f(A)b. El primer cas implica un càlcul dens que limita la grandària del problema. El segon permet treballar amb matrius disperses grans, i per a resoldre-ho també fem ús de mètodes de Krylov. L'expansió del subespai es fa mitjançant multiplicacions matriu-vector, i fem ús de GPUs de la mateixa forma que en resoldre autovalors. En aquest cas el problema projectat comença sent de grandària m, però s'incrementa en m en cada reinici del mètode. La resolució del problema projectat es fa aplicant una funció de matriu de forma directa. Nosaltres hem implementat diversos algorismes per a calcular les funcions de matrius arrel quadrada i exponencial, en les quals l'ús de GPUs permet accelerar el càlcul. / Lamas Daviña, A. (2018). Dense and sparse parallel linear algebra algorithms on graphics processing units [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/112425 Parallel computing, GPU MPI Eigenproblems Linear systems Matrix functions SLEPc
282	HPC algorithms for nonnegative decompositions San Juan Sebastián, Pablo 26 November 2018 (has links) Muchos problemas procedentes de aplicaciones del mundo real pueden ser modelados como problemas matemáticos con magnitudes no negativas, y por tanto, las soluciones de estos problemas matemáticos solo tienen sentido si son no negativas. Estas magnitudes no negativas pueden ser, por ejemplo, las frecuencias en una señal sonora, las intensidades de los pixeles de una imagen, etc. Algunos de estos problemas pueden ser modelados utilizando un sistema de ecuaciones lineales sobredeterminado. Cuando la solución de dicho problema debe ser restringida a valores no negativos, aparece un problema llamado problema de mínimos cuadrados no negativos (NNLS por sus siglas en inglés). La solución de dicho problema tiene múltiples aplicaciones en ciencia e ingeniería. Otra descomposición no negativa importante es la Factorización de Matrices No negativas (NMF por sus siglas en inglés). La NMF es una herramienta muy popular utilizada en varios campos, como por ejemplo: clasificación de documentos, aprendizaje automático, análisis de imagen o separación de señales sonoras. Esta factorización intenta aproximar una matriz no negativa con el producto de dos matrices no negativas de menor tamaño, creando habitualmente representaciones por partes de los datos originales. Los algoritmos diseñados para calcular la solución de estos dos problemas no negativos tienen un elevado coste computacional, y debido a ese elevado coste, estas descomposiciones pueden beneficiarse mucho del uso de técnicas de Computación de Altas Prestaciones (HPC por sus siglas en inglés). Estos sistemas computacionales de altas prestaciones incluyen desde los modernos computadores multinucleo a lo último en aceleradores de calculo (Unidades de Procesamiento Gráfico (GPU), Intel Many Integrated Core (MIC), etc.). Para obtener el máximo rendimiento de estos sistemas, los desarrolladores deben utilizar tecnologías software tales como la programación paralela, la vectoración o el uso de librerías de computación altas prestaciones. A pesar de que existen diversos algoritmos para calcular la NMF y resolver el problema NNLS, no todos ellos disponen de una implementación paralela y eficiente. Además, es muy interesante reunir diversos algoritmos con propiedades diferentes en una sola librería computacional. Esta tesis presenta una librería computacional de altas prestaciones que contiene implementaciones paralelas y eficientes de los mejores algoritmos existentes actualmente para calcular la NMF. Además la tesis también incluye una comparación experimental entre las diferentes implementaciones presentadas. Esta librería centrada en el cálculo de la NMF soporta múltiples arquitecturas tales como CPUs multinucleo, GPUs e Intel MIC. El objetivo de esta librería es ofrecer un abanico de algoritmos eficientes para ayudar a científicos, ingenieros o cualquier tipo de profesionales que necesitan hacer uso de la NMF. Otro problema abordado en esta tesis es la actualización de las factorizaciones no negativas. El problema de la actualización se ha estudiado tanto para la solución del problema NNLS como para el calculo de la NMF. Existen problemas no negativos cuya solución es próxima a otros problemas que ya han sido resueltos, el problema de la actualización consiste en aprovechar la solución de un problema A que ya ha sido resuelto, para obtener la solución de un problema B cercano al problema A. Utilizando esta aproximación, el problema B puede ser resuelto más rápido que si se tuviera que resolver sin aprovechar la solución conocida del problema A. En esta tesis se presenta una metodología algorítmica para resolver ambos problemas de actualización: la actualización de la solución del problema NNLS y la actualización de la NMF. Además se presentan evaluaciones empíricas de las soluciones presentadas para ambos problemas. Los resultados de estas evaluaciones muestran que los algoritmos propuestos son más rápidos que reso / Molts problemes procedents de aplicacions del mon real poden ser modelats com problemes matemàtics en magnituts no negatives, i per tant, les solucions de estos problemes matemàtics només tenen sentit si son no negatives. Estes magnituts no negatives poden ser, per eixemple, la concentració dels elements en un compost químic, les freqüències en una senyal sonora, les intensitats dels pixels de una image, etc. Alguns d'estos problemes poden ser modelats utilisant un sistema d'equacions llineals sobredeterminat. Quant la solució de este problema deu ser restringida a valors no negatius, apareix un problema nomenat problema de mínims quadrats no negatius (NNLS per les seues sigles en anglés). La solució de este problema te múltiples aplicacions en ciències i ingenieria. Un atra descomposició no negativa important es la Factorisació de Matrius No negatives(NMF per les seues sigles en anglés). La NMF es una ferramenta molt popular utilisada en diversos camps, com per eixemple: classificacio de documents, aprenentage automàtic, anàlisis de image o separació de senyals sonores. Esta factorisació intenta aproximar una matriu no negativa en el producte de dos matrius no negatives de menor tamany, creant habitualment representacions a parts de les dades originals. Els algoritmes dissenyats per a calcular la solució de estos dos problemes no negatius tenen un elevat cost computacional, i degut a este elevat cost, estes descomposicions poden beneficiar-se molt del us de tècniques de Computació de Altes Prestacions (HPC per les seues sigles en anglés). Estos sistemes de computació de altes prestacions inclouen des dels moderns computadors multinucli a lo últim en acceleradors de càlcul (Unitats de Processament Gràfic (GPU), Intel Many Core (MIC), etc.). Per a obtindre el màxim rendiment de estos sistemes, els desenrolladors deuen utilisar tecnologies software tals com la programació paralela, la vectorisació o el us de llibreries de computació de altes prestacions. A pesar de que existixen diversos algoritmes per a calcular la NMF i resoldre el problema NNLS, no tots ells disponen de una implementació paralela i eficient. Ademés, es molt interessant reunir diversos algoritmes en propietats diferents en una sola llibreria computacional. Esta tesis presenta una llibreria computacional de altes prestacions que conté implementacions paraleles i eficients dels millors algoritmes existents per a calcular la NMF. Ademés, la tesis també inclou una comparació experimental entre les diferents implementacions presentades. Esta llibreria centrada en el càlcul de la NMF soporta diverses arquitectures tals com CPUs multinucli, GPUs i Intel MIC. El objectiu de esta llibreria es oferir una varietat de algoritmes eficients per a ajudar a científics, ingeniers o qualsevol tipo de professionals que necessiten utilisar la NMF. Un atre problema abordat en esta tesis es la actualisació de les factorisacions no negatives. El problema de la actualisació se ha estudiat tant per a la solució del problema NNLS com per a el càlcul de la NMF. Existixen problemes no negatius la solució dels quals es pròxima a atres problemes no negatius que ya han sigut resolts, el problema de la actualisació consistix en aprofitar la solució de un problema A que ya ha sigut resolt, per a obtindre la solució de un problema B pròxim al problema A. Utilisant esta aproximació, el problema B pot ser resolt molt mes ràpidament que si tinguera que ser resolt des de 0 sense aprofitar la solució coneguda del problema A. En esta tesis es presenta una metodologia algorítmica per a resoldre els dos problemes de actualisació: la actualisació de la solució del problema NNLS i la actualisació de la NMF. Ademés es presenten evaluacions empíriques de les solucions presentades per als dos problemes. Els resultats de estes evaluacions mostren que els algoritmes proposts son més ràpits que resoldre el problema des de 0 en tots els / Many real world-problems can be modelled as mathematical problems with nonnegative magnitudes, and, therefore, the solutions of these problems are meaningful only if their values are nonnegative. Examples of these nonnegative magnitudes are the concentration of components in a chemical compound, frequencies in an audio signal, pixel intensities on an image, etc. Some of these problems can be modelled to an overdetermined system of linear equations. When the solution of this system of equations should be constrained to nonnegative values, a new problem arises. This problem is called the Nonnegative Least Squares (NNLS) problem, and its solution has multiple applications in science and engineering, especially for solving optimization problems with nonnegative restrictions. Another important nonnegativity constrained decomposition is the Nonnegative Matrix Factorization (NMF). The NMF is a very popular tool in many fields such as document clustering, data mining, machine learning, image analysis, chemical analysis, and audio source separation. This factorization tries to approximate a nonnegative data matrix with the product of two smaller nonnegative matrices, usually creating parts based representations of the original data. The algorithms that are designed to compute the solution of these two nonnegative problems have a high computational cost. Due to this high cost, these decompositions can benefit from the extra performance obtained using High Performance Computing (HPC) techniques. Nowadays, there are very powerful computational systems that offer high performance and can be used to solve extremely complex problems in science and engineering. From modern multicore CPUs to the newest computational accelerators (Graphics Processing Units(GPU), Intel Many Integrated Core(MIC), etc.), the performance of these systems keeps increasing continuously. To make the most of the hardware capabilities of these HPC systems, developers should use software technologies such as parallel programming, vectorization, or high performance computing libraries. While there are several algorithms for computing the NMF and for solving the NNLS problem, not all of them have an efficient parallel implementation available. Furthermore, it is very interesting to group several algorithms with different properties into a single computational library. This thesis presents a high-performance computational library with efficient parallel implementations of the best algorithms to compute the NMF in the current state of the art. In addition, an experimental comparison between the different implementations is presented. This library is focused on the computation of the NMF supporting multiple architectures like multicore CPUs, GPUs and Intel MIC. The goal of the library is to offer a full suit of algorithms to help researchers, engineers or professionals that need to use the NMF. Another problem that is dealt with in this thesis is the updating of nonnegative decompositions. The updating problem has been studied for both the solution of the NNLS problem and the NMF. Sometimes there are nonnegative problems that are close to other nonnegative problems that have already been solved. The updating problem tries to take advantage of the solution of a problem A, that has already been solved in order to obtain a solution of a new problem B, which is closely related to problem A. With this approach, problem B can be solved faster than solving it from scratch and not taking advantage of the already known solution of problem A. In this thesis, an algorithmic scheme is proposed for both the updating of the solution of NNLS problems and the updating of the NMF. Empirical evaluations for both updating problems are also presented. The results show that the proposed algorithms are faster than solving the problems from scratch in all of the tested cases. / San Juan Sebastián, P. (2018). HPC algorithms for nonnegative decompositions [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/113069 NMF NNLS HPC parallel computing nonnegative decompositions computational library GPU
283	Multi-level Parallelism with MPI and OpenACC for CFD Applications McCall, Andrew James 14 June 2017 (has links) High-level parallel programming approaches, such as OpenACC, have recently become popular in complex fluid dynamics research since they are cross-platform and easy to implement. OpenACC is a directive-based programming model that, unlike low-level programming models, abstracts the details of implementation on the GPU. Although OpenACC generally limits the performance of the GPU, this model significantly reduces the work required to port an existing code to any accelerator platform, including GPUs. The purpose of this research is twofold: to investigate the effectiveness of OpenACC in developing a portable and maintainable GPU-accelerated code, and to determine the capability of OpenACC to accelerate large, complex programs on the GPU. In both of these studies, the OpenACC implementation is optimized and extended to a multi-GPU implementation while maintaining a unified code base. OpenACC is shown as a viable option for GPU computing with CFD problems. In the first study, a CFD code that solves incompressible cavity flows is accelerated using OpenACC. Overlapping communication with computation improves performance for the multi-GPU implementation by up to 21%, achieving up to 400 times faster performance than a single CPU and 99% weak scalability efficiency with 32 GPUs. The second study ports the execution of a more complex CFD research code to the GPU using OpenACC. Challenges using OpenACC with modern Fortran are discussed. Three test cases are used to evaluate performance and scalability. The multi-GPU performance using 27 GPUs is up to 100 times faster than a single CPU and maintains a weak scalability efficiency of 95%. / Master of Science / The research and analysis performed in scientific computing today produces an ever-increasing demand for faster and more energy efficient performance. Parallel computing with supercomputers that use many central processing units (CPUs) is the current standard for satisfying these demands. The use of graphics processing units (GPUs) for scientific computing applications is an emerging technology that has gained a lot of popularity in the past decade. A single GPU can distribute the computations required by a program over thousands of processing units. This research investigates the effectiveness of a relatively new standard, called OpenACC, for offloading execution of a program to the GPU. The most widely used standards today are highly complex and require low-level, detailed knowledge of the GPU’s architecture. These issues significantly reduce the maintainability and portability of a program. OpenACC does not require rewriting a program for the GPU. Instead, the developer annotates regions of code to run on the GPU and only has to denote high-level information about how to parallelize the code. The results of this research found that even for a complex program that models air flows, using OpenACC to run the program on 27 GPUs increases performance by a factor of 100 over a single CPU and by a factor of 4 over 27 CPUs. Although higher performance is expected with other GPU programming standards, these results were accomplished with minimal change to the original program. Therefore, these results demonstrate the ability of OpenACC to improve performance while keeping the program maintainable and portable. Graphics processing unit Directive-based programming OpenACC Lid-driven cavity Multi-GPU Parallel computing
284	Stabilized Explicit Time Integration for Parallel Air Quality Models Srivastava, Anurag 09 November 2006 (has links) Air Quality Models are defined for prediction and simulation of air pollutant concentrations over a certain period of time. The predictions can be used in setting limits for the emission levels of industrial facilities. The input data for the air quality models are very large and encompass various environmental conditions like wind speed, turbulence, temperature and cloud density. Most air quality models are based on advection-diffusion equations. These differential equations are moderately stiff and require appropriate techniques for fast integration over large intervals of time. Implicit time stepping techniques for solving differential equations being unconditionally stable are considered suitable for the solution. However, implicit time stepping techniques impose certain data dependencies that can cause the parallelization of air quality models to be inefficient. The current approach uses Runge Kutta Chebyshev explicit method for solution of advection diffusion equations. It is found that even if the explicit method used is computationally more expensive in the serial execution, it takes lesser execution time when parallelized because of less complicated data dependencies presented by the explicit time-stepping. The implicit time-stepping on the other hand cannot be parallelized efficiently because of the inherent complicated data dependencies. / Master of Science Explicit time-stepping Runge Kutta Chebyshev STEM Air Quality Models Supercomputing Parallel Computing
285	An Implementation-Based Exploration of HAPOD: Hierarchical Approximate Proper Orthogonal Decomposition Beach, Benjamin Josiah 25 January 2018 (has links) Proper Orthogonal Decomposition (POD), combined with the Method of Snapshots and Galerkin projection, is a popular method for the model order reduction of nonlinear PDEs. The POD requires the left singular vectors from the singular value decomposition (SVD) of an n-by-m "snapshot matrix" S, each column of which represents the computed state of the system at a given time. However, the direct computation of this decomposition can be computationally expensive, particularly for snapshot matrices that are too large to fit in memory. Hierarchical Approximate POD (HAPOD) (Himpe 2016) is a recent method for the approximate truncated SVD that requires only a single pass over S, is easily parallelizable, and can be computationally cheaper than direct SVD, all while guaranteeing the requested accuracy for the resulting basis. This method processes the columns of S in blocks based on a predefined rooted tree of processors, concatenating the outputs from each stage to form the inputs for the next. However, depending on the selected parameter values and the properties of S, the performance of HAPOD may be no better than that of direct SVD. In this work, we numerically explore the parameter values and snapshot matrix properties for which HAPOD is computationally advantageous over the full SVD and compare its performance to that of a parallelized incremental SVD method (Brand 2002, Brand 2003, and Arrighi2015). In particular, in addition to the two major processor tree structures detailed in the initial publication of HAPOD (Himpe2016), we explore the viability of a new structure designed with an MPI implementation in mind. / Master of Science / Singular Value Decomposition (SVD) provides a way to represent numeric data that breaks the data up into its most important components, as well as measuring how significant each part is. This decomposition is widely used to assist in finding patterns in data and making decisions accordingly, or to obtain simple, yet accurate, representations of complex physical processes. Examples of useful data to decompose include the velocity of water flowing past an obstacle in a river, a large collection of images, or user ratings for a large number of movies. However, computing the SVD directly can be computationally expensive, and usually requires repeated access to the entire dataset. As these data sets can be very large, up to hundreds of gigabytes or even several terabytes, storing all of the data in memory at once may be infeasible. Thus, repeated access to the entire dataset requires that the files be read repeatedly from the hard disk, which can make the required computations exceptionally slow. Fortunately, for many applications, only the most important parts of the data are needed, and the rest can be discarded. As a result, several methods have surfaced that can pick out the most important parts of the data while accessing the original data only once, piece by piece, and can be much faster than computing the SVD directly. In addition, the recent bottleneck in individual computer processor speeds has motivated a need for methods that can efficiently run on a large number of processors in parallel. Hierarchical Approximate POD (HAPOD) [1] is a recently-developed method that can efficiently pick out the most important parts of the data while only accessing the original data once, and which is very easy to run in parallel. However, depending on a user-defined algorithm parameter (weight), HAPOD may return more information than is needed to satisfy the requested accuracy, which determines how much data can be discarded. It turns out that the input weights that result in less extra data also result in slower computations and the eventual need for more data to be stored in memory at once. This thesis explores how to choose this input weight to best balance the amount of extra information used with the speed of the method, and also explores how the properties of the data, such as the size of the data or the distribution of levels of significance of each part, impact the effectiveness of HAPOD. Reduced Order Modeling Proper Orthogonal Decomposition Truncated Singular Value Decomposition Parallel Computing Mine Fire Dynamics
286	Numerical Methods for the Chemical Master Equation Zhang, Jingwei 20 January 2010 (has links) The chemical master equation, formulated on the Markov assumption of underlying chemical kinetics, offers an accurate stochastic description of general chemical reaction systems on the mesoscopic scale. The chemical master equation is especially useful when formulating mathematical models of gene regulatory networks and protein-protein interaction networks, where the numbers of molecules of most species are around tens or hundreds. However, solving the master equation directly suffers from the so called "curse of dimensionality" issue. This thesis first tries to study the numerical properties of the master equation using existing numerical methods and parallel machines. Next, approximation algorithms, namely the adaptive aggregation method and the radial basis function collocation method, are proposed as new paths to resolve the "curse of dimensionality". Several numerical results are presented to illustrate the promises and potential problems of these new algorithms. Comparisons with other numerical methods like Monte Carlo methods are also included. Development and analysis of the linear Shepard algorithm and its variants, all of which could be used for high dimensional scattered data interpolation problems, are also included here, as a candidate to help solve the master equation by building surrogate models in high dimensions. / Ph. D. Collocation Method Radial Basis Function Shepard Algorithm M-estimation Uniformization/Randomization Method Aggregation/Disaggregation Uniformization/Randomization Method Stochastic Simulation Algorithm Parallel Computing Chemical Master Equation Radial Basis Function Stochastic Simulation Algorithm Chemical Master Equation Aggregation/Disaggregation Parallel Computing Collocation Method
287	Construction and analysis of numerical methods for solution of laser physics and nonlinear optics problems / Lazerių fizikos ir netiesinės optikos ir uždavinių sprendimo metodų sudarymas ir analizė Laukaitytė, Inga 18 June 2010 (has links) Mathematical models describing the Q-switched laser generation, which is a widely used laser technique for producing short intense pulses of light, belong to the class of semi-nonlinear models where only source terms nonlinearly depend on the solution. Numerical methods for solution of systems of semi-nonlinear partial differential equations have been extensively studied in many papers. Schrödinger-type equations, parabolic-type equations or general diffusion-reaction models arise in nonlinear optics. Such differential problems are solved mainly by finite-difference and Galerkin methods. The convergence analysis is based on the stability analysis of the linearized problems. The construction and theoretical analysis of discrete schemes for one-dimensional problem give a basis for a numerical solution of more general two-dimensional and three-dimensional problems where a diffraction process is taken into account. The two-dimensional problem simulates the dynamics of high-power semiconductor lasers. To solve the problems simulating propagation of photon fluxes in the nonlinear disperse medium, the finite-difference time-domain method is used. However, the major drawback of this method is that the computational domain must be sufficiently large. In order to restrict the computational domain and to solve the problem only in the region of interest, special artificial boundary conditions are investigated. The three-dimensional problem simulates an interaction of counter propagating... [to full text] / Disertacijoje nagrinėjami kai kurių lazerių fizikos ir netiesinės optikos uždavinių skaitinės analizės metodai. Tiriami trys pagrindiniai atvejai: bėgančias plokščias bangas aprašantis vienmatis, bėgančias difraguojančias bangas nagrinėjantis dvimatis ir lazerio pluoštų sąveiką netiesinėje Kero terpėje modeliuojantis trimatis modeliai. Šiuos uždavinius sieja pernešimo diferencialinės lygtys dalinėmis išvestinėmis, aprašančios į priešingas puses sklindančias lazerio bangas. Dvimačiame ir trimačiame uždaviniuose sprendžiamos dalinių išvestinių Šrėdingerio (ang. Schrödinger) tipo diferencialinės lygtys. Šiems matematiniams modeliams sudarytos baigtinių skirtumų schemos, atlikta jų analizė ir pagrindimas. Skaitinių eksperimentų realizacijai sukurti lygiagretieji algoritmai, jie yra būtini atliekant didelių resursų reikalaujančius skaičiavimus. Disertaciją sudaro įvadas, keturi skyriai, rezultatų apibendrinimas, naudotos literatūros ir autoriaus publikacijų disertacijos tema sarašai. Įvadiniame skyriuje aptariama tiriamoji problema, darbo aktualumas, aprašomas tyrimų objektas, formuluojamas darbo tikslas bei uždaviniai, aprašoma tyrimų metodika, darbo mokslinis naujumas, darbo rezultatų praktinė reikšmė, ginamieji teiginiai. Įvado pabaigoje pristatomos disertacijos tema autoriaus paskelbtos publikacijos ir pranešimai konferencijose bei disertacijos struktūra. Pirmasis skyrius skirtas mokslinės literatūros apžvalgai ir supažindinimui su netiesinės optikos sąvokomis bei... [toliau žr. visą tekstą] Mathematics Numerical methods Nonlinear optics Transparent boundary conditions Parallel computing Skaitiniai metodai Netiesinė optika Pralaidžios kraštinės sąlygos Lygiagretieji algoritmai
288	Lazerių fizikos ir netiesinės optikos ir uždavinių sprendimo metodų sudarymas ir analizė / Construction and analysis of numerical methods for solution of laser physics and nonlinear optics problems Laukaitytė, Inga 18 June 2010 (has links) Disertacijoje nagrinėjami kai kurių lazerių fizikos ir netiesinės optikos uždavinių skaitinės analizės metodai. Tiriami trys pagrindiniai atvejai: bėgančias plokščias bangas aprašantis vienmatis, bėgančias difraguojančias bangas nagrinėjantis dvimatis ir lazerio pluoštų sąveiką netiesinėje Kero terpėje modeliuojantis trimatis modeliai. Šiuos uždavinius sieja pernešimo diferencialinės lygtys dalinėmis išvestinėmis, aprašančios į priešingas puses sklindančias lazerio bangas. Dvimačiame ir trimačiame uždaviniuose sprendžiamos dalinių išvestinių Šrėdingerio (ang. Schrödinger) tipo diferencialinės lygtys. Šiems matematiniams modeliams sudarytos baigtinių skirtumų schemos, atlikta jų analizė ir pagrindimas. Skaitinių eksperimentų realizacijai sukurti lygiagretieji algoritmai, jie yra būtini atliekant didelių resursų reikalaujančius skaičiavimus. Disertaciją sudaro įvadas, keturi skyriai, rezultatų apibendrinimas, naudotos literatūros ir autoriaus publikacijų disertacijos tema sarašai. Įvadiniame skyriuje aptariama tiriamoji problema, darbo aktualumas, aprašomas tyrimų objektas, formuluojamas darbo tikslas bei uždaviniai, aprašoma tyrimų metodika, darbo mokslinis naujumas, darbo rezultatų praktinė reikšmė, ginamieji teiginiai. Įvado pabaigoje pristatomos disertacijos tema autoriaus paskelbtos publikacijos ir pranešimai konferencijose bei disertacijos struktūra. Pirmasis skyrius skirtas mokslinės literatūros apžvalgai ir supažindinimui su netiesinės optikos sąvokomis bei... [toliau žr. visą tekstą] / Mathematical models describing the Q-switched laser generation, which is a widely used laser technique for producing short intense pulses of light, belong to the class of semi-nonlinear models where only source terms nonlinearly depend on the solution. Numerical methods for solution of systems of semi-nonlinear partial differential equations have been extensively studied in many papers. Schrödinger-type equations, parabolic-type equations or general diffusion-reaction models arise in nonlinear optics. Such differential problems are solved mainly by finite-difference and Galerkin methods. The convergence analysis is based on the stability analysis of the linearized problems. The construction and theoretical analysis of discrete schemes for one-dimensional problem give a basis for a numerical solution of more general two-dimensional and three-dimensional problems where a diffraction process is taken into account. The two-dimensional problem simulates the dynamics of high-power semiconductor lasers. To solve the problems simulating propagation of photon fluxes in the nonlinear disperse medium, the finite-difference time-domain method is used. However, the major drawback of this method is that the computational domain must be sufficiently large. In order to restrict the computational domain and to solve the problem only in the region of interest, special artificial boundary conditions are investigated. The three-dimensional problem simulates an interaction of counter propagating... [to full text] Mathematics Skaitiniai metodai Netiesinė optika Pralaidžios kraštinės sąlygos Lygiagretieji algoritmai Numerical methods Nonlinear optics Transparent boundary conditions Parallel computing
289	Numerická simulace proudění stlačitelných tekutin pomocí paralelních výpočtů / Numerical simulation of compressible flows using the parallel computing Šíp, Viktor January 2011 (has links) In the present work we implemented parallel version of a computational fluid dynamics code. This code is based on Discontinuous Galerkin Method (DGM), which is due to its favourable properties suitable for parallelization. In the work we describe the Navier-Stokes equations and their discretization using DGM. We explain the advantages of usage of the DGM and formulate the serial algorithm. Next we focus on the parallel implementation of the algorithm and several particular issues connected to the parallelization. We present the numerical experiments showing the efficiency of the parallel code in the last chapter.
290	Couplage de modèles population et individu-centrés pour la simulation parallélisée des systèmes biologiques : application à la coagulation du sang / Population and individual-based model coupling for the parallel simulation of biological systems : application to blood coagulation Crépin, Laurent 28 October 2013 (has links) Plusieurs types d’expérimentation existent pour étudier et comprendre les systèmes biologiques. Dans ces travaux, nous nous intéressons à la simulation in silico, c’est-à-dire à la simulation numérique de modèles sur un ordinateur. Les systèmes biologiques sont composés d’entités, à la fois nombreuses et variées, en interaction les unes avec les autres. Ainsi, ils peuvent être modélisés par l’intermédiaire de deux approches complémentaires : l’approche population-centrée et l’approche individu-centrée. Face à la multitude et à la variété des phénomènes composant les systèmes biologiques, il nous semble pertinent de coupler ces deux approches pour obtenir une modélisation mixte. En outre, en raison de la quantité conséquente d’informations que représente l’ensemble des entités et des interactions à modéliser, la simulation numérique des systèmes biologiques est particulièrement coûteuse en temps de calcul informatique. Ainsi, dans ce mémoire, nous proposons des solutions techniques de parallélisation permettant d’exploiter au mieux les performances offertes par les architectures multicoeur et multiprocesseur et les architectures graphiques pour la simulation de systèmes biologiques à base de modélisations mixtes. Nous appliquons nos travaux au domaine de la coagulation du sang et plus particulièrement à l’étude de la cinétique biochimique à l’échelle microscopique ainsi qu’à la simulation d’un vaisseau sanguin virtuel. Ces deux applications nous permettent d’évaluer les performances offertes par les solutions techniques de parallélisation que nous proposons, ainsi que leur pertinence dans le cadre de la simulation des systèmes biologiques. / Several types of experimentation exist to study and understand biological systems. Inthis document, we take an interest in in silico simulation, i.e. numerical simulation ofmodels on a computer. Biological systems are made of many various entities, interactingwith each other. Therefore, they can be modeled by two complementary approaches: thepopulation-based approach and the individual-based one. Because of the multitude anddiversity of the phenomena constituting biological systems, we find the coupling of thesetwo approaches relevant to provide a hybrid modelisation. Moreover, because of the hugequantity of data that the entities and interactions represent, numerical simulation of biologicalsystems is especially computationaly intensive. This is why, in this document, we proposeparallel computing methods to take advantage of the performances offered by multicore andmultiprocessor architectures and by graphical ones for the simulation of biological systemsusing hybrid modelisations. We apply our work to blood coagulation and especially to thestudy of biochemical kinetics at the microscopic scale and the simulation of a virtual bloodvessel. These two applications enable us to assess both the performances obtained by theparallel computing methods we proposed and their relevance for biological systems simulation. Simulation des systèmes biologiques Parallélisation Couplage multi-modèles Coagulation du sang Biological systems simulation Parallel computing Multi-model coupling Blood coagulation

Search results