1 |
Analysis and Implementation Considerations of Krylov Subspace Methods on Modern Heterogeneous Computing ArchitecturesHiggins, Andrew, 0009-0007-5527-9263 05 1900 (has links)
Krylov subspace methods are the state-of-the-art iterative algorithms for solving large, sparse systems of equations, which are ubiquitous throughout scientific computing. Even with Krylov methods, these problems are often infeasible to solve on standard workstation computers and must be solved instead on supercomputers. Most modern supercomputers fall into the category of “heterogeneous architectures”, typically meaning a combination of CPU and GPU processors. Thus, development and analysis of Krylov subspace methods on these heterogeneous architectures is of fundamental importance to modern scientific computing.
This dissertation focuses on how this relates to several specific problems. Thefirst analyzes the performance of block GMRES (BGMRES) compared to GMRES for linear systems with multiple right hand sides (RHS) on both CPUs and GPUs, and modelling when BGMRES is most advantageous over GMRES on the
GPU. On CPUs, the current paradigm is that if one wishes to solve a system of equations with multiple RHS, BGMRES can indeed outperform GMRES, but not always. Our original goal was to see if there are some cases for which BGMRES
is slower in execution time on the CPU than GMRES on the CPU, while on the GPU, the reverse holds. This is true, and we generally observe much faster execution times and larger improvements in the case of BGMRES on the GPU. We
also observe that for any fixed matrix, when the number of RHS increase, there is a point in which the improvements start to decrease and eventually any advantage of the (unrestarted) block method is lost. We present a new computational model which helps us explain why this is so. The significance of this analysis is that it first demonstrates increased potential of block Krylov methods on heterogeneous architectures than on previously studied CPU-only machines. Moreover, the theoretical runtime model can be used to identify an optimal partitioning strategy of the RHS
for solving systems with many RHS.
The second problem studies the s-step GMRES method, which is an implementation of GMRES that attains high performance on modern heterogeneous machines by generating s Krylov basis vectors per iteration, and then orthogonalizing the vectors in a block-wise fashion. The use of s-step GMRES is currently limited because the algorithm is prone to numerical instabilities, partially due to breakdowns in a tall-and-skinny QR subroutine. Further, a conservatively small step size must be used in practice, limiting the algorithm’s performance. To address these issues, first a novel randomized tall-and-skinny QR factorization is presented that is significantly more stable than the current practical algorithms without sacrificing performance on GPUs. Then, a novel two-stage block orthogonalization scheme is introduced that significantly improves the performance of the s-step GMRES algorithm when small step sizes are used. These contributions help make s-step GMRES a more practical method in heterogeneous, and therefore exascale, environments. / Mathematics
|
2 |
Delayed Transfer Entropy applied to Big Data / Delayed Transfer Entropy aplicado a Big DataDourado, Jonas Rossi 30 November 2018 (has links)
Recent popularization of technologies such as Smartphones, Wearables, Internet of Things, Social Networks and Video streaming increased data creation. Dealing with extensive data sets led the creation of term big data, often defined as when data volume, acquisition rate or representation demands nontraditional approaches to data analysis or requires horizontal scaling for data processing. Analysis is the most important Big Data phase, where it has the objective of extracting meaningful and often hidden information. One example of Big Data hidden information is causality, which can be inferred with Delayed Transfer Entropy (DTE). Despite DTE wide applicability, it has a high demanding processing power which is aggravated with large datasets as those found in big data. This research optimized DTE performance and modified existing code to enable DTE execution on a computer cluster. With big data trend in sight, this results may enable bigger datasets analysis or better statistical evidence. / A recente popularização de tecnologias como Smartphones, Wearables, Internet das Coisas, Redes Sociais e streaming de Video aumentou a criação de dados. A manipulação de grande quantidade de dados levou a criação do termo Big Data, muitas vezes definido como quando o volume, a taxa de aquisição ou a representação dos dados demanda abordagens não tradicionais para analisar ou requer uma escala horizontal para o processamento de dados. A análise é a etapa de Big Data mais importante, tendo como objetivo extrair informações relevantes e às vezes escondidas. Um exemplo de informação escondida é a causalidade, que pode ser inferida utilizando Delayed Transfer Entropy (DTE). Apesar do DTE ter uma grande aplicabilidade, ele possui uma grande demanda computacional, esta última, é agravada devido a grandes bases de dados como as encontradas em Big Data. Essa pesquisa otimizou e modificou o código existente para permitir a execução de DTE em um cluster de computadores. Com a tendência de Big Data em vista, esse resultado pode permitir bancos de dados maiores ou melhores evidências estatísticas.
|
3 |
Delayed Transfer Entropy applied to Big Data / Delayed Transfer Entropy aplicado a Big DataJonas Rossi Dourado 30 November 2018 (has links)
Recent popularization of technologies such as Smartphones, Wearables, Internet of Things, Social Networks and Video streaming increased data creation. Dealing with extensive data sets led the creation of term big data, often defined as when data volume, acquisition rate or representation demands nontraditional approaches to data analysis or requires horizontal scaling for data processing. Analysis is the most important Big Data phase, where it has the objective of extracting meaningful and often hidden information. One example of Big Data hidden information is causality, which can be inferred with Delayed Transfer Entropy (DTE). Despite DTE wide applicability, it has a high demanding processing power which is aggravated with large datasets as those found in big data. This research optimized DTE performance and modified existing code to enable DTE execution on a computer cluster. With big data trend in sight, this results may enable bigger datasets analysis or better statistical evidence. / A recente popularização de tecnologias como Smartphones, Wearables, Internet das Coisas, Redes Sociais e streaming de Video aumentou a criação de dados. A manipulação de grande quantidade de dados levou a criação do termo Big Data, muitas vezes definido como quando o volume, a taxa de aquisição ou a representação dos dados demanda abordagens não tradicionais para analisar ou requer uma escala horizontal para o processamento de dados. A análise é a etapa de Big Data mais importante, tendo como objetivo extrair informações relevantes e às vezes escondidas. Um exemplo de informação escondida é a causalidade, que pode ser inferida utilizando Delayed Transfer Entropy (DTE). Apesar do DTE ter uma grande aplicabilidade, ele possui uma grande demanda computacional, esta última, é agravada devido a grandes bases de dados como as encontradas em Big Data. Essa pesquisa otimizou e modificou o código existente para permitir a execução de DTE em um cluster de computadores. Com a tendência de Big Data em vista, esse resultado pode permitir bancos de dados maiores ou melhores evidências estatísticas.
|
Page generated in 0.1064 seconds