Spelling suggestions: "subject:"parallel computational""
41 |
Advanced Computational Modeling for Marine Tidal Turbine FarmLi, Zhisong 05 October 2012 (has links)
No description available.
|
42 |
Développement d'un modèle 3D Automate Cellulaire-Éléments Finis (CAFE) parallèle pour la prédiction de structures de grains lors de la solidification d'alliages métalliques / Development of a 3D parallel Cellular Automaton-Finite Element (CAFE) model for grain structure prediction during solidification of metallic alloysCarozzani, Tommy 04 December 2012 (has links)
La formation de la structure de grains dans les métaux pendant la solidification est déterminante pour les propriétés mécaniques et électroniques des pièces coulées. En plus de la texture donnée au matériau, la germination et la croissance des grains sont liées en particulier avec la formation des phases thermodynamiques et les inhomogénéités en composition d'éléments d'alliage. La structure de grains est rarement modélisée à l'échelle macroscopique, d'autant plus que l'approximation 2D est très souvent injustifiée. Dans ces travaux, la germination et la croissance de chaque grain individuel sont suivies avec un modèle macroscopique 3D CAFE. La microstructure interne des grains n'est pas explicitement résolue. Pour valider les approximations faites sur cette microstructure, une comparaison directe avec un modèle microscopique "champ de phase" a été réalisée. Celle-ci a permis de valider les hypothèses de construction du modèle CAFE, de mettre en avant le lien entre données calculées par les modèles microscopiques et paramètres d'entrée des modèles à plus grande échelle, et les domaines de validité de chaque modèle. Dans un deuxième temps, un couplage avec la ségrégation chimique et les bases de données thermodynamiques a été mise en place et appliquée sur un alliage binaire étain-plomb. Une expérience de macroségrégation par convection naturelle a été simulée. L'accord entre les courbes de température expérimentales et simulées atteint une précision de l'ordre de 1K, et la recalescence est correctement prédite. Les cartes de compositions sont comparables qualitativement, ainsi que la structure de grains. Les avantages du suivi de la structure ont été mis en évidence par rapport à une simulation par éléments finis classique. De plus, il a été montré que le calcul 3D était ici indispensable. Enfin, une implémentation parallèle optimisée du code a permis d'appliquer le modèle CAFE à un lingot de silicium polycristallin industriel de dimensions 0,192 x 0,192 x 2,08m, avec une taille de cellules de 250µm. Au total, 4,9 milliards de cellules sont représentées sur le domaine, et la germination et la croissance de 1,6 million de grains sont suivies. / Grain structure formation during solidification of metal parts has a big impact on the final mechanical and electronic properties. Besides determining the crystallographic texture, the nucleation and growth of grains are linked and interact with the appearance of thermodynamic phases and inhomogeneities in the alloy's chemical elements distribution. Grain structure is very rarely modeled on the macro scale, especially because the 2D approximation is often not justified. In this work, the nucleation and growth of each individual grain is tracked with the 3D CAFE macroscopic model. The internal microscopic structure is not explicitly solved. In order to validate the assumptions concerning this microstructure, a direct comparison has been done with a microscopic "phase field" model. That comparison led to the validation of some of the hypothesis on which the CAFE model is built. Moreover, the various data computed in microscopic models that can be used as input parameters of the macroscopic models have been identified, and the limits of each model clearly shown. Secondly, coupling with macrosegregation and thermodynamic databases was achieved, and applied to a binary tin-lead alloy. An experiment featuring macrosegregation induced by natural convection was modeled. The agreement between the experimental and the predicted cooling curves is within 1K, and the recalescence is found to be correctly predicted. The composition maps and the grain structure agree qualitatively with the experiment. The improvement due to structure tracking was demonstrated, regarding a standard finite elements resolution. It was also shown that the 3D simulation is mandatory to reach a good description. Finally, the model was implemented through an optimized parallel algorithm. This permitted to apply the CAFE model on an industrial scale polycrystalline silicon ingot, which dimensions are 0,192 x 0,192 x 2,08m. The cell size is chosen to be 250µm. In total, 4,9 billions of cells were represented, and the nucleation and growth of 1,6 million of grains were tracked.
|
43 |
Modélisation de la solidification dendritique d’un alliage Al-4.5%pdsCu atomisé avec une méthode de champs de phase anisotrope adaptative / Phase-field modeling of dendritic solidification for an Al-4.5wt%Cu atomized droplet using an anisotropic adaptive meshSarkis, Carole 01 December 2016 (has links)
La croissance dendritique est calculée en utilisant un modèle champ de phase avec adaptation automatique anisotrope et non structurées d’un maillage éléments finis. Les inconnues sont la fonction champ de phase, une température adimensionnelle et une composition adimensionnelle, tel que proposé par [KAR1998] et [RAM2004]. Une interpolation linéaire d’éléments finis est utilisée pour les trois variables, après des techniques de stabilisation de discrétisation qui assurent la convergence vers une solution correcte non-oscillante. Afin d'effectuer des calculs quantitatifs de la croissance dendritique sur un grand domaine, deux ingrédients numériques supplémentaires sont nécessaires: un maillage adaptatif anisotrope et non structuré [COU2011], [COU2014] et un calcul parallèle [DIG2001], mis à disposition de la plateforme numérique utilisée (CimLib) basée sur des développements C++. L'adaptation du maillage se trouve à réduire considérablement le nombre de degrés de liberté. Les résultats des simulations en champ de phase pour les dendrites pour une solidification d'un matériau pur et d’un alliage binaire en deux et trois dimensions sont présentés et comparés à des travaux de référence. Une discussion sur les détails de l'algorithme et le temps CPU sont présentés et une comparaison avec un modèle macroscopique sont faite. / Dendritic growth is computed using a phase-field model with automatic adaptation of an anisotropic and unstructured finite element mesh. Unknowns are the phase-field function, a dimensionless temperature and a dimensionless composition, as proposed by [KAR1998] and [RAM2004]. Linear finite element interpolation is used for all variables, after discretization stabilization techniques that ensure convergence towards a correct non-oscillating solution. In order to perform quantitative computations of dendritic growth on a large domain, two additional numerical ingredients are necessary: automatic anisotropic unstructured adaptive meshing [COU2011], [COU2014] and parallel implementations [DIG2001], both made available with the numerical platform used (CimLib) based on C++ developments. Mesh adaptation is found to greatly reduce the number of degrees of freedom. Results of phase-field simulations for dendritic solidification of a pure material and a binary alloy in two and three dimensions are shown and compared with reference work. Discussion on algorithm details and the CPU time are outlined and a comparison with a macroscopic model are made.
|
44 |
Metodologia e ferramentas para paralelização de laços perfeitamente aninhados com processamento heterogêneo. / Methodology and tools for parallelization of nested perfectly loops with heterogeneous processing.Luz, Cleber Silva Ferreira da 01 February 2018 (has links)
Aplicações podem apresentar laços perfeitamente aninhados que demandam um alto poder de processamento. Diversas aplicações científicas contêm laços aninhados em suas estruturas. Tais laços podem processar computações heterogêneas. Uma solução para reduzir o tempo de execução desta classe de aplicações é a paralelização destes laços. A heterogeneidade dos tempos de execução de computações presentes nas iterações de laços perfeitamente aninhados demanda uma paralelização adequada visando uma distribuição de carga homogênea entre os recursos computacionais para reduzir a ociosidade de tais recursos. Esta heterogeneidade implica em um número ideal de recursos computacionais a partir do qual, o seu aumento não impactaria no ganho de desempenho, uma vez que, o tempo mínimo possível é o tempo de execução da tarefa que consome o maior tempo de processamento. Neste trabalho é proposta uma metodologia e ferramentas para paralelização de laços perfeitamente aninhados sem dependência de dados e com processamento heterogêneo em sistemas paralelos e distribuídos. A implementação da metodologia proposta em aplicações melhora o desempenho da execução e reduz a ociosidade dos recursos de processamento. Na metodologia proposta, alguns procedimentos são apoiados por ferramentas desenvolvidas para auxiliá-los. O sistema de processamento poderá ser: um computador Multicore, um Cluster real ou virtual alocado na nuvem. Resultados experimentais são apresentados neste trabalho. Tais resultados mostram a viabilidade e eficiência da metodologia proposta. / Applications may have nested perfectly loops that require a high processing power. Various scientific applications contain nested loops in their structures. Such loops can process heterogeneous computations. A solution to reduce the execution time of this class of applications is the parallelization of these loops. The heterogeneity of the execution times of computations present in the iterations of nested perfectly loops demands an adequate parallelization aiming at a homogeneous load distribution among the computational resources to reduce the idleness of such resources. This heterogeneity implies an ideal number of computational resources which, its increase would not impact the performance gain, since the minimum possible time is the execution time of the task that consumes the longest processing time. In this work is proposed a methodology and tools for parallelization of loops perfectly nested with heterogeneous processing in parallel and distributed systems. The implementation of proposed methodology in application improves execution performance and reduce idles of the processing resources. In the methodology proposed, some procedures are supported by tools developed to assist them. The processing system can be: a computer multicore, a cluster real or virtual allocated in cloud. Experimental results are presented in this work. These results show the feasibility and efficiency of the proposed methodology.
|
45 |
VirD-GM: Uma Contribuição Para o Modelo de Distribuição e Paralelismo do Projeto D-GM / VIRD-GM: A CONTRIBUTION TO THE MODEL OF DISTRIBUTION AND PARALLELISM OF DE PROJECT D-GMFonseca, Vanessa Souza da 07 August 2008 (has links)
Made available in DSpace on 2016-03-22T17:26:09Z (GMT). No. of bitstreams: 1
Vanessa_Souza_da_Fonseca.pdf: 1687661 bytes, checksum: 6dbf2a6dc47f997aa3e8aa0c8f37aced (MD5)
Previous issue date: 2008-08-07 / This research describes the main contributions of the VirD-GM (Virtual Distributed
Geometric Machine Model) for the model of parallelism and distribution of the
Project D-GM (Distributed Geometric Machine Project). In order to provide the abstractions
of the GM model (Geometric Machine) on a platform to support the implementation
distributed and / or parallel computations, the middleware EXEHDA (Execution Environment
for High Distributed Applications) is considered as the execution environment.
The work enabled to create and manage an environment of parallel and directed programming,
and promote the implementation, in this environment, of applications developed in
the visual environment VPE-GM (Visual Programming Environment for the Geometric
Machine Model). These applications are, by nature, parallel and restricted to the study
of parallel algorithms for Scientific Computation. The work focuses on the design and
construction of the software architecture of the VirD-GM, which is responsible for managing
parallel computations obtained by the application of process constructors defined
by the GM model. In this context, this research does not only disposes the construction
of the structural vision of the project D-GM but also consolidates its integration with the
functional vision. It is characterized by an extension of the visual environment VPEGM,
which is responsible for the environment development and code generation for the
Project D-GM. Among the main contributions, one may consider: (i) formalization of
the concepts of concurrency and conflict intermittent with the notions of communication
and synchronization of processes, directly related to the space-time structure of the GM
model; (ii) modeling and implementation of the loading, management and control structures
of the VirD-GM; (iii) implementation and customization of services provided by
the EXEHDA; (iv) construction of the levels of applications, support of execution environment
and basic systems; (v) data flow control and manipulation of adjacency matrix
related to concurrent computations, including the implementation of barriers of synchronization.
The prototyping of VirD-GM and avaliaton achieved through the development
of test applications have implemented the viability of theoretical-practical approach proposed
in Project D-GM / Este trabalho descreve as principais contribuic¸ oes da VirD-GM (Virtual Distributed
Geometric Machine Model) para o modelo de distribuic¸ ao e paralelismo do Projeto
D-GM (Distributed Geometric Machine Project). Para disponibilizar as abstrac¸ oes
do modelo GM (Geometric Machine) em uma plataforma com suporte `a execuc¸ ao distribu
´ıda e/ou concorrente, considera-se o middleware EXEHDA ( Execution Environment
for High Distributed Applications) como ambiente de suporte `a execuc¸ ao. O trabalho
possibilitou criar e gerenciar um ambiente de programac¸ ao paralela e distribu´ıda, bem
como promover a execuc¸ ao, sob este ambiente, das aplicac¸ oes desenvolvidas no ambiente
visual VPE-GM (Visual Programming Environment for the Geometric Machine
Model). Estas aplicac¸ oes s ao, por natureza, paralelas e direcionadas ao estudo de algoritmos
paralelos para a Computac¸ ao Cient´ıfica. O trabalho est´a centrado na concepc¸ ao e
construc¸ ao da arquitetura de software da VirD-GM, respons´avel pelo gerenciamento das
computac¸ oes paralelas obtidas pela aplicac¸ ao de construtores de processos definidos no
modelo GM. Neste contexto, esta dissertac¸ ao n ao s´o viabilizou construc¸ ao da vis ao estrutural
do projeto D-GM como tamb´em consolidou sua integrac¸ ao com a vis ao funcional,
caracterizada pela extens ao do ambiente VPE-GM, respons´avel pelo ambiente de desenvolvimento
e gerac¸ ao de c´odigo para o Projeto D-GM. Dentre as principais contribuic¸ oes,
destacam-se: (i) formalizac¸ ao das noc¸ oes de concorr encia e conflito intermitentes com
as noc¸ oes de comunicac¸ ao e sincronizac¸ ao de processos, diretamente relacionadas com a
estrutura espac¸o-temporal do modelo GM; (ii) definic¸ ao compreendendo a modelagem e
implementac¸ ao dos m´odulos de carregamento, gerenciamento e controle da VirD-GM;
(iii) estudo, aplicac¸ ao e customizac¸ ao dos servic¸os disponibilizados pelo middleware
EXEHDA; (iv) implementac¸ ao das camadas de aplicac¸ ao, de suporte ao ambiente de
execuc¸ ao e de sistemas b´asicos; (v) controle do fluxo de dados e manipulac¸ ao das depend
encias entre as computac¸ oes concorrentes pelo uso de matrizes de adjac encias, incluindo
a implementac¸ ao de barreiras de sincronizac¸ ao, garantindo a correta execuc¸ ao.
A prototipac¸ ao da VirD-GM e a avaliac¸ ao obtida com o desenvolvimento de aplicac¸ oes
de teste demonstraram a viabilidade da abordagem te´orica-pr´atica proposta no Projeto
D-GM
|
46 |
Técnicas de decomposição de domínio em computação paralela para simulação de campos eletromagnéticos pelo método dos elementos finitos / Domain decomposition and parallel processing techniques applied to the solution of systems of algebraic equations issued from the finite element analysis of eletromagnetic phenomena.Palin, Marcelo Facio 18 June 2007 (has links)
Este trabalho apresenta a aplicação de técnicas de Decomposição de Domínio e Processamento Paralelo na solução de grandes sistemas de equações algébricas lineares provenientes da modelagem de fenômenos eletromagnéticos pelo Método de Elementos Finitos. Foram implementadas as técnicas dos tipos Complemento de Schur e o Método Aditivo de Schwarz, adaptadas para a resolução desses sistemas em cluster de computadores do tipo Beowulf e com troca de mensagens através da Biblioteca MPI. A divisão e balanceamento de carga entre os processadores são feitos pelo pacote METIS. Essa metodologia foi testada acoplada a métodos, seja iterativo (ICCG), seja direto (LU) na etapa de resolução dos sistemas referentes aos nós internos de cada partição. Para a resolução do sistema envolvendo os nós de fronteira, no caso do Complemento de Schur, utilizou-se uma implementação paralisada do Método de Gradientes Conjugados (PCG). S~ao discutidos aspectos relacionados ao desempenho dessas técnicas quando aplicadas em sistemas de grande porte. As técnicas foram testadas na solução de problemas de aplicação do Método de Elementos Finitos na Engenharia Elétrica (Magnetostática, Eletrocinética e Magnetodinâmica), sejam eles de natureza bidimensional com malhas não estruturadas, seja tridimensional, com malhas estruturadas. / This work presents the study of Domain Decomposition and Parallel Processing Techniques applied to the solution of systems of algebraic equations issued from the Finite Element Analysis of Electromagnetic Phenomena. Both Schur Complement and Schwarz Additive techniques were implemented. They were adapted to solve the linear systems in Beowulf clusters with the use of MPI library for message exchange. The load balance among processors is made with the aid of METIS package. The methodology was tested in association to either iterative (ICCG) or direct (LU) methods in order to solve the system related to the inner nodes of each partition. In the case of Schur Complement, the solution of the system related to the boundary nodes was performed with a parallelized Conjugated Gradient Method (PCG). Some aspects of the peformance of these techniques when applied to large scale problems have also been discussed. The techniques has been tested in the simulation of a collection of problems of Electrical Engineering, modelled by the Finite Element Method, both in two dimensions with unstructured meshes (Magnetostatics) and three dimensions with structured meshes (Electrokinetics).
|
47 |
Metodologia e ferramentas para paralelização de laços perfeitamente aninhados com processamento heterogêneo. / Methodology and tools for parallelization of nested perfectly loops with heterogeneous processing.Cleber Silva Ferreira da Luz 01 February 2018 (has links)
Aplicações podem apresentar laços perfeitamente aninhados que demandam um alto poder de processamento. Diversas aplicações científicas contêm laços aninhados em suas estruturas. Tais laços podem processar computações heterogêneas. Uma solução para reduzir o tempo de execução desta classe de aplicações é a paralelização destes laços. A heterogeneidade dos tempos de execução de computações presentes nas iterações de laços perfeitamente aninhados demanda uma paralelização adequada visando uma distribuição de carga homogênea entre os recursos computacionais para reduzir a ociosidade de tais recursos. Esta heterogeneidade implica em um número ideal de recursos computacionais a partir do qual, o seu aumento não impactaria no ganho de desempenho, uma vez que, o tempo mínimo possível é o tempo de execução da tarefa que consome o maior tempo de processamento. Neste trabalho é proposta uma metodologia e ferramentas para paralelização de laços perfeitamente aninhados sem dependência de dados e com processamento heterogêneo em sistemas paralelos e distribuídos. A implementação da metodologia proposta em aplicações melhora o desempenho da execução e reduz a ociosidade dos recursos de processamento. Na metodologia proposta, alguns procedimentos são apoiados por ferramentas desenvolvidas para auxiliá-los. O sistema de processamento poderá ser: um computador Multicore, um Cluster real ou virtual alocado na nuvem. Resultados experimentais são apresentados neste trabalho. Tais resultados mostram a viabilidade e eficiência da metodologia proposta. / Applications may have nested perfectly loops that require a high processing power. Various scientific applications contain nested loops in their structures. Such loops can process heterogeneous computations. A solution to reduce the execution time of this class of applications is the parallelization of these loops. The heterogeneity of the execution times of computations present in the iterations of nested perfectly loops demands an adequate parallelization aiming at a homogeneous load distribution among the computational resources to reduce the idleness of such resources. This heterogeneity implies an ideal number of computational resources which, its increase would not impact the performance gain, since the minimum possible time is the execution time of the task that consumes the longest processing time. In this work is proposed a methodology and tools for parallelization of loops perfectly nested with heterogeneous processing in parallel and distributed systems. The implementation of proposed methodology in application improves execution performance and reduce idles of the processing resources. In the methodology proposed, some procedures are supported by tools developed to assist them. The processing system can be: a computer multicore, a cluster real or virtual allocated in cloud. Experimental results are presented in this work. These results show the feasibility and efficiency of the proposed methodology.
|
48 |
Técnicas de decomposição de domínio em computação paralela para simulação de campos eletromagnéticos pelo método dos elementos finitos / Domain decomposition and parallel processing techniques applied to the solution of systems of algebraic equations issued from the finite element analysis of eletromagnetic phenomena.Marcelo Facio Palin 18 June 2007 (has links)
Este trabalho apresenta a aplicação de técnicas de Decomposição de Domínio e Processamento Paralelo na solução de grandes sistemas de equações algébricas lineares provenientes da modelagem de fenômenos eletromagnéticos pelo Método de Elementos Finitos. Foram implementadas as técnicas dos tipos Complemento de Schur e o Método Aditivo de Schwarz, adaptadas para a resolução desses sistemas em cluster de computadores do tipo Beowulf e com troca de mensagens através da Biblioteca MPI. A divisão e balanceamento de carga entre os processadores são feitos pelo pacote METIS. Essa metodologia foi testada acoplada a métodos, seja iterativo (ICCG), seja direto (LU) na etapa de resolução dos sistemas referentes aos nós internos de cada partição. Para a resolução do sistema envolvendo os nós de fronteira, no caso do Complemento de Schur, utilizou-se uma implementação paralisada do Método de Gradientes Conjugados (PCG). S~ao discutidos aspectos relacionados ao desempenho dessas técnicas quando aplicadas em sistemas de grande porte. As técnicas foram testadas na solução de problemas de aplicação do Método de Elementos Finitos na Engenharia Elétrica (Magnetostática, Eletrocinética e Magnetodinâmica), sejam eles de natureza bidimensional com malhas não estruturadas, seja tridimensional, com malhas estruturadas. / This work presents the study of Domain Decomposition and Parallel Processing Techniques applied to the solution of systems of algebraic equations issued from the Finite Element Analysis of Electromagnetic Phenomena. Both Schur Complement and Schwarz Additive techniques were implemented. They were adapted to solve the linear systems in Beowulf clusters with the use of MPI library for message exchange. The load balance among processors is made with the aid of METIS package. The methodology was tested in association to either iterative (ICCG) or direct (LU) methods in order to solve the system related to the inner nodes of each partition. In the case of Schur Complement, the solution of the system related to the boundary nodes was performed with a parallelized Conjugated Gradient Method (PCG). Some aspects of the peformance of these techniques when applied to large scale problems have also been discussed. The techniques has been tested in the simulation of a collection of problems of Electrical Engineering, modelled by the Finite Element Method, both in two dimensions with unstructured meshes (Magnetostatics) and three dimensions with structured meshes (Electrokinetics).
|
49 |
Parallel Solution Of Soil-structure Interaction Problems On Pc ClustersBahcecioglu, Tunc 01 February 2011 (has links) (PDF)
Numerical assessment of soil structure interaction problems require heavy computational efforts because of the dynamic and iterative (nonlinear) nature of the problems. Furthermore,
modeling soil-structure interaction may require
|
50 |
Algèbre linéaire exacte, parallèle, adaptative et générique / Adaptive and generic parallel exact linear algebraSultan, Ziad 17 June 2016 (has links)
Les décompositions en matrices triangulaires sont une brique de base fondamentale en calcul algébrique. Ils sont utilisés pour résoudre des systèmes linéaires et calculer le rang, le déterminant, l'espace nul ou les profiles de rang en ligne et en colonne d'une matrix. Le projet de cette thèse est de développer des implantations hautes performances parallèles de l'élimination de Gauss exact sur des machines à mémoire partagée.Dans le but d'abstraire le code de l'environnement de calcul parallèle utilisé, un langage dédié PALADIn (Parallel Algebraic Linear Algebra Dedicated Interface) a été implanté et est basé essentiellement sur des macros C/C++. Ce langage permet à l'utilisateur d'écrire un code C++ et tirer partie d’exécutions séquentielles et parallèles sur des architectures à mémoires partagées en utilisant le standard OpenMP et les environnements parallel KAAPI et TBB, ce qui lui permet de bénéficier d'un parallélisme de données et de taches.Plusieurs aspects de l'algèbre linéaire exacte parallèle ont été étudiés. Nous avons construit de façon incrémentale des noyaux parallèles efficaces pour les multiplication de matrice, la résolution de systèmes triangulaires au dessus duquel plusieurs variantes de l'algorithme de décomposition PLUQ sont construites. Nous étudions la parallélisation de ces noyaux en utilisant plusieurs variantes algorithmiques itératives ou récursives et en utilisant des stratégies de découpes variées.Nous proposons un nouvel algorithme récursive de l'élimination de Gauss qui peut calculer simultanément les profiles de rang en ligne et en colonne d'une matrice et de toutes ses sous-matrices principales, tout en étant un algorithme état de l'art de l'élimination de Gauss. Nous étudions aussi les conditions pour qu'un algorithme de l'élimination de Gauss révèle cette information en définissant un nouvel invariant matriciel, la matrice de profil de rang. / Triangular matrix decompositions are fundamental building blocks in computational linear algebra. They are used to solve linear systems, compute the rank, the determinant, the null-space or the row and column rank profiles of a matrix. The project of my PhD thesis is to develop high performance shared memory parallel implementations of exact Gaussian elimination.In order to abstract the computational code from the parallel programming environment, we developed a domain specific language, PALADIn: Parallel Algebraic Linear Algebra Dedicated Interface, that is based on C/C + + macros. This domain specific language allows the user to write C + + code and benefit from sequential and parallel executions on shared memory architectures using the standard OpenMP, TBB and Kaapi parallel runtime systems and thus providing data and task parallelism.Several aspects of parallel exact linear algebra were studied. We incrementally build efficient parallel kernels, for matrix multiplication, triangular system solving, on top of which several variants of PLUQ decomposition algorithm are built. We study the parallelization of these kernels using several algorithmic variants: either iterative or recursive and using different splitting strategies.We propose a recursive Gaussian elimination that can compute simultaneously therow and column rank profiles of a matrix as well as those of all of its leading submatrices, in the same time as state of the art Gaussian elimination algorithms. We also study the conditions making a Gaussian elimination algorithm reveal this information by defining a new matrix invariant, the rank profile matrix.
|
Page generated in 0.1258 seconds