• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 226
  • 81
  • 30
  • 24
  • 14
  • 7
  • 6
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 501
  • 501
  • 103
  • 70
  • 61
  • 58
  • 58
  • 57
  • 57
  • 56
  • 54
  • 54
  • 52
  • 50
  • 47
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
291

Approche parcimonieuse et calcul haute performance pour la tomographie itérative régularisée. / Computationally Efficient Sparse Prior in Regularized Iterative Tomographic Reconstruction

Notargiacomo, Thibault 14 February 2017 (has links)
La tomographie est une technique permettant de reconstruire une carte des propriétés physiques de l'intérieur d'un objet, à partir d'un ensemble de mesures extérieures. Bien que la tomographie soit une technologie mature, la plupart des algorithmes utilisés dans les produits commerciaux sont basés sur des méthodes analytiques telles que la rétroprojection filtrée. L'idée principale de cette thèse est d'exploiter les dernières avancées dans le domaine de l'informatique et des mathématiques appliqués en vue d'étudier, concevoir et implémenter de nouveaux algorithmes dédiés à la reconstruction 3D en géométrie conique. Nos travaux ciblent des scenarii d'intérêt clinique tels que les acquisitions faible dose ou faible nombre de vues provenant de détecteurs plats. Nous avons étudié différents modèles d'opérateurs tomographiques, leurs implémentations sur serveur multi-GPU, et avons proposé l'utilisation d'une transformée en ondelettes complexes 3D pour régulariser le problème inverse. / X-Ray computed tomography (CT) is a technique that aims at providing a measure of a given property of the interior of a physical object, given a set of exterior projection measurement. Although CT is a mature technology, most of the algorithm used for image reconstruction in commercial applications are based on analytical methods such as the filtered back-projection. The main idea of this thesis is to exploit the latest advances in the field of applied mathematics and computer sciences in order to study, design and implement algorithms dedicated to 3D cone beam reconstruction from X-Ray flat panel detectors targeting clinically relevant usecases, including low doses and few view acquisitions.In this work, we studied various strategies to model the tomographic operators, and how they can be implemented on a multi-GPU platform. Then we proposed to use the 3D complex wavelet transform in order to regularize the reconstruction problem.
292

Vers une simulation par éléments finis en temps réel pour le génie électrique / Towards a real-time simulation by finite elements for electrical engineering

Dinh, Van Quang 15 December 2016 (has links)
Les phénomènes physiques dans le domaine de génie électrique sont basés sur les équations de Maxwell qui sont des équations aux dérivés partielles dont les solutions sont des fonctions s’appuyant sur les propriétés des matériaux et vérifiant certaines conditions aux limites du domaine d’étude. La méthode des éléments finis (MEF) est la méthode la plus couramment utilisée pour calculer les solutions de ces équations et en déduire les champs et inductions magnétiques et électriques. De nos jours, le calcul parallèle GPU (Graphic Processor Unit) présente un potentiel important de performance à destination du calcul numérique par rapport au calcul traditionnel par CPU. Le calcul par GPU consiste à utiliser un processeur graphique (Graphic Processor Unit) en complément du CPU pour accélérer les applications en sciences et en ingénierie. Le calcul par GPU permet de paralléliser massivement les tâches et d'offrir ainsi un maximum de performances en accélérant les portions de code les plus lourdes, le reste de l'application restant affectée au CPU. Cette thèse s’inscrit dans le contexte de modélisation dans le domaine de génie électrique utilisant la méthode des éléments finis. L’objectif de la thèse est d’améliorer la performance de la MEF, voire d’en changer les modes d’utilisation en profitant de la grande performance du calcul parallèle sur GPU. En effet, si grâce au GPU, le calcul parvenait à s’effectuer en quasi temps réel, les outils de simulation deviendraient alors des outils de conception intuitifs, qui permettraient par exemple de « sentir » la sensibilité d’un dimensionnement à la modification de paramètres géométriques ou physiques. Un nouveau champ d’utilisation des codes de simulation s’ouvrirait alors. C’est le fil conducteur de ce travail, qui tente, en abordant les différentes phases d’une simulation par la MEF, de les accélérer au maximum, pour rendre l’ensemble quasi instantané. Ainsi dans cette thèse, les phases de maillage, intégration, résolution et exploitation sont abordées successivement. Pour chacune de ces grandes étapes de la simulation d’un dispositif, les méthodes de la littérature sont examinées et de nouvelles approches sont proposées. Les performances atteintes sont analysées et comparées au cout de l’implantation traditionnelle sur CPU ; Les détails d’implantation sont décrits assez finement, car la performance globale des approches sur GPU sont très liés à ces choix. / The physical phenomena in the electrical engineering field are based on Maxwell's equations in which solutions are functions verifying the material properties and satisfying certain boundary conditions on the field. The finite element method (FEM) is the most commonly used method to calculate the solutions of these equations and deduce the magnetic and electric fields.Nowadays, the parallel computing on graphics processors offers a very high computing performance over traditional calculation by CPU. The GPU-accelerated computing makes use of a graphics processing unit (GPU) together with a CPU to accelerate many applications in science and engineering. It enables massively parallelized tasks and thus accelerate the performance by offloading the compute-intensive portions of the application to the GPU while the remainder of the application still runs on the CPU.The thesis deals with the modeling in the magnetic field using the finite element method. The aim of the thesis is to improve the performance of the MEF by taking advantage of the high performance parallel computing on the GPU. Thus if the calculation can be performed in near real-time, the simulation tools would become an intuitive design tool which allow for example to "feel" the sensitivity of a design modification of geometric and physical parameters. A new field of use of simulation codes would open. This is the theme of this work, which tries to accelerate the different phases of a simulation to make the whole almost instantaneous. So in this thesis, the meshing, the numerical integration, the assembly, the resolution and the post processing are discussed respectively. For each phase, the methods in the literature are examined and new approaches are proposed. The performances are analyzed and compared. The implementation details are described as the overall performance of GPU approaches are closely linked to these choices.
293

Non-oscillatory forward-in-time method for incompressible flows

Cao, Zhixin January 2018 (has links)
This research extends the capabilities of Non-oscillatory Forward-in-Time (NFT) solvers operating on unstructured meshes to allow for accurate simulation of incompressible turbulent flows. This is achieved by the development of Large Eddy Simulation (LES) and Detached Eddy Simulation (DES) turbulent flow methodologies and the development of parallel option of the flow solver. The effective use of LES and DES requires a development of a subgrid-scale model. Several subgrid-scale models are implemented and studied, and their efficacy is assessed. The NFT solvers employed in this work are based on the Multidimensional Positive Definite Advection Transport Algorithm (MPDATA) that facilitates novel implicit Large Eddy Simulation (ILES) approach to treating turbulence. The flexibility and robustness of the new NFT MPDATA solver are studied and successfully validated using well established benchmarks and concentrate on a flow past a sphere. The flow statistics from the solutions are compared against the existing experimental and numerical data and fully confirm the validity of the approach. The parallel implementation of the flow solver is also documented and verified showing a substantial speedup of computations. The proposed method lays foundations for further studies and developments, especially for exploring the potential of MPDATA in the context of ILES and associated treatments of boundary conditions at solid boundaries.
294

[en] APPLICATION OF THE OBJECT-ORIENTED PROGRAMMING AND DISTRIBUTED COMPUTING TO THE STRUCTURAL ANALYSIS BY THE FINITE ELEMENT METHOD / [pt] APLICAÇÃO DA PROGRAMAÇÃO ORIENTADA A OBJETOS E DA COMPUTAÇÃO DISTRIBUÍDA AO MEF PARA ANÁLISE DE ESTRUTURAS

MARCELO RODRIGUES LEAO SILVA 08 March 2006 (has links)
[pt] O objetivo deste trabalho é o de apresentar uma proposta de metodologia para a análise de estruturas pelo Método dos Elementos Finitos, utilizando-se na sua implementação as técnicas de programação orientada a objetos e computação distribuída. A utilização das técnicas de programação orientada a objetos permite a implementação de um código compacto, portável e de fácil adaptação. Para a implementação do código optou-se pela utilização da linguagem C++, que possui os recursos mais importantes da programação orientada a objetos, destacando-se a herança, o polimorfismo e a sobrecarga de operadores, e da biblioteca MPI de computação paralela. Inicialmente serão apresentados os procedimentos necessários à implementação orientada a objetos da análise de estruturas pelo método dos elementos finitos, sendo posteriormente apresentadas às alterações necessárias à inclusão das técnicas de processamento paralelo, empregando-se duas técnicas de paralelização. A grande quantidade de operações matriciais envolvidas na análise de estruturas pelo método dos elementos finitos motivou ainda o desenvolvimento de uma biblioteca de classes para a representação destas operações. Os exemplos apresentados têm a finalidade de verificar a exatidão dos resultados obtidos com o código implementado, e as vantagens de se empregar a programação orientada a objetos e a computação distribuída / [en] This work focuses on a methodology for the analysis of structures based on the Finite Element Method (FEM) using on its implementation object-oriented programming techniques, together with parallel programming. The usage of object-oriented programming techniques allows the implementation of a compact, portable and of easily adaptable source code. The implementation was carried out using C++ language, which has the main features of the object-oriented programming, such as inheritance, polymorphism and operator overloading, and the MPI library for parallel computing. The procedures taken into account on object-oriented implementations for analysis of structures using the Finite Element Method are presented, followed by the modifications needed for including parallel computing, using two strategies. Also, the large amount of matrix operations involved on the structures analysis using Finite Element Method motivated the development of a class library which represents such operations. The examples presented have the purpose of verify the accuracy of the results obtained with the code, and the advantages of the use of object-oriented programming and parallel computing.
295

Parallel Optimization of Polynomials for Large-scale Problems in Stability and Control

January 2016 (has links)
abstract: In this thesis, we focus on some of the NP-hard problems in control theory. Thanks to the converse Lyapunov theory, these problems can often be modeled as optimization over polynomials. To avoid the problem of intractability, we establish a trade off between accuracy and complexity. In particular, we develop a sequence of tractable optimization problems - in the form of Linear Programs (LPs) and/or Semi-Definite Programs (SDPs) - whose solutions converge to the exact solution of the NP-hard problem. However, the computational and memory complexity of these LPs and SDPs grow exponentially with the progress of the sequence - meaning that improving the accuracy of the solutions requires solving SDPs with tens of thousands of decision variables and constraints. Setting up and solving such problems is a significant challenge. The existing optimization algorithms and software are only designed to use desktop computers or small cluster computers - machines which do not have sufficient memory for solving such large SDPs. Moreover, the speed-up of these algorithms does not scale beyond dozens of processors. This in fact is the reason we seek parallel algorithms for setting-up and solving large SDPs on large cluster- and/or super-computers. We propose parallel algorithms for stability analysis of two classes of systems: 1) Linear systems with a large number of uncertain parameters; 2) Nonlinear systems defined by polynomial vector fields. First, we develop a distributed parallel algorithm which applies Polya's and/or Handelman's theorems to some variants of parameter-dependent Lyapunov inequalities with parameters defined over the standard simplex. The result is a sequence of SDPs which possess a block-diagonal structure. We then develop a parallel SDP solver which exploits this structure in order to map the computation, memory and communication to a distributed parallel environment. Numerical tests on a supercomputer demonstrate the ability of the algorithm to efficiently utilize hundreds and potentially thousands of processors, and analyze systems with 100+ dimensional state-space. Furthermore, we extend our algorithms to analyze robust stability over more complicated geometries such as hypercubes and arbitrary convex polytopes. Our algorithms can be readily extended to address a wide variety of problems in control such as Hinfinity synthesis for systems with parametric uncertainty and computing control Lyapunov functions. / Dissertation/Thesis / Doctoral Dissertation Mechanical Engineering 2016
296

On Scalable Reconfigurable Component Models for High-Performance Computing / Modèles à composants reconfigurables et passant à l'échelle pour le calcul haute performance

Lanore, Vincent 10 December 2015 (has links)
La programmation à base de composants est un paradigme de programmation qui facilite la réutilisation de code et la séparation des préoccupations. Les modèles à composants dits « reconfigurables » permettent de modifier en cours d'exécution la structure d'une application. Toutefois, ces modèles ne sont pas adaptés au calcul haute performance (HPC) car ils reposent sur des mécanismes ne passant pas à l'échelle.L'objectif de cette thèse est de fournir des modèles, des algorithmes et des outils pour faciliter le développement d'applications HPC reconfigurables à base de composants. La principale contribution de la thèse est le modèle à composants formel DirectMOD qui facilite l'écriture et la réutilisation de code de transformation distribuée. Afin de faciliter l'utilisation de ce premier modèle, nous avons également proposé :• le modèle formel SpecMOD qui permet la spécialisation automatique d'assemblage de composants afin de fournir des fonctionnalités de génie logiciel de haut niveau ; • des mécanismes de reconfiguration performants à grain fin pour les applications AMR, une classe d'application importante en HPC.Une implémentation de DirectMOD, appelée DirectL2C, a été réalisée et a permis d'implémenter une série de benchmarks basés sur l'AMR pour évaluer notre approche. Des expériences sur grappes de calcul et supercalculateur montrent que notre approche passe à l'échelle. De plus, une analyse quantitative du code produit montre que notre approche est compacte et facilite la réutilisation. / Component-based programming is a programming paradigm which eases code reuse and separation of concerns. Some component models, which are said to be "reconfigurable", allow the modification at runtime of an application's structure. However, these models are not suited to High-Performance Computing (HPC) as they rely on non-scalable mechanisms.The goal of this thesis is to provide models, algorithms and tools to ease the development of component-based reconfigurable HPC applications.The main contribution of the thesis is the DirectMOD component model which eases development and reuse of distributed transformations. In order to improve on this core model in other directions, we have also proposed:• the SpecMOD formal component model which allows automatic specialization of hierarchical component assemblies and provides high-level software engineering features;• mechanisms for efficient fine-grain reconfiguration for AMR applications, an important application class in HPC.An implementation of DirectMOD, called DirectL2C, as been developed so as to implement a series of benchmarks to evaluate our approach. Experiments on HPC architectures show our approach scales. Moreover, a quantitative analysis of the benchmark's codes show that our approach is compact and eases reuse.
297

Paralelismo em visão natural e artificial / Paralelism in natural and artificial

Odemir Martinez Bruno 16 June 2000 (has links)
Nesta tese são abordados, de maneira integrada, aspectos de paralelismo em visão natural e artificial, com discussões críticas das diversas áreas relacionadas. O paralelismo é discutido no sistema visual dos primatas, assim como suas principais contribuições e motivações incentivando a incorporação de paralelismo em sistemas de visão artificial. Um dos objetivos principais é fornecer as bases de paralelismo para o desenvolvimento do projeto Cyvis-1, uma proposta do Grupo de Pesquisa em Visão Cibernética (IFSC-USP) para visão versátil, com forte motivação biológica e baseada no córtex visual dos primatas. Para tanto, foi introduzida e implementada a proposta CVMP (Cybernetic Vision Message Passage), um conjunto de ferramentas para o desenvolvimento de aplicações paralelas em visão, tanto para sistemas distribuídos como para máquinas multiprocessadores. Baseada em programação orientada a objetos, interação homem-máquina, engenharia de software e programação visual, a proposta prima pelo desenvolvimento de forma simples e amigável. O CVMP é testado, avaliado e validado quanto a aspectos de funcionalidade e utilização, através da implementação paralela de diversos algoritmos de visão computacional e de processamento de imagens (operadores locais, transformada de Hough e transformada de Fourier, entre outros) os quais, além de ilustrar a utilização da ferramenta, são discutidos em termos de arquitetura e balanceamento de carga. São apresentadas três aplicações reais de sistemas paralelos de visão computacional, implementadas através do CVMP, demonstrando a eficiência da ferramenta, na implementação paralela, na utilização e cooperação de trabalho. Duas destas aplicações (integração de atributos visuais no projeto Cyvis-1 e um modelo de complexidade com base na percepção humana), foram desenvolvidas em conjunto com outros pesquisadores do Grupo de Pesquisa em Visão Cibernética. A terceira aplicação apresenta uma proposta do autor para um sistema automático de reconhecimento de plantas arbóreas (Botânica) / This thesis addresses, in an integrated way, the concept and usage of parallelism in natural and artificial vision. It starts by revising the primate visual system, and discussing how its principles and solutions can be extended to computational systems. One of the main objectives is to supply the parallelism backbone for the development of the Cyvis-1 System, which is a proposal of the Cybernetic Vision Research Group (IFSC-USP) for versatile vision, presenting a strong biological motivation, especially regarding the primate visual cortex. In order to achieve these objectives, the CVMP - Cybernetic Vision Message Passage - had to be developed, representing a set of simple and friendly parallel tools for computer vision applications in distributed and parallel (multiprocessor) systems, which is based on object oriented programming, human-machine interaction, software engineering and visual programming. The CVMP is tested, evaluated and validated with respect to functionality and utilization through the parallel implementation of several algorithms in computer vision and image processing (local operators, Hough transform, Fourier transform, etc.) which, in addition to illustrating the tools, are also discussed as far as their architecture and load balancing is concerned. Three applications of parallel computer vision systems to real situations are presented and implemented by using CVMP, corroborating the effectiveness of the tools in the parallel implementation, usage, and researcher integration. Two such applications (visual attributes integration in Cyvis-1 and a human complexity model) have been developed in collaboration with other researchers at the Cybernetic Vision Research Group. The third application presents the author\'s proposal for an automated system for arboreal plants recognition (Botany)
298

Paralelização do cálculo de estruturas de bandas de semicondutores usando o High Performance Fortran / Semiconductors band structure calculus paralelization using High Performance Fortran

Rodrigo Daniel Malara 14 January 2005 (has links)
O uso de sistemas multiprocessados para a resolução de problemas que demandam um grande poder computacional tem se tornado cada vez mais comum. Porém a conversão de programas seqüenciais para programas concorrentes ainda não é uma tarefa trivial. Dentre os fatores que tornam esta tarefa difícil, destacamos a inexistência de um paradigma único e consolidado para a construção de sistemas computacionais paralelos e a existência de várias plataformas de programação para o desenvolvimento de programas concorrentes. Nos dias atuais ainda é impossível isentar o programador da especificação de como o problema será particionado entre os vários processadores. Para que o programa paralelo seja eficiente, o programador deve conhecer a fundo aspectos que norteiam a construção do hardware computacional paralelo, aspectos inerentes à arquitetura onde o software será executado e à plataforma de programação concorrente escolhida. Isto ainda não pode ser mudado. O ganho que podemos obter é na implementação do software paralelo. Esta tarefa pode ser trabalhosa e demandar muito tempo para a depuração, pois as plataformas de programação não possibilitam que o programador abstraia dos elementos de hardware. Tem havido um grande esforço na criação de ferramentas que otimizem esta tarefa, permitindo que o programador se expresse mais fácil e sucintamente quanto à para1elização do programa. O presente trabalho se baseia na avaliação dos aspectos ligados à implementação de software concorrente utilizando uma plataforma de portabilidade chamada High Performance Fortran, aplicado a um problema específico da física: o cálculo da estrutura de bandas de heteroestruturas semicondutoras. O resultado da utilização desta plataforma foi positivo. Obtivemos um ganho de performance superior ao esperado e verificamos que o compilador pode ser ainda mais eficiente do que o próprio programador na paralelização de um programa. O custo inicial de desenvolvimento não foi muito alto, e pode ser diluído entre os futuros projetos que venham a utilizar deste conhecimento pois após a fase de aprendizado, a paralelização de programas se torna rápida e prática. A plataforma de paralelização escolhida não permite a paralelização de todos os tipos de problemas, apenas daqueles que seguem o paradigma de paralelismo por dados, que representam uma parcela considerável dos problemas típicos da Física. / The employment of multiprocessor systems to solve problems that demand a great computational power have become more and more usual. Besides, the conversion of sequential programs to concurrent ones isn\'t trivial yet. Among the factors that makes this task difficult, we highlight the nonexistence of a unique and consolidated paradigm for the parallel computer systems building and the existence of various programming platforms for concurrent programs development. Nowadays it is still impossible to exempt the programmer of the specification about how the problem will be partitioned among the various processors. In order to have an efficient parallel program the programmer have to deeply know subjects that heads the parallel hardware systems building, the inherent architecture where the software will run and the chosen concurrent programming platform. This cannot be changed yet. The gain is supposed to be on the parallel software implementation. This task can be very hard and consume so much time on debugging it, because the programming platforms do not allow the programmer to abstract from the hardware elements. It has been a great effort in the development of tools that optimize this task, allowing the programmer to work easily and briefly express himself concerning the software parallelization. The present work is based on the evaluation of aspects linked to the concurrent software implementation using a portability platform called High Performance Fortran, applied to a physics specific problem: the calculus of semiconductor heterostructures? valence band structure. The result of the use of this platform use was positive. We obtained a performance gain superior than we expected and we could assert that the compiler is able to be more effective than the programmer on the paralelization of a program. The initial development cost wasn\'t so high and it can be diluted between the next projects that would use the acquired knowledge, because after the learning phase, the programs parallelization task becomes quick and practical. The chosen parallelization platform does not allow the parallelization of all kinds of problems, but just the ones that follow the data parallelism paradigm that represents a considerable parcel of tipical Physics problems.
299

Implementação e analise de ferramentas de quimica comoutacional aplicada ao desenvolvimento de processos / Implentation and analysis of computational chemistry tools applied to the processes development

Pinto, Jefferson Ferreira, 1972- 22 February 2006 (has links)
Orientador: Rubens Maciel Filho / Tese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Qumica / Made available in DSpace on 2018-08-09T18:17:35Z (GMT). No. of bitstreams: 1 Pinto_JeffersonFerreira_D.pdf: 2614838 bytes, checksum: 839b2ff15752c4919e502eaf94e81822 (MD5) Previous issue date: 2006 / Resumo: As indústrias vêm mudando profundamente nos últimos anos, principalmente para redução de consumo energético, melhoria na qualidade dos produtos e adequação às leis ambientais. Estas mudanças podem ser auxiliadas pelas técnicas de modelagem e simulação, incluindo o detalhamento do modelo em nível atômico, quando então recebe o nome de química computacional. Diversas ferramentas abrangendo todas as áreas de química computacional, em sua maioria gratuitas ou de domínio público, foram implementadas em um microcomputador e analisadas para aplicação no desenvolvimento de processos. Foi analisado também o desempenho computacional em função do sistema operacional, que apresentou diferenças no desempenho de até 353% para cálculos de ponto flutuante, 18% para acesso a memória RAM e 67% para acesso a disco. Para melhorar o desempenho computacional, foi elaborado o projeto de um ambiente computacional paralelo de alto desempenho, no qual o custo ficou limitado à aquisição de hardware, de fácil disponibilidade no mercado, sendo que os softwares utilizados são gratuitos ou de domínio público / Abstract: lndustries are changing in the last years, mainly for reduction of energy consumption, improvement in the product quality and adequacy to the environmental laws. These changes can be assisted by the modeling and simulation techniques, including the detailing of the model in atomic leveI, when then it receives the name of computational chemistry. Several tools enclosing all the areas of computational chemistry, in its mainly free or of public domain, had been implemented in a microcomputer and analyzed for application in the processes development. The computational performance in function of the operational system was also analyzed, that presented differences in the performance of up to 353% for floating-point calculations, 18% for access the RAM memory and 67% for access the hard disk. To improve the computational performance the project of high performance computer system was elaborated, in which the cost was limited the acquisition of the hardware, of easy availability in the market, being that software used is free or of public domain / Doutorado / Desenvolvimento de Processos Químicos / Doutor em Engenharia Química
300

The hierarchical preconditioning having unstructured threedimensional grids

Globisch, Gerhard 09 September 2005 (has links) (PDF)
Continuing the previous work in the preprint 97-11 done for the 2D-approach in this paper we describe the Yserentant preconditioned conjugate gradient method as well as the BPX-preconditioned cg-iteration fastly solving 3D-elliptic boundary value problems on unstructured quasi uniform grids. These artificially constructed hierarchical methods have optimal computational costs. In the case of the sequential computing several numerical examples demonstrate their efficiency not depending on the finite element types used for the discretiziation of the original potential problem. Moreover, implementing the methods in parallel first results are given.

Page generated in 0.0657 seconds