Global ETD Search

81	Adaptive Solvers for High-Dimensional PDE Problems on Clusters of Multicore Processors Grandin, Magnus January 2014 (has links) Accurate numerical solution of time-dependent, high-dimensional partial differential equations (PDEs) usually requires efficient numerical techniques and massive-scale parallel computing. In this thesis, we implement and evaluate discretization schemes suited for PDEs of higher dimensionality, focusing on high order of accuracy and low computational cost. Spatial discretization is particularly challenging in higher dimensions. The memory requirements for uniform grids quickly grow out of reach even on large-scale parallel computers. We utilize high-order discretization schemes and implement adaptive mesh refinement on structured hyperrectangular domains in order to reduce the required number of grid points and computational work. We allow for anisotropic (non-uniform) refinement by recursive bisection and show how to construct, manage and load balance such grids efficiently. In our numerical examples, we use finite difference schemes to discretize the PDEs. In the adaptive case we show how a stable discretization can be constructed using SBP-SAT operators. However, our adaptive mesh framework is general and other methods of discretization are viable. For integration in time, we implement exponential integrators based on the Lanczos/Arnoldi iterative schemes for eigenvalue approximations. Using adaptive time stepping and a truncated Magnus expansion, we attain high levels of accuracy in the solution at low computational cost. We further investigate alternative implementations of the Lanczos algorithm with reduced communication costs. As an example application problem, we have considered the time-dependent Schrödinger equation (TDSE). We present solvers and results for the solution of the TDSE on equidistant as well as adaptively refined Cartesian grids. / eSSENCE adaptive mesh refinement anisotropic refinement exponential integrators Lanczos' algorithm hybrid parallelization time-dependent Schrödinger equation
82	Web-based front-end design and scientific computing for material stress simulation software Lin, Tien-Ju 12 January 2015 (has links) A precise simulation requires a large amount of input data such as geometrical descriptions of the crystal structure, the external forces and loads, and quantitative properties of the material. Although some powerful applications already exist for research purposes, they are not widely used in education due to complex structure and unintuitive operation. To cater to the generic user base, a front-end application for material simulation software is introduced. With a graphic interface, it provides a more efficient way to conduct the simulation and to educate students who want to enlarge knowledge in relevant fields. We first discuss how we explore the solution for the front-end application and how to develop it on top of the material simulation software developed by mechanical engineering lab from Georgia Tech Lorraine. The user interface design, the functionality and the whole user experience are primary factors determining the product success or failure. This material simulation software helps researchers resolve the motion and the interactions of a large ensemble of dislocations for single or multi-layered 3D materials. However, the algorithm it utilizes is not well optimized and parallelized, so its performance of speedup cannot scale when using more CPUs in the cluster. This problem leads to the second topic on scientific computing, so in this thesis we offer different approaches that attempt to improve the parallelization and optimize the scalability. Front-end design Front-end development Scientific computing High performance computing Parallelization
83	Methods for Network Optimization and Parallel Derivative-free Optimization Olsson, Per-Magnus January 2014 (has links) This thesis is divided into two parts that each is concerned with a specific problem. The problem under consideration in the first part is to find suitable graph representations, abstractions, cost measures and algorithms for calculating placements of unmanned aerial vehicles (UAVs) such that they can keep one or several static targets under constant surveillance. Each target is kept under surveillance by a surveillance UAV, which transmits information, typically real time video, to a relay UAV. The role of the relay UAV is to retransmit the information to another relay UAV, which retransmits it again to yet another UAV. This chain of retransmission continues until the information eventually reaches an operator at a base station. When there is a single target, then all Pareto-optimal solutions, i.e. all relevant compromises between quality and the number of UAVs required, can be found using an efficient new algorithm. If there are several targets, the problem becomes a variant of the Steiner tree problem and to solve this problem we adapt an existing algorithm to find an initial tree. Once it is found, we can further improve it using a new algorithm presentedin this thesis. The second problem is optimization of time-consuming problems where the objective function is seen as a black box, where the input parameters are sent and a function valueis returned. This has the important implication that no gradient or Hessian information is available. Such problems are common when simulators are used to perform advanced calculations such as crash test simulations of cars, dynamic multibody simulations etc. It is common that a single function evaluation takes several hours. Algorithms for solving such problems can be broadly divided into direct search algorithms and model building algorithms. The first kind evaluates the objective function directly, whereas the second kind builds a model of the objective function, which is then optimized in order to find a new point where it is believed that objective function has agood value. Then the objective function is evaluated in that point. Since the objective function is very time-consuming, it is common to focus on minimizing the number of function evaluations. However, this completely disregards the possibility to perform calculations in parallel and to exploit this we investigate different ways parallelization can be used in model-building algorithms. Some of the ways to do this is to use several starting points, generate several new points in each iteration, new ways of predicting a point’s value and more. We have implemented the parallel extensions in one of the state of the art algorithms for derivative-free optimization and report results from testing on synthetic benchmarksas well as from solving real industrial problems. Derivative-free parallelization Steiner tree constrained shortest path unmanned aerial vehicles
84	Correspondence-based pairwise depth estimation with parallel acceleration Bartosch, Nadine January 2018 (has links) This report covers the implementation and evaluation of a stereo vision corre- spondence-based depth estimation algorithm on a GPU. The results and feed- back are used for a Multi-view camera system in combination with Jetson TK1 devices for parallelized image processing and the aim of this system is to esti- mate the depth of the scenery in front of it. The performance of the algorithm plays the key role. Alongside the implementation, the objective of this study is to investigate the advantages of parallel acceleration inter alia the differences to the execution on a CPU which are significant for all the function, the imposed overheads particular for a GPU application like memory transfer from the CPU to the GPU and vice versa as well as the challenges for real-time and concurrent execution. The study has been conducted with the aid of CUDA on three NVIDIA GPUs with different characteristics and with the aid of knowledge gained through extensive literature study about different depth estimation algo- rithms but also stereo vision and correspondence as well as CUDA in general. Using the full set of components of the algorithm and expecting (near) real-time execution is utopic in this setup and implementation, the slowing factors are in- ter alia the semi-global matching. Investigating alternatives shows that results for disparity maps of a certain accuracy are also achieved by local methods like the Hamming Distance alone and by a filter that refines the results. Further- more, it is demonstrated that the kernel launch configuration and the usage of GPU memory types like shared memory is crucial for GPU implementations and has an impact on the performance of the algorithm. Just concurrency proves to be a more complicated task, especially in the desired way of realization. For the future work and refinement of the algorithm it is therefore recommended to invest more time into further optimization possibilities in regards of shared memory and into integrating the algorithm into the actual pipeline. Depth estimation disparity stereo vision stereo correspondence NVIDIA GPU CUDA parallelization Computer Systems Datorsystem
85	Leveraging the entity matching performance through adaptive indexing and efficient parallelization MESTRE, Demetrio Gomes. 11 September 2018 (has links) Submitted by Emanuel Varela Cardoso (emanuel.varela@ufcg.edu.br) on 2018-09-11T19:44:07Z No. of bitstreams: 1 DEMETRIO GOMES MESTRE – TESE (PPGCC) 2018.pdf: 15362740 bytes, checksum: eb531a72836b3c7f2f4e0171c7f563dc (MD5) / Made available in DSpace on 2018-09-11T19:44:07Z (GMT). No. of bitstreams: 1 DEMETRIO GOMES MESTRE – TESE (PPGCC) 2018.pdf: 15362740 bytes, checksum: eb531a72836b3c7f2f4e0171c7f563dc (MD5) Previous issue date: 2018-03-27 / Entity Matching (EM), ou seja, a tarefa de identificar entidades que se referem a um mesmo objeto do mundo real, é uma tarefa importante e difícil para a integração e limpeza de fontes de dados. Uma das maiores dificuldades para a realização desta tarefa, na era de Big Data, é o tempo de execução elevado gerado pela natureza quadrática da execução da tarefa. Para minimizar a carga de trabalho preservando a qualidade na detecção de entidades similares, tanto para uma ou mais fontes de dados, foram propostos os chamados métodos de indexação ou blocagem. Estes métodos particionam o conjunto de dados em subconjuntos (blocos) de entidades potencialmente similares, rotulando-as com chaves de bloco, e restringem a execução da tarefa de EM entre entidades pertencentes ao mesmo bloco. Apesar de promover uma diminuição considerável no número de comparações realizadas, os métodos de indexação ainda podem gerar grandes quantidades de comparações, dependendo do tamanho dos conjuntos de dados envolvidos e/ou do número de entidades por índice (ou bloco). Assim, para reduzir ainda mais o tempo de execução, a tarefa de EM pode ser realizada em paralelo com o uso de modelos de programação tais como MapReduce e Spark. Contudo, a eficácia e a escalabilidade de abordagens baseadas nestes modelos depende fortemente da designação de dados feita da fase de map para a fase de reduce, para o caso de MapReduce, e da designação de dados entre as operações de transformação, para o caso de Spark. A robustez da estratégia de designação de dados é crucial para se alcançar alta eficiência, ou seja, otimização na manipulação de dados enviesados (conjuntos de dados grandes que podem causar gargalos de memória) e no balanceamento da distribuição da carga de trabalho entre os nós da infraestrutura distribuída. Assim, considerando que a investigação de abordagens que promovam a execução eficiente, em modo batch ou tempo real, de métodos de indexação adaptativa de EM no contexto da computação distribuída ainda não foi contemplada na literatura, este trabalho consiste em propor um conjunto de abordagens capaz de executar a indexação adaptativas de EM de forma eficiente, em modo batch ou tempo real, utilizando os modelos programáticos MapReduce e Spark. O desempenho das abordagens propostas é analisado em relação ao estado da arte utilizando infraestruturas de cluster e fontes de dados reais. Os resultados mostram que as abordagens propostas neste trabalho apresentam padrões que evidenciam o aumento significativo de desempenho da tarefa de EM distribuída promovendo, assim, uma redução no tempo de execução total e a preservação da qualidade da detecção de pares de entidades similares. / Entity Matching (EM), i.e., the task of identifying all entities referring to the same realworld object, is an important and difficult task for data sources integration and cleansing. A major difficulty for this task performance, in the Big Data era, is the quadratic nature of the task execution. To minimize the workload and still maintain high levels of matching quality, for both single or multiple data sources, the indexing (blocking) methods were proposed. Such methods work by partitioning the input data into blocks of similar entities, according to an entity attribute, or a combination of them, commonly called “blocking key”, and restricting the EM process to entities that share the same blocking key (i.e., belong to the same block). In spite to promote a considerable decrease in the number of comparisons executed, indexing methods can still generate large amounts of comparisons, depending on the size of the data sources involved and/or the number of entities per index (or block). Thus, to further minimize the execution time, the EM task can be performed in parallel using programming models such as MapReduce and Spark. However, the effectiveness and scalability of MapReduce and Spark-based implementations for data-intensive tasks depend on the data assignment made from map to reduce tasks, in the case of MapReduce, and the data assignment between the transformation operations, in the case of Spark. The robustness of this assignment strategy is crucial to achieve skewed data handling (large sets of data can cause memory bottlenecks) and balanced workload distribution among all nodes of the distributed infrastructure. Thus, considering that studies about approaches that perform the efficient execution of adaptive indexing EM methods, in batch or real-time modes, in the context of parallel computing are an open gap according to the literature, this work proposes a set of parallel approaches capable of performing efficient adaptive indexing EM approaches using MapReduce and Spark in batch or real-time modes. The proposed approaches are compared to state-of-the-art ones in terms of performance using real cluster infrastructures and data sources. The results carried so far show evidences that the performance of the proposed approaches is significantly increased, enabling a decrease in the overall runtime while preserving the quality of similar entities detection. Ciência da Computação Efficient parallelization Entity matching Computação paralela Balanceamento de carga MapReduce Spark
86	Programação paralela e sequencial aplicada à otimização de estruturas metálicas com o algoritmo PSO Esposito, Adelano January 2012 (has links) Um dos métodos heurísticos bastante explorados em engenharia é o PSO (Otimização por enxame de partículas). O PSO é uma meta-heurística baseada em populações de indivíduos, na qual candidatos à solução evoluem através da simulação de um modelo simplificado de adaptação social. Este método vem conquistando grande popularidade, no entanto, o elevado número de avaliações da função objetivo limita a sua aplicação em problemas de grande porte de engenharia. Por outro lado, esse algoritmo pode ser facilmente paralelizado, o que torna a computação paralela uma alternativa atraente para sua utilização. Neste trabalho, são desenvolvidas duas versões seriais do algoritmo por enxame de partícula e suas respectivas extensões paralelas. Os algoritmos paralelos, por meio de funções disponíveis na biblioteca do MATLAB®, utilizam os paradigmas mestre-escravo e múltiplas populações, diferindo entre si pela forma de atualização das partículas do enxame (revoada ou pseudo-revoada) bem como pelo modo de comunicação entre os processadores (síncrono ou assíncrono). Os modelos propostos foram aplicados na otimização de problemas clássicos da engenharia estrutural, tradicionalmente encontrados na literatura (benchmarks) e seus resultados são comparados quanto às métricas utilizadas na literatura para avaliação dos algoritmos. Os resultados obtidos demonstram que a computação paralela possibilitou uma melhora no desempenho do algoritmo sequencial assíncrono. Também são registrados bons ganhos de tempo de processamento para as duas extensões paralelas do algoritmo, salvo que o algoritmo paralelo síncrono, diferentemente da versão paralela assíncrona, demonstrou um crescente desempenho computacional à medida que mais processadores são utilizados. / Amongst heuristic algorithms, PSO (Particle Swarm Optimization) is one of the most explored. PSO is a metaheuristic based on a population of individuals, in which solution candidates evolve by simulating a simplified model of social adaptation. This method has becoming popular, however, the large number of evaluations of the objective function limits its application to large-scale engineering problems. On the other hand, this algorithm can easily be parallelized, which makes parallel computation an attractive alternative to be used. In this work, two versions of the serial particle swarm algorithm and their parallel extensions are developed. The parallel algorithms, by means of available MATLAB® functionalities, use the master-slave paradigm and multiple populations, differing from each other by the way the particle swarm is updated (flocking or pseudo-flocking) as well as by the communication between processors (synchronous or asynchronous). The proposed models were applied to the optimization of classical structural engineering problems found in the literature (benchmarks) and the results are compared in terms usual metrics used for algorithm evaluation. The results show that parallel computing has enabled an improvement in the performance of asynchronous parallel algorithm. Good time savings were recorded for the two parallel extensions, except that the synchronous parallel algorithm, unlike the asynchronous parallel version, demonstrated a growing performance as more processors are used. Estruturas (Engenharia) Processamento paralelo Otimização matemática Metaheuristic Sequential Parallelization Structural optimization
87	Parallélisation et optimisation d'un simulateur de morphogénèse d'organes. Application aux éléments du rein / Parallelization and optimization of an organ morphogenesis simulator. Application to the elements of the kidney Caux, Jonathan 30 November 2012 (has links) Depuis plusieurs dizaines d’années, la modélisation du vivant est un enjeu majeur qui nécessite de plus en plus de travaux dans le domaine de la simulation. En effet, elle ouvre la porte à toute une palette d’applications : l’aide à la décision en environnement et en écologie, l’aide à l’enseignement, l’aide à la décision pour les médecins, l’aide à la recherche de nouveaux traitements pharmaceutiques et la biologie dite « prédictive », etc. Avant de pouvoir aborder un problème, il est nécessaire de pouvoir modéliser de façon précise le système biologique concerné en précisant bien les questions auxquelles devra répondre le modèle. La manipulation et l’étude de systèmes complexes, les systèmes biologiques en étant l’archétype, pose, de façon générale, des problèmes de modélisation et de simulation. C’est dans ce contexte que la société Integrative BioComputing (IBC) développe depuis le début des années 2000 un prototype d’une Plateforme Générique de Modélisation et de Simulation (la PGMS) dont le but est de fournir un environnement pour modéliser et simuler plus simplement les processus et les fonctions biologiques d’un organisme complet avec les organes le composant. La PGMS étant une plateforme générique encore en phase de développement, elle ne possédait pas les performances nécessaires pour permettre de réaliser la modélisation et la simulation d’éléments importants dans des temps suffisamment courts. Il a donc été décidé, afin d’améliorer drastiquement les performances de la PGMS, de paralléliser et d’optimiser l’implémentation de celle-ci ; le but étant de permettre la modélisation et la simulation d’organes complets dans des temps acceptables. Le travail réalisé au cours de cette thèse a donc consisté à traiter différents aspects de la modélisation et de la simulation de systèmes biologiques afin d’accélérer les traitements de ceux-ci. Le traitement le plus gourmand en termes de temps de calcul lors de l’exécution de la PGMS, le calcul des champs physicochimiques, a ainsi fait l’objet d’une étude de faisabilité de sa parallélisation. Parmi les différentes architectures disponibles pour paralléliser une telle application, notre choix s’est porté sur l’utilisation de GPU (Graphical Processing Unit) à des fins de calculs généralistes aussi couramment appelé GPGPU (General-Purpose computation on Graphics Processing Units). Ce choix a été réalisé du fait, entre autres, du coût réduit du matériel et de sa très grande puissance de calcul brute qui en fait une des architectures de parallélisation les plus accessibles du marché. Les résultats de l’étude de faisabilité étant particulièrement concluant, la parallélisation du calcul des champs a ensuite été intégrée à la PGMS. En parallèle, nous avons également mené des travaux d’optimisations pour améliorer les performances séquentielles de la PGMS. Le résultat de ces travaux est une augmentation de la vitesse d’exécution d’un facteur 18,12x sur les simulations les plus longues (passant de 16 minutes pour la simulation non optimisée utilisant un seul cœur CPU à 53 secondes pour la version optimisée utilisant toujours un seul cœur CPU mais aussi un GPU GTX500). L’autre aspect majeur traité dans ces travaux a été d’améliorer les performances algorithmiques pour la simulation d’automates cellulaires en trois dimensions. En effet, ces derniers permettent aussi bien de simuler des comportements biologiques que d’implémenter des mécanismes de modélisation tels que les interactions multi-échelles. Le travail de recherche s’est essentiellement effectué sur des propositions algorithmiques originales afin d’améliorer les simulations réalisées par IBC sur la PGMS. L’accélération logicielle, à travers l’implémentation de l’algorithme Hash‑Life en trois dimensions, et la parallélisation à l’aide de GPGPU ont été étudiées de façon concomitante et ont abouti à des gains très significatifs en temps de calcul. / For some years, living matter modeling has been a major challenge which needs more and more research in the simulation field. Indeed, the use of models of living matter have multiple applications: decision making aid in environment or ecology, teaching tools, decision making aid for physician, research aid for new pharmaceutical treatment and “predictive” biology, etc. But before being able to tackle all these issues, the development of a correct model, able to give answer about specific questions, is needed. Working with complex systems –biologic system being the archetype of them– raises various modeling and simulation issues. It is in this context that the Integrative BioComputing (IBC) company have been elaborating, since the early 2000s, the prototype of a generic platform for modeling and simulation (PGMS). Its goal is to provide a platform used to easily model and simulate biological process of a full organism, including its organs. Since the PGMS was still in its development stage at the start of my PhD, the application performance prevented the modeling and simulation of large biological components in an acceptable time. Therefore, it has been decide to optimize and parallelize its computation to increase significantly the PGMS performances. The goal was to enable the use of the PGMS to model and simulate full organs in acceptable times. During my PhD, I had to work on various aspects of the modeling and simulation of biological systems to increase their process speed. Since the most costly process during the PGMS execution was the computation of chemical fields, I had to study the opportunity of parallelizing this process. Among the various hardware architectures available to parallelize this application, we chose to use graphical processing units for general purpose computation (GPGPUs). This choice was motivated, beside other reasons, by the low cost of the hardware compared to its massive computation power, making it one of the most affordable parallel architecture on the market. Since the results of the initial feasibility study were conclusive, the parallelization of the fields computation has been integrated into the PGMS. In parallel to this work, I also worked on optimizing the sequential performance of the application. All these works lead to an increase of the software performances achieving a speed-up of 18.12x for the longest simulation (from 16 minutes for the non-optimized version with one CPU core to 53 seconds for the optimized version, still using only one core on the CPU but also a GPU GTX500). The other major aspect of my work was to increase the algorithmic performances for the simulation of three-dimensional cellular automata. In fact, these automata allow the simulation of biological behavior as they can be used to implement various mechanisms of a model such as multi-scale interactions. The research work consisted mainly in proposing original algorithms to improve the simulation provided by IBC on the PGMS. The sequential speed increase, thanks to the three-dimensional Hash Life implementation, and the parallelization on GPGPU has been studied together and achieved major computation time improvement. Parallélisation Optimisation GPGPU Modélisation Simulation Automates cellulaires Parallelization Optimization GPGPU Modeling Simulation Cellular automata
88	A Taxonomy of Parallel Vector Spatial Analysis Algorithms January 2015 (has links) abstract: Nearly 25 years ago, parallel computing techniques were first applied to vector spatial analysis methods. This initial research was driven by the desire to reduce computing times in order to support scaling to larger problem sets. Since this initial work, rapid technological advancement has driven the availability of High Performance Computing (HPC) resources, in the form of multi-core desktop computers, distributed geographic information processing systems, e.g. computational grids, and single site HPC clusters. In step with increases in computational resources, significant advancement in the capabilities to capture and store large quantities of spatially enabled data have been realized. A key component to utilizing vast data quantities in HPC environments, scalable algorithms, have failed to keep pace. The National Science Foundation has identified the lack of scalable algorithms in codified frameworks as an essential research product. Fulfillment of this goal is challenging given the lack of a codified theoretical framework mapping atomic numeric operations from the spatial analysis stack to parallel programming paradigms, the diversity in vernacular utilized by research groups, the propensity for implementations to tightly couple to under- lying hardware, and the general difficulty in realizing scalable parallel algorithms. This dissertation develops a taxonomy of parallel vector spatial analysis algorithms with classification being defined by root mathematical operation and communication pattern, a computational dwarf. Six computational dwarfs are identified, three being drawn directly from an existing parallel computing taxonomy and three being created to capture characteristics unique to spatial analysis algorithms. The taxonomy provides a high-level classification decoupled from low-level implementation details such as hardware, communication protocols, implementation language, decomposition method, or file input and output. By taking a high-level approach implementation specifics are broadly proposed, breadth of coverage is achieved, and extensibility is ensured. The taxonomy is both informed and informed by five case studies im- plemented across multiple, divergent hardware environments. A major contribution of this dissertation is a theoretical framework to support the future development of concrete parallel vector spatial analysis frameworks through the identification of computational dwarfs and, by extension, successful implementation strategies. / Dissertation/Thesis / Doctoral Dissertation Geography 2015 Computer science cybergis cyberinfrastructure parallelization spatial analysis
89	Programação paralela e sequencial aplicada à otimização de estruturas metálicas com o algoritmo PSO Esposito, Adelano January 2012 (has links) Um dos métodos heurísticos bastante explorados em engenharia é o PSO (Otimização por enxame de partículas). O PSO é uma meta-heurística baseada em populações de indivíduos, na qual candidatos à solução evoluem através da simulação de um modelo simplificado de adaptação social. Este método vem conquistando grande popularidade, no entanto, o elevado número de avaliações da função objetivo limita a sua aplicação em problemas de grande porte de engenharia. Por outro lado, esse algoritmo pode ser facilmente paralelizado, o que torna a computação paralela uma alternativa atraente para sua utilização. Neste trabalho, são desenvolvidas duas versões seriais do algoritmo por enxame de partícula e suas respectivas extensões paralelas. Os algoritmos paralelos, por meio de funções disponíveis na biblioteca do MATLAB®, utilizam os paradigmas mestre-escravo e múltiplas populações, diferindo entre si pela forma de atualização das partículas do enxame (revoada ou pseudo-revoada) bem como pelo modo de comunicação entre os processadores (síncrono ou assíncrono). Os modelos propostos foram aplicados na otimização de problemas clássicos da engenharia estrutural, tradicionalmente encontrados na literatura (benchmarks) e seus resultados são comparados quanto às métricas utilizadas na literatura para avaliação dos algoritmos. Os resultados obtidos demonstram que a computação paralela possibilitou uma melhora no desempenho do algoritmo sequencial assíncrono. Também são registrados bons ganhos de tempo de processamento para as duas extensões paralelas do algoritmo, salvo que o algoritmo paralelo síncrono, diferentemente da versão paralela assíncrona, demonstrou um crescente desempenho computacional à medida que mais processadores são utilizados. / Amongst heuristic algorithms, PSO (Particle Swarm Optimization) is one of the most explored. PSO is a metaheuristic based on a population of individuals, in which solution candidates evolve by simulating a simplified model of social adaptation. This method has becoming popular, however, the large number of evaluations of the objective function limits its application to large-scale engineering problems. On the other hand, this algorithm can easily be parallelized, which makes parallel computation an attractive alternative to be used. In this work, two versions of the serial particle swarm algorithm and their parallel extensions are developed. The parallel algorithms, by means of available MATLAB® functionalities, use the master-slave paradigm and multiple populations, differing from each other by the way the particle swarm is updated (flocking or pseudo-flocking) as well as by the communication between processors (synchronous or asynchronous). The proposed models were applied to the optimization of classical structural engineering problems found in the literature (benchmarks) and the results are compared in terms usual metrics used for algorithm evaluation. The results show that parallel computing has enabled an improvement in the performance of asynchronous parallel algorithm. Good time savings were recorded for the two parallel extensions, except that the synchronous parallel algorithm, unlike the asynchronous parallel version, demonstrated a growing performance as more processors are used. Estruturas (Engenharia) Processamento paralelo Otimização matemática Metaheuristic Sequential Parallelization Structural optimization
90	Méthodes numériques pour la simulation de problèmes acoustiques de grandes tailles / Numerical methods for acoustic simulation of large-scale problems Venet, Cédric 30 March 2011 (has links) Cette thèse s’intéresse à la simulation acoustique de problèmes de grandes tailles. La parallélisation des méthodes numériques d’acoustique est le sujet principal de cette étude. Le manuscrit est composé de trois parties : lancé de rayon, méthodes de décomposition de domaines et algorithmes asynchrones. / This thesis studies numerical methods for large-scale acoustic problems. The parallelization of the numerical acoustic methods is the main focus. The manuscript is composed of three parts: ray-tracing, optimized interface conditions for domain decomposition methods and asynchronous iterative algorithms. Parallélisation Méthodes de décomposition de domaine Algorithmes asynchrones Parallelization Domain decomposition method Asynchronous algorithms

Search results