• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 74
  • 16
  • 7
  • 5
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 146
  • 146
  • 57
  • 23
  • 21
  • 21
  • 19
  • 19
  • 19
  • 17
  • 16
  • 16
  • 15
  • 15
  • 14
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Solução paralela para sistemas de balanço não-lineares / Parallel solution of nonlinear balance systems

Gustavo Hime 27 September 2007 (has links)
Modelos para diversos fenômenos baseiam-se em equações de balanço ou conservação. Dependendo do fenônemo e do que é admitido pelo modelo, nas equações são simplificadas e resolvidas de diferentes modos. O problema de injeção em um meio poroso de um fluido bifásico cujo equilíbrio depende da temperatura, por exemplo, pode ser modelado por uma equação de conservação de massa que inclui um termo difusivo; esta equação, por sua vez, pode ser discretizada por diferenças finitas tanto no tempo quanto no espaço e resolvida numericamente. O estudo estritamente analítico destes modelos é muito limitado. Uma compreensão mais detalhada do comportamento do modelo só pode ser obtida através de simulações numéricas e do estudo qualitativo de seus resultados. Os resultados de uma simulação só podem ser visualizados uma vez que esta tenha sido concluída: mas simulações de alta qualidade requerem simulações em malhas mais finas, que necessitam de mais tempo computacional. Mesmo para fluxos unidimensionais, o ciclo interativo de especificar os parâmetros para uma nova simulação com base nas conclusões tiradas de simulações prévias necessariamente inclui um tempo de espera indesejável. Sistemas capazes de resolver esta classe de problemas numéricos rápida e eficientemente são portanto o objetivo principal deste trabalho. Para obter alto desempenho no cálculo destas soluções, muitos fatores precisam ser levados em consideração: o custo computacional inerente às equações constitutivas usadas no modelo, o tipo específico de sistema linear resultante da discretização do problema, as diferentes alternativas quanto ao algoritmo de solução do sistema e suas implementações e os pontos fortes e limitações impostas por cada ambiente computacional que se deseja explorar. Como resultado do teste de diversas abordagens em diferentes máquinas, nós obtemos não somente um motor numérico eficiente para os casos de estudo apresentados neste trabalho, mas também um guia para a aplicação destas técnicas a problemas similares.
92

Um metodo numerico com paralelismo no tempo para aproximar solucoes de EDPs / A numerical method with parallelism in time to approximate solutions to PDEs

Washington Santos da Silva 10 June 2014 (has links)
Este trabalho de pesquisa tem por objetivo apresentar e investigar a viabilidade de um método numérico que contempla o paralelismo no tempo. Este método numérico está associado a problemas de condição inicial e de contorno para equações diferenciais parciais (evolutivas). Diferentemente do método proposto neste trabalho, a maioria dos métodos numéricos associados a equações diferencias parciais evolutivas e tradicionalmente encontrados, contemplam apenas o paralelismo no espaço. Daí, a motivação em realizar o presente trabalho de pesquisa, buscando não somente um método com paralelismo no tempo mas, sobretudo, um método viável do ponto de vista computacional. Para isso, a implementação do esquema numérico proposto está por conta de um algoritmo paralelo escrito na linguagem C e que utiliza a biblioteca MPI. A análise dos resultados obtidos com os testes de desempenho revelam um método numérico escalável e que exige pouco nível de comunicação entre processadores. / This research aims to present and investigate the feasibility of a numerical method that considers the parallelism in time. This numerical method is associated with problems of initial and boundary conditions for (evolutionary) partial differential equations. Unlike the method proposed in this work, most of the numerical methods associated with evolutionary partial diferential equations and traditionally found include only parallelism in space. Hence, the motivation for carrying out the present research work,seeking not only a method with parallelism in time but,above all, a viable method. The implementation of this proposed computational parallel algorithm was written with the language C and uses the MPI library. The results obtained from performance tests reveal a scalable and numerical method that requires little level of communication amount processors
93

High Performance Parallel Algorithms for Tensor Decompositions / Algorithmes Parallèles pour les Décompositions des Tenseurs

Kaya, Oguz 15 September 2017 (has links)
La factorisation des tenseurs est au coeur des méthodes d'analyse des données massives multidimensionnelles dans de nombreux domaines, dont les systèmes de recommandation, les graphes, les données médicales, le traitement du signal, la chimiométrie, et bien d'autres.Pour toutes ces applications, l'obtention rapide de la décomposition des tenseurs est cruciale pour pouvoir traiter manipuler efficacement les énormes volumes de données en jeu.L'objectif principal de cette thèse est la conception d'algorithmes pour la décomposition de tenseurs multidimensionnels creux, possédant de plusieurs centaines de millions à quelques milliards de coefficients non-nuls. De tels tenseurs sont omniprésents dans les applications citées plus haut.Nous poursuivons cet objectif via trois approches.En premier lieu, nous proposons des algorithmes parallèles à mémoire distribuée, comprenant des schémas de communication point-à-point optimisés, afin de réduire les coûts de communication. Ces algorithmes sont indépendants du partitionnement des éléments du tenseur et des matrices de faible rang. Cette propriété nous permet de proposer des stratégies de partitionnement visant à minimiser le coût de communication tout en préservant l'équilibrage de charge entre les ressources. Nous utilisons des techniques d'hypergraphes pour analyser les paramètres de calcul et de communication de ces algorithmes, ainsi que des outils de partitionnement d'hypergraphe pour déterminer des partitions à même d'offrir un meilleur passage à l'échelle. Deuxièmement, nous étudions la parallélisation sur plate-forme à mémoire partagée de ces algorithmes. Dans ce contexte, nous déterminons soigneusement les tâches de calcul et leur dépendances, et nous les exprimons en termes d'une structure de données idoine, et dont la manipulation permet de révéler le parallélisme intrinsèque du problème. Troisièmement, nous présentons un schéma de calcul en forme d'arbre binaire pour représenter les noyaux de calcul les plus coûteux des algorithmes, comme la multiplication du tenseur par un ensemble de vecteurs ou de matrices donnés. L'arbre binaire permet de factoriser certains résultats intermédiaires, et de les ré-utiliser au fil du calcul. Grâce à ce schéma, nous montrons comment réduire significativement le nombre et le coût des multiplications tenseur-vecteur et tenseur-matrice, rendant ainsi la décomposition du tenseur plus rapide à la fois pour la version séquentielle et la version parallèle des algorithmes.Enfin, le reste de la thèse décrit deux extensions sur des thèmes similaires. La première extension consiste à appliquer le schéma d'arbre binaire à la décomposition des tenseurs denses, avec une analyse précise de la complexité du problème et des méthodes pour trouver la structure arborescente qui minimise le coût total. La seconde extension consiste à adapter les techniques de partitionnement utilisées pour la décomposition des tenseurs creux à la factorisation des matrices non-négatives, problème largement étudié et pour lequel nous obtenons des algorithmes parallèles plus efficaces que les meilleurs actuellement connus.Tous les résultats théoriques de cette thèse sont accompagnés d'implémentations parallèles,aussi bien en mémoire partagée que distribuée. Tous les algorithmes proposés, avec leur réalisation sur plate-forme HPC, contribuent ainsi à faire de la décomposition de tenseurs un outil prometteur pour le traitement des masses de données actuelles et à venir. / Tensor factorization has been increasingly used to analyze high-dimensional low-rank data ofmassive scale in numerous application domains, including recommender systems, graphanalytics, health-care data analysis, signal processing, chemometrics, and many others.In these applications, efficient computation of tensor decompositions is crucial to be able tohandle such datasets of high volume. The main focus of this thesis is on efficient decompositionof high dimensional sparse tensors, with hundreds of millions to billions of nonzero entries,which arise in many emerging big data applications. We achieve this through three majorapproaches.In the first approach, we provide distributed memory parallel algorithms with efficientpoint-to-point communication scheme for reducing the communication cost. These algorithmsare agnostic to the partitioning of tensor elements and low rank decomposition matrices, whichallow us to investigate effective partitioning strategies for minimizing communication cost whileestablishing computational load balance. We use hypergraph-based techniques to analyze computational and communication requirements in these algorithms, and employ hypergraphpartitioning tools to find suitable partitions that provide much better scalability.Second, we investigate effective shared memory parallelizations of these algorithms. Here, we carefully determine unit computational tasks and their dependencies, and express them using aproper data structure that exposes the parallelism underneath.Third, we introduce a tree-based computational scheme that carries out expensive operations(involving the multiplication of the tensor with a set of vectors or matrices, found at the core ofthese algorithms) faster by factoring out and storing common partial results and effectivelyre-using them. With this computational scheme, we asymptotically reduce the number oftensor-vector and -matrix multiplications for high dimensional tensors, and thereby rendercomputing tensor decompositions significantly cheaper both for sequential and parallelalgorithms.Finally, we diversify this main course of research with two extensions on similar themes.The first extension involves applying the tree-based computational framework to computingdense tensor decompositions, with an in-depth analysis of computational complexity andmethods to find optimal tree structures minimizing the computational cost. The second workfocuses on adapting effective communication and partitioning schemes of our parallel sparsetensor decomposition algorithms to the widely used non-negative matrix factorization problem,through which we obtain significantly better parallel scalability over the state of the artimplementations.We point out that all theoretical results in the thesis are nicely corroborated by parallelexperiments on both shared-memory and distributed-memory platforms. With these fastalgorithms as well as their tuned implementations for modern HPC architectures, we rendertensor and matrix decomposition algorithms amenable to use for analyzing massive scaledatasets.
94

Nouveaux algorithmes pour la détection de communautés disjointes et chevauchantes basés sur la propagation de labels et adaptés aux grands graphes / New algorithms for disjoint and overlapping community detection based on label propagation and adapted to large graphs

Attal, Jean-Philippe 19 January 2017 (has links)
Les graphes sont des structures mathématiques capable de modéliser certains systèmes complexes.Une des nombreuses problématiques liée aux graphes concerne la détection de communautés qui vise à trouver une partition en sommet d'un graphe en vue d'en comprendre la structure. A titre d'exemple, en représentant des contratsd'assurances par des noeuds et leurs degrés de similarité par une arête,détecter des groupes de noeuds fortement connectésconduit à détecter des profils similaires, et donc a voir des profils à risques.De nombreux algorithmes ont essayé de répondreà ce problème.Une des méthodes est la propagation de labels qui consiste à ce quechaque noeud puisse recevoir un label par un vote majoritaire de ses voisins.Bien que cette méthode soit simple à mettre en oeuvre,elle présente une grande instabilité due au non déterminisme del'algorithme et peut dans certains cas ne pas détecter de structures communautaires.La première contribution  de cette thèse sera de i) proposerune méthode de stabilisation de la propagation de labelstout en appliquant des barrages artificiels pour limiter les possibles mauvaises propagations.Les réseaux complexes ont également comme caractéristique que certains noeuds puissent appartenir à plusieurs communautés, on parle alors de recouvrements.  C'est en ce sens que la secondecontribution de cette thèse portera sur ii) la créationd'un algorithme auquel seront adjointes des fonctions d'appartenancespour détecter de possibles recouvrements via des noeuds candidats au chevauchement.La taille des graphes est également une notion à considérer dans  la mesure où certains réseaux peuvent contenir plusieursmillions de noeuds et d'arêtes.Nous proposons iii) une version parallèleet distribuée de la détection de communautés en utilisant la propagation de labels par coeur.Une étude comparative sera effectuée pour observerla qualité de partitionnement et de recouvrement desalgorithmes proposés. / Graphs are mathematical structures amounting to a set of nodes (objects or persons) in which some pairs are in linked with edges. Graphs can be  used to model complex systems.One of the main problems in graph theory is the community detection problemwhich aims to find a partition of nodes in the graph to understand its structure.For instance, by representing insurance contracts by nodes and their relationship by edges,detecting groups of nodes highly connected leads to detect similar profiles and to evaluate risk profiles. Several algorithms are used as aresponse to this currently open research field.One of the fastest method is the label propagation.It's a local method, in which each node changes its own label according toits neighbourhood.Unfortunately, this method has two major drawbacks. The first is the instability of the method. Each trialgives rarely the same result.The second is a bad propagation which can lead to huge communities without sense (giant communities problem).The first contribution of the thesis is i)  proposing a stabilisation methodfor the label propagation with artificial dams on edges of some networks in order to limit bad label propagations. Complex networks are also characterized by some nodes which may belong to several communities,we call this a cover.For example, in Protein–protein interaction networks, some proteins may have several functions.Detecting these functions according to their communities could help to cure cancers. The second contribution of this thesis deals with the ii)implementation of an algorithmwith functions to detect potential overlapping nodes .The size of the graphs is also to be considered because some networks  contain several millions of nodes and edges like the Amazon product co-purchasing network.We propose iii) a parallel and a distributed version of the community detection using core label propagation.A study and a comparative analysis of the proposed algorithms will be done based on the quality of the resulted partitions and covers.
95

Akcelerace Burrows-Wheelerovy transformace s využitím GPU / Acceleration of Burrows-Wheeler Transform Using GPU

Zahradníček, Tomáš January 2019 (has links)
This thesis deals with Burrows-Wheeler transform (BWT) and possibilities of acceleration of this transform on graphics processing unit (GPU). Methods of compression based on BWT are introduced, as well as software libraries CUDA and OpenCL for writing programs for GPU. Parallel variants of BWT are implemented, as well as following steps necessary for compression, using CUDA library. Amount of compression of used approaches are tested and parallel versions are compared to their sequential counterparts.
96

Bibliotheken zur Entwicklung paralleler Algorithmen

Haase, G., Hommel, T., Meyer, A., Pester, M. 30 October 1998 (has links)
The purpose of this paper is to supply a summary of library subroutines and functions for parallel MIMD computers. The subroutines have been developed at the University of Chemnitz during a period of the last five years. In detail, they are concerned with vector operations, inter-processor communication and simple graphic output to workstations. One of the most valuable features is the machine-independence of the communication subroutines proposed in this paper for a hypercube topology of the parallel processors (excepting a kernel of only two primitive system-dependend operations). They were implemented and tested for different hardware and operating systems including transputer, nCube, KSR, PVM. The vector subroutines are optimized by the use of C language and enrolled loops (BLAS1-like). The paper includes hints for using the libraries with both Fortran and C programs.
97

On-line visualization in parallel computations

Pester, M. 30 October 1998 (has links)
The investigation of new parallel algorithms for MIMD computers requires some postprocessing facilities for quickly evaluating the behavior of those algorithms We present two kinds of visualization tool implementations for 2D and 3D finite element applications to be used on a parallel computer and a host workstation.
98

Multicore Optimized Real-Time Protocol for Power Control Networks

Naveed, Muhammad January 2012 (has links)
The Technology today is changing at a fast pace. The growth of computers and telecommunications over the past three decades has been extraordinary. We today are at the point where all technologies related to communication and data transfer are submerging to a common platform. A number of different methods are available for data communication or data transfer. The important factor in all communication setups is to satisfy user demands with low cost and reliability. The area of interest for this thesis is future energy substations and wind mills. In order to make things more straight forward and see its different options and capabilities the focus is on designing and implementing a new energy protocol called Energy Real Time Protocol (eRTP) based on Iyad Real Time Protocol (iRTP) [2]. The protocol is designed to meet the requirements of power and energy networks in terms of sending the energy parameters with VoIP data (optional) among power stations at different locations. Keeping in mind the importance transferring energy parameters in real-time, the presented protocol has built upon small individual algorithms/modules designed for multi-core architecture. Each module is supposed to be processed by an individual core/processor in parallel.
99

Razvoj serijskog i paralelnog algoritma za računanje elektronske strukture materijala metodom sklapanja naelektrisanja / Development of Serial and Parallel Algorithms forComputing the Electronic Structure of MaterialsUsing the Charge Patching Method

Bodroški Žarko 04 November 2020 (has links)
<p>U tezi je predstavljena implementacija metode teorija funkcionala gustine (DFT) bazirana na metodi za sklapanje naelektrisanja (CPM) koja koristi bazise gausijanskih funkcija. Metod je baziran na pretpostavci da se elektronska gustina naelektrisanja velikih sistema, može predstaviti kao suma doprinosa pojedinačnih atoma, takozvanih motiva gustine naelektrisanja, koji se dobijaju računanjem malog prototip sistema. Talasna funkcija,<br />kao i gustina naelektrisanja, se u na&scaron;oj implementaciji reprezentuju uz pomoć bazise gausijanskih funkcija, dok se motivi opisuju kori&scaron;ćenjem prostornih koordinata. Uz pomoć procedure za minimizaciju se iz motiva opisanih koordinatama, dobija gustina naelektrisanja predstavljena u bazisu Gausijana. Implementacija serijskog programa pokazuje značajno pobolj&scaron;anje u performansama u odnosu na prethodne implementacije bazirane na ravnim talasima. Ova implementacija re&scaron;ava sistem od približno 1000 atoma na jednom procesorskom jezgru za svega nekoliko sati. Paralelna implementacija uz pomoć naprednih metoda paralelizacije i distribucije podataka omogućava re&scaron;avanje sistema od vi&scaron;e desetina hiljada atoma. Najveći testirani sistem ima približno<br />20000 atoma i testiran je na 256 paralelnih procesa.</p> / <p>We present the implementation of the density functional theory (DFT) based charge patching method (CPM) using the basis of Gaussian functions. The method is based on the assumption that the electronic charge density of a large system is the sum of contributions of individual atoms, so called charge density motifs, that are obtained from calculations of small prototype systems.In our implementation wave functions and electronic charge density are represented using the basis of Gaussian functions, while charge density motifs are represented using a real space grid. A constrained minimization procedure is used to obtain Gaussian basis representation of charge density from real space representation of motifs. The code based on this&nbsp; implementation exhibits superior performance in comparison to previous implementation of the charge patching method using the basis of plane waves. It enables calculations of electronic structure of systems with around 1000 atoms on a single CPU core with computational time of just several hours. The parallel implementation enables calculations for the system with more than ten thousand atoms. The largest system tested has around 20000 atoms and was computed on 256 parallel processes.</p>
100

Parallel optimization based operational planning to enhance the resilience of large-scale power systems

Gong, Lin 01 May 2020 (has links)
The resilience of power systems is attracting extensive attention in recent years and needs to be further enhanced in the future, as potential threats from severe events such as extreme weather, geomagnetic storm, as well as extended fuel disruption, which are not easy to be quantified, predicted, or anticipated, are still challenging the modern power industry. To increase the resilience, proper operational planning considering potential impacts of severe events could effectively enable power systems to prepare for, operate through, and recover from those events and mitigate their negative economic, social, and humanitarian consequences by fully deploying existing system resources and operational measures. In this dissertation, operational planning problems in the bulk power system considering potential threats from severe events are focused, including the co-optimization of security-constrained unit commitment and transmission switching with consideration of transmission line outages probably caused by severe weather events, the security-constrained optimal power flow under potential impacts from geomagnetic storms, and the optimal operational planning to prevent electricity-natural gas systems from possible risks of natural gas supply disruptions. Notice that systematic, comprehensive, and consistent operational strategies should be conducted across the entire system to achieve superior resilience enhancement solution, which, along with increased size and complexity of modern energy systems, makes the proposed operational planning problems mathematically large-size and computationally complex optimization problems, and practically difficult to solve, especially when comprehensive operational measures and resourceful components are incorporated. In order to tackle such a challenge, the parallel optimization based approaches are developed in the proposed research, which fully decompose an originally large and complex problem into multiple independent small subproblems, simultaneously solve them in a fully parallel manner on scalable multiple-core computing platforms, and iteratively coordinate their results by using mathematical programming methods to achieve optimal solutions that satisfy engineering requirements of power system operations in practice. As a result, by efficiently solving optimal operational planning problems of large-scale power systems, their secure and economic operations in the presence of severe events like hurricanes, geomagnetic storms, and natural gas supply disruptions can be ensured, which indicates the resilience of power systems is effectively enhanced.

Page generated in 0.0656 seconds