• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 23
  • 10
  • 5
  • 4
  • 4
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 66
  • 66
  • 12
  • 10
  • 10
  • 10
  • 9
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Comparação de métodos de otimização para o problema de ajuste de histórico em ambientes paralelos

Xavier, Carolina Ribeiro 18 August 2009 (has links)
Submitted by isabela.moljf@hotmail.com (isabela.moljf@hotmail.com) on 2017-05-05T11:50:07Z No. of bitstreams: 1 carolinaribeiroxavier.pdf: 2823825 bytes, checksum: af5d50f5cdbb099ed71457b9baaabdc9 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-05-17T13:34:27Z (GMT) No. of bitstreams: 1 carolinaribeiroxavier.pdf: 2823825 bytes, checksum: af5d50f5cdbb099ed71457b9baaabdc9 (MD5) / Made available in DSpace on 2017-05-17T13:34:27Z (GMT). No. of bitstreams: 1 carolinaribeiroxavier.pdf: 2823825 bytes, checksum: af5d50f5cdbb099ed71457b9baaabdc9 (MD5) Previous issue date: 2009-08-18 / O processo de ajuste histórico tem como objetivo a determinação dos parâmetros de modelos de reservatório de petróleo. Uma vez ajustados, os modelos podem ser utilizados para a previsão do comportamento do reservatório. Este trabalho apresenta uma comparação de diferentes métodos de otimização para a solução deste problema. Métodos baseados em derivadas são comparados com um algoritmo genético. Em particular, compara-se os métodos: Levenberg-Marquardt, Quasi-Newton, Gradiente Conjugado n~ao linear, máxima descida e algoritmo genético. Devido à grande demanda computacional deste problema a computação paralela foi amplamente utilizada. As comparações entre os algoritmos de otimização foram realizadas em um ambiente de computação paralela heterogêneo e os resultados preliminares são apresentados e discutidos. / The process of history matching aims on the determination of the models' parameters from a petroleum reservoir. Once adjusted, the models can be used for the prediction of the reservoir behavior. This work presents a comparsion of different optimization methods for this problem's solution. Derivative based methods are compared to a genetic algorithm. In particular, the following methods are compared: Levenberg-Marquadt, Quasi-Newton, Non Linear Conjugate Gradient, steepest descent and genetic algorithm. Due to the great computational demand of this problem, the parallel computing has been widely used. The comparsions among the optimization algorithms were performed in an heterogeneous parallel computing environment and the preliminar results are presented and discussed.
52

A parallel preconditioned iterative realization of the panel method in 3D

Pester, M., Rjasanow, S. 30 October 1998 (has links) (PDF)
The parallel version of precondition iterative techniques is developed for matrices arising from the panel boundary element method for three-dimensional simple connected domains with Dirichlet boundary conditions. Results were obtained on an nCUBE-2 parallel computer showing that iterative solution methods are very well suited also in three-dimensional case for implementation on a MIMD computer and that they are much more efficient than usual direct solution techniques.
53

Visualisation Studio for the analysis of massive datasets

Tucker, Roy Colin January 2016 (has links)
This thesis describes the research underpinning and the development of a cross platform application for the analysis of simultaneously recorded multi-dimensional spike trains. These spike trains are believed to carry the neural code that encodes information in a biological brain. A number of statistical methods already exist to analyse the temporal relationships between the spike trains. Historically, hundreds of spike trains have been simultaneously recorded, however as a result of technological advances recording capability has increased. The analysis of thousands of simultaneously recorded spike trains is now a requirement. Effective analysis of large data sets requires software tools that fully exploit the capabilities of modern research computers and effectively manage and present large quantities of data. To be effective such software tools must; be targeted at the field under study, be engineered to exploit the full compute power of research computers and prevent information overload of the researcher despite presenting a large and complex data set. The Visualisation Studio application produced in this thesis brings together the fields of neuroscience, software engineering and information visualisation to produce a software tool that meets these criteria. A visual programming language for neuroscience is produced that allows for extensive pre-processing of spike train data prior to visualisation. The computational challenges of analysing thousands of spike trains are addressed using parallel processing to fully exploit the modern researcher’s computer hardware. In the case of the computationally intensive pairwise cross-correlation analysis the option to use a high performance compute cluster (HPC) is seamlessly provided. Finally the principles of information visualisation are applied to key visualisations in neuroscience so that the researcher can effectively manage and visually explore the resulting data sets. The final visualisations can typically represent data sets 10 times larger than previously while remaining highly interactive.
54

基於大數據資料的非監督分散式分群演算法 / An Effective Distributed GHSOM Algorithm for Unsupervised Clustering on Big Data

邱垂暉, Chiu, Chui Hui Unknown Date (has links)
基於屬性相似度將樣本進行分群的技術已經被廣泛應用在許多領域,如模式識別,特徵提取和惡意行為偵測。由於此技術的重要性,很多人已經將各種分群技術利用分散式框架進行再製,例如K-means搭配Hadoop在Apache Mahout平台上。由於K-means需要預先定義分群數量,而自組織映射圖(SOM)需要預先定義圖的大小,所以能夠自動將樣本依照樣本間的變化容差進行分群的GHSOM(增長層次自組織映射圖)就提供了一個很棒的非監督學習方法用來針對某些資訊不完整的資料。然而,GHSOM目前並不是一個分散式的演算法,這就限制了其在大數據資料的應用上。在本篇論文中,我們提出了一種新的分散式GHSOM演算法。我們使用Scala的Actor Model來實現GHSOM的分散式系統,我們將GHSOM演算法中的水平擴增以及垂直擴增交由Actor來處理並顯示出顯著的性能提升。為了評估我們所提出的方法,我們收集並分析了數千個惡意程式在現實生活中的執行行為,並通過在數百萬個樣本上進行非監督分群後推導出惡意程式行為的檢測規則來顯示其性能的改進、規則有效性以及實踐中的潛在用法。 / Clustering techniques that group samples based on their attribute similarity have been widely used in many fields such as pattern recognition, feature extraction and malicious behavior characterization. Due to its importance, various clustering techniques have been developed with distributed frameworks such as K-means with Hadoop in Apache Mahout for scalable computation. While K-means requires the number of clusters and self organizing maps (SOM) requires the map size to be given, the technique of GHSOM (growing hierarchical self organizing maps) that clusters samples dynamically to satisfy the requirement on tolerance of variation between samples, poses an attractive unsupervised learning solution for data that have limited information to decide the number of clusters in advance. However it is not scalable with sequential computation, which limits its applications on big data. In this paper, we present a novel distributed algorithm on GHSOM. We take advantage of parallel computation with scala actor model for GHSOM construction, distributing vertical and horizontal expansion tasks to actors and showing significant performance improvement. To evaluate the presented approach, we collect and analyze execution behaviors of thousands of malware in real life and derive detection rules with the presented unsupervised clustering on millions samples, showing its performance improvement, rule effectiveness and potential usage in practice.
55

Décomposition en temps réel de signaux iEMG : filtrage bayésien implémenté sur GPU / On-line decomposition of iEMG signals using GPU-implemented Bayesian filtering

Yu, Tianyi 28 January 2019 (has links)
Un algorithme de décomposition des unités motrices constituant un signal électromyographiques intramusculaires (iEMG) a été proposé au laboratoire LS2N. Il s'agit d'un filtrage bayésien estimant l'état d'un modèle de Markov caché. Cet algorithme demande beaucoup de temps d'execution, même pour un signal ne contenant que 4 unités motrices. Dans notre travail, nous avons d'abord validé cet algorithme dans une structure série. Nous avons proposé quelques modifications pour le modèle de recrutement des unités motrices et implémenté deux techniques de pré-traitement pour améliorer la performance de l'algorithme. Le banc de filtres de Kalman a été remplacé par un banc de filtre LMS. Le filtre global consiste en l'examen de divers scénarios arborescents d'activation des unités motrices: on a introduit deux techniques heuristiques pour élaguer les divers scénarios. On a réalisé l'implémentation GPU de cet algorithme à structure parallèle intrinsèque. On a réussi la décomposition de 10 signaux expérimentaux enregistrés sur deux muscules, respectivement avec électrode aiguille et électrode filaire. Le nombre d'unités motrices est de 2 à 8. Le pourcentage de superposition des potentiels d'unité motrice, qui représente la complexité de signal, varie de 6.56 % à 28.84 %. La précision de décomposition de tous les signaux sont plus que 90 %, sauf deux signaux en 30 % MVC , sauf pour deux signaux qui sont à 30 % MVC et dont la précision de décomposition est supérieure à 85%. Nous sommes les premiers à réaliser la décomposition en temps réel pour un signal constitué de 10 unités motrices. / :A sequential decomposition algorithm based on a Hidden Markov Model of the EMG, that used Bayesian filtering to estimate the unknown parameters of discharge series of motor units was previously proposed in the laboratory LS2N. This algorithm has successfully decomposed the experimental iEMG signal with four motor units. However, the proposed algorithm demands a high time consuming. In this work, we firstly validated the proposed algorithm in a serial structure. We proposed some modifications for the activation process of the recruitment model in Hidden Markov Model and implemented two signal pre-processing techniques to improve the performance of the algorithm. Then, we realized a GPU-oriented implementation of this algorithm, as well as the modifications applied to the original model in order to achieve a real-time performance. We have achieved the decomposition of 10 experimental iEMG signals acquired from two different muscles, respectively by fine wire electrodes and needle electrodes. The number of motor units ranges from 2 to 8. The percentage of superposition, representing the complexity of iEMG signal, ranges from 6.56 % to 28.84 %. The accuracies of almost all experimental iEMG signals are more than90 %, except two signals at 30 % MVC (more than 85 %). Moreover, we realized the realtime decomposition for all these experimental signals by the parallel implementation. We are the first one that realizes the real time full decomposition of single channel iEMG signal with number of MUs up to 10, where full decomposition means resolving the superposition problem. For the signals with more than 10 MUs, we can also decompose them quickly, but not reaching the real time level.
56

A parallel preconditioned iterative realization of the panel method in 3D

Pester, M., Rjasanow, S. 30 October 1998 (has links)
The parallel version of precondition iterative techniques is developed for matrices arising from the panel boundary element method for three-dimensional simple connected domains with Dirichlet boundary conditions. Results were obtained on an nCUBE-2 parallel computer showing that iterative solution methods are very well suited also in three-dimensional case for implementation on a MIMD computer and that they are much more efficient than usual direct solution techniques.
57

<strong>NONLINEAR BAYESIAN CONTROL FRAMEWORK FOR PARALLEL REAL-TIME HYBRID SIMULATION</strong>

Johnny Wilfredo Condori Uribe (16661055) 01 August 2023 (has links)
<p>  </p> <p>The development of an increasingly interconnected infrastructure and its rapid evolution demands engineering testing solutions capable of investigating realistically and with high accuracy the interactions among the different components of the problem to study. The examination of any of these components without losing the interaction of the other surroundings components is not only realistic, but also desirable. The more interconnected the whole system is, the greater the dependencies. Real-time Hybrid Simulation (RTHS) is a disruptive technology that has the potential to address this type of complex interactions or internal couplings by partitioning the system into numerical (better understood) substructures and experimental (unknown) substructures, which are built physically in the laboratory. These two types of substructures are connected through a transfer system (e.g., hydraulic actuators) to enforce boundary conditions in their common interfaces creating a synchronized cyber-physical system. However, despite the RTHS community has been improving these hybrid techniques, there are still important barriers in their core methodologies. Current control approaches developed for RTHS were validated mainly for linear applications with limited capabilities to deal with high uncertainties, hard nonlinearities, or extensive damage of structural elements due to plasticity. Furthermore, capturing the realistic dynamics of a structural system requires the description of the motion using more than one degree of freedom, which increases the number of hydraulic actuators needed to enforce additional degrees of freedom at boundary condition interface. As these requirements escalate for larger or more complex problems, the computational cost can turn into a prohibitive constraint. </p> <p>In this dissertation, the main research goal is to develop and validate a nonlinear controller with capabilities to control highly uncertain nonlinear physical substructures with complex boundary conditions and its parallel computational implementation for accurate and realistic RTHS. The validation of the proposed control system is achieved through a set of real-time tracking control and RTHS experiments that explore robustness, accuracy performance, and their trade-off </p>
58

On Space-Time Trade-Off for Montgomery Multipliers over Finite Fields

Chen, Yiyang 04 1900 (has links)
La multiplication dans le corps de Galois à 2^m éléments (i.e. GF(2^m)) est une opérations très importante pour les applications de la théorie des correcteurs et de la cryptographie. Dans ce mémoire, nous nous intéressons aux réalisations parallèles de multiplicateurs dans GF(2^m) lorsque ce dernier est généré par des trinômes irréductibles. Notre point de départ est le multiplicateur de Montgomery qui calcule A(x)B(x)x^(-u) efficacement, étant donné A(x), B(x) in GF(2^m) pour u choisi judicieusement. Nous étudions ensuite l'algorithme diviser pour régner PCHS qui permet de partitionner les multiplicandes d'un produit dans GF(2^m) lorsque m est impair. Nous l'appliquons pour la partitionnement de A(x) et de B(x) dans la multiplication de Montgomery A(x)B(x)x^(-u) pour GF(2^m) même si m est pair. Basé sur cette nouvelle approche, nous construisons un multiplicateur dans GF(2^m) généré par des trinôme irréductibles. Une nouvelle astuce de réutilisation des résultats intermédiaires nous permet d'éliminer plusieurs portes XOR redondantes. Les complexités de temps (i.e. le délais) et d'espace (i.e. le nombre de portes logiques) du nouveau multiplicateur sont ensuite analysées: 1. Le nouveau multiplicateur demande environ 25% moins de portes logiques que les multiplicateurs de Montgomery et de Mastrovito lorsque GF(2^m) est généré par des trinômes irréductible et m est suffisamment grand. Le nombre de portes du nouveau multiplicateur est presque identique à celui du multiplicateur de Karatsuba proposé par Elia. 2. Le délai de calcul du nouveau multiplicateur excède celui des meilleurs multiplicateurs d'au plus deux évaluations de portes XOR. 3. Nous determinons le délai et le nombre de portes logiques du nouveau multiplicateur sur les deux corps de Galois recommandés par le National Institute of Standards and Technology (NIST). Nous montrons que notre multiplicateurs contient 15% moins de portes logiques que les multiplicateurs de Montgomery et de Mastrovito au coût d'un délai d'au plus une porte XOR supplémentaire. De plus, notre multiplicateur a un délai d'une porte XOR moindre que celui du multiplicateur d'Elia au coût d'une augmentation de moins de 1% du nombre total de portes logiques. / The multiplication in a Galois field with 2^m elements (i.e. GF(2^m)) is an important arithmetic operation in coding theory and cryptography. In this thesis, we focus on the bit- parallel multipliers over the Galois fields generated by trinomials. We start by introducing the GF(2^m) Montgomery multiplication, which calculates A(x)B(x)x^{-u} in GF(2^m) with two polynomials A(x), B(x) in GF(2^m) and a properly chosen u. Then, we investigate the rule for multiplicand partition used by a divide-and-conquer algorithm PCHS originally proposed for the multiplication over GF(2^m) with odd m. By adopting similar rules for splitting A(x) and B(x) in A(x)B(x)x^{-u}, we develop new Montgomery multiplication formulae for GF(2^m) with m either odd or even. Based on this new approach, we develop the corresponding bit-parallel Montgomery multipliers for the Galois fields generated by trinomials. A new bit-reusing trick is applied to eliminate redundant XOR gates from the new multiplier. The time complexity (i.e. the delay) and the space complexity (i.e. the logic gate number) of the new multiplier are explicitly analysed: 1. This new multiplier is about 25% more efficient in the number of logic gates than the previous trinomial-based Montgomery multipliers or trinomial-based Mastrovito multipliers on GF(2^m) with m big enough. It has a number of logic gates very close to that of the Karatsuba multiplier proposed by Elia. 2. While having a significantly smaller number of logic gates, this new multiplier is at most two T_X larger in the total delay than the fastest bit-parallel multiplier on GF(2^m), where T_X is the XOR gate delay. 3. We determine the space and time complexities of our multiplier on the two fields recommended by the National Institute of Standards and Technology (NIST). Having at most one more T_X in the total delay, our multiplier has a more-than-15% reduced logic gate number compared with the other Montgomery or Mastrovito multipliers. Moreover, our multiplier is one T_X smaller in delay than the Elia's multiplier at the cost of a less-than-1% increase in the logic gate number.
59

Estimation d'erreur de discrétisation dans les calculs par décomposition de domaine / Estimation of discretization error in domain decomposition computations

Parret-Fréaud, Augustin 28 June 2011 (has links)
Le contrôle de la qualité des calculs de structure suscite un intérêt croissant dans les processus de conception et de certification. Il repose sur l'utilisation d'estimateurs d'erreur, dont la mise en pratique entraîne un sur-coût numérique souvent prohibitif sur des calculs de grande taille. Le présent travail propose une nouvelle procédure permettant l'obtention d'une estimation garantie de l'erreur de discrétisation dans le cadre de problèmes linéaires élastiques résolus au moyen d'approches par décomposition de domaine. La méthode repose sur l'extension du concept d'erreur en relation de comportement au cadre des décompositions de domaine sans recouvrement, en s'appuyant sur la construction de champs admissibles aux interfaces. Son développement dans le cadre des approches FETI et BDD permet d'accéder à une mesure pertinente de l'erreur de discrétisation bien avant convergence du solveur lié à la décomposition de domaine. Une extension de la procédure d'estimation aux problèmes hétérogènes est également proposée. Le comportement de la méthode est illustré et discuté sur plusieurs exemples numériques en dimension 2. / The control of the quality of mechanical computations arouses a growing interest in both design and certification processes. It relies on error estimators the use of which leads to often prohibitive additional numerical costs on large computations. The present work puts forward a new procedure enabling to obtain a guaranteed estimation of discretization error in the setting of linear elastic problems solved by domain decomposition approaches. The method relies on the extension of the constitutive relation error concept to the framework of non-overlapping domain decomposition through the recovery of admissible interface fields. Its development within the framework of the FETI and BDD approaches allows to obtain a relevant estimation of discretization error well before the convergence of the solver linked to the domain decomposition. An extension of the estimation procedure to heterogeneous problems is also proposed. The behaviour of the method is illustrated and assessed on several numerical examples in 2 dimension.
60

Algorithmes parallèles pour le suivi de particules / Parallel algorithms for tracking of particles

Bonnier, Florent 12 December 2018 (has links)
Les méthodes de suivi de particules sont couramment utilisées en mécanique des fluides de par leur propriété unique de reconstruire de longues trajectoires avec une haute résolution spatiale et temporelle. De fait, de nombreuses applications industrielles mettant en jeu des écoulements gaz-particules, comme les turbines aéronautiques utilisent un formalisme Euler-Lagrange. L’augmentation rapide de la puissance de calcul des machines massivement parallèles et l’arrivée des machines atteignant le petaflops ouvrent une nouvelle voie pour des simulations qui étaient prohibitives il y a encore une décennie. La mise en oeuvre d’un code parallèle efficace pour maintenir une bonne performance sur un grand nombre de processeurs devra être étudié. On s’attachera en particuliers à conserver un bon équilibre des charges sur les processeurs. De plus, une attention particulière aux structures de données devra être fait afin de conserver une certaine simplicité et la portabilité et l’adaptabilité du code pour différentes architectures et différents problèmes utilisant une approche Lagrangienne. Ainsi, certains algorithmes sont à repenser pour tenir compte de ces contraintes. La puissance de calcul permettant de résoudre ces problèmes est offerte par des nouvelles architectures distribuées avec un nombre important de coeurs. Cependant, l’exploitation efficace de ces architectures est une tâche très délicate nécessitant une maîtrise des architectures ciblées, des modèles de programmation associés et des applications visées. La complexité de ces nouvelles générations des architectures distribuées est essentiellement due à un très grand nombre de noeuds multi-coeurs. Ces noeuds ou une partie d’entre eux peuvent être hétérogènes et parfois distants. L’approche de la plupart des bibliothèques parallèles (PBLAS, ScalAPACK, P_ARPACK) consiste à mettre en oeuvre la version distribuée de ses opérations de base, ce qui signifie que les sous-programmes de ces bibliothèques ne peuvent pas adapter leurs comportements aux types de données. Ces sous programmes doivent être définis une fois pour l’utilisation dans le cas séquentiel et une autre fois pour le cas parallèle. L’approche par composants permet la modularité et l’extensibilité de certaines bibliothèques numériques (comme par exemple PETSc) tout en offrant la réutilisation de code séquentiel et parallèle. Cette approche récente pour modéliser des bibliothèques numériques séquentielles/parallèles est très prometteuse grâce à ses possibilités de réutilisation et son moindre coût de maintenance. Dans les applications industrielles, le besoin de l’emploi des techniques du génie logiciel pour le calcul scientifique dont la réutilisabilité est un des éléments des plus importants, est de plus en plus mis en évidence. Cependant, ces techniques ne sont pas encore maÃotrisées et les modèles ne sont pas encore bien définis. La recherche de méthodologies afin de concevoir et réaliser des bibliothèques réutilisables est motivée, entre autres, par les besoins du monde industriel dans ce domaine. L’objectif principal de ce projet de thèse est de définir des stratégies de conception d’une bibliothèque numérique parallèle pour le suivi lagrangien en utilisant une approche par composants. Ces stratégies devront permettre la réutilisation du code séquentiel dans les versions parallèles tout en permettant l’optimisation des performances. L’étude devra être basée sur une séparation entre le flux de contrôle et la gestion des flux de données. Elle devra s’étendre aux modèles de parallélisme permettant l’exploitation d’un grand nombre de coeurs en mémoire partagée et distribuée. / The complexity of these new generations of distributed architectures is essencially due to a high number of multi-core nodes. Most of the nodes can be heterogeneous and sometimes remote. Today, nor the high number of nodes, nor the processes that compose the nodes are exploited by most of applications and numerical libraries. The approach of most of parallel libraries (PBLAS, ScalAPACK, P_ARPACK) consists in implementing the distributed version of its base operations, which means that the subroutines of these libraries can not adapt their behaviors to the data types. These subroutines must be defined once for use in the sequential case and again for the parallel case. The object-oriented approach allows the modularity and scalability of some digital libraries (such as PETSc) and the reusability of sequential and parallel code. This modern approach to modelize sequential/parallel libraries is very promising because of its reusability and low maintenance cost. In industrial applications, the need for the use of software engineering techniques for scientific computation, whose reusability is one of the most important elements, is increasingly highlighted. However, these techniques are not yet well defined. The search for methodologies for designing and producing reusable libraries is motivated by the needs of the industries in this field. The main objective of this thesis is to define strategies for designing a parallel library for Lagrangian particle tracking using a component approach. These strategies should allow the reuse of the sequential code in the parallel versions while allowing the optimization of the performances. The study should be based on a separation between the control flow and the data flow management. It should extend to models of parallelism allowing the exploitation of a large number of cores in shared and distributed memory.

Page generated in 0.1339 seconds