Global ETD Search

31	Algorithmes Parallèles Efficaces Appliqués aux Calculs sur Maillages Non Structurés / Scalable and Efficient Algorithms for Unstructured Mesh Computations Thebault, Loïc 14 October 2016 (has links) Le besoin croissant en simulation a conduit à l’élaboration de supercalculateurs complexes et d’un nombre croissant de logiciels hautement parallèles. Ces supercalculateurs requièrent un rendement énergétique et une puissance de calcul de plus en plus importants. Les récentes évolutions matérielles consistent à augmenter le nombre de noeuds de calcul et de coeurs par noeud. Certaines ressources n’évoluent cependant pas à la même vitesse. La multiplication des coeurs de calcul implique une diminution de la mémoire par coeur, plus de trafic de données, un protocole de cohérence plus coûteux et requiert d’avantage de parallélisme. De nombreuses applications et modèles actuels peinent ainsi à s’adapter à ces nouvelles tendances. En particulier, générer du parallélisme massif dans des méthodes d’éléments finis utilisant des maillages non structurés, et ce avec un nombre minimal de synchronisations et des charges de travail équilibrées, s’avèrent particulièrement difficile. Afin d’exploiter efficacement les multiples niveaux de parallélisme des architectures actuelles, différentes approches parallèles doivent être combinées. Cette thèse propose plusieurs contributions destinées à paralléliser les codes et les structures irrégulières de manière efficace. Nous avons développé une approche parallèle hybride par tâches à grain fin combinant les formes de parallélisme distribuée, partagée et vectorielle sur des structures irrégulières. Notre approche a été portée sur plusieurs applications industrielles développées par Dassault Aviation et a permis d’importants gains de performance à la fois sur les multicoeurs classiques ainsi que sur le récent Intel Xeon Phi. / The growing need for numerical simulations results in larger and more complex computing centers and more HPC softwares. Actual HPC system architectures have an increasing requirement for energy efficiency and performance. Recent advances in hardware design result in an increasing number of nodes and an increasing number of cores per node. However, some resources do not scale at the same rate. The increasing number of cores and parallel units implies a lower memory per core, higher requirement for concurrency, higher coherency traffic, and higher cost for coherency protocol. Most of the applications and runtimes currently in use struggle to scale with the present trend. In the context of finite element methods, exposing massive parallelism on unstructured mesh computations with efficient load balancing and minimal synchronizations is challenging. To make efficient use of these architectures, several parallelization strategies have to be combined together to exploit the multiple levels of parallelism. This P.h.D. thesis proposes several contributions aimed at overpassing this limitation by addressing irregular codes and data structures in an efficient way. We developed a hybrid parallelization approach combining the distributed, shared, and vectorial forms of parallelism in a fine grain taskbased approach applied to irregular structures. Our approach has been ported to several industrial applications developed by Dassault Aviation and has led to important speedups using standard multicores and the Intel Xeon Phi manycore. Algorithme Parallélisme Diviser et conquérir Maillage non structuré Vectorisation Méthode d'éléments finis Algorithm Parallelism Divide-And-Conquer Unstructured mesh Vectorization Finite element method 004.35
32	Quelques propositions pour la mise en oeuvre d'algorithmes combinatoires Tallot, Didier 28 June 1985 (has links) (PDF) Le travail, exposé dans ce rapport, se divise en deux parties. La première partie a fait l'objet d'un rapport de recherche publié en Avril 1984 [Tall]. La deuxième partie résulte d'un travail qui s'est déroulé de juin 1984 i juin 1985. Pour produire des images de hautes qualités, on est obligé de manipuler des grandes quantités de données. D'où l' intérêt d'étudier des algorithmes et des structures de données efficaces pour résoudre ~es problèmes géométriques. On pourra ainsi obtenir des méthodes efficaces pour la manipulation de données graphiques. On peut citer i ce propos, les travaux pionners de SHAMOS dans le domaine de la complexité géométrique [Sham]. La deuxième partie contient la description d 'un logiciel interactif *de manipulation de graphes et d'ordres. A notre connaissnce, la plus ancienne réalisation de ce type de logiciel est le "Graph theory interactive system" de WOLFBERG [Wolf]. Suivant les travaux de GUIDO sur CABRI (CAhier de BRouillon Informatisé), nous desirons offrir un poste de travail sur les graphes pour des chercheurs en théorie des graphes. CABRI est une bonne approche du problème, mais reste d'un emploi malaisé. Nous avons donc étudiés de nouveau le problème en nous attachant à décrire une meilleure interface utilisateur. Nous nous sommes inspirés des travaux sur les logiciels interactif existant, comme ceux développés chez XEROX, au PALO-ALTO RESEARCH CENTER (Smit] chez NIEVERGELT [Sere]. complexité géométrique manipulation de données graphiques enveloppe convexe intersections d'un ensemble principe "Divide-and-Conquer" principe du ratissage
33	Automated Reasoning Support for Invasive Interactive Parallelization Moshir Moghaddam, Kianosh January 2012 (has links) To parallelize a sequential source code, a parallelization strategy must be defined that transforms the sequential source code into an equivalent parallel version. Since parallelizing compilers can sometimes transform sequential loops and other well-structured codes into parallel ones automatically, we are interested in finding a solution to parallelize semi-automatically codes that compilers are not able to parallelize automatically, mostly because of weakness of classical data and control dependence analysis, in order to simplify the process of transforming the codes for programmers.Invasive Interactive Parallelization (IIP) hypothesizes that by using anintelligent system that guides the user through an interactive process one can boost parallelization in the above direction. The intelligent system's guidance relies on a classical code analysis and pre-defined parallelizing transformation sequences. To support its main hypothesis, IIP suggests to encode parallelizing transformation sequences in terms of IIP parallelization strategies that dictate default ways to parallelize various code patterns by using facts which have been obtained both from classical source code analysis and directly from the user.In this project, we investigate how automated reasoning can supportthe IIP method in order to parallelize a sequential code with an acceptable performance but faster than manual parallelization. We have looked at two special problem areas: Divide and conquer algorithms and loops in the source codes. Our focus is on parallelizing four sequential legacy C programs such as: Quick sort, Merge sort, Jacobi method and Matrix multipliation and summation for both OpenMP and MPI environment by developing an interactive parallelizing assistance tool that provides users with the assistanceneeded for parallelizing a sequential source code. Multi-processors Dependence Analysis Code parallelization Semi-automatic parallelization IIP ISC OpenMP MPI Artificial Intelligence Reasoning Decision Tree Divide and Conquer (D&C) algorithms
34	Sub-graph Approach In Iterative Sum-product Algorithm Bayramoglu, Muhammet Fatih 01 September 2005 (has links) (PDF) Sum-product algorithm can be employed for obtaining the marginal probability density functions from a given joint probability density function (p.d.f.). The sum-product algorithm operates on a factor graph which represents the dependencies of the random variables whose joint p.d.f. is given. The sum-product algorithm can not be operated on factor-graphs that contain loops. For these factor graphs iterative sum-product algorithm is used. A factor graph which contains loops can be divided in to loop-free sub-graphs. Sum-product algorithm can be operated in these loop-free sub-graphs and results of these sub-graphs can be combined for obtaining the result of the whole factor graph in an iterative manner. This method may increase the convergence rate of the algorithm significantly while keeping the complexity of an iteration and accuracy of the output constant. A useful by-product of this research that is introduced in this thesis is a good approximation to message calculation in factor nodes of the inter-symbol interference (ISI) factor graphs. This approximation has a complexity that is linearly proportional with the number of neighbors instead of being exponentially proportional. Using this approximation and the sub-graph idea we have designed and simulated joint decoding-equalization (turbo equalization) algorithm and obtained good results besides the low complexity.
35	以規則為基礎的分類演算法：應用粗糙集 / A Rule-Based classification algorithm: a rough set approach 廖家奇, Liao, Chia Chi Unknown Date (has links) 在本論文中，我們提出了一個以規則為基礎的分類演算法，名為ROUSER（ROUgh SEt Rule），它利用粗糙集理論作為搜尋啟發的基礎，進而建立規則。我們使用一個已經被廣泛利用的工具實作ROUSER，也使用數個公開資料集對它進行實驗，並將它應用於真實世界的案例。本論文的初衷可被追溯到一個真實世界的案例，而此案例的目標是從感應器所蒐集的資料中找出與機械故障之間的關聯。為了能支援機械故障的根本原因分析，我們設計並實作了一個以規則為基礎的分類演算法，它所產生的模型是由人類可理解的決策規則所組成，而故障的徵兆與原因則被決策規則所連結。此外，資料中存在著矛盾。舉例而言，不同時間點所蒐集的兩筆紀錄極為相似、甚至相同（除了時間戳記），但其中一筆紀錄與機械故障相關，另一筆則否。本案例的挑戰在於分析矛盾的資料。我們使用粗糙集理論克服這個難題，因為它可以處理不完美知識。研究者們已經提出了各種不同的分類演算法，而實踐者們則已經將它們應用於各種領域，然而多數分類演算法的設計並不強調演算法所產生模型的可解釋性與可理解性。ROUSER的設計是專門從名目資料中萃取人類可理解的決策規則。而ROUSER與其它多數規則分類演算法不同的地方是利用粗糙集方法選取特徵。ROUSER也提供了數種方式來選擇合宜的屬性與值配對，作為規則的前項。此外，ROUSER的規則產生方法是基於separate-and-conquer策略，因此比其它基於粗糙集的分類演算法所廣泛採用的不可分辨矩陣方法還有效率。我們進行延伸實驗來驗證ROUSER的能力。對於名目資料的實驗裡，ROUSER在半數的結果中的準確率可匹敵、甚至勝過其他以規則為基礎的分類演算法以及決策樹分類演算法。ROUSER也可以在一些離散化的資料集之中達到可匹敵甚至超越的準確率。我們也提供了內建的特徵萃取方法與其它方法的比較的實驗結果，以及數種用來決定規則前項的方法的實驗結果。 / In this thesis, we propose a rule-based classification algorithm named ROUSER (ROUgh SEt Rule), which uses the rough set theory as the basis of the search heuristics in the process of rule generation. We implement ROUSER using a well developed and widely used toolkit, evaluate it using several public data sets, and examine its applicability using a real-world case study. The origin of the problem addressed in this thesis can be traced back to a real-world problem where the goal is to determine whether a data record collected from a sensor corresponds to a machine fault. In order to assist in the root cause analysis of the machine faults, we design and implement a rule-based classification algorithm that can generate models consisting of human understandable decision rules to connect symptoms to the cause. Moreover, there are contradictions in data. For example, two data records collected at different time points are similar, or the same (except their timestamps), while one is corresponding to a machine fault but not the other. The challenge is to analyze data with contradictions. We use the rough set theory to overcome the challenge, since it is able to process imperfect knowledge. Researchers have proposed various classification algorithms and practitioners have applied them to various application domains, while most of the classification algorithms are designed without a focus on interpretability or understandability of the models built using the algorithms. ROUSER is specifically designed to extract human understandable decision rules from nominal data. What distinguishes ROUSER from most, if not all, other rule-based classification algorithms is that it utilizes a rough set approach to select features. ROUSER also provides several ways to decide an appropriate attribute-value pair for the antecedents of a rule. Moreover, the rule generation method of ROUSER is based on the separate-and-conquer strategy, and hence it is more efficient than the indiscernibility matrix method that is widely adopted in the classification algorithms based on the rough set theory. We conduct extensive experiments to evaluate the capability of ROUSER. On about half of the nominal data sets considered in experiments, ROUSER can achieve comparable or better accuracy than do classification algorithms that are able to generate decision rules or trees. On some of the discretized data sets, ROUSER can achieve comparable or better accuracy. We also present the results of the experiments on the embedded feature selection method and several ways to decide an appropriate attribute-value pair for the antecedents of a rule. 資料探勘分類粗糙集規則學習規則歸納 data mining classification rough set rule learning separate-and-conquer rule induction
36	Designing a Question Answering System in the Domain of Swedish Technical Consulting Using Deep Learning / Design av ett frågebesvarande system inom svensk konsultverksamhet med användning av djupinlärning Abrahamsson, Felix January 2018 (has links) Question Answering systems are greatly sought after in many areas of industry. Unfortunately, as most research in Natural Language Processing is conducted in English, the applicability of such systems to other languages is limited. Moreover, these systems often struggle in dealing with long text sequences. This thesis explores the possibility of applying existing models to the Swedish language, in a domain where the syntax and semantics differ greatly from typical Swedish texts. Additionally, the text length may vary arbitrarily. To solve these problems, transfer learning techniques and state-of-the-art Question Answering models are investigated. Furthermore, a novel, divide-and-conquer based technique for processing long texts is developed. Results show that the transfer learning is partly unsuccessful, but the system is capable of perform reasonably well in the new domain regardless. Furthermore, the system shows great performance improvement on longer text sequences with the use of the new technique. / System som givet en text besvarar frågor är högt eftertraktade inom många arbetsområden. Eftersom majoriteten av all forskning inom naturligtspråkbehandling behandlar engelsk text är de flesta system inte direkt applicerbara på andra språk. Utöver detta har systemen ofta svårt att hantera långa textsekvenser. Denna rapport utforskar möjligheten att applicera existerande modeller på det svenska språket, i en domän där syntaxen och semantiken i språket skiljer sig starkt från typiska svenska texter. Dessutom kan längden på texterna variera godtyckligt. För att lösa dessa problem undersöks flera tekniker inom transferinlärning och frågebesvarande modeller i forskningsfronten. En ny metod för att behandla långa texter utvecklas, baserad på en dekompositionsalgoritm. Resultaten visar på att transfer learning delvis misslyckas givet domänen och modellerna, men att systemet ändå presterar relativt väl i den nya domänen. Utöver detta visas att systemet presterar väl på långa texter med hjälp av den nya metoden. Question Answering Deep Learning Machine Learning Transfer Learning Natural Language Processing Technical Consulting Word Embeddings Divide and Conquer Computer Sciences Datavetenskap (datalogi)
37	Triangulation de Delaunay et arbres multidimensionnels Lemaire, Christophe 19 December 1997 (has links) (PDF) Les travaux effectués lors de cette thèse concernent principalement la triangulation de Delaunay. On montre que la complexité en moyenne - en termes de sites inachevés - du processus de fusion multidimensionnelle dans l'hypothèse de distribution quasi-uniforme dans un hypercube est linéaire en moyenne. Ce résultat général est appliqué au cas du plan et permet d'analyser de nouveaux algorithmes de triangulation de Delaunay plus performants que ceux connus à ce jour. Le principe sous-jacent est de diviser le domaine selon des arbres bidimensionnels (quadtree, 2d-tree, bucket-tree. . . ) puis de fusionner les cellules obtenues selon deux directions. On étudie actuellement la prise en compte de contraintes directement pendant la phase de triangulation avec des algorithmes de ce type. De nouveaux algorithmes pratiques de localisation dans une triangulation sont proposés, basés sur la randomisation à partir d'un arbre binaire de recherche dynamique de type AVL, dont l'un est plus rapide que l'algorithme optimal de Kirkpatrick, au moins jusqu'à 12 millions de sites K Nous travaillons actuellement sur l'analyse rigoureuse de leur complexité en moyenne. Ce nouvel algorithme est utilisé pour construire " en-ligne " une triangulation de Delaunay qui est parmi les plus performantes des méthodes " en-ligne " connues à ce jour. géométrie algorithmique triangulation de Delaunay diagramme de Voronoi arbre multidimensionnel Divide-and-Conquer kd-tree quadtree bucket complexité en moyenne complexité dans le pire des cas site inachevé localisation algorithme dynamique surface triangulée
38	TOP-K AND SKYLINE QUERY PROCESSING OVER RELATIONAL DATABASE Samara, Rafat January 2012 (has links) Top-k and Skyline queries are a long study topic in database and information retrieval communities and they are two popular operations for preference retrieval. Top-k query returns a subset of the most relevant answers instead of all answers. Efficient top-k processing retrieves the k objects that have the highest overall score. In this paper, some algorithms that are used as a technique for efficient top-k processing for different scenarios have been represented. A framework based on existing algorithms with considering based cost optimization that works for these scenarios has been presented. This framework will be used when the user can determine the user ranking function. A real life scenario has been applied on this framework step by step. Skyline query returns a set of points that are not dominated (a record x dominates another record y if x is as good as y in all attributes and strictly better in at least one attribute) by other points in the given datasets. In this paper, some algorithms that are used for evaluating the skyline query have been introduced. One of the problems in the skyline query which is called curse of dimensionality has been presented. A new strategy that based on the skyline existing algorithms, skyline frequency and the binary tree strategy which gives a good solution for this problem has been presented. This new strategy will be used when the user cannot determine the user ranking function. A real life scenario is presented which apply this strategy step by step. Finally, the advantages of the top-k query have been applied on the skyline query in order to have a quickly and efficient retrieving results. Top-k query Skyline query Fagin’s algorithm Threshold Algorithm No random access algorithm Minimal Probing algorithm Block-Nested-Loop algorithm Nearest Neighbor algorithm Branch and Bound Skyline Algorithm Divide and Conquer algorithm
39	Design, Implementation and Cryptanalysis of Modern Symmetric Ciphers Henricksen, Matthew January 2005 (has links) The main objective of this thesis is to examine the trade-offs between security and efficiency within symmetric ciphers. This includes the influence that block ciphers have on the new generation of word-based stream ciphers. By incorporating block-cipher like components into their designs, word-based stream ciphers have experienced hundreds-fold improvement in speed over bit-based stream ciphers, without any observable security degradation. The thesis also emphasizes the importance of keying issues in block and stream ciphers, showing that by reusing components of the principal cipher algorithm in the keying algorithm, security can be enhanced without loss of key-agility or expanding footprint in software memory. Firstly, modern block ciphers from four recent cipher competitions are surveyed and categorized according to criteria that includes the high-level structure of the block cipher, the method in which non-linearity is instilled into each round, and the strength of the key schedule. In assessing the last criterion, a classification by Carter [45] is adopted and modified to improve its consistency. The classification is used to demonstrate that the key schedule of the Advanced Encryption Standard (AES) [62] is surprisingly flimsy for a national standard. The claim is supported with statistical evidence that shows the key schedule suffers from bit leakage and lacks sufficient diffusion. The thesis contains a replacement key schedule that reuses components from the cipher algorithm, leveraging existing analysis to improve security, and reducing the cipher's implementation footprint while maintaining key agility. The key schedule is analyzed from the perspective of an efficiency-security tradeoff, showing that the new schedule rectifies an imbalance towards e±ciency present in the original. The thesis contains a discussion of the evolution of stream ciphers, focusing on the migration from bit-based to word-based stream ciphers, from which follows a commensurate improvement in design flexibility and software performance. It examines the influence that block ciphers, and in particular the AES, have had upon the development of word-based stream ciphers. The thesis includes a concise literature review of recent styles of cryptanalytic attack upon stream ciphers. Also, claims are refuted that one prominent word-based stream cipher, RC4, suffers from a bias in the first byte of each keystream. The thesis presents a divide and conquer attack against Alpha1, an irregularly clocked bit-based stream cipher with a 128-bit state. The dominating aspect of the divide and conquer attack is a correlation attack on the longest register. The internal state of the remaining registers is determined by utilizing biases in the clocking taps and launching a guess and determine attack. The overall complexity of the attack is 261 operations with text requirements of 35,000 bits and memory requirements of 2 29.8 bits. MUGI is a 64-bit word-based cipher with a large Non-linear Feedback Shift Register (NLFSR) and an additional non-linear state. In standard benchmarks, MUGI appears to su®er from poor key agility because it is implemented on an architecture for which it is not designed, and because its NLFSR is too large relative to the size of its master key. An unusual feature of its key initialization algorithm is described. A variant of MUGI, entitled MUGI-M, is proposed to enhance key agility, ostensibly without any loss of security. The thesis presents a new word-based stream cipher called Dragon. This cipher uses a large internal NLFSR in conjunction with a non-linear filter to produce 64 bits of keystream in one round. The non-linear filter looks very much like the round function of a typical modern block cipher. Dragon has a native word size of 32 bits, and uses very simple operations, including addition, exclusive-or and s-boxes. Together these ensure high performance on modern day processors such as the Intel Pentium family. Finally, a set of guidelines is provided for designing and implementing symmetric ciphers on modern processors, using the Intel Pentium 4 as a case study. Particular attention is given to understanding the architecture of the processor, including features such as its register set and size, the throughput and latencies of its instruction set, and the memory layouts and speeds. General optimization rules are given, including how to choose fast primitives for use within the cipher. The thesis describes design decisions that were made for the Dragon cipher with respect to implementation on the Intel Pentium 4. Block Ciphers, Word-based Stream Ciphers, Cipher Design, Cipher Implementa- tion, - Block Ciphers Word-based Stream Ciphers Cipher Design Cipher Implementation Cryptanalysis Key Schedule Classifcation Key Agility Advanced Encryption Standard RC4 Alpha1 MUGI Dragon Correlation Attacks Divide and Conquer Attacks Intel Pentium 4
40	Acceleration Strategies of Markov Chain Monte Carlo for Bayesian Computation / Stratégies d'accélération des algorithmes de Monte Carlo par chaîne de Markov pour le calcul Bayésien Wu, Chang-Ye 04 October 2018 (has links) Les algorithmes MCMC sont difficiles à mettre à l'échelle, car ils doivent balayer l'ensemble des données à chaque itération, ce qui interdit leurs applications dans de grands paramètres de données. En gros, tous les algorithmes MCMC évolutifs peuvent être divisés en deux catégories: les méthodes de partage et de conquête et les méthodes de sous-échantillonnage. Le but de ce projet est de réduire le temps de calcul induit par des fonctions complexes ou à grande efficacité. / MCMC algorithms are difficult to scale, since they need to sweep over the whole data set at each iteration, which prohibits their applications in big data settings. Roughly speaking, all scalable MCMC algorithms can be divided into two categories: divide-and-conquer methods and subsampling methods. The aim of this project is to reduce the computing time induced by complex or largelikelihood functions. Chaîne de Markov Monte Carlo Données massives Diviser pour régner Forêt aléatoire Markov chain Monte Carlo Big Data Piecewise deterministic Markov process Divide-and-conquer Random forest 519.2

Search results