• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 60
  • 7
  • 6
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 96
  • 42
  • 29
  • 28
  • 19
  • 18
  • 17
  • 13
  • 11
  • 11
  • 10
  • 10
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Athapascan-0 : exploitation de la multiprogrammation légère sur grappes de multiprocesseurs

Carissimi, Alexandre da Silva January 1999 (has links)
L'accroissement d'efficacite des réseaux d'interconnexion et la vulgarisation des machines multiprocesseurs permettent la réalisation de machines parallèles a mémoire distribuée de faible coût: les grappes de multiprocesseurs. Elles nécessitent l'exploitation à la fois du parallélismeà grain fin, interne à un multiprocesseur offert par la multiprogrammation légère, et du parallélisme à gros grain entre les différents multiprocesseurs. L'exploitation simultanée de ces deux types de parallélisme exige une méthode de communication entre les processus légers qui ne partagent pas le mêmme espace d'adressage. Le travail de cette thèse porte sur le problème de l'Intégration de la multiprogrammation légère et des communications sur grappes de multiprocesseurs symétriques (SMP). II porte plus précisément sur evaluation et le reglage du noyau exécutif ATHAPASCAN-0 sur ce type d'architecture. ATHAPASCAN-0 est un noyau exécutif, portable, développé au sein du projet APACHE (CNRS-INPG-INRIA-UJF), qui combine la multiprogrammation légère et la communication par échange de messages. La portabilité est assurée par une organisation en couches basée sur les standards POSIX threads et MPI largement répandus. ATHAPASCAN-0 étend le modèle de réseau statique de processus «lourds» communicants tel que MPI, PVM, etc,à celui d'un réseau dynamique de processus légers communicants. La technique de base est la multiprogrammation lègere des communications et des calculs. La progression des communications exige la scrutation de état du reseau et l'enchainement des opérations de transferts. L'efficacité repose sur la minimisation de ces opérations. De plus, l'emploi de multiprocesseurs ajoute des problèmes spécifiques dus à l'apparition d'un parallélisme réel entre calcul et communication. Ces problèmes sont présentés et des solutions sont proposées pour l'environnement ATHAPASCAN-0. Ces solutions sont évaluées sur des grappes de multiprocesseurs. / The continuous price reduction for commodity PC multiprocessors and the availability of fast network interfaces have made cluster of multiprocessors an attractive low-price alternative to build parallel systems. Multiprocessor clusters offer two levels of parallelism: a fine grain parallelism inside a single multiprocessor and a coarse grain among them. A mechanism must be provided to exploit both levels of parallelism simultaneously. This requires to provide communications between threads belonging to different addresses spaces. This dissertation addresses the problem of integrating threads and communications on ATHAPASCAN-0 run time system. ATHAPASCAN-0 is a portable run time for cluster of multiprocessors developed as part of the APACHE project (CNRS-INPG-INRIA-UJF). Portability is achieved by a layered organization based on standards like POSIX threads and MPI. The ATHAPASCAN-0 run time system extends the heavy-weight process communication model of message passing libraries such as MPI, PVM, etc, into a lighter dynamic network of communicating threads. Multiprogramming is the key concept used. Communication progress is based on a network polling basis to handle incoming messages and to deliver outgoing communications requests. Performance is strongly dependent on the way these operations are implemented. Additionally, multiprocessors introduce some programming problems like overhead of cache coherency mechanisms, method of managing concurrent accesses and efficient mutex locking to avoid unnecessary context switching. These problems are analyzed and solutions are implemented in the ATHAPASCAN-0 run time system. An evaluation of these solutions is performed on a cluster of multiprocessors.
92

Performance of Physics-Driven Procedural Animation of Character Locomotion : For Bipedal and Quadrupedal Gait

Larsson, Jarl January 2015 (has links)
Context. Animation of character locomotion is an important part of computer animation and games. It is a vital aspect in achieving believable behaviour and articulation for virtual characters. For games there is also often a need for supporting real-time reactive behaviour in an animation as a response to direct or indirect user interaction, which have given rise to procedural solutions to generate animation of locomotion. Objectives. In this thesis the performance aspects for procedurally generating animation of locomotion within real-time constraints is evaluated, for bipeds and quadrupeds, and for simulations of several characters. A general pose-driven feedback algorithm for physics-driven character locomotion is implemented for this purpose. Methods. The execution time of the locomotion algorithm is evaluated using an automated experiment process, in which real-time gait simulations of incrementing character population count are instantiated and measured, for the bipedal and quadrupedal gaits. The simulations are measured for both serial and parallel executions of the locomotion algorithm. Results. Simulations of up to and including 100 characters are performance measured providing an overview of the slowdown rate when increasing the character count in the simulations, as well as the performance relations between bipeds and quadrupeds. Conclusions. The experiment concludes that the evaluated algorithm on its own exhibits a relatively small performance impact that scales almost linearly for the evaluated population sizes. Due to the relatively low performance impacts it is thus also concluded that for future experiments a broader measurement of the locomotion algorithm that includes and compares different physics solvers is of interest.
93

Frequent itemset mining on multiprocessor systems

Schlegel, Benjamin 08 May 2014 (has links) (PDF)
Frequent itemset mining is an important building block in many data mining applications like market basket analysis, recommendation, web-mining, fraud detection, and gene expression analysis. In many of them, the datasets being mined can easily grow up to hundreds of gigabytes or even terabytes of data. Hence, efficient algorithms are required to process such large amounts of data. In recent years, there have been many frequent-itemset mining algorithms proposed, which however (1) often have high memory requirements and (2) do not exploit the large degrees of parallelism provided by modern multiprocessor systems. The high memory requirements arise mainly from inefficient data structures that have only been shown to be sufficient for small datasets. For large datasets, however, the use of these data structures force the algorithms to go out-of-core, i.e., they have to access secondary memory, which leads to serious performance degradations. Exploiting available parallelism is further required to mine large datasets because the serial performance of processors almost stopped increasing. Algorithms should therefore exploit the large number of available threads and also the other kinds of parallelism (e.g., vector instruction sets) besides thread-level parallelism. In this work, we tackle the high memory requirements of frequent itemset mining twofold: we (1) compress the datasets being mined because they must be kept in main memory during several mining invocations and (2) improve existing mining algorithms with memory-efficient data structures. For compressing the datasets, we employ efficient encodings that show a good compression performance on a wide variety of realistic datasets, i.e., the size of the datasets is reduced by up to 6.4x. The encodings can further be applied directly while loading the dataset from disk or network. Since encoding and decoding is repeatedly required for loading and mining the datasets, we reduce its costs by providing parallel encodings that achieve high throughputs for both tasks. For a memory-efficient representation of the mining algorithms’ intermediate data, we propose compact data structures and even employ explicit compression. Both methods together reduce the intermediate data’s size by up to 25x. The smaller memory requirements avoid or delay expensive out-of-core computation when large datasets are mined. For coping with the high parallelism provided by current multiprocessor systems, we identify the performance hot spots and scalability issues of existing frequent-itemset mining algorithms. The hot spots, which form basic building blocks of these algorithms, cover (1) counting the frequency of fixed-length strings, (2) building prefix trees, (3) compressing integer values, and (4) intersecting lists of sorted integer values or bitmaps. For all of them, we discuss how to exploit available parallelism and provide scalable solutions. Furthermore, almost all components of the mining algorithms must be parallelized to keep the sequential fraction of the algorithms as small as possible. We integrate the parallelized building blocks and components into three well-known mining algorithms and further analyze the impact of certain existing optimizations. Our algorithms are already single-threaded often up an order of magnitude faster than existing highly optimized algorithms and further scale almost linear on a large 32-core multiprocessor system. Although our optimizations are intended for frequent-itemset mining algorithms, they can be applied with only minor changes to algorithms that are used for mining of other types of itemsets.
94

Rozvoj instrumentace programu při překladu / Development of Instrumentation during Compilation

Ševčík, Václav January 2020 (has links)
The focus of this master's thesis is on the topic of instrumentation during the compilation process in the LLVM compiler. The tool enables to instrument memory accesses and functions. The instrumentation is realized through adding a novel pass to the LLVM's optimalization phase. Information about variables are managed by the created framework. The framework is linked with the program. The overhead of the instrumentation increases duration of the program by about 14 % in the case of switched off indirect addressing and 23 % in the case of switched on indirect addressing. The main benefit of the work is the possibility of easy instrumentation of the program which can even monitor operation of local variables through indirect addressing) and support multithread programs. The framework is part of Testos's tools where it provides automatic instrumentation in the Spectra tool.
95

Grafická reprezentace navigačních zpráv GNSS prototypu / GNSS navigation prototype messages visualization

Homolka, Martin January 2016 (has links)
The aim of this thesis is to graphically interpret navigation messages from a prototype of global navigation satellite system. The resulting application is implemented in Python programming language for Windows operating system and follows requests from researchers developing the prototype. Necessary terminology together with graphical user interface programming possibilities of object-oriented language Python is a base for theoretical background of this text. Practical part of this research describes a solution for receiving generated messages from the prototype as well as their storing and filtering into useful information. Further, the design of graphical user interface of the application for prototype interactions and other tools used for its configuration are included.
96

Frequent itemset mining on multiprocessor systems

Schlegel, Benjamin 30 May 2013 (has links)
Frequent itemset mining is an important building block in many data mining applications like market basket analysis, recommendation, web-mining, fraud detection, and gene expression analysis. In many of them, the datasets being mined can easily grow up to hundreds of gigabytes or even terabytes of data. Hence, efficient algorithms are required to process such large amounts of data. In recent years, there have been many frequent-itemset mining algorithms proposed, which however (1) often have high memory requirements and (2) do not exploit the large degrees of parallelism provided by modern multiprocessor systems. The high memory requirements arise mainly from inefficient data structures that have only been shown to be sufficient for small datasets. For large datasets, however, the use of these data structures force the algorithms to go out-of-core, i.e., they have to access secondary memory, which leads to serious performance degradations. Exploiting available parallelism is further required to mine large datasets because the serial performance of processors almost stopped increasing. Algorithms should therefore exploit the large number of available threads and also the other kinds of parallelism (e.g., vector instruction sets) besides thread-level parallelism. In this work, we tackle the high memory requirements of frequent itemset mining twofold: we (1) compress the datasets being mined because they must be kept in main memory during several mining invocations and (2) improve existing mining algorithms with memory-efficient data structures. For compressing the datasets, we employ efficient encodings that show a good compression performance on a wide variety of realistic datasets, i.e., the size of the datasets is reduced by up to 6.4x. The encodings can further be applied directly while loading the dataset from disk or network. Since encoding and decoding is repeatedly required for loading and mining the datasets, we reduce its costs by providing parallel encodings that achieve high throughputs for both tasks. For a memory-efficient representation of the mining algorithms’ intermediate data, we propose compact data structures and even employ explicit compression. Both methods together reduce the intermediate data’s size by up to 25x. The smaller memory requirements avoid or delay expensive out-of-core computation when large datasets are mined. For coping with the high parallelism provided by current multiprocessor systems, we identify the performance hot spots and scalability issues of existing frequent-itemset mining algorithms. The hot spots, which form basic building blocks of these algorithms, cover (1) counting the frequency of fixed-length strings, (2) building prefix trees, (3) compressing integer values, and (4) intersecting lists of sorted integer values or bitmaps. For all of them, we discuss how to exploit available parallelism and provide scalable solutions. Furthermore, almost all components of the mining algorithms must be parallelized to keep the sequential fraction of the algorithms as small as possible. We integrate the parallelized building blocks and components into three well-known mining algorithms and further analyze the impact of certain existing optimizations. Our algorithms are already single-threaded often up an order of magnitude faster than existing highly optimized algorithms and further scale almost linear on a large 32-core multiprocessor system. Although our optimizations are intended for frequent-itemset mining algorithms, they can be applied with only minor changes to algorithms that are used for mining of other types of itemsets.

Page generated in 0.0538 seconds