• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 5
  • 1
  • Tagged with
  • 21
  • 5
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Compiler-driven data layout transformations for network applications

Fenacci, Damon January 2012 (has links)
This work approaches the little studied topic of compiler optimisations directed to network applications. It starts by investigating if there exist any fundamental differences between application domains that justify the development and tuning of domain-specific compiler optimisations. It shows an automated approach that is capable of identifying domain-specific workload characterisations and presenting them in a readily interpretable format based on decision trees. The generated workload profiles summarise key resource utilisation issues and enable compiler engineers to address the highlighted bottlenecks. By applying this methodology to data intensive network infrastructure application it shows that data organisation is the key obstacle to overcome in order to achieve high performance. It therefore proposes and evaluates three specialised data transformations (structure splitting, array regrouping, and software caching) against the industrial EEMBC networking benchmarks and real-world data sets. It also demonstrates on one hand that speedups of up to 2.62 can be achieved, but on the other that no single solution performs equally well across different network traffic scenarios. Hence, to address this issue, an adaptive software caching scheme for high frequency route lookup operations is introduced and its effectiveness evaluated one more time against EEMBC networking benchmarks and real-world data sets achieving speedups of up to 3.30 and 2.27. The results clearly demonstrate that adaptive data organisation schemes are necessary to ensure optimal performance under varying network loads. Finally this research addresses another issue introduced by data transformations such as array regrouping and software caching, i.e. the need for static analysis to allow efficient resource allocation. This thesis proposes a static code analyser that allows the automatic resource analysis of source code containing lists and tree structures. The tool applies a combination of amortised analysis and separation logic methodology to real code and is able to evaluate type and resource usage of existing data structures, which can be used to compute global resource consumption values for full data intensive network applications.
2

Optimised Ray Tracing for the SuperNEC Implementation of the Uniform Theory of Diffraction

Hartleb, Robert 26 February 2007 (has links)
Student Number : 0006329K - MSc(Eng) Dissertation - School of Electrical and Information Engineering - Faculty of Engineering and the Built Environment / Geometric optimisations are presented for the UTD in SuperNEC which is a commercial electromagnetic software package. Path finding optimisations rapidly find propagation paths of electromagnetic waves by using back face culling to determine the visible plates of polyhedral structures and by using reflection and diffraction zones which use image theory and the law of diffraction to determine illuminated spatial regions. An octree reduces the number of intersections during the shadow tests. Numerical results show that overall the optimisations halve the run time of the software for models which consist of plates and cylinders. The path finding optimisations do not scale with model size, are limited to plates and introduce errors. The mean absolute error due to the path finding optimisations is on average 0:02 dB for first order rays and 0:17 dB for second order rays. The octree optimisation scales with model size, can be used with any geometry and any type of ray and does not cause errors.
3

Compiler optimisations and relaxed memory consistency models / Optimisations des compilateurs et modèles mémoire relâchés

Morisset, Robin 05 April 2017 (has links)
Les architectures modernes avec des processeurs multicœurs, ainsi que les langages de programmation modernes, ont des mémoires faiblement consistantes. Leur comportement est formalisé par le modèle mémoire de l'architecture ou du langage de programmation ; il définit précisément quelle valeur peut être lue par chaque lecture dans la mémoire partagée. Ce n'est pas toujours celle écrite par la dernière écriture dans la même variable, à cause d'optimisation dans les processeurs, telle que l'exécution spéculative d'instructions, des effets complexes des caches, et des optimisations dans les compilateurs. Dans cette thèse, nous nous concentrons sur le modèle mémoire C11 qui est défini par l'édition 2011 du standard C. Nos contributions suivent trois axes. Tout d'abord, nous avons regardé la théorie autour du modèle C11, étudiant de façon formelle quelles optimisations il autorise les compilateurs à faire. Nous montrons que de nombreuses optimisations courantes sont permises, mais, surprenamment, d'autres, importantes, sont interdites. Dans un second temps, nous avons développé une méthode à base de tests aléatoires pour détecter quand des compilateurs largement utilisés tels que GCC et Clang réalisent des optimisations invalides dans le modèle mémoire C11. Nous avons trouvés plusieurs bugs dans GCC, qui furent tous rapidement fixés. Nous avons aussi implémenté une nouvelle passez d'optimisation dans LLVM, qui recherchent des instructions des instructions spéciales qui limitent les optimisations faites par le processeur - appelées instructions barrières - et élimine celles qui ne sont pas utiles. Finalement, nous avons développé un ordonnanceur en mode utilisateur pour des threads légers communicants via des canaux premier entré-premier sorti à un seul producteur et un seul consommateur. Ce modèle de programmation est connu sous le nom de réseau de Kahn, et nous montrons comment l'implémenter efficacement, via les primitives désynchronisation de C11. Ceci démontre qu'en dépit de ses problèmes, C11 peut être utilisé en pratique. / Modern multiprocessors architectures and programming languages exhibit weakly consistent memories. Their behaviour is formalised by the memory model of the architecture or programming language; it precisely defines which write operation can be returned by each shared memory read. This is not always the latest store to the same variable, because of optimisations in the processors such as speculative execution of instructions, the complex effects of caches, and optimisations in the compilers. In this thesis we focus on the C11 memory model that is defined by the 2011 edition of the C standard. Our contributions are threefold. First, we focused on the theory surrounding the C11 model, formally studying which compiler optimisations it enables. We show that many common compiler optimisations are allowed, but, surprisingly, some important ones are forbidden. Secondly, building on our results, we developed a random testing methodology for detecting when mainstream compilers such as GCC or Clang perform an incorrect optimisation with respect to the memory model. We found several bugs in GCC, all promptly fixed. We also implemented a novel optimisation pass in LLVM, that looks for special instructions that restrict processor optimisations - called fence instructions - and eliminates the redundant ones. Finally, we developed a user-level scheduler for lightweight threads communicating through first-in first-out single-producer single-consumer queues. This programming model is known as Kahn process networks, and we show how to efficiently implement it, using C11 synchronisation primitives. This shows that despite its flaws, C11 can be usable in practice.
4

Erbium : Reconciling languages, runtimes, compilation and optimizations for streaming applications / Erbium : réconcilier les langages, les supports d'exécution, la compilation, et les optimisations pour calculs sur des flux de données

Miranda, Cupertino 11 February 2013 (has links)
Frappée par les rendements décroissants de la performance séquentielle et les limitations thermiques, l’industrie des microprocesseurs s’est tournée résolument vers les multiprocesseurs sur puce. Ce mouvement a ramené des problèmes anciens et difficiles sous les feux de l’actualité du développement logiciel. Les compilateurs sont l’une des pièces maitresses du puzzle permettant de poursuivre la traduction de la loi de Moore en gains de performances effectifs, gains inaccessibles sans exploiter le parallélisme de threads. Pourtant, la recherche sur les systèmes parallèles s’est concentrée sur les aspects langage et architecture, et le potentiel reste énorme en termes de compilation de programmes parallèles, d’optimisation et d’adaptation de programmes parallèles pour exploiter efficacement le matériel. Cette thèse relève ces défis en présentant Erbium, un langage de bas niveau fondé sur le traitement de flots de données, et mettant en œuvre des communications multi-producteur multi-consommateur ; un exécutif parallèle très efficace pour les architectures x86 et des variantes pour d’autres types d’architectures ; un schéma d’intégration du langage dans un compilateur illustré en tant que représentation intermédiaire dans GCC ; une étude des primitives du langage et de leurs dépendances permettant aux compilateurs d’optimiser des programmes Erbium à l’aide de transformations spécifiques aux programmes parallèles, et également à travers des formes généralisées d’optimisations classiques, telles que l’élimination de redondances partielles et l’élimination de code mort. / As transistors size and power limitations stroke computer industry, hardware parallelism arose as the solution, bringing old forgotten problems back into equation to solve the existing limitations of current parallel technologies. Compilers regain focus by being the most relevant puzzle piece in the quest for the expected computer performance improvements predicted by Moores law no longer possible without parallelism. Parallel research is mainly focused in either the language or architectural aspects, not really giving the needed attention to compiler problems, being the reason for the weak compiler support by many parallel languages or architectures, not allowing to exploit performance to the best. This thesis addresses these problems by presenting: Erbium, a low level streaming data-flow language supporting multiple producer and consumer task communication; a very efficient runtime implementation for x86 architectures also addressing other types of architectures; a compiler integration of the language as an intermediate representation in GCC; a study of the language primitives dependencies, allowing compilers to further optimise the Erbium code not only through specific parallel optimisations but also through traditional compiler optimisations, such as partial redundancy elimination and dead code elimination.
5

Erbium : Reconciling languages, runtimes, compilation and optimizations for streaming applications

Miranda, Cupertino 11 February 2013 (has links) (PDF)
As transistors size and power limitations stroke computer industry, hardware parallelism arose as the solution, bringing old forgotten problems back into equation to solve the existing limitations of current parallel technologies. Compilers regain focus by being the most relevant puzzle piece in the quest for the expected computer performance improvements predicted by Moores law no longer possible without parallelism. Parallel research is mainly focused in either the language or architectural aspects, not really giving the needed attention to compiler problems, being the reason for the weak compiler support by many parallel languages or architectures, not allowing to exploit performance to the best. This thesis addresses these problems by presenting: Erbium, a low level streaming data-flow language supporting multiple producer and consumer task communication; a very efficient runtime implementation for x86 architectures also addressing other types of architectures; a compiler integration of the language as an intermediate representation in GCC; a study of the language primitives dependencies, allowing compilers to further optimise the Erbium code not only through specific parallel optimisations but also through traditional compiler optimisations, such as partial redundancy elimination and dead code elimination.
6

Low-cost memory analyses for efficient compilers / Analyses de mémoire à bas cout pour des compilateurs efficaces

Maalej Kammoun, Maroua 26 September 2017 (has links)
La rapidité, la consommation énergétique et l'efficacité des systèmes logiciels et matériels sont devenues les préoccupations majeures de la communauté informatique de nos jours. Gérer de manière correcte et efficace les problématiques mémoire est essentiel pour le développement des programmes de grande tailles sur des architectures de plus en plus complexes. Dans ce contexte, cette thèse contribue aux domaines de l'analyse mémoire et de la compilation tant sur les aspects théoriques que sur les aspects pratiques et expérimentaux. Outre l'étude approfondie de l'état de l'art des analyses mémoire et des différentes limitations qu'elles montrent, notre contribution réside dans la conception et l'évaluation de nouvelles analyses qui remédient au manque de précision des techniques publiées et implémentées. Nous nous sommes principalement attachés à améliorer l'analyse de pointeurs appartenant à une même structure de données, afin de lever une des limitations majeures des compilateurs actuels. Nous développons nos analyses dans le cadre général de l'interprétation abstraite « non dense ». Ce choix est motivé par les aspects de correction et d'efficacité : deux critères requis pour une intégration facile dans un compilateur. La première analyse que nous concevons est basée sur l'analyse d'intervalles des variables entières ; elle utilise le fait que deux pointeurs définis à l'aide d'un même pointeur de base n'aliasent pas si les valeurs possibles des décalages sont disjointes. La seconde analyse que nous développons est inspirée du domaine abstrait des Pentagones ; elle génère des relations d'ordre strict entre des paires de pointeurs comparables. Enfin, nous combinons et enrichissons les deux analyses précédentes dans un cadre plus général. Ces analyses ont été implémentées dans le compilateur LLVM. Nous expérimentons et évaluons leurs performances, et les comparons aux implémentations disponibles selon deux métriques : le nombre de paires de pointeurs pour lesquelles nous inférons le non-aliasing et les optimisations rendues possibles par nos analyses / This thesis was motivated by the emergence of massively parallel processing and supercomputingthat tend to make computer programming extremely performing. Speedup, the power consump-tion, and the efficiency of both software and hardware are nowadays the main concerns of theinformation systems community. Handling memory in a correct and efficient way is a step towardless complex and more performing programs and architectures. This thesis falls into this contextand contributes to memory analysis and compilation fields in both theoretical and experimentalaspects.Besides the deep study of the current state-of-the-art of memory analyses and their limitations,our theoretical results stand in designing new algorithms to recover part of the imprecisionthat published techniques still show. Among the present limitations, we focus our research onthe pointer arithmetic to disambiguate pointers within the same data structure. We develop ouranalyses in the abstract interpretation framework. The key idea behind this choice is correctness,and scalability: two requisite criteria for analyses to be embedded to the compiler construction.The first alias analysis we design is based on the range lattice of integer variables. Given a pair ofpointers defined from a common base pointer, they are disjoint if their offsets cannot have valuesthat intersect at runtime. The second pointer analysis we develop is inspired from the Pentagonabstract domain. We conclude that two pointers do not alias whenever we are able to build astrict relation between them, valid at program points where the two variables are simultaneouslyalive. In a third algorithm we design, we combine both the first and second analysis, and enhancethem with a coarse grained but efficient analysis to deal with non related pointers.We implement these analyses on top of the LLVM compiler. We experiment and evaluate theirperformance based on two metrics: the number of disambiguated pairs of pointers compared tocommon analyses of the compiler, and the optimizations further enabled thanks to the extraprecision they introduce
7

Mise au point de synthèses de stéroïdes : perspectives d'applications industrielles / Optimization of the synthesis of a key intermediate in glucocorticoids preparation

Jouve, Romain 03 January 2017 (has links)
L’objectif des travaux de thèse est l’amélioration du coût de revient industriel du décortidiène, intermédiaire clé dans la préparation de plusieurs principes actifs produits au sein de l’usine de production de SANOFI Vertolaye. Après une revue de l’état de l’art sur la synthèse du décortidiène et en fonction du cahier des charges défini, deux voies ont été explorées. La première voie de synthèse étudiée a permis de préparer le décortidiène en deux étapes, alors que la synthèse industrielle est effectuée en quatre étapes. Cependant, le rendement encourageant obtenu est inférieur à l’objectif fixé. Une étude de modélisation moléculaire a permis de proposer un mécanisme réactionnel concernant cette voie. Nous avons pu mettre en évidence lors de cette synthèse la formation d’un nouveau coricostéroïde, présentant une activité anti inflammatoire comparable à celle de la prénidsolone. La deuxième voie de synthèse permet de réduire le nombre d’étapes de la synthèse du décortidiène de 25% et conduit à un rendement acceptable par rapport à celui défini dans le cahier des charges. Le procédé a fait l’objet de plusieurs séries par une approche « plan d’expériences » pour atteindre les objectifs. / The aim of this doctoral works is the reduction of the production costs of decortidiene made by SANOFI Vertolaye. Decortidiene is a key intermediate for the synthesis of several active principles. After checking the iterature, two ways were explored. The first one reduces by half the synthetic steps buthe yield is lower than the target. A molecular modeling allowed us to propose a reaction mechanism. Several optimizations and experimental design of experiments enabled the second studied pathway to reduce by 25% the industrial process and to reach the targeted yield.
8

Utilisation des modes directionnels dans la résolution

Oudot, Olivier 30 November 1987 (has links) (PDF)
Cette étude met en évidence l'utilité des modes directionnels pour l'optimisation de la résolution dans le langage Prolog. Ils se caractérisent essentiellement par le fait qu'ils permettent de distinguer les différentes utilisations possibles d'un même prédicat. Un algorithme de production automatique de ces modes est décrit. L'étude est concrétisée par la réalisation du compilateur Starlog, fonctionnant sur un cas particulier de modes directionnels
9

Semantically-enabled stream processing and complex event processing over RDF graph streams / Traitement de flux sémantiquement activé et traitement d'évènements complexes sur des flux de graphe RDF

Gillani, Syed 04 November 2016 (has links)
Résumé en français non fourni par l'auteur. / There is a paradigm shift in the nature and processing means of today’s data: data are used to being mostly static and stored in large databases to be queried. Today, with the advent of new applications and means of collecting data, most applications on the Web and in enterprises produce data in a continuous manner under the form of streams. Thus, the users of these applications expect to process a large volume of data with fresh low latency results. This has resulted in the introduction of Data Stream Processing Systems (DSMSs) and a Complex Event Processing (CEP) paradigm – both with distinctive aims: DSMSs are mostly employed to process traditional query operators (mostly stateless), while CEP systems focus on temporal pattern matching (stateful operators) to detect changes in the data that can be thought of as events. In the past decade or so, a number of scalable and performance intensive DSMSs and CEP systems have been proposed. Most of them, however, are based on the relational data models – which begs the question for the support of heterogeneous data sources, i.e., variety of the data. Work in RDF stream processing (RSP) systems partly addresses the challenge of variety by promoting the RDF data model. Nonetheless, challenges like volume and velocity are overlooked by existing approaches. These challenges require customised optimisations which consider RDF as a first class citizen and scale the processof continuous graph pattern matching. To gain insights into these problems, this thesis focuses on developing scalable RDF graph stream processing, and semantically-enabled CEP systems (i.e., Semantic Complex Event Processing, SCEP). In addition to our optimised algorithmic and data structure methodologies, we also contribute to the design of a new query language for SCEP. Our contributions in these two fields are as follows: • RDF Graph Stream Processing. We first propose an RDF graph stream model, where each data item/event within streams is comprised of an RDF graph (a set of RDF triples). Second, we implement customised indexing techniques and data structures to continuously process RDF graph streams in an incremental manner. • Semantic Complex Event Processing. We extend the idea of RDF graph stream processing to enable SCEP over such RDF graph streams, i.e., temporalpattern matching. Our first contribution in this context is to provide a new querylanguage that encompasses the RDF graph stream model and employs a set of expressive temporal operators such as sequencing, kleene-+, negation, optional,conjunction, disjunction and event selection strategies. Based on this, we implement a scalable system that employs a non-deterministic finite automata model to evaluate these operators in an optimised manner. We leverage techniques from diverse fields, such as relational query optimisations, incremental query processing, sensor and social networks in order to solve real-world problems. We have applied our proposed techniques to a wide range of real-world and synthetic datasets to extract the knowledge from RDF structured data in motion. Our experimental evaluations confirm our theoretical insights, and demonstrate the viability of our proposed methods
10

Continuous steepest descent path for traversing non-convex regions

Beddiaf, Salah January 2016 (has links)
In this thesis, we investigate methods of finding a local minimum for unconstrained problems of non-convex functions with n variables, by following the solution curve of a system of ordinary differential equations. The motivation for this was the fact that existing methods (e.g. those based on Newton methods with line search) sometimes terminate at a non-stationary point when applied to functions f(x) that do not a have positive-definite Hessian (i.e. ∇²f → 0) for all x. Even when methods terminate at a stationary point it could be a saddle or maximum rather than a minimum. The only method which makes intuitive sense in non-convex region is the trust region approach where we seek a step which minimises a quadratic model subject to a restriction on the two-norm of the step size. This gives a well-defined search direction but at the expense of a costly evaluation. The algorithms derived in this thesis are gradient based methods which require systems of equations to be solved at each step but which do not use a line search in the usual sense. Progress along the Continuous Steepest Descent Path (CSDP) is governed both by the decrease in the function value and measures of accuracy of a local quadratic model. Numerical results on specially constructed test problems and a number of standard test problems from CUTEr [38] show that the approaches we have considered are more promising when compared with routines in the optimization tool box of MATLAB [46], namely the trust region method and the quasi-Newton method. In particular, they perform well in comparison with the, superficially similar, gradient-flow method proposed by Behrman [7].

Page generated in 0.1793 seconds