1 |
Using Decision Tree Voting to Select a Polyhedral Model Loop TransformationRuvinskiy, Ray January 2013 (has links)
Algorithms in fields like image manipulation, sound and signal processing, and statistics frequently employ tight loops. These loops are computationally intensive and CPU-bound, making their performance highly dependent on efficient utilization of the CPU pipeline and memory bus. Recent years have seen CPU pipelines becoming more and more complicated, with features such as branch prediction and speculative execution. At the same time, clock speeds have stopped their prior exponential growth rate due to heat dissipation issues, and multiple cores have become prevalent. These developments have made it more difficult for developers to reason about how their code executes on the CPU, which in turn makes it difficult to write performant code. An automated method to take code and optimize it for most efficient execution would, therefore, be desirable. The Polyhedral Model allows the generation of alternative transformations for a loop nest that are semantically equivalent to the original. The transformations vary the degree of loop tiling, loop fusion, loop unrolling, parallelism, and vectorization. However, selecting the transformation that would most efficiently utilize the architecture remains challenging. Previous work utilizes regression models to select a transformation, using as features hardware performance counter values collected during a sample run of the program being optimized. Due to inaccuracies in the resulting regression model, the transformation selected by the model as the best transformation often yields unsatisfactory performance. As a result, previous work resorts to using a five-shot technique, which entails running the top five transformations suggested by the model and selecting the best one based on their actual runtime. However, for long-running benchmarks, five runs may be take an excessive amount of time. I present a variation on the previous approach which does not need to resort to the five-shot selection process to achieve performance comparable to the best five-shot results reported in previous work. With the transformations in the search space ranked in reverse runtime order, the transformation selected by my classifier is, on average, in the 86th percentile. There are several key contributing factors to the performance improvements attained by my method: formulating the problem as a classification problem rather than a regression problem, using static features in addition to dynamic performance counter features, performing feature selection, and using ensemble methods to boost the performance of the classifier. Decision trees are constructed from pairs of features (performance counters and structural features than can be determined statically from the source code). The trees are then evaluated according to the number of benchmarks for which they select a transformation that performs better than two baseline variants, the original program and the expected runtime if a randomly selected transformation were applied. The top 20 trees vote to select a final transformation.
|
2 |
Characterizing the Effectiveness of Compilers in Vectorizing Polyhedrally Transformed CodeChidambarnathan, Yogesh 22 May 2013 (has links)
No description available.
|
3 |
Effective Automatic Parallelization and Locality Optimization Using The Polyhedral ModelBondhugula, Uday Kumar 11 September 2008 (has links)
No description available.
|
4 |
Applying Polyhedral Transformation to Fortran ProgramsGururaghavendran, Ashwin 31 March 2011 (has links)
No description available.
|
5 |
A programming language based on recurrence equations and polyhedral compilation for stream processingLeben, Jakob 31 July 2019 (has links)
The work presented in this dissertation contributes to the field of programming lan-
guage design and implementation for stream processing applications. There is a
fast-expanding domain of stream processing applications which demand processing
high-volume streams quickly and often in real time. Examples include analysis and
synthesis of audio, video and other digital media, sensor array signals, real-time phys-
ical simulation etc. High performance is crucial in this domain. When choosing
between available programming methods, the programmer often chooses one that
maximizes performance while sacrificing ease of programming, code comprehension,
maintainability and reusability. This work contributes towards improving the state
of the art by jointly maximizing these aspects.
High-volume streams are often most naturally represented as multi-dimensional
arrays with one infinite dimension representing time. Algorithms working with such
streams are typically defined mathematically using recurrence equations. A pro-
gramming language is presented in this dissertation which enables an almost literal
translation of such mathematical definitions to computer programs. The language
also supports powerful facilities for abstraction and code reuse such as polymorphic
and higher-order functions. Together, these features enable a more natural expression
of algorithms and improve code modularity and reusability.
A major contribution of this dissertation is the compilation of the proposed lan-
guage in the polyhedral framework, specifically targeting general-purpose multi-core
processors. This framework provides powerful means of analysis and transformations
of computations on multi-dimensional arrays, which enables data-locality optimiza-
tions essential for high performance on general-purpose processors with deep memory
hierarchies. The benefit of this framework for computations on finite arrays has been
extensively explored. However, this dissertation presents essential extensions that
enable the application of state-of-the-art optimizations in this framework on infinite
arrays representing streams. / Graduate
|
6 |
Detection of linear algebra operations in polyhedral programs / Reconnaissance d'opérations d'algèbre linéaire dans un programme polyédriqueIooss, Guillaume 01 July 2016 (has links)
Durant ces dernières années, Il est de plus en plus compliqué d'écrire du code qui utilise une architecture au mieux de ses capacités. Certaines opérations clefs ont soit un accélérateur dédié, ou admettent une implémentation finement optimisée qui délivre les meilleurs performances. Ainsi, il est intéressant d'identifier ces opérations pendant la compilation d'un programme, et de faire appel à une implémentation optimisée.Nous nous intéressons dans cette thèse au problème de détection de ces opérations. Nous proposons un procédé qui détecte des sous-calculs correspondant à des opérations d'algèbre linéaire à l'intérieur de programmes polyédriques. L'idée principale de ce procédé est de découper le programme en sous-calculs isolés, et essayer de reconnaître chaque sous-calculs comme une combinaison d'opérateurs d'algèbre linéaire.Le découpage du calcul est effectué en utilisant une transformation de programme appelée tuilage monoparamétrique. Cette transformation partitionne le calcul en tuiles dont la forme est un agrandissement paramétrique d'une tuile de taille constante. Nous montrons que le programme tuilé reste polyédrique tout en permettant une paramétrisation limitée des tailles de tuile. Les travaux précédents sur le tuilage nous forçaient à choisir l'une de ces deux propriétés.Ensuite, afin d'identifier les opérateurs, nous introduisons un algorithme de reconnaissance de template, qui est une extension d'un algorithme d'équivalence de programme. Nous proposons plusieurs extensions afin de tenir compte des propriétés sémantiques communément rencontrées en algèbre linéaire.Enfin, nous combinons les deux contributions précédentes en un procédé qui détecte les sous-calculs correspondant à des opérateurs d'algèbre linéaire. Une de ses composantes est une librairie de template, inspirée de la spécification BLAS. Nous démontrons l'efficacité de notre procédé sur plusieurs applications. / Writing a code which uses an architecture at its full capability has become an increasingly difficult problem over the last years. For some key operations, a dedicated accelerator or a finely tuned implementation exists and delivers the best performance. Thus, when compiling a code, identifying these operations and issuing calls to their high-performance implementation is attractive. In this dissertation, we focus on the problem of detection of these operations. We propose a framework which detects linear algebra subcomputations within a polyhedral program. The main idea of this framework is to partition the computation in order to isolate different subcomputations in a regular manner, then we consider each portion of the computation and try to recognize it as a combination of linear algebra operations.We perform the partitioning of the computation by using a program transformation called monoparametric tiling. This transformation partitions the computation into blocks, whose shape is some homothetic scaling of a fixed-size partitioning. We show that the tiled program remains polyhedral while allowing a limited amount of parametrization: a single size parameter. This is an improvement compared to the previous work on tiling, that forced us to choose between these two properties.Then, in order to recognize computations, we introduce a template recognition algorithm. This template recognition algorithm is built on a state-of-the-art program equivalence algorithm. We also propose several extensions in order to manage some semantic properties.Finally, we combine these two previous contributions into a framework which detects linear algebra subcomputations. A part of this framework is a library of template, based on the BLAS specification. We demonstrate our framework on several applications.
|
7 |
Robust routing optimization in resilient networks : Polyhedral model and complexity issues / Optimisation robuste du routage dans les réseaux résilients : Modèle polyédrique et problèmes de complexitéZotkiewicz, Mateusz 04 January 2011 (has links)
Dans les grands réseaux de transport, certains éléments du réseau peuvent être responsables du traitement d’importants volumes de trafic. Cela rend ces réseaux vulnérables aux pannes telles que les coupures de câbles. Des mécanismes appropriés pour le recouvrement du trafic doivent être mis en oeuvre pour éviter les ruptures de service. Une des meilleures techniques pour protéger les réseaux de transport consiste à prévoir des mécanismes de restauration au niveau de la couche transport elle-même afin que chaque opérateur de transport puisse sécuriser son propre réseau et offrir un service de transport fiable aux autres acteurs tels que les opérateurs IP. D’autres mécanismes de protection pourront alors être déployés aux niveaux supérieurs sans interférences avec la restauration au niveau transport. Outre les pannes pouvant touchers ses composantes, un réseau doit aussi faire face à l’incertitude de la matrice de trafic qu’on chercher à acheminer dans le réseau. Cette incertitude est une conséquence de la multiplication des applications et services faisant appel au réseau. La mobilité des usagers ainsi que les pannes touchant le réseau contribuent également à cette incertitude. La thèse se découpe donc en deux parties. Dans la première partie, nous nous intéressons à la complexité des différents mécanismes de sécurisation des réseaux. Dans la seconde partie, nous nous intéressons à l’incertitude de la matrice de trafic et notamment au modèle polyédrique / In the thesis robust routing design problems in resilient networks are considered. In the first part computational complexity of such problems are discussed. The following cases are considered: - path protection and path restoration - failure-dependent and failure-independent restoration - cases with and without stub-release - single-link failures and multiple-link failures (shared risk link group) - non-bifurcated (unsplittable) flows and bifurcated flows For each of the related optimization cases a mixed-integer (in the non-bifurcated cases) or linear programming formulation (in all bifurcated cases) is presented, and their computational complexity is investigated. For the NP-hard cases original NP-hardness proofs are provided, while for the polynomial cases compact linear programming formulations (which prove the polynomiality in the question) are discussed. Moreover, pricing problems related to each of the considered NP-hard problems are discussed. The second part of the thesis deals with various routing strategies in networks where the uncertainty issues are modeled using the polyhedral model. In such networks two extrema are possible. The simplest in terms of implementation, and simultaneously the least effective strategy, is the robust stable routing. On the other hand, the most effective strategy, i.e., the dynamic routing, is virtually impossible to implement in real world networks. Therefore, the major aim of this part of the thesis is to present novel routing strategies that merge the simplicity of the robust stable routing with the efficiency of the dynamic routing
|
8 |
Quadrature-Based Gravity Models for the Homogeneous PolyhedronPearl, Jason 01 January 2019 (has links)
A number of missions to comets and asteroids have been undertaken by major space organizations driving a need to accurately characterize their gravitational fields. This is complicated however by their irregular shapes. To accurately and safely navigate spacecraft in these environments, a simple point-mass gravity model is insufficient and instead higher-fidelity models are required. Several such models exist for this purpose but all posess drawbacks. Moreover, there are some applications for which the currently available models are not particular well suited.
In this dissertation, numerical quadrature and curvilinear meshing techniques are applied to the small body gravity problem. The goal of this work is to to create a gravitational model suitable for integating large numbers of low altitude trajectories and rapidly characterizing the near-surface potential field. In total three new models are developed. The first applies two-dimensional quadrature formulas to calculate the gravitational field of an arbitrary triangular surface mesh. The second extends this result to curvilinear surface meshes that more accurately approximate the surface topology. The third applies three-dimensional quadrature to curvilinear tetrahedral meshes to generate accurate distributions of point-masses. The accuracy of the new models is fully characterized and simple relations are presented for predicting the error of integrated trajectories. The efficiency of the models is then compared to other high-fidelity models currently in use. The new models perform well between the body's circumsphere and a thin layer that surrounds the surface.
|
9 |
Restructuration interactive des programmes / Interactive Program RestructuringZinenko, Oleksandr 25 November 2016 (has links)
Le développement des logiciels et leur restructuration deviennent de plus en plus complexes à cause de l'adoption massive des architectures parallèles, ce qui nécessite une expertise considérable de la part des développeurs. Bien que des nombreux modèles et langages de programmation permettent de créer des programmes efficaces, ils n'offrent pas de support spécifique à la restructuration des programmes existants afin d'en augmenter l'efficacité. En même temps, les approches automatiques sont trop conservatives et insuffisamment précises pour atteindre une partie substantielle de la performance du système sans que le développeur aie à fournir des informations sémantiques supplémentaires. Pour répondre à ces défis, nous adoptons l'approche de la restructuration interactive des programmes qui lie la manipulation semi-automatique des programmes avec la visualisation des logiciels. Dans cette thèse, l'approche de restructuration interactive est illustrée par l'extension du modèle polyédrique - une représentation des programmes moderne et puissante - pour permettre la manipulation de haut niveau ainsi que par la conception et l'évaluation d'une interface visuelle à manipulation directe pour la restructuration des programmes. Cette interface visualise l'information qui n'était pas immédiatement accessible dans la représentation textuelle et permet de manipuler des programmes sans en réécrire le code. Nous proposons également une représentation de l'optimisation de programme, calculée automatiquement, telle que le développeur puisse la comprendre et réutiliser facilement ainsi que la modifier d'une manière textuelle ou visuelle dans le cadre du partenariat homme-machine. Afin de représenter plusieurs aspects de la restructuration des programmes, nous concevons et évaluons une nouvelle interaction qui permet de communiquer l'information supplémentaire et non-cruciale pour la tâche à accomplir. Après une étude empirique de la distribution d'attention des développeurs face aux représentations textuelles et visuelles des programmes, nous discutons des implications pour la conception des outils d'aide à la programmation dans le cadre du modèle d'interaction instrumentale. La restructuration interactive des programmes est supposée faciliter la manipulation des programmes dans le but d'optimisation, la rendre plus efficace et plus largement adopté. / Software development and program manipulation become increasingly complex with the massive adoption of parallel architectures, requiring significant expertise from developers. While numerous programming models and languages allow for creating efficient programs, they fall short at helping developers to restructure existing programs for more effective execution. At the same time, automatic approaches are overly conservative and imprecise to achieve a decent portion of the systems' performance without supplementary semantic information from the developer. To answer these challenges, we propose the interactive program restructuring approach, a bridge between semi-automatic program manipulation and software visualization. It is illustrated in this thesis by, first, extending a state-of-the-art polyhedral model for program representation so that it supports high-level program manipulation and, second, by designing and evaluating a direct manipulation visual interface for program restructuring. This interface provides information about the program that was not immediately accessible in the code and allows to manipulate programs without rewriting. We also propose a representation of an automatically computed program optimization in an understandable form, easily modifiable and reusable by the developer both visually and textually in a sort of human-machine partnership. To support different aspects of program restructuring, we design and evaluate a new interaction to communicate supplementary information, not critical for the task at hand. After an empirical study of developers' attention distribution when faced with visual and textual program representation, we discuss the implications for design of program manipulation tools in the instrumental interaction paradigm. We expect interactive program restructuring to make program manipulation for optimization more efficient and widely adopted.
|
10 |
Improving tiling, reducing compilation time, and extending the scope of polyhedral compilation / Amélioration du tuilage, réduction du temps de compilation, et extension de l'utilisabilité de la compilation polyédriqueBaghdadi, Mohamed Riyadh 25 September 2015 (has links)
Les processeurs multi-coeurs sont maintenant largement utilisés presque partout en informatique: ordinateurs de bureau, ordinateurs portables et accélérateurs tels que les GPGPU (General Purpose Graphics Processing Units). La difficulté de la programmation des systèmes parallèles est considérée comme un problème majeur qui va empêcher l'exploitation de leurs capacités dans le futur. Pour exploiter la puissance des processeurs multi-coeurs et les hiérarchies complexes de mémoire, il y a une grande nécessité pour utiliser des outils de parallélisation et d'optimisation automatique de code. L'optimisation polyédrique est un axe de recherche qui a comme but de résoudre ces problèmes. C'est est une représentation algébrique du programme et un ensemble d'analyses, de transformations et d'algorithmes de génération de code qui permettent à un compilateur de raisonner sur des transformations avancées de nids de boucle. Dans cette thèse, nous abordons certaines des limites du modèle polyédrique. Nous nous intéréssons particulièrement à trois problèmes et nous proposons des solutions pratiques à ces trois problèmes. Le premier problème est lié à la capacité d'appliquer l'optimisation de tuilage sur un code qui contient des fausses dépendances. Nous proposons une téchnique qui permet d'ignorer certaines fausses dépendences et donc qui permet d'appliquer l'optimisation de tuilage qui n'est pas possible sinon. Le second problème est lié au temps de compilation qui peut être trés long pour certains programmes. Nous proposons une téchnique qui transforme la représentation originale du programme à une nouvelle representation dans laquelle il y a moins d'instructions. L'optimisation de cette nouvelle représentation du programme est moins couteuse en terme de temps de compilation en comparaison avec l'optimisation de la représentation originale du programme. Le troisième problème est lié à deux limites: la première limite concerne la possibilité d'utiliser la compilation polyédrique sur des programmes qui ne resepectent pas les restrictions classiques du modèle polyédrique (un programme peut être représenté de façon précise dans le modèle polyédrique s'il ne contient pas des conditionnelles non-affines, des bornes de boucles non-affines et des accés non-affines). La seconde limite est liée à l'aptitude des outils à générer un code performant dans les performances se rapprochent des performances du code écrit à la main. Pour éviter ces deux limites, nous proposons un language de programmation que l'on appelle PENCIL, c'est un sous-ensemble de GNU C99 avec des règles de programmation spécifiques et quelques extensions. L'utilisation de ce sous-ensemble et l'utilisation de ces extensions permettent aux compilateurs de mieux exploiter le parallélisme et de mieux optimiser le code. / Multi-core processors are now in widespread use in almost all areas of computing: desktops, laptops and accelerators such as GPGPUs (General Purpose Graphics Processing Units). To harness the power of multi-core processors and complex memory hierarchies, the need for powerful compiler optimizations and especially loop nest transformations is now in high demand. The polyhedral optimization framework is showing promising results in addressing such a problem. It's an algebraic program representation and a set of analyses, transformations and code generation algorithms that enable a compiler to reason about advanced loop nest transformations addressing most of the parallelism and locality-enhancing challenges.In this thesis we address some of the limitations of the polyhedral framework. We address three problems and propose practical solutions to these three problems.The first problem is related to the ability to apply tiling on code that has false dependences (loop nest tiling is an optimization that changes the order of execution of statements in a loop nest in order to enhance data locality; false dependences are induced by the reuse of a single memory location to store multiple values during the life of the program). To preserve the validity of loop nest transformations and parallelization, data-dependences need to be analyzed. Memory dependences come in two varieties: true dependences (a.k.a. flow dependences) and false dependences (a.k.a. output and anti dependences). While true dependences must be satisfied in order to preserve the correct order of computations. False dependences reduce the degrees of freedom for loop transformations. In particular, loop tiling is severely limited in the presence of these dependences. While array expansion, a transformation that transforms scalars into arrays and arrays into higher dimensional arrays, removes all false dependences, the overhead of this transformation on memory and the detrimental impact on register-level reuse can be catastrophic. We propose and evaluate a compilation technique to safely ignore a large number of false dependences in order to enable loop nest tiling in the polyhedral model. It is based on the precise characterization of interferences between live range intervals, and it does not incur any scalar or array expansion.The second problem is related to the long compilation time that one may experience when using polyhedral tools to optimize a program. Particularly, the long execution time of the Pluto affine scheduling algorithm. The Pluto affine scheduling algorithm is the algorithm that is responsible for changing the schedule (order of execution) of statements in order to optimize the code (maximize parallelism and data locality). Reducing the execution time of this affine scheduling algorithm enhances the overall compilation time. We introduce and evaluate a technique called offline statement clustering. It is a practical technique designed to reduce the execution time of the Pluto affine scheduling algorithm without much loss in optimization opportunities. Using this technique, the statements of the program are clustered into macro-statements, the Pluto affine scheduling algorithm is then used to schedule the macro-statements instead of scheduling the original statements of the program. Since the number of macro-statements is less than the number of statements in the original program, scheduling the macro-statements is in general faster than scheduling the original statements of the program. We present the statement clustering algorithm, we show how offline statement clustering integrates transparently with the work-flow of a state-of-the-art polyhedral compiler and present two heuristics for choosing how statements should be clustered together. We show experimentally that statement clustering can reduce the scheduling time by a factor of 8x (in median) without a significant loss in optimization opportunities...
|
Page generated in 0.0615 seconds