Global ETD Search

1	Design and Implementation of Thread-Level Speculation in JavaScript Engines Martinsen, Jan Kasper January 2014 (has links) Two important trends in computer systems are that applications are moved to the Internet as web applications, and that computer systems are getting an increasing number of cores to increase the performance. It has been shown that JavaScript in web applications has a large potential for parallel execution despite the fact that JavaScript is a sequential language. In this thesis, we show that JavaScript execution in web applications and in benchmarks are fundamentally different and that an effect of this is that Just-in-time compilation does often not improve the execution time, but rather increases the execution time for JavaScript in web applications. Since there is a significant potential for parallel computation in JavaScript for web applications, we show that Thread-Level Speculation can be used to take advantage of this in a manner completely transparent to the programmer. The Thread-Level Speculation technique is very suitable for improving the performance of JavaScript execution in web applications; however we observe that the memory overhead can be substantial. Therefore, we propose several techniques for adaptive speculation as well as for memory reduction. In the last part of this thesis we show that Just-in-time compilation and Thread-Level Speculation are complementary techniques. The execution characteristics of JavaScript in web applications are very suitable for combining Just-in-time compilation and Thread-Level Speculation. Finally, we show that Thread-Level Speculation and Just-in-time compilation can be combined to reduce power usage on embedded devices. parallel computation web applications thread-level speculation
2	A data dependency recovery system for a heterogeneous multicore processor Kainth, Haresh S. January 2014 (has links) Multicore processors often increase the performance of applications. However, with their deeper pipelining, they have proven increasingly difficult to improve. In an attempt to deliver enhanced performance at lower power requirements, semiconductor microprocessor manufacturers have progressively utilised chip-multicore processors. Existing research has utilised a very common technique known as thread-level speculation. This technique attempts to compute results before the actual result is known. However, thread-level speculation impacts operation latency, circuit timing, confounds data cache behaviour and code generation in the compiler. We describe an software framework codenamed Lyuba that handles low-level data hazards and automatically recovers the application from data hazards without programmer and speculation intervention for an asymmetric chip-multicore processor. The problem of determining correct execution of multiple threads when data hazards occur on conventional symmetrical chip-multicore processors is a significant and on-going challenge. However, there has been very little focus on the use of asymmetrical (heterogeneous) processors with applications that have complex data dependencies. The purpose of this thesis is to: (i) define the development of a software framework for an asymmetric (heterogeneous) chip-multicore processor; (ii) present an optimal software control of hardware for distributed processing and recovery from violations;(iii) provides performance results of five applications using three datasets. Applications with a small dataset showed an improvement of 17% and a larger dataset showed an improvement of 16% giving overall 11% improvement in performance. 004.35
3	Complementing user-level coarse-grain parallelism with implicit speculative parallelism Ioannou, Nikolas January 2012 (has links) Multi-core and many-core systems are the norm in contemporary processor technology and are expected to remain so for the foreseeable future. Parallel programming is, thus, here to stay and programmers have to endorse it if they are to exploit such systems for their applications. Programs using parallel programming primitives like PThreads or OpenMP often exploit coarse-grain parallelism, because it offers a good trade-off between programming effort versus performance gain. Some parallel applications show limited or no scaling beyond a number of cores. Given the abundant number of cores expected in future many-cores, several cores would remain idle in such cases while execution performance stagnates. This thesis proposes using cores that do not contribute to performance improvement for running implicit fine-grain speculative threads. In particular, we present a many-core architecture and protocols that allow applications with coarse-grain explicit parallelism to further exploit implicit speculative parallelism within each thread. We show that complementing parallel programs with implicit speculative mechanisms offers significant performance improvements for a large and diverse set of parallel benchmarks. Implicit speculative parallelism frees the programmer from the additional effort to explicitly partition the work into finer and properly synchronized tasks. Our results show that, for a many-core comprising 128 cores supporting implicit speculative parallelism in clusters of 2 or 4 cores, performance improves on top of the highest scalability point by 44% on average for the 4-core cluster and by 31% on average for the 2-core cluster. We also show that this approach often leads to better performance and energy efficiency compared to existing alternatives such as Core Fusion and Turbo Boost. Moreover, we present a dynamic mechanism to choose the number of explicit and implicit threads, which performs within 6% of the static oracle selection of threads. To improve energy efficiency processors allow for Dynamic Voltage and Frequency Scaling (DVFS), which enables changing their performance and power consumption on-the-fly. We evaluate the amenability of the proposed explicit plus implicit threads scheme to traditional power management techniques for multithreaded applications and identify room for improvement. We thus augment prior schemes and introduce a novel multithreaded power management scheme that accounts for implicit threads and aims to minimize the Energy Delay2 product (ED2). Our scheme comprises two components: a “local” component that tries to adapt to the different program phases on a per explicit thread basis, taking into account implicit thread behavior, and a “global” component that augments the local components with information regarding inter-thread synchronization. Experimental results show a reduction of ED2 of 8% compared to having no power management, with an average reduction in power of 15% that comes at a minimal loss of performance of less than 3% on average.
4	High Performance Architecture using Speculative Threads and Dynamic Memory Management Hardware Li, Wentong 12 1900 (has links) With the advances in very large scale integration (VLSI) technology, hundreds of billions of transistors can be packed into a single chip. With the increased hardware budget, how to take advantage of available hardware resources becomes an important research area. Some researchers have shifted from control flow Von-Neumann architecture back to dataflow architecture again in order to explore scalable architectures leading to multi-core systems with several hundreds of processing elements. In this dissertation, I address how the performance of modern processing systems can be improved, while attempting to reduce hardware complexity and energy consumptions. My research described here tackles both central processing unit (CPU) performance and memory subsystem performance. More specifically I will describe my research related to the design of an innovative decoupled multithreaded architecture that can be used in multi-core processor implementations. I also address how memory management functions can be off-loaded from processing pipelines to further improve system performance and eliminate cache pollution caused by runtime management functions. memory management Thread level speculation data flow architecture decoupled architecture Computer architecture. Memory management (Computer science)
5	Beyond the realm of the polyhedral model : combining speculative program parallelization with polyhedral compilation / Au-delà des limites du modèle polyédrique : l'alliage de la parallélisation spéculative de programmes avec la compilation polyédrique Sukumaran Rajam, Aravind 05 November 2015 (has links) Dans cette thèse, nous présentons nos contributions à Apollo (Automatic speculative POLyhedral Loop Optimizer), qui est un compilateur automatique combinant la parallélisation spéculative et le modèle polyédrique, afin d’optimiser les codes à la volée. En effectuant une instrumentation partielle au cours de l’exécution, et en la soumettant à une interpolation, Apollo est capable de construire un modèle polyédrique spéculatif dynamiquement. Ce modèle spéculatif est ensuite transmis à Pluto, qui est un ordonnanceur polyédrique statique. Apollo sélectionne ensuite un des squelettes d’optimisation de code générés statiquement, et l’instancie. La partie dynamique d’Apollo surveille continuellement l’exécution du code afin de détecter de manière dé- centralisée toute violation de dépendance. Une autre contribution importante de cette thèse est notre extension du modèle polyédrique aux codes exhibant un comportement non-linéaire. Grâce au contexte dynamique et spéculatif d’Apollo, les comportements non-linéaires sont soit modélisés par des hyperplans de régression linéaire formant des tubes, soit par des intervalles de valeurs atteintes. Notre approche permet l’application de transformations polyédriques à des codes non-linéaires grâce à un système de vérification de la spéculation hybride, combinant vérifications centralisées et décentralisées. / In this thesis, we present our contributions to APOLLO (Automatic speculative POLyhedral Loop Optimizer), which is an automated compiler combining Thread Level Speculation (TLS) and the polyhedral model to optimize codes on the fly. By doing partial instrumentation at runtime, and subjecting it to interpolation, Apollo is able to construct a speculative polyhedral model dynamically. The speculative model is then passed to Pluto -a static polyhedral scheduler-. Apollo then selects one of the statically generated code optimization skeletons and instantiates it. The runtime continuously monitors the code for any dependence violation in a decentralized manner. Another important contribution of this thesis is our extension of the polyhedral model to codes exhibiting a non linear behavior. Thanks to the dynamic and speculative context offered by Apollo, non-linear behaviors are either modeled using linear regression hyperplanes forming tubes, or using ranges of reached values. Our approach enables the application of polyhedral transformations to non-linear codes thanks to an hybrid centralized-decentralized speculation verification system Compilation dynamique Modèle polyédrique Parallélisation spéculative APOLLO Régression linéaire Dynamic compilation Polyedral model Loop optimization Thread level speculation Speculative optimization Regression Compilers 005.4

1

Page generated in 0.1095 seconds