Return to search

High performance optimizations in runtime speculative parallelization for multicore architectures

Thread-Level Speculation (TLS) overcomes limitations intrinsic with conservative compile-time auto-parallelizing tools by extracting parallel threads optimistically and only ensuring absence of data dependence violations at runtime. A significant barrier for adopting TLS (implemented in software) is the overheads associated with maintaining speculative state. Previous TLS limit studies observe that on future multi-core systems it is likely to have more cores idle than those which traditional TLS would be able to harness. This thesis describes a novel compact version management data structure optimized for space overhead when using a small number of TLS threads. Furthermore, two novel software runtime parallelization systems were developed that utilize this compact data structure. The first one, MiniTLS, is optimized for fast recovery in the case of misspeculations by parallelizing the recovery procedure. The second one, Lector, is optimizedfor performance by using lightweight helper threads, along with TLS threads, to establish whether speculation can be withdrawn avoiding that way any speculative overheads. Facilitated by the novel compact representation, MiniTLS reduces the space overhead over state-of-the-art software TLS systems between 96% on 2 threads and 40% on 32 threads. MiniTLS and Lector were applied to seven Java benchmarks performing on average 7x and 8.2x faster, respectively, against the sequential versions and on average 1.7x faster than the current state-of-the-art in software TLS for 32 threads.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:607033
Date January 2013
CreatorsYiapanis, Paraskevas
ContributorsLujan Moreno, Mikel Lujan; Brown, Gavin
PublisherUniversity of Manchester
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttps://www.research.manchester.ac.uk/portal/en/theses/high-performance-optimizations-in-runtime-speculative-parallelization-for-multicore-architectures(1d980957-a08d-49f0-8f7f-d5c35bae38b8).html

Page generated in 0.0022 seconds