Return to search

Automatic Source Code Transformation To Pass Compiler Optimization

Loop vectorization is a powerful optimization technique that can significantly boost the runtime of loops. This optimization depends on functional equivalence between the original and optimized code versions, a requirement typically established through the compiler's static analysis. When this condition is not met, the compiler will miss the optimization. The process of manually rewriting the source code to pass an already missed compiler optimization is time-consuming, given the multitude of potential code variations, and demands a high level of expertise, making it impractical in many scenarios. In this work, we propose a novel framework that aims to take the code blocks that the compiler failed to optimize and transform them to another code block that passes the compiler optimization. We develop an algorithm to efficiently search for a code structure that automatically passes the compiler optimization (weakly verified through a correctness test). We focus on loop-vectorize optimization inside OpenMP directives, where the introduction of parallelism adds complexity to the compiler's vectorization task and is shown to hinder optimizations. Furthermore, we introduce a modified version of TSVC, a loop vectorization benchmark in which all original loops are executed within OpenMP directives.
Our evaluation shows that our framework enables " loop-vectorize" optimizations that the compiler failed to pass, resulting in a speedup up to 340× in the blocks optimized. Furthermore, applying our tool to HPC benchmark applications, where those applications are already built with optimization and performance in mind, demonstrates that our technique successfully enables extended compiler optimization, thereby accelerating the execution time of the optimized blocks in 15 loops and the entire execution time of the three applications by up to 1.58 times. / Master of Science / Loop vectorization is a powerful technique for improving the performance of specific sections in computer programs known as loops. Particularly, it simultaneously executes instructions of different iterations in a loop, providing a considerable speedup on its runtime due to this parallelism. To apply this optimization, the code needs to meet certain conditions, which are usually checked by the compiler. However, sometimes the compiler cannot verify these conditions, and the optimization fails. Our research introduces a new approach to fix these issues automatically.
Normally, fixing the code manually to meet these conditions is time-consuming and requires high expertise. To overcome this, we've developed a tool that can efficiently find ways to make the code satisfy the conditions needed for optimization.
Our focus is on a specific type of code that uses OpenMP directives to split the loop on multiple processor cores and runs them simultaneously, where adding this parallelism makes the code more complex for the compiler to optimize.
Our tests show that our approach successfully improves the speed of computer programs by enabling optimizations initially missed by the compiler. This results in significant speed improvements for specific parts of the code, sometimes up to 340 times faster. We've also applied our method to well-optimized computer programs, and it still managed to make them run up to 1.58 times faster.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/117295
Date03 January 2024
CreatorsKahla, Moustafa Mohamed
ContributorsElectrical and Computer Engineering, Nikolopoulos, Dimitrios S., Xiong, Wenjie, Ravindran, Binoy
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0021 seconds