Global ETD Search

1	Baseband Processing Using the Julia Language Mellberg, Linus January 2015 (has links) Baseband processing is an important and computationally heavy part of modern mobile cellular systems. These systems use specialized hardware that has many digital signal processing cores and hardware accelerators. The algorithms that run on these systems are complexand needs to take advantage of this hardware. Developing software for these systems requires domain knowledge about baseband processing and low level programming on parallel real time systems. This thesis investigates if the programming language Julia can be used to implement algorithms for baseband processing in mobile telephony base stations. If it is possible to use a scientific language like Julia to directly implement programs for the special hardware in the base stations it can reduce lead times and costs. In this thesis a uplink receiver is implemented in Julia. This implementation is written usinga domain specific language. This makes it possible to specify a number of transformations that use the metaprogramming capabilities in Julia to transform the uplink receiver such that it is better suited to execute on the hardware described above. This is achieved by transforming the program such that it consists of functions that either can be executed on single digital signal processing cores or hardware accelerators. It is concluded that Julia seems suited for prototyping baseband processing algorithms. Using metaprogramming to transform a baseband processing algorithm to be better suited for baseband processing hardware is also a feasible approach.
2	Automatic Source Code Transformation To Pass Compiler Optimization Kahla, Moustafa Mohamed 03 January 2024 (has links) Loop vectorization is a powerful optimization technique that can significantly boost the runtime of loops. This optimization depends on functional equivalence between the original and optimized code versions, a requirement typically established through the compiler's static analysis. When this condition is not met, the compiler will miss the optimization. The process of manually rewriting the source code to pass an already missed compiler optimization is time-consuming, given the multitude of potential code variations, and demands a high level of expertise, making it impractical in many scenarios. In this work, we propose a novel framework that aims to take the code blocks that the compiler failed to optimize and transform them to another code block that passes the compiler optimization. We develop an algorithm to efficiently search for a code structure that automatically passes the compiler optimization (weakly verified through a correctness test). We focus on loop-vectorize optimization inside OpenMP directives, where the introduction of parallelism adds complexity to the compiler's vectorization task and is shown to hinder optimizations. Furthermore, we introduce a modified version of TSVC, a loop vectorization benchmark in which all original loops are executed within OpenMP directives. Our evaluation shows that our framework enables " loop-vectorize" optimizations that the compiler failed to pass, resulting in a speedup up to 340× in the blocks optimized. Furthermore, applying our tool to HPC benchmark applications, where those applications are already built with optimization and performance in mind, demonstrates that our technique successfully enables extended compiler optimization, thereby accelerating the execution time of the optimized blocks in 15 loops and the entire execution time of the three applications by up to 1.58 times. / Master of Science / Loop vectorization is a powerful technique for improving the performance of specific sections in computer programs known as loops. Particularly, it simultaneously executes instructions of different iterations in a loop, providing a considerable speedup on its runtime due to this parallelism. To apply this optimization, the code needs to meet certain conditions, which are usually checked by the compiler. However, sometimes the compiler cannot verify these conditions, and the optimization fails. Our research introduces a new approach to fix these issues automatically. Normally, fixing the code manually to meet these conditions is time-consuming and requires high expertise. To overcome this, we've developed a tool that can efficiently find ways to make the code satisfy the conditions needed for optimization. Our focus is on a specific type of code that uses OpenMP directives to split the loop on multiple processor cores and runs them simultaneously, where adding this parallelism makes the code more complex for the compiler to optimize. Our tests show that our approach successfully improves the speed of computer programs by enabling optimizations initially missed by the compiler. This results in significant speed improvements for specific parts of the code, sometimes up to 340 times faster. We've also applied our method to well-optimized computer programs, and it still managed to make them run up to 1.58 times faster. Source Code Transformation Loop-Vectorization Machine Learning
3	An Environment for Automatic Generation of Code Optimizers Paleri, Vineeth Kumar 07 1900 (has links) Code optimization or code transformation is a complex function of a compiler involving analyses and modifications with the entire program as its scope. In spite of its complexity, hardly any tools exist to support this function of the compiler. This thesis presents the development of a code transformation system, specifically for scalar transformations, which can be used either as a tool to assist the generation of code transformers or as an environment for experimentation with code transformations. The development of the code transformation system involves the formal specification of code transformations using dependence relations. We have written formal specifications for the whole class of traditional scalar transformations, including induction variable elimination - a complex transformation - for which no formal specifications are available in the literature. All transformations considered in this thesis are global. Most of the specifications given here, for which specifications are already available in the literature, are improved versions, in terms of conservativeness.The study of algorithms for code transformations, in the context of their formal specification, lead us to the development of a new algorithm for partial redundancy elimination. The basic idea behind the algorithm is the new concepts of safe partial availability and safe partial anticipability. Our algorithm is computationally and lifetime optimal. It works on flow graphs whose nodes are basic blocks, which makes it practical.In comparison with existing algorithms the new algorithm also requires four unidirectional analyses, but saves some preprocessing time. The main advantage of the algorithm is its conceptual simplicity. The code transformation system provides an environment in which one can specify a transformation using dependence relations (in the specification language we have designed), generate code for a transformer from its specification,and experiment with the generated transformers on real-world programs. The system takes a program to be transformed, in C or FORTRAN, as input,translates it into intermediate code, interacts with the user to decide the transformation to be performed, computes the necessary dependence relations using the dependence analyzer, applies the specified transformer on the intermediate code, and converts the transformed intermediate code back to high-level. The system is unique of its kind,providing a complete environment for the generation of code transformers, and allowing experimentations with them using real-world programs. Computer and Information Science Compiler optimization Automatic Generation Code transformation
4	Downgrading Java 5.0 Projects : An approach based on source-code transformations Steijger, Tamara January 2008 (has links) <p>The introduction of Java 5.0 came along with an extension of the language syntax. Several new language features as generic types and enumeration types were added to the language specification. These features cause downward-incompatibilities, code written in Java 5.0 will not work on older versions of the Java runtime environment. For some active projects, however, it is not possible to upgrade to higher Java versions, since some code might not be supported on Java 5.0. If one still wants to use components written in Java 5.0, these must be downgraded. Up to now this has been accomplished mostly by transforming the byte code of these programs.</p><p>In this thesis, we present a set of transformations which transform Java 5.0 source code to Java 1.4 compatible code. We successfully apply these transformations to two larger projects and compare our approach to the up to now common byte-code based tools.</p> Source-Code Transformation Java Metaprogram RECODER Computer science Datavetenskap
5	Downgrading Java 5.0 Projects : An approach based on source-code transformations Steijger, Tamara January 2008 (has links) The introduction of Java 5.0 came along with an extension of the language syntax. Several new language features as generic types and enumeration types were added to the language specification. These features cause downward-incompatibilities, code written in Java 5.0 will not work on older versions of the Java runtime environment. For some active projects, however, it is not possible to upgrade to higher Java versions, since some code might not be supported on Java 5.0. If one still wants to use components written in Java 5.0, these must be downgraded. Up to now this has been accomplished mostly by transforming the byte code of these programs. In this thesis, we present a set of transformations which transform Java 5.0 source code to Java 1.4 compatible code. We successfully apply these transformations to two larger projects and compare our approach to the up to now common byte-code based tools. Source-Code Transformation Java Metaprogram RECODER Computer Sciences Datavetenskap (datalogi)
6	A Compiler Framework to Support and Exploit Heterogeneous Overlapping-ISA Multiprocessor Platforms Jelesnianski, Christopher Stanisław 15 December 2015 (has links) As the demand for ever increasingly powerful machines continues, new architectures are sought to be the next route of breaking past the brick wall that currently stagnates the performance growth of modern multi-core CPUs. Due to physical limitations, scaling single-core performance any further is no longer possible, giving rise to modern multi-cores. However, the brick wall is now limiting the scaling of general-purpose multi-cores. Heterogeneous-core CPUs have the potential to continue scaling by reducing power consumption through exploitation of specialized and simple cores within the same chip. Heterogeneous-core CPUs join fundamentally different processors each which their own peculiar features, i.e., fast execution time, improved power efficiency, etc; enabling the building of versatile computing systems. To make heterogeneous platforms permeate the computer market, the next hurdle to overcome is the ability to provide a familiar programming model and environment such that developers do not have to focus on platform details. Nevertheless, heterogeneous platforms integrate processors with diverse characteristics and potentially a different Instruction Set Architecture (ISA), which exacerbate the complexity of the software. A brave few have begun to tread down the heterogeneous-ISA path, hoping to prove that this avenue will yield the next generation of super computers. However, many unforeseen obstacles have yet to be discovered. With this new challenge comes the clear need for efficient, developer-friendly, adaptable system software to support the efforts of making heterogeneous-ISA the golden standard for future high-performance and general-purpose computing. To foster rapid development of this technology, it is imperative to put the proper tools into the hands of developers, such as application and architecture profiling engines, in order to realize the best heterogeneous-ISA platform possible with available technology. In addition, it would be in the best interest to create tools to be as "timeless" as possible to expose fundamental concepts industry could benefit from and adopt in future designs. We demonstrate the feasibility of a compiler framework and runtime for an existing heterogeneous-ISA operating system (Popcorn Linux) for automatically scheduling compute blocks within an application on a given heterogeneous-ISA high-performance platform (in our case a platform built with Intel Xeon - Xeon Phi). With the introduced Profiler, Partitioner, and Runtime support, we prove to be able to automatically exploit the heterogeneity in an overlapping-ISA platform, being faster than native execution and other parallelism programming models. Empirically evaluating our compiler framework, we show that application execution on Popcorn Linux can be up to 52% faster than the most performant native execution for Xeon or Xeon Phi. Using our compiler framework relieves the developer from manual scheduling and porting of applications, requiring only a single profiling run per application. / Master of Science Compilers Heterogeneous Architecture Performance Profiling Runtime Code Transformation System Software
7	Semi-automatic code-to-code transformer for Java : Transformation of library calls / Halvautomatisk kodöversättare för Java : Transformation av biblioteksanrop Boije, Niklas, Borg, Kristoffer January 2016 (has links) Having the ability to perform large automatic software changes in a code base gives new possibilities for software restructuring and cost savings. The possibility of replacing software libraries in a semi-automatic way has been studied. String metrics are used to find equivalents between two libraries by looking at class- and method names. Rules based on the equivalents are then used to describe how to apply the transformation to the code base. Using the abstract syntax tree, locations for replacements are found and transformations are performed. After the transformations have been performed, an evaluation of the saved effort of doing the replacement automatically versus manually is made. It shows that a large part of the cost can be saved. An additional evaluation calculating the maintenance cost saved annually by changing libraries is also performed in order to prove the claim that an exchange can reduce the annual cost for the project. code-to-code transformer semi-automatic transformation code transformation transformation of library calls rule transformation
8	Preprocesor Java bytecode pro verifikační nástroje / Java Bytecode Preprocessor for Program Verification Tools Šafařík, Tomáš January 2016 (has links) Both J2BP and PANDA tools verify compiled Java programs. By now, these tools are not able to process some programs with specific JVM bytecode instruction sequences in the correct way. We described these instruction sequences and proposed their transformations. We developed the new application, called BytecodeTransformer, based on these propositions. This application transforms compiled Java programs and replaces the problematic instruction sequences with some others. Usage of BytecodeTransformer enlarges the set of programs that can be verified by both J2BP and PANDA. We also evaluated BytecodeTransformer on several Java programs, including own tests and well-known open-source programs. These tests demonstrated the correct functionality of BytecodeTransformer. Powered by TCPDF (www.tcpdf.org)
9	Un environnement parallèle de développement haut niveau pour les accélérateurs graphiques : mise en œuvre à l’aide d’OPENMP / A high-level parallel development framework for graphic accelerators : an implementation based on OPENMP Noaje, Gabriel 07 March 2013 (has links) Les processeurs graphiques (GPU), originellement dédiés à l'accélération de traitements graphiques, ont une structure hautement parallèle. Les innovations matérielles et de langage de programmation ont permis d'ouvrir le domaine du GPGPU, où les cartes graphiques sont utilisées comme des accélérateurs de calcul pour des applications HPC généralistes.L'objectif de nos travaux est de faciliter l'utilisation de ces nouvelles architectures pour les besoins du calcul haute performance ; ils suivent deux objectifs complémentaires.Le premier axe de nos recherches concerne la transformation automatique de code, permettant de partir d'un code de haut niveau pour le transformer en un code de bas niveau, équivalent, pouvant être exécuté sur des accélérateurs. Dans ce but nous avons implémenté un transformateur de code capable de prendre en charge les boucles « pour » parallèles d'un code OpenMP (simples ou imbriquées) et de le transformer en un code CUDA équivalent, qui soit suffisamment lisible pour permettre de le retravailler par des optimisations ultérieures.Par ailleurs, le futur des architectures HPC réside dans les architectures distribuées basées sur des nœuds dotés d'accélérateurs. Pour permettre aux utilisateurs d'exploiter les nœuds multiGPU, il est nécessaire de mettre en place des schémas d'exécution appropriés. Nous avons mené une étude comparative et mis en évidence que les threads OpenMP permettent de gérer de manière efficace plusieurs cartes graphiques et les communications au sein d'un nœud de calcul multiGPU. / Graphic cards (GPUs), initially used for graphic processing, have a highly parallel architecture. Innovations in both architecture and programming languages opened the new domain of GPGPU where GPUs are used as accelerators for general purpose HPC applications.Our main objective is to facilitate the use of these new architectures for high-performance computing needs; our research follows two main directions.The first direction concerns an automatic code transformation from a high level code into an equivalent low level one, capable of running on accelerators. To this end we implemented a code transformer that can handle parallel “for” loops (single or nested) of an OpenMP code and convert it into an equivalent CUDA code, which is in a human readable form that allows for further optimizations.Moreover, the future of HPC lies in distributed architectures based on hybrid nodes. Specific programming schemes have to be used in order to allow users to benefit from such multiGPU nodes. We conducted a comparative study which revealed that using OpenMP threads is the most adequate way to control multiple graphic cards as well as manage communications efficiently within a multiGPU node. OpenMP Cuda Compilateur Transformation de code Manycœurs MultiGPU OpenMP Cuda Compiler Code transformation Manycores MultiGPU
10	SUPPORTING SOFTWARE EXPLORATION WITH A SYNTACTIC AWARESOURCE CODE QUERY LANGUAGE Bartman, Brian M. 26 July 2017 (has links) No description available. Computer Science

Search results