Global ETD Search

151	Normalisation by evaluation in the compilation of typed functional programming languages Lindley, Sam January 2005 (has links) This thesis presents a critical analysis of normalisation by evaluation as a technique for speeding up compilation of typed functional programming languages. Our investigation focuses on the SML.NET compiler and its typed intermediate language MIL. We implement and measure the performance of normalisation by evaluation for MIL across a range of benchmarks. Taking a different approach, we also implement and measure the performance of a graph-based shrinking reductions algorithm for SML.NET. MIL is based on Moggi’s computational metalanguage. As a stepping stone to normalisation by evaluation, we investigate strong normalisation of the computational metalanguage by introducing an extension of Girard-Tait reducibility. Inspired by previous work on local state and parametric polymorphism, we define reducibility for continuations and more generally reducibility for frame stacks. First we prove strong normalistion for the computational metalanguage. Then we extend that proof to include features of MIL such as sums and exceptions. Taking an incremental approach, we construct a collection of increasingly sophisticated normalisation by evaluation algorithms, culminating in a range of normalisation algorithms for MIL. Congruence rules and alpha-rules are captured by a compositional parameterised semantics. Defunctionalisation is used to eliminate eta-rules. Normalisation by evaluation for the computational metalanguage is introduced using a monadic semantics. Variants in which the monadic effects are made explicit, using either state or control operators, are also considered. Previous implementations of normalisation by evaluation with sums have relied on continuation-passing-syle or control operators. We present a new algorithm which instead uses a single reference cell and a zipper structure. This suggests a possible alternative way of implementing Filinski’s monadic reflection operations. In order to obtain benchmark results without having to take into account all of the features of MIL, we implement two different techniques for eliding language constructs. The first is not semantics-preserving, but is effective for assessing the efficiency of normalisation by evaluation algorithms. The second is semantics-preserving, but less flexible. In common with many intermediate languages, but unlike the computational metalanguage, MIL requires all non-atomic values to be named. We use either control operators or state to ensure each non-atomic value is named. We assess our normalisation by evaluation algorithms by comparing them with a spectrum of progressively more optimised, rewriting-based normalisation algorithms. The SML.NET front-end is used to generate MIL code from ML programs, including the SML.NET compiler itself. Each algorithm is then applied to the generated MIL code. Normalisation by evaluation always performs faster than the most naıve algorithms— often by orders of magnitude. Some of the algorithms are slightly faster than normalisation by evaluation. Closer inspection reveals that these algorithms are in fact defunctionalised versions of normalisation by evaluation algorithms. Our normalisation by evaluation algorithms perform unrestricted inlining of functions. Unrestricted inlining can lead to a super-exponential blow-up in the size of target code with respect to the source. Furthermore, the worst-case complexity of compilation with unrestricted inlining is non-elementary in the size of the source code. SML.NET alleviates both problems by using a restricted form of normalisation based on Appel and Jim’s shrinking reductions. The original algorithm is quadratic in the worst case. Using a graph-based representation for terms we implement a compositional linear algorithm. This speeds up the time taken to perform shrinking reductions by up to a factor of fourteen, which leads to an improvement of up to forty percent in total compile time. 005.3
152	Suitability of Java for Solving Large Sparse Positive Definite Systems of Equations Using Direct Methods Armstrong, Shea January 2004 (has links) The purpose of the thesis is to determine whether Java, a programming language that evolved out of a research project by Sun Microsystems in 1990, is suitable for solving large sparse linear systems using direct methods. That is, can performance comparable to the language traditionally used for sparse matrix computation, Fortran, be achieved by a Java implementation. Performance evaluation criteria include execution speed and memory requirements. A secondary criterion is ease of development. Many attractive features, unique to the Java programming language, make it desirable for use in sparse matrix computation and provide the motivation for the thesis. The 'write once, run anywhere' proposition, coupled with nearly-ubiquitous Java support, alleviates the need to re-write programs in the event of hardware change. Features such as garbage collection (automatic recycling of memory) and array-index bounds checking make Java programs more robust than those written in Fortran. Java has garnered a poor reputation as a high-performance computing platform, largely attributable to poor performance relative to Fortran in its early years. It is now a consensus among researchers that the Java language itself is not the problem, but rather its implementation. As such, improving compiler technology for numerical codes is critical to achieving high performance in numerical Java applications. Preliminary work involved converting SPARSPAK, a collection of Fortran 90 subroutines for solving large sparse systems of linear equations and least squares problems developed by Dr. Alan George, into Java (J-SPARSPAK). It is well known that the majority of the solution process is spent in the numeric factorization phase. Initial benchmarks showed Java performing, on average, 3. 6 times slower than Fortran for this critical phase. We detail how we improved Java performance to within a factor of two of Fortran. Computer Science Java High-performance Java Numerical Java Computing Java compiler Numerical Java Libraries
153	A New Look at Retargetable Compilers Burke, Patrick William 12 1900 (has links) Consumers demand new and innovative personal computing devices every 2 years when their cellular phone service contracts are renewed. Yet, a 2 year development cycle for the concurrent development of both hardware and software is nearly impossible. As more components and features are added to the devices, maintaining this 2 year cycle with current tools will become commensurately harder. This dissertation delves into the feasibility of simplifying the development of such systems by employing heterogeneous systems on a chip in conjunction with a retargetable compiler such as the hybrid computer retargetable compiler (Hy-C). An example of a simple architecture description of sufficient detail for use with a retargetable compiler like Hy-C is provided. As a software engineer with 30 years of experience, I have witnessed numerous system failures. A plethora of software development paradigms and tools have been employed to prevent software errors, but none have been completely successful. Much discussion centers on software development in the military contracting market, as that is my background. The dissertation reviews those tools, as well as some existing retargetable compilers, in an attempt to determine how those errors occurred and how a system like Hy-C could assist in reducing future software errors. In the end, the potential for a simple retargetable solution like Hy-C is shown to be very simple, yet powerful enough to provide a very capable product in a very fast-growing market. Retargetable compiler architecture description software process Compilers (Computer programs) Software failures -- Prevention.
154	A programming language based on recurrence equations and polyhedral compilation for stream processing Leben, Jakob 31 July 2019 (has links) The work presented in this dissertation contributes to the field of programming lan- guage design and implementation for stream processing applications. There is a fast-expanding domain of stream processing applications which demand processing high-volume streams quickly and often in real time. Examples include analysis and synthesis of audio, video and other digital media, sensor array signals, real-time phys- ical simulation etc. High performance is crucial in this domain. When choosing between available programming methods, the programmer often chooses one that maximizes performance while sacrificing ease of programming, code comprehension, maintainability and reusability. This work contributes towards improving the state of the art by jointly maximizing these aspects. High-volume streams are often most naturally represented as multi-dimensional arrays with one infinite dimension representing time. Algorithms working with such streams are typically defined mathematically using recurrence equations. A pro- gramming language is presented in this dissertation which enables an almost literal translation of such mathematical definitions to computer programs. The language also supports powerful facilities for abstraction and code reuse such as polymorphic and higher-order functions. Together, these features enable a more natural expression of algorithms and improve code modularity and reusability. A major contribution of this dissertation is the compilation of the proposed lan- guage in the polyhedral framework, specifically targeting general-purpose multi-core processors. This framework provides powerful means of analysis and transformations of computations on multi-dimensional arrays, which enables data-locality optimiza- tions essential for high performance on general-purpose processors with deep memory hierarchies. The benefit of this framework for computations on finite arrays has been extensively explored. However, this dissertation presents essential extensions that enable the application of state-of-the-art optimizations in this framework on infinite arrays representing streams. / Graduate programming language stream processing signal processing recurrence equations polyhedral model functional language compiler
155	ChipCflow: tool for convert C code in a static dataflow architecture in reconfigurable hardware / ChipCflow: ferramenta para conversão de código C em uma arquitetura a fluxo de dados estática em harware reconfigurável Silva, Antonio Carlos Fernandes da 19 February 2015 (has links) A growing search for alternative architectures and softwares have been noted in the last years. This search happens due to the advance of hardware technology and such advances must be complemented by innovations on design methodologies, test and verification techniques in order to use technology effectively. Alternative architectures and softwares, in general, explores the parallelism of applications, differently to Von Neumann model. Among high performance alternative architectures, there is the Dataflow Architecture. In this kind of architecture, the process of program execution is determined by data availability, thus the parallelism is intrinsic in these systems. The dataflow architectures become again a highlighted search area due to hardware advances, in particular, the advances of Reconfigurable Computing and Field Programmable Gate Arrays (FPGAs). ChipCflow projet is a tool for execution of algorithms using dynamic dataflow graph in FPGA. In this thesis, the development of a code conversion tool to generate aplications in a static dataflow architecture, is described. Also the ChipCflow project where the code conversion tool is part, is presented. The specification of algorithm to be converted is made in C language and converted to a hadware description language, respecting the proposed by ChipCflow project. The results are the proof of concept of converting a high-level language code for dataflow architecture to be used into a FPGA. / Existe uma crescente busca por softwares e arquiteturas alternativas. Essa busca acontece pois houveram avanços na tecnologia do hardware, e estes avanços devem ser complementados por inovações nas metodologias de projetos, testes e verificação para que haja um uso eficaz da tecnologia. Os software e arquiteturas alternativas, geralmente são modelos que exploram o paralelismo das aplicações, ao contrário do modelo de Von Neumann. Dentre as arquiteturas alternativas de alto desempenho, tem-se a arquitetura a fluxo de dados. Nesse tipo de arquitetura, o processo de execução de programas é determinado pela disponibilidade dos dados, logo o paralelismo está embutido na própria natureza do sistema. O modelo a fluxo de dados possui a vantagem de expressar o paralelismo de maneira intrínseca, eliminando a necessidade do programador explicitar em seu código os trechos onde deve haver paralelismo. As arquiteturas a fluxo de dados voltaram a ser uma área de pesquisa devido aos avanços do hardware, em particular, os avanços da Computação Reconfigurável e dos Field Programmable Gate Arrays (FPGAs).Nesta tese é descrita uma ferramenta de conversão de código que visa a geração de aplicações utilizando uma arquitetura a fluxo de dados estática. Também é descrito o projeto ChipCflow, cuja ferramenta de conversão de código, descrita nesta tese, é parte integrante. A especificação do algoritmo a ser convertido é feita em linguagem C e convertida para uma linguagem de descrição de hardware, respeitando o modelo proposto pelo ChipCflow. Os resultados alcançados visam a prova de conceito da conversão de código de uma linguagem de alto nível para uma arquitetura a fluxo de dados a ser configurada em FPGA. C code ChipCflow ChipCflow Código C Compilador Compiler Dataflow Fluxo de dados FPGA FPGA VHDL VHDL
156	Proposta e construção de um compilador pascal para arquitetura RISC-LIE / Design and implementation of a PASCAL compiler for the RISC-LIE architecture Traina, Antônio Fernando 13 September 1993 (has links) Este trabalho apresenta uma proposta para implementação de um subconjunto de instruções e comandos de uma linguagem Pascal Padrão ISSO, aplicada a arquitetura RISC, tendo como base a arquitetura RISC-LIE [Vale91], proposta e desenvolvida no IFQSC. Para definição e construção de parte do código gerado foi utilizada a ferramenta de desenvolvimento de compiladores YACC, que definiu toda estrutura gramatical da linguagem, sendo que as demais estruturas foram desenvolvidas usando interfaces em linguagem C. O código gerado pelo computador utilizou trinta instruções de máquina que compõe o simulador da arquitetura RISC-LIE, gerando assim códigos compatíveis que podem ser interpretados por esse simulador. / This work presents a proposal for an implementation of a subset of instructions and commands of Standard Pascal ISO applied to RISC architectures. The work was developed using the RISC-LIE architecture as our target [Vale91]. The RISC-LIE has been proposed and developed at IFQSC. Part of the code was defined and constructed using YACC, a tool for compilers development which defined the grammatical structure of language. The remainder routines were developed using the C language. The code produced by the compiler used the thirty instructions of the RISC-LIE instruction set. These instructions are implemented in the RISC-LIE architecture simulator. Therefore, generates codes that can be interpreted by this simulator. Compiler design Linguagem de programação PASCAL PASCAL Programming languages Projeto de compliladores RISC RISC
157	On the semantics of exceptions for high level and low level languages / On the semantics of exceptions for high level and low level languages Tejiščák, Matúš January 2012 (has links) The thesis deals with correctness of a compiler of a simple language featuring exceptions. We present formal semantics, both denotational semantics of a~high-level language and operational semantics of a low-level language for a~simple stack machine. We study the method of stack unwinding and then iteratively, improving upon a naive solution, we present a different method that is structurally recursive and thus suitable for implementation in total dependently typed languages. Finally, we provide an implementation of the compiler in the dependently typed functional programming language Agda, along with a mechanically verifiable proof of adherence of the implementation to the semantics.
158	Software Techniques For Dependable Execution January 2018 (has links) abstract: Advances in semiconductor technology have brought computer-based systems intovirtually all aspects of human life. This unprecedented integration of semiconductor based systems in our lives has signiﬁcantly increased the domain and the number of safety-critical applications – application with unacceptable consequences of failure. Software-level error resilience schemes are attractive because they can provide commercial-oﬀ-the-shelf microprocessors with adaptive and scalable reliability. Among all software-level error resilience solutions, in-application instruction replication based approaches have been widely used and are deemed to be the most effective. However, existing instruction-based replication schemes only protect some part of computations i.e. arithmetic and logical instructions and leave the rest as unprotected. To improve the eﬃcacy of instruction-level redundancy-based approaches, we developed several error detection and error correction schemes. nZDC (near Zero silent Data Corruption) is an instruction duplication scheme which protects the execution of whole application. Rather than detecting errors on register operands of memory and control ﬂow operations, nZDC checks the results of such operations. nZDC en sures the correct execution of memory write instruction by reloading stored value and checking it against redundantly computed value. nZDC also introduces a novel control ﬂow checking mechanism which replicates compare and branch instructions and detects both wrong direction branches as well as unwanted jumps. Fault injection experiments show that nZDC can improve the error coverage of the state-of-the-art schemes by more than 10x, without incurring any more performance penalty. Further more, we introduced two error recovery solutions. InCheck is our backward recovery solution which makes light-weighted error-free checkpoints at the basic block granularity. In the case of error, InCheck reverts the program execution to the beginning of last executed basic block and resumes the execution by the aid of preserved in formation. NEMESIS is our forward recovery scheme which runs three versions of computation and detects errors by checking the results of all memory write and branch operations. In the case of a mismatch, NEMESIS diagnosis routine decides if the error is recoverable. If yes, NEMESIS recovery routine reverts the eﬀect of error from the program state and resumes program normal execution from the error detection point. / Dissertation/Thesis / Doctoral Dissertation Computer Engineering 2018 Computer engineering Computer science Compiler transfromation Instruction Duplication Redundancy Reliability Silent Data Corruption Soft Error
159	A Compiler Target Model for Line Associative Registers Eberhart, Paul S. 01 January 2019 (has links) LARs (Line Associative Registers) are very wide tagged registers, used for both register-wide SWAR (SIMD Within a Register )operations and scalar operations on arbitrary fields. LARs include a large data field, type tags, source addresses, and a dirty bit, which allow them to not only replace both caches and registers in the conventional memory hierarchy, but improve on both their functions. This thesis details a LAR-based architecture, and describes the design of a compiler which can generate code for a LAR-based design. In particular, type conversion, alignment, and register allocation are discussed in detail. computer architecture compiler SWAR cache register memory alias Computer and Systems Architecture Programming Languages and Compilers
160	Scratch-pad memory management for static data aggregates Li, Lian, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Scratch-pad memory (SPM), a fast on-chip SRAM managed by software, is widely used in embedded systems. Compared to hardware-managed cache, SPM can be more efficient in performance, power and area cost, and has the added advantage of better time predictability. In this thesis, SPMs should be seen in a general context. For example, in stream processors, a software-managed stream register file is usually used to stage data to and from off-chip memory. In IBM's Cell architecture, each co-processor has a software-managed local store for keeping data and instructions. SPM management is critical for SPM-based embedded systems. In this thesis, we propose two novel methodologies, the memory colouring methodology and the perfect colouring methodology, to place the static data aggregates such as arrays and structs of a program in SPM. Our methodologies are dynamic in the sense that some data aggregates can be swapped into and out of SPM during program execution. To this end, a live range splitting heuristic is introduced in order to create potential data transfer statements between SPM and off-chip memory. The memory colouring methodology is a general-purpose compiler approach. The novelty of this approach lies in partitioning an SPM into a pseudo register file then generalising existing graph colouring algorithms for register allocation to colour data aggregates. In this thesis, a scheme for partitioning an SPM into a pseudo register file is introduced. This methodology is inter-procedural and therefore operates on the interference graph for the data aggregates in the whole program. Different graph colouring algorithms may give rise to different results due to live range splitting and spilling heuristics used. As a result, two representative graph colouring algorithms, George and Appel's iterative-coalescing and Park and Moon's optimistic-coalescing, are generalised and evaluated for SPM allocation. Like memory colouring, perfect colouring is also inter-procedural. The novelty of this second methodology lies in formulating the SPM allocation problem as an interval colouring problem. The interval colouring problem is an NP problem and no widely-accepted approximation algorithms exist. The key observation is that the interference graphs for data aggregates in many embedded applications form a special class of superperfect graphs. This has led to the development of two additional SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place data aggregates in SPM based on the cliques in an interference graph. In both cases, we guarantee optimally that all data aggregates in an interference graph can be placed in SPM if the given SPM size is no smaller than the chromatic number of the graph. We have developed two memory colouring algorithms and two perfect colouring algorithms for SPM allocation. We have evaluated them using a set of embedded applications. Our results show that both methodologies are efficient and effective in handling large-scale embedded applications. While neither methodology outperforms the other consistently, perfect colouring has yielded better overall results in the set of benchmarks used in our experiments. All these algorithms are expected to be valuable. For example, they can be made available as part of the same compiler framework to assist the embedded designer with exploring a large number of optimisation opportunities for a particular embedded application. Scratch-pad memory. Compiler. Embedded computer systems. Memory management (Computer science)

Search results