• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 194
  • 33
  • 31
  • 16
  • 11
  • 10
  • 6
  • 6
  • 5
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 366
  • 132
  • 80
  • 72
  • 50
  • 45
  • 42
  • 40
  • 39
  • 36
  • 34
  • 34
  • 33
  • 31
  • 30
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Applied logic : its use and implementation as a programming tool

Warren, David H. D. January 1978 (has links)
The first Part of the thesis explains from first principles the concept of "logic programming" and its practical application in the programming language Prolog. Prolog is a simple but powerful language which encourages rapid, error-free programming and clear, readable, concise programs. The basic computational mechanism is a pattern matching process ("unification") operating on general record structures ("terms" of logic). IThe ideas are illustrated by describing in detail one sizable Prolog program which implements a simple compiler. The advantages and practicability of using Prolog for "real" compiler implementation are discussed. The second Part of the thesis describes techniques for implementing Prolog efficiently. In particular it is shown how to compile the patterns involved in the matching process into instructions of a low-level language. This idea has actually been implemented in a compiler (written in Prolog) from Prolog to DECsystem-10 assembly language. However the principles involved are explained more abstractly in terms of a "Prolog Machine". The code generated is comparable in speed with that produced by existing DEC10 Lisp compilers. Comparison is possible since pure Lisp can be viewed as a (rather restricted) subset of Prolog. It is argued that structured data objects, such as lists and trees, can be manipulated by pattern matching using a "structure 'sharing" representation as efficiently as by conventional selector and constructor functions operating on linked records in "heap" storage. Moreover the pattern matching formulation actually helps the implementor to produce a better implementation.
72

Modelica-Compiler Methoden für zeitkritische Simulationen

Waurich, Volker 01 December 2021 (has links)
Die Modellierung und Simulation dynamischer Vorgänge ist ein etabliertes Werkzeug bei der Entwicklung und Untersuchung technisch-physikalischer Systeme. Gleichungsbasierte und objektorientierte Modellbeschreibungssprachen wie Modelica ermöglichen eine effiziente Modellierung bei gleichzeitig guter Wiederverwendbarkeit und Übersichtlichkeit. Der Modelica-Compiler übersetzt das Modell in eine mathematisch lösbare Form und erzeugt ein ausführbares Simulationsprogramm. Mithilfe symbolischer Verfahren kann der Berechnungsaufwand und die Zeit zur Auswertung der Modellgleichungen positiv beeinflusst werden. Die vorliegende Arbeit identifiziert Potenziale zur automatisierten Laufzeitverbesserung von Modelica-Modellen und untersucht geeignete Verfahren, welche in den open source Compiler OpenModelica implementiert werden. Die vorgeschlagenen Methoden basieren auf einer graphentheoretischen Repräsentation des Gleichungssystems. Graphenalgorithmen werden zur symbolischen Manipulation des Gleichungssystem benutzt, um die Ausführungszeit zu reduzieren und ein nicht-deterministisches Laufzeitverhalten zu vermeiden. Es werden Methoden aufgezeigt, welche automatisiert und generalisiert ohne manuelles Eingreifen in den Übersetzungsprozess zu einer beschleunigten Modellauswertung führen. Die vorliegende Arbeit leistet somit einen Beitrag für die Echtzeitfähigkeit von Modellen mit steigender Komplexität, effizienter Modellerstellung und leistungsfähigen, automatisierten Simulationsläufen, wie bspw. Parameterstudien oder virtuellen Testläufen. / Modelling and simulation of dynamic processes is a well-established tool for the development of technical and physical systems. Equation-based and object-oriented modelling languages such as Modelica enable an efficient way of modelling as well as good reusability and structuring of the model. The Modelica-Compiler translates the model to a solvable representation and compiles an executable simulation program. With the help of symbolic compiler methods, the calculation costs and executation time can be reduced. The present paper identifies potentials for the automated improvement of execution times of Modelica-based simulation models and investigates suitable methods which have been implemented in the OpenModelica Compiler. The proposed compiler methods are based on a graph representation of the underlying equation system. Graph algorithms are used for the symbolic manipulation of the system of equations to reduce execuation time and non-deterministic runtime behaviour. All presented methods can be applied automatically without addtional manual modifications during the compilation process. Hence, the present paper is a contribution to achieve faster simulation times of models with increasing complexity. This is an eminent requirement for a productive and highly automated virtual development process.
73

Methodology of Construction Compiler Front-End and Its Integration into the GNU Compiler Collection / Methodology of Construction Compiler Front-End and Its Integration into the GNU Compiler Collection

Machata, Petr Unknown Date (has links)
Tato diplomová práce vznikla za podpory ANF DATA s.r.o., Brno. Diplomová práce je vypracována v angličtině. Vstupní bariéra pro vývoj uvnitř GCC se během posledních let znatelně snížila. Na konferencích, v časopisech a na webu se objevují články s architektonickými přehledy a návody. Věci se nadále zjednodušují použitím oficiálního vnitřního jazyka GENERIC: Pro komunikaci mezi přední částí a zbytkem překladače již není nutné zabývat se obtížným a nepřehledným RTL. Přesto je práce se souborem zdrojových kódů velikosti GCC nutně složitá. Je třeba napsat určité soubory a provést určitá nastavení, oboje jen s poměrně malým množstvím dokumentace. Cílem této práce je pomoci s posledním zmíněným bodem. Práce popisuje ukázkovou přední část: Vše od vytvoření zdrojových souborů, přes různé konstrukce jazyka GENERIC, až k problémům s kompilací běhové podpory nebo používání nativního preprocesoru.
74

Automatic Discovery and Exposition of Parallelism in Serial Applications for Compiler-Inserted Runtime Adaptation

Greenland, David A. 25 May 2012 (has links) (PDF)
Compiler-Inserted Runtime Adaptation (CIRA) is a compilation and runtime adaptation strategy which has great potential for increasing performance in multicore systems. In this strategy, the compiler inserts directives into the application which will adapt the application at runtime. Its ability to overcome the obstacles of architectural and environmental diversity coupled with its flexibility to work with many programming languages and styles of applications make it a very powerful tool. However, it is not complete. In fact, there are many pieces still needed to accomplish these lofty goals. This work describes the automatic discovery of parallelism inherent in an application and the generation of an intermediate representation to expose that parallelism. This work shows on six benchmark applications that a significant amount of parallelism which was not initially apparent can be automatically discovered. This work also shows that the parallelism can then be exposed in a representation which is also automatically generated. This is accomplished by a series of analysis and transformation passes with only minimal programmer-inserted directives. This series of passes forms a necessary part of the CIRA toolchain called the concurrency compiler. This concurrency compiler proves that a representation with exposed parallelism and locality can be generated by a compiler. It also lays the groundwork for future, more powerful concurrency compilers. This work also describes the extension of the intermediate representation to support hierarchy, a prerequisite characteristic to the creation of the concurrency compiler. This extension makes it capable of representing many more applications in a much more effective way. This extension to support hierarchy allows much more of the parallelism discovered by the concurrency compiler to be stored in the representation.
75

Compiler Testing of C11 Atomics for Arm and RISC-V

Adolfsson, Hampus January 2022 (has links)
The C11 standard introduced atomic types and operations, with an accompanying memory model, to enable the use of shared variables in concurrent programs. In this thesis, I demonstrate how compilers can be tested, in a way that is deterministic and covers the entire set of atomic operations, to ensure they correctly implement C11 atomics and the C11 memory model.  I use a large set of short concurrent programs (”litmus tests”), generated from a model written in a specification language and based on a formalized C11 memory model. Each test program is compiled and run with a model checker, to determine the possible outcomes; any program with an outcome that is possible after compilation but not allowed by C11 is a failed test case. As an alternative to model checking, I also test a nondeterministic, hardware-based method for running tests, but I find that this method is too inaccurate to be useful.  I test IAR and gcc compilers for Arm and RISC-V; all of these compilers pass all tests. Out of three compilers with purposefully inserted bugs, all are correctly identified as faulty. This testing process thus shows some promise, but further evaluation is needed.
76

A development environment and static analyses for GUARDOL - a language for the specification of high assurance guards

Dodds, Josiah January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / John M. Hatcliff / There are a number of network situations where different networks have different security policies and still need to share information. While it is important to allow some data to flow between the two networks, it is just as important that they don't share any data that violates the respective security policies of the networks. Constraints on data sharing are often phrased in terms of classification levels of data (e.g. top secret, secret, public). They might also be stated in terms of the contents of the data (e.g. are there military base names, is the location correct). The software and hardware that works to solve these problems is called Cross Domain Solutions (CDS). There are a variety of hardware platforms capable of implementing CDS. These platforms are all configured in different ways and they are often proprietary. Not only are there a number of platforms on the market, many are difficult to understand, verify, or even specify. The Guardol project provides an open, non-proprietary, and domain-specific language for specifying CDS security policies and implementing CDS. Guardol is designed to be easy to understand and verify. This thesis describes the design and implementation of primary Guardol components. It includes a description of the Eclipse GUI plug-ins that have been developed for the project as well as a description of new formal analyses and translations that have been developed for the language. The translation is used to plug into external tools for model checking and the analyses help to make the translation clean and efficient. The analyses are also useful tools to help make the use of Guardol easier for developers.
77

Efficient Execution Paradigms for Parallel Heterogeneous Architectures

Koukos, Konstantinos January 2016 (has links)
This thesis proposes novel, efficient execution-paradigms for parallel heterogeneous architectures. The end of Dennard scaling is threatening the effectiveness of DVFS in future nodes; therefore, new execution paradigms are required to exploit the non-linear relationship between performance and energy efficiency of memory-bound application-regions. To attack this problem, we propose the decoupled access-execute (DAE) paradigm. DAE transforms regions of interest (at program-level) in two coarse-grain phases: the access-phase and the execute-phase, which we can independently DVFS. The access-phase is intended to prefetch the data in the cache, and is therefore expected to be predominantly memory-bound, while the execute-phase runs immediately after the access-phase (that has warmed-up the cache) and is therefore expected to be compute-bound. DAE, achieves good energy savings (on average 25% lower EDP) without performance degradation, as opposed to other DVFS techniques. Furthermore, DAE increases the memory level parallelism (MLP) of memory-bound regions, which results in performance improvements of memory-bound applications. To automatically transform application-regions to DAE, we propose compiler techniques to automatically generate and incorporate the access-phase(s) in the application. Our work targets affine, non-affine, and even complex, general-purpose codes. Furthermore, we explore the benefits of software multi-versioning to optimize DAE in dynamic environments, and handle codes with statically unknown access-phase overheads. In general, applications automatically-transformed to DAE by our compiler, maintain (or even exceed in some cases) the good performance and energy efficiency of manually-optimized DAE codes. Finally, to ease the programming environment of heterogeneous systems (with integrated GPUs), we propose a novel system-architecture that provides unified virtual memory with low overhead. The underlying insight behind our work is that existing data-parallel programming models are a good fit for relaxed memory consistency models (e.g., the heterogeneous race-free model). This allows us to simplify the coherency protocol between the CPU – GPU, as well as the GPU memory management unit. On average, we achieve 45% speedup and 45% lower EDP over the corresponding SC implementation.
78

Optimalizace rozsáhlých aplikací / Optimizing large applications

Liška, Martin January 2013 (has links)
Both uppermost open source compilers, GCC and LLVM, are mature enough to link-time optimize large applications. In case of large applications, we must take into account, except standard speed efficiency and memory consumption, different aspects. We focus on size of the code, cold start-up time, etc. Developers of applications often come up with ad-hoc solutions such as Elfhack utility, start-up of an application via a pre-loading utility and dlopen; prelinking and variety of different tools that reorder functions to fit the order of execution. The goal of the thesis is to analyse all existing techniques of optimization, evaluate their efficiency and design new solutions based on the link-time optimization platform. Powered by TCPDF (www.tcpdf.org)
79

RISC-V Compiler Performance:A Comparison between GCC and LLVM/clang

Bjäreholt, Johan January 2017 (has links)
RISC-V is a new open-source instruction set architecture (ISA) that in De-cember 2016 manufactured its rst mass-produced processors. It focuses onboth eciency and performance and diers from other open-source architec-tures by not having a copyleft license permitting vendors to freely design,manufacture and sell RISC-V chips without any fees nor having to sharetheir modications on the reference implementations of the architecture.The goal of this thesis is to evaluate the performance of the GCC andLLVM/clang compilers support for the RISC-V target and their ability tooptimize for the architecture. The performance will be evaluated from ex-ecuting the CoreMark and Dhrystone benchmarks are both popular indus-try standard programs for evaluating performance on embedded processors.They will be run on both the GCC and LLVM/clang compilers on dierentoptimization levels and compared in performance per clock to the ARM archi-tecture which is mature yet rather similar to RISC-V. The compiler supportfor the RISC-V target is still in development and the focus of this thesis willbe the current performance dierences between the GCC and LLVM com-pilers on this architecture. The platform we will execute the benchmarks onwil be the Freedom E310 processor on the SiFive HiFive1 board for RISC-Vand a ARM Cortex-M4 processor by Freescale on the Teensy 3.6 board. TheFreedom E310 is almost identical to the reference Berkeley Rocket RISC-Vdesign and the ARM Coretex-M4 processor has a similar clock speed and isaimed at a similar target audience.The results presented that the -O2 and -O3 optimization levels on GCCfor RISC-V performed very well in comparison to our ARM reference. Onthe lower -O1 optimization level and -O0 which is no optimizations and -Oswhich is -O0 with optimizations for generating a smaller executable code sizeGCC performs much worse than ARM at 46% of the performance at -O1,8.2% at -Os and 9.3% at -O0 on the CoreMark benchmark with similar resultsin Dhrystone except on -O1 where it performed as well as ARM. When turn-ing o optimizations (-O0) GCC for RISC-V was 9.2% of the performanceon ARM in CoreMark and 11% in Dhrystone which was unexpected andneeds further investigation. LLVM/clang on the other hand crashed whentrying to compile our CoreMark benchmark and on Dhrystone the optimiza-tion options made a very minor impact on performance making it 6.0% theperformance of GCC on -O3 and 5.6% of the performance of ARM on -O3, soeven with optimizations it was still slower than GCC without optimizations.In conclusion the performance of RISC-V with the GCC compiler onthe higher optimization levels performs very well considering how young theRISC-V architecture is. It does seems like there could be room for improvement on the lower optimization levels however which in turn could also pos-sibly increase the performance of the higher optimization levels. With theLLVM/clang compiler on the other hand a lot of work needs to be done tomake it competetive in both performance and stability with the GCC com-piler and other architectures. Why the -O0 optimization is so considerablyslower on RISC-V than on ARM was also very unexpected and needs furtherinvestigation.
80

An automated OpenCL FPGA compilation framework targeting a configurable, VLIW chip multiprocessor

Parker, Samuel J. January 2015 (has links)
Modern system-on-chips augment their baseline CPU with coprocessors and accelerators to increase overall computational capacity and power efficiency, and thus have evolved into heterogeneous systems. Several languages have been developed to enable this paradigm shift, including CUDA and OpenCL. This thesis discusses a unified compilation environment to enable heterogeneous system design through the use of OpenCL and a customised VLIW chip multiprocessor (CMP) architecture, known as the LE1. An LLVM compilation framework was researched and a prototype developed to enable the execution of OpenCL applications on the LE1 CPU. The framework fully automates the compilation flow and supports work-item coalescing to better utilise the CPU cores and alleviate the effects of thread divergence. This thesis discusses in detail both the software stack and target hardware architecture and evaluates the scalability of the proposed framework on a highly precise cycle-accurate simulator. This is achieved through the execution of 12 benchmarks across 240 different machine configurations, as well as further results utilising an incomplete development branch of the compiler. It is shown that the problems generally scale well with the LE1 architecture, up to eight cores, when the memory system becomes a serious bottleneck. Results demonstrate superlinear performance on certain benchmarks (x9 for the bitonic sort benchmark with 8 dual-issue cores) with further improvements from compiler optimisations (x14 for bitonic with the same configuration).

Page generated in 0.0454 seconds