311 |
Polymage : Automatic Optimization for Image Processing PipelinesMullapudi, Ravi Teja January 2015 (has links) (PDF)
Image processing pipelines are ubiquitous. Every image captured by a camera and every image uploaded on social networks like Google+or Facebook is processed by a pipeline. Applications in a wide range of domains like computational photography, computer vision and medical imaging use image processing pipelines. Many of these applications demand high-performance which requires effective utilization of modern architectures. Given the proliferation of camera enabled devices and social networks optimizing these emerging workloads has become important both at the data center and the embedded device scales.
An image processing pipeline can be viewed as a graph of interconnected stages which process images successively. Each stage typically performs one of point-wise, stencil, sam-pling, reduction or data-dependent operations on image pixels. Individual stages in a pipeline typically exhibit abundant data parallelism that can be exploited with relative ease. However, the stages also require high memory bandwidth preventing effective uti-lization of parallelism available on modern architectures. The traditional options are using optimized libraries like OpenCV or to optimize manually. While using libraries precludes optimization across library routines, manual optimization accounting for both parallelism and locality is very tedious.
Inthisthesis,wepresentthedesignandimplementationofPolyMage,adomain-specific language and compiler for image processing pipelines. The focus of the system is on au-tomatically generating high-performance implementations of image processing pipelines expressed in a high-level declarative language. We achieve such automation with:
• tiling techniques to improve parallelism and locality by introducing redundant computation,
v
a model-driven fusion heuristic which enables a trade-off between locality and re-dundant computations, and anautotuner whichleveragesthefusionheuristictoexploreasmallsubsetofpipeline implementations and find the best performing one.
Our optimization approach primarily relies on the transformation and code generation ca-pabilities of the polyhedral compiler framework. To the best of our knowledge, this is the first model-driven compiler for image processing pipelines that performs complex fusion, tiling, and storage optimization fully automatically. We evaluate our framework on a modern multicore system using a set of seven benchmarks which vary widely in structure and complexity. Experimental results show that the performance of pipeline implementations generated by our approach is:
• up to 1.81× better than pipeline implementations manually tuned using Halide, a state-of-the-art language and compiler for image processing pipelines,
• on average 5.39× better than pipeline implementations automatically tuned using Halide and OpenTuner, and
• on average 3.3× better than naive pipeline implementations which only exploit par-allelism without optimizing for locality.
We also demonstrate that the performance of PolyMage generated code is better or compa-rable to implementations using OpenCV, a state-of-the-art image processing and computer vision library.
|
312 |
Heuristisk profilbaserad optimering av instruktionscache i en online Just-In-Time kompilator / Heuristic Online Profile Based Instruction Cache Optimisation in a Just-In-Time CompilerEng, Stefan January 2004 (has links)
This master’s thesis examines the possibility to heuristically optimise instruction cache performance in a Just-In-Time (JIT) compiler. Programs that do not fit inside the cache all at once may suffer from cache misses as a result of frequently executed code segments competing for the same cache lines. A new heuristic algorithm LHCPA was created to place frequently executed code segments to avoid cache conflicts between them, reducing the overall cache misses and reducing the performance bottlenecks. Set-associative caches are taken into consideration and not only direct mapped caches. In Ahead-Of-Time compilers (AOT), the problem with frequent cache misses is often avoided by using call graphs derived from profiling and more or less complex algorithms to estimate the performance for different placements approaches. This often results in heavy computation during compilation which is not accepted in a JIT compiler. A case study is presented on an Alpha processor and an at Ericsson developed JIT Compiler. The results of the case study shows that cache performance can be improved using this technique but also that a lot of other factors influence the result of the cache performance. Such examples are whether the cache is set-associative or not; and especially the size of the cache highly influence the cache performance.
|
313 |
Towards a Unified Framework for Design of MEMS based VLSI SystemsSukumar, Jairam January 2016 (has links) (PDF)
Current day VLSI systems have started seeing increasing percentages of multiple energy domain components being integrated into the mainstream. Energy domains such as mechanical, optical, fluidic etc. have become all pervasive into VLSI systems and such systems are being manufactured routinely. The framework required to design such an integrated system with diverse energy domains needs to be evolved as a part of conventional VLSI design methodology. This is because manufacturing and design of these integrated energy domains although based on semiconductor processing, is still very ad-hoc, with each device requiring its dedicated design tools and process integration.
In this thesis three different approaches in different energy domains, have been pro-posed. These three domains include modelling & simulation, synthesis & compilation and formal verification. Three different scenarios have been considered and it is shown that these tasks can be co-performed along with conventional VLSI circuits and systems.
In the first approach a micro-mechanical beam bending case is presented. A thermal heat ow causing the beam to bend through thermal stress is analyzed for change in capacitance under a single analysis and modelling framework. This involves a seamless analysis through thermal, mechanical and electrical energy domains. The second part of the thesis explores synthesis and compilation paradigms. The concept of a Gyro-compiler analogous to a memory compiler is proposed, which primarily generates soft IP models for various gyro topologies.
The final part of this thesis deals in showcasing a working prototype of a formal verification framework for MEMS based hybrid systems. The MEMS verification domain today is largely limited to simulation based verification. Many techniques have been proposed for formal verification of hybrid systems. Some of these methods have been extended to demonstrate, how MEMS based hybrid systems can be formally verified through ex-tensions of conventional formal verification methods. An adaptive cruise control (ACC) system with a gyro based speed sensor has been analyzed and formally verified for various specifications of this system.
|
314 |
OMCCp : A MetaModelica Based Parser Generator Applied to ModelicaLopez-Rojas, Edgar Alonso January 2011 (has links)
The OpenModelica Compiler-Compiler parser generator (OMCCp) is an LALR(1) parser generator implemented in the MetaModelica language with parsing tables generated by the tools Flex and GNU Bison. The code generated for the parser is in MetaModelica 2.0 language which is the OpenModelica compiler implementation language and is an extension of the Modelica 3.2 language. OMCCp uses as input an LALR(1) grammar that specifies the Modelica language. The generated Parser can be used inside the OpenModelica Compiler (OMC) as a replacement for the current parser generated by the tool ANTLR from an LL(k) Modelica grammar. This report explains the design and implementation of this novel Lexer and Parser Generator called OMCCp. Modelica and its extension MetaModelica are both languages used in the OpenModelica environment. Modelica is an Object-Oriented Equation-Based language for Modeling and Simulation.
|
315 |
A closer look and comparison of cross-platform development environment for smartphonesAndersson, Tobias, Johansson, Erik January 2014 (has links)
A problem with having a fast and wide production of different platforms for mobile devices is that you can’t code for one and deploy on all devices at the same time. This thesis is focused on cross-plat1form development environments for smartphones and mainly to see what options there are on the market. This report will investigate how well a cross-compiler solution compares to hybrid cross-platform development. To do this we took a closer look at their architecture and then compared this with the results from different tests made. All the tests were made on the same smartphone to ensure fairness between them. All the tests strive to be as equal as possible even though the languages might differ from each other. The tested frameworks were PhoneGap, Qt, Unity3D and GameMaker. The different tests were about performance, power consumption, difficulty in accessing web browsers to perform HTML parsing and lastly to see if the platforms can access different native APIs such as the camera and accelerometer. The previously mentioned topics were compared between all the frameworks. We also compared the documentation found on their webpage to figure out which is the easiest to get started on.
|
316 |
Processus et outils qualifiables pour le développement de systèmes critiques certifiés en avionique basés sur la génération automatique de code / Processes and qualifiable tools for the development of safety-critical certified systems in avionics based on automated code generationBedin França, Ricardo 10 April 2012 (has links)
Le développement des logiciels avioniques les plus critiques, comme les commandes de vol électriques, présentent plusieurs contraintes qui peuvent être quasiment contradictoires – par exemple, performance et sûreté – et toutes ces contraintes doivent être respectées simultanément. L'objective de cette thèse est d'étudier et de proposer des évolutions dans le cycle de développement des logiciels de commande de vol chez Airbus afin d'améliorer leur performance, tout en respectant les contraintes industrielles existantes et en conservant des processus de vérification au moins aussi sûrs que ceux utilisés actuellement. Le critère principal d'évaluation de performance est le temps d'exécution au pire cas (WCET), vu qu'il est utilisé lors des analyses temporelles des logiciels de vol réels. Dans un premier temps, le DO-178, qui contient des considérations pour l'approbation des logiciels avioniques, est présenté. Le DO-178B et le DO-178C sont étudiés. Le DO-178B est la référence pour plusieurs logiciels de commande de vol développés chez Airbus et le DO-178C est la référence pour le développement des nouveaux logiciels à partir de 2012. Ensuite, l'étude de cas est présentée. Afin d'améliorer sa compréhension, le contexte historique est fourni à travers l'étude des autres logiciels de commande de vol, car plusieurs activités de son cycle de vie réutilisent des techniques qui ont été utilisées avec succès dans des projets précédents. Quelques activités qui présentent des causes potentielles de pertes de performance logicielle sont exposées et l'axe principal d'étude choisi pour le reste de la thèse est la phase de compilation. Ce choix se justifie dans le contexte des logiciels de commande de vol car la compilation est réalisée avec peu ou pas d'optimisations, son impact sur la performance des logiciels est donc important et des travaux de recherche récents permettent d'envisager un changement dans les paradigmes actuels de compilation sûre. / The development of safety-critical avionics software, such as aircraft flight control programs, presents many different constraints that are nearly contradictory, such as performance and safety requirements, and all must be met simultaneously. The objective of this Thesis is to propose modifications in the development cycle of Airbus flight control programs in order to improve their performance without weakening their verification processes or violating other industrial constraints. The main criterion for performance evaluation is the Worst-Case Execution Time (WCET), as it is used in the timing analysis that is performed in actual avionics software verification processes. In a first moment, the DO-178, which contains guidance for avionics software development approval, is presented. Both the DO-178B and the DO-178C are discussed, since the former was the reference for the development of many Airbus flight control programs and the latter shall be the reference for the development of new programs, starting from 2012. Then, the case study is presented. In order to better understand it, some historical context is provided by the study of other flight control programs - many of its life cycle activities reuse techniques that were successful in previous software projects. Each activity is evaluated in order to underline what are the performance bottlenecks in the flight control software development. Some potential underperforming activities are depicted and the main axis of study developed subsequently is the compilation phase: not only it is a well-known unoptimized activity that has important impacts over software performance, but it is also an activity that might undergo a paradigm change due to innovating compilers that are being developed by researchers. The CompCert compiler is presented and its use in the scope of this Thesis is justified - at the time of this Thesis, it was the compiler that was best prepared to perform meaningful experiments, such as compiling a large subset of the chosen case study. Its architecture is studied, together with its semantic preservation theorem, which is the backbone of its formally-verified part. Additional features that were developed in CompCert during this Thesis in order to meet Airbus's requirements - such as its annotation mechanism and its reference interpreter - are discussed in order to underline their usefulness in the development of flight control software. The evaluation of CompCert consists in a performance comparison with the current compilation strategy and an assessment of the impacts that its utilization might have over the verification strategy commonly employed in flight control software. The results of the performance comparison are promising, since CompCert-generated code has a WCET more than 10% lower than if it were compiled with a good quality non-optimizing compiler. As expected, the use of CompCert has impacts over some important verification activities but its formal development and increased verifiability helps in the development of new compiler verification activities that can keep the whole development process at least as safe as the current one. Some development strategy propositions are then presented, according to the certification credit that might be required by using CompCert.
|
317 |
Regulovaný syntaxí řízený překlad / Regulated Syntax-Directed TranslationDvořák, Tomáš January 2019 (has links)
This thesis deals with formal and syntax directed translation. This thesis contains theoretical part, which defines regular, context free, context sensitive and recursively enumerable languages a grammar. There are given examples of grammars which are able to generate languages that are not context free. Covered by this thesis are matrix grammars, random context grammars and programed grammars. Researched are also finite, pushdown, deep and regular automata, transducers and their part within format syntax directed translation. This project also defines regular transducers based as regulated automata. Thesis defines regulated methods of syntax analysis based on predictive parsers. These methods cover analysis of studied regulated grammars. The final part of this thesis describes new language capable of effective description of these grammars and compiler producing parser code for these grammars written in this new language and their graphical analyzer.
|
318 |
Just-in-time kompilace závisle typovaného lambda kalkulu / Just-in-Time Compilation of Dependently-Typed Lambda CalculusZárybnický, Jakub January 2021 (has links)
Řada programovacích jazyků byla schopna zvýšit svoji rychlost výměnou běhových systémů stavěných na míru za obecné platformy, které pro optimalizaci používají just-in-time překlad, jako jsou GraalVM nebo RPython. V této práci vyhodnocuji, zda je použití takovýchto platforem vhodné i pro jazyky se závislymi typy nebo důkazovými systémy. Tato práce představuje koncepty -kalkulu a teorie typů potřebné pro úvod do závislých typů s relevantními algoritmy, specifikuje malý závisle typovaný jazyk založený na $\lambda\Pi$ kalkulu, a prezentuje dva interpretery tohoto jazyka. Tyto interpretery jsou psané v jazyce Kotlin, první je jednoduchý, psaný ve funkcionálním stylu a druhý používá platformu GraalVM a Truffle. GraalVM je platforma založená na virtuálním stroji Javy (JVM), která přidává just-in-time překladač založený na částečném vyhodnocení (partial evaluation) a Truffle je knihovna pro tvorbu programovacích jazyků využívající tento překladač. Závěr práce vyhodnocuje běhové charakteristiky těchto interpreterů na různých zátěžových testech.Závěry práce jsou ale silně negativní. Vliv JIT překladu není znatelný ani přes snahu optimalizovat běžné algoritmy z teorie typů, které jsou zjevně nevhodné pro platformu JVM. Práce končí návrhy několika navazujících projektů, které by lépe využily možnosti Truffle a které by byly vhodnější pro implementaci závisle typovaných jazyků.
|
319 |
Intel Integrated Performance Primitives a jejich využití při vývoji aplikací / Intel Integrated Performance Primitives and their use in application developmentMachač, Jiří January 2008 (has links)
The aim of the presented work is to demonstrate and evaluate the contribution of computing system SIMD especially units MMX, SSE, SSE2, SSE3, SSSE3 and SSE4 from Intel company, by creation of demostrating applications with using Intel Integrated Performance Primitives library. At first, possibilities of SIMD programming using intrinsic function, vektorization and libraries Intel Integrated Performance Primitives are presented, as next are descibed options of evaluation of particular algorithms. Finally procedure of programing by using Intel Integrated Performance Primitives library are ilustrated.
|
320 |
Online LaTeX editor / Online LaTeX EditorSokol, Miroslav January 2012 (has links)
This diploma thesis has its purpose in creating such an editor of LATEX language, which would succeed in the competition of existing solutions. To offer users a basis typical for this type of editors and add functions that will make him a unique project. The progress went in a direction of transparent environment and completely trivial operating. Most of the functions are available through 1 click only. Predefined templates are displayed immediately including previews and it is possible to download them with all source codes. Some kinds of actions redraw the whole content of the page. In other cases, we used update panels for partially redraw the content or client javascript. Due to available archivation of .zip files we can even process more files at the same time. Program is designed in a way to be able to further continue developing it and even broaden his features.
|
Page generated in 0.04 seconds