Spelling suggestions: "subject:"core generation.""
51 |
Genetic Algorithm for Integrated SoftwarePipeliningCai, Zesi January 2012 (has links)
The purpose of the thesis was to study the feasibility of using geneticalgorithm (GA) to do the integrated software pipelining (ISP). Different from phasedcode generation, ISP is a technique which integrates instruction selection, instructionscheduling, and register allocation together when doing code generation. ISP is able toprovide a lager solution space than phased way does, which means that ISP haspotential to generate more optimized code than phased code generation. However,integrated compiling costs more than phased compiling. GA is stochastic beam searchalgorithm which can accelerate the solution searching and find an optimized result.An experiment was designed for verifying feasibility of implementing GA for ISP(GASP). The implemented algorithm analyzed data dependency graphs of loop bodies,created genes for the graphs and evolved, generated schedules, calculated andevaluated fitness, and obtained optimized codes. The fitness calculation wasimplemented by calculating the maximum value between the smallest possibleresource initiation interval and the smallest possible recurrence initiation interval. Theexperiment was conducted by generating codes from data dependency graphsprovided in FFMPEG and comparing the performance between GASP and integerlinear programming (ILP). The results showed that out of eleven cases that ILP hadgenerated code, GASP performed close to ILP in seven cases. In all twelve cases thatILP did not have result, GASP did generate optimized code. To conclude, the studyindicated that GA was feasible of being implemented for ISP. The generated codesfrom GASP performed similar with the codes from ILP. And for the dependencygraphs that ILP could not solve in a limited time, GASP could also generate optimizedresults.
|
52 |
Vývoj SW a HIL testování pro modul monitorování stavu motoru / SW development and HIL testing for engine monitoring moduleSumtsov, Artem January 2014 (has links)
Diplomová práce popisuje vývojovou techniku model based design a její použití pro návrh a testování algoritmů. Popis této techniky je proveden na příkladu praktického využití v praxi při vývoji modulu monitorování stavu motoru ve spolupráci se společností Unis. Vývoj v oblasti současné letecké techniky klade velký důraz na monitorování životnosti zařízení. Podle výstupů algoritmu se dají naplánovat preventivní opravy s ohledem na aktuální podmínky opotřebení a provozování. Implementace algoritmů je provedena v prostředí Matlab/Simulink s následným testováním na platformě dSpace
|
53 |
A Systematic Approach for Obtaining Performance on Matrix-Like OperationsVeras, Richard Michael 01 August 2017 (has links)
Scientific Computation provides a critical role in the scientific process because it allows us ask complex queries and test predictions that would otherwise be unfeasible to perform experimentally. Because of its power, Scientific Computing has helped drive advances in many fields ranging from Engineering and Physics to Biology and Sociology to Economics and Drug Development and even to Machine Learning and Artificial Intelligence. Common among these domains is the desire for timely computational results, thus a considerable amount of human expert effort is spent towards obtaining performance for these scientific codes. However, this is no easy task because each of these domains present their own unique set of challenges to software developers, such as domain specific operations, structurally complex data and ever-growing datasets. Compounding these problems are the myriads of constantly changing, complex and unique hardware platforms that an expert must target. Unfortunately, an expert is typically forced to reproduce their effort across multiple problem domains and hardware platforms. In this thesis, we demonstrate the automatic generation of expert level high-performance scientific codes for Dense Linear Algebra (DLA), Structured Mesh (Stencil), Sparse Linear Algebra and Graph Analytic. In particular, this thesis seeks to address the issue of obtaining performance on many complex platforms for a certain class of matrix-like operations that span across many scientific, engineering and social fields. We do this by automating a method used for obtaining high performance in DLA and extending it to structured, sparse and scale-free domains. We argue that it is through the use of the underlying structure found in the data from these domains that enables this process. Thus, obtaining performance for most operations does not occur in isolation of the data being operated on, but instead depends significantly on the structure of the data.
|
54 |
A Manifestation of Model-Code Duality: Facilitating the Representation of State Machines in the Umple Model-Oriented Programming LanguageBadreldin, Omar January 2012 (has links)
This thesis presents research to build and evaluate embedding of a textual form of state machines into high-level programming languages. The work entailed adding state machine syntax and code generation to the Umple model-oriented programming technology. The added concepts include states, transitions, actions, and composite states as found in the Unified Modeling Language (UML). This approach allows software developers to take advantage of the modeling abstractions in their textual environments, without sacrificing the value added of visual modeling.
Our efforts in developing state machines in Umple followed a test-driven approach to ensure high quality and usability of the technology. We have also developed a syntax-directed editor for Umple, similar to those available to other high-level programming languages. We conducted a grounded theory study of Umple users and used the findings iteratively to guide our experimental development. Finally, we conducted a controlled experiment to evaluate the effectiveness of our approach.
By enhancing the code to be almost as expressive as the model, we further support model-code duality; the notion that both model and code are two faces for the same coin. Systems can be and should be equally-well specified textually and diagrammatically. Such duality will benefit both modelers and coders alike. Our work suggests that code enhanced with state machine modeling abstractions is semantically equivalent to visual state machine models.
The flow of the thesis is as follows; the research hypothesis and questions are presented in “Chapter 1: Introduction”. The background is explored in “Chapter 2: Background”. “Chapter 3: Syntax and semantics of simple state machines” and “Chapter 4: Syntax and semantics of composite state machines” investigate simple and composite state machines in Umple, respectively. “Chapter 5: Implementation of composite state machines” presents the approach we adopt for the implementation of composite state machines that avoids explosion of the amount of generated code. From this point on, the thesis presents empirical work. A grounded theory study is presented in “Chapter 6: A Grounded theory study of Umple”, followed by a controlled experiment in “Chapter 7: Experimentation”. These two chapters constitute our validation and evaluation of Umple research. Related and future work is presented in “Chapter 8: Related work”.
|
55 |
Amoss: Improving Simulation Speed and Numerical Stability of Large-Scale Mixed Continuous/Conditional Stochastic Differential SimulationsPretorius, Deon January 2021 (has links)
Amoss is an equation-orientated stochastic simulation platform, developed on open-source software. It is designed to facilitate the development and simulation of Sasol value chain models using the Moss methodology. The main difficulties with the original Moss methodology was that plant recycles were difficult to incorporate and that plant or model changes meant rebuilding the entire Moss model. The first version of automatic-Moss was developed by Edgar Whyte
in an effort to address these problems. It was successful as a proof of concept, but generated simulations were numerically unstable and very slow. A second version of the tool was to be developed to address numerical stability and simulation speed.
The stochastic simulations stemming from Amoss models are large-scale and contain mixed continuous/conditional algebraic equation sets, with first order stochastic differential equations. Additionally, optimal flow allocation as a disjunctive optimisation is often encountered. The complexity of these factors makes finite difference approximation the main solution. The equation ordering, simulation approach and code generation features of the Amoss tool
were investigated and re-implemented. A custom equation ordering method, which uses interval arithmetic and weighted maximal matching for numerically stable matching, followed by Dulmage-Mendelsohn decomposition and Cellier’s tearing, was implemented. For implicitly ordered systems, a fixed-point iterative Newton method, where conditional variables are separated from continuous variables for solving stability, was implemented. The optimal allocation problem with heuristic allocation was generalised to plants with recycles. Fast simulation code utilising parallel processing, efficient solving and function evaluation, efficient intermediate data storage and fast file writing, was implemented. Amoss simulations are now substantially faster than the industry equivalent and can reliably model Moss methodology problems. / Dissertation (MEng (Control Engineering))--University of Pretoria, 2021. / Sasol / Chemical Engineering / MEng (Control Engineering) / Unrestricted
|
56 |
Generované peephole optimalizace v překladači LLVM / Generated Peephole Optimizations in LLVM CompilerMelo, Stanislav January 2016 (has links)
One of the important feature of application specific processors is performance. To maximize it, the compiler must adapt to needs of processor that it is going to compile for and it must generate the most efficient code. One of the ways to do that is to search for appropriate instructions that can be implemented as one instruction with multiple outputs. Afterwards the generated code can be parsed through peephole optimizations that search for instruction patterns and replace them with other instructions to make code more effective. This paper describes the problem of finding and selecting suitable candidates for multiple output instructions. It also provides a brief overview of the few best known algorithms that solve this problem. Eventually it examines possibilities of incorporating this optimizations to LLVM compiler.
|
57 |
A Generalized Framework for Automatic Code Partitioning and Generation in Distributed SystemsSairaman, Viswanath 05 February 2010 (has links)
In distributed heterogeneous systems the partitioning of application software to be executed in a distributed fashion is a challenge by itself. The task of code partitioning for distributed processing involves partitioning the code into clusters and mapping those code clusters to the individual processing elements interconnected through a high speed network. Code generation is the process of converting the code partitions into individually executable code clusters and satisfying the code dependencies by adding communication primitives to send and receive data between dependent code clusters. In this work, we describe a generalized framework for automatic code partitioning and code generation for distributed heterogeneous systems. A model for system level design and synthesis using transaction level models has also been developed and is presented. The application programs along with the partition primitives are converted into independently executable concrete implementations. The process consists of two steps, first translating the primitives of the application program into equivalent code clusters, and then scheduling the implementations of these code clusters according to the inherent data dependencies. Further, the original source code needs to be reverse engineered in order to create a meta-data table describing the program elements and dependency trees. The data gathered, is used along with Parallel Virtual Machine (PVM) primitives for enabling the communication between the partitioned programs in the distributed environment. The framework consists of profiling tools, partitioning methodology, architectural exploration and cost analysis tools. The partitioning algorithm is based on clustering, in which the code clusters are created to minimize communication overhead represented as data transfers in task graph for the code. The proposed approach has been implemented and tested for different applications and compared with simulated annealing and tabu search based partitioning algorithms. The objective of partitioning is to minimize the communication overhead. While the proposed approach performs comparably with simulated annealing and better than tabu search based approaches in most cases in terms of communication overhead reduction, it is conclusively faster than simulated annealing and tabu search by an order of magnitude as indicated by simulation results. The proposed framework for system level design/synthesis provides an end to end rapid prototyping approach for aiding in architectural exploration and design optimization. The level of abstraction in the design phase can be fine tuned using transaction level models.
|
58 |
DeepMACSS : Deep Modular Analyzer for Creating Semantics and generate code from SketchEriksson, David January 2022 (has links)
Scientific areas such as artificial intelligence have exploded in popularity and more advanced techniques such as deep learning has been applied in various areas in order to automate tasks. As many software developers know creating prototypes can both be daunting and very time consuming. In this thesis we explore the possibility of utilizing deep learning techniques to automate the task of creating prototypes from hand drawn sketches. We will cover a method of attempting to automate daunting tasks that follow with using deep learning techniques such as labelling of data. This is automated by utilizing image editing techniques and results in both the automatic labelling of new data being efficient and the ability to create new data artificially to extend the data obtained. The thesis compares three different deep learning architectures which are trained and evaluated to obtain the best resulting model. It will also investigate how performance changes depending on data set size, pre-processing steps, architectures, and extendibility. The architectures utilize transfer learning to be able to be extended with new components without great loss to the overall performance. To read text written on the provided image optical character recognition is utilized with different pre-processing techniques to obtain the best result possible. A code generator built using template-based design is also proposed which is built using one main generator which then utilizes a language specific generator for generating the code. The reasoning behind splitting the generator into two is to provide a more extendable solution by having the language specific generator being able to be switched out for any language independent of the results of the deep learning architecture. The provided solution heavily aims on being a modular approach for the solution as a whole to be more future proof. The results show that using the proposed deep learning model might not have enough prediction accuracy to be used in a production environment. Due to the low prediction accuracy conclusions are made on how the accuracy can be increased which would lead to better results for the solution.
|
59 |
Automatically Generating Tests from Natural Language Descriptions of Software BehaviorSunil Kamalakar, FNU 18 October 2013 (has links)
Behavior-Driven Development (BDD) is an emerging agile development approach where all stakeholders (including developers and customers) work together to write user stories in structured natural language to capture a software application's functionality in terms of re- quired "behaviors". Developers then manually write "glue" code so that these scenarios can be executed as software tests. This glue code represents individual steps within unit and acceptance test cases, and tools exist that automate the mapping from scenario descriptions to manually written code steps (typically using regular expressions). Instead of requiring programmers to write manual glue code, this thesis investigates a practical approach to con- vert natural language scenario descriptions into executable software tests fully automatically. To show feasibility, we developed a tool called Kirby that uses natural language processing techniques, code information extraction and probabilistic matching to automatically gener- ate executable software tests from structured English scenario descriptions. Kirby relieves the developer from the laborious work of writing code for the individual steps described in scenarios, so that both developers and customers can both focus on the scenarios as pure behavior descriptions (understandable to all, not just programmers). Results from assessing the performance and accuracy of this technique are presented. / Master of Science
|
60 |
Code Generation and Global Optimization Techniques for a Reconfigurable PRAM-NUMA Multicore ArchitectureHansson, Erik January 2014 (has links)
In this thesis we describe techniques for code generation and global optimization for a PRAM-NUMA multicore architecture. We specifically focus on the REPLICA architecture which is a family massively multithreaded very long instruction word (VLIW) chip multiprocessors with chained functional units that has a reconfigurable emulated shared on-chip memory. The on-ship memory system supports two execution modes, PRAM and NUMA, which can be switched between at run-time.PRAM mode is considered the standard execution mode and targets mainly applications with very high thread level parallelism (TLP). In contrast, NUMA mode is optimized for sequential legacy applications and applications with low amount of TLP. Different versions of the REPLICA architecture have different number of cores, hardware threads and functional units. In order to utilize the REPLICA architecture efficiently we have made several contributionsto the development of a compiler for REPLICA target code generation. It supports both code generation for PRAM mode and NUMA mode and can generate code for different versions of the processor pipeline (i.e. for different numbers of functional units). It includes optimization phases to increase the utilization of the available functional units. We have also contributed to quantitative the evaluation of PRAM and NUMA mode. The results show that PRAM mode often suits programs with irregular memory access patterns and control flow best while NUMA mode suites regular programs better. However, for a particular program it is not always obvious which mode, PRAM or NUMA, will show best performance. To tackle this we contributed a case study for generic stencil computations, using machine learning derived cost models in order to automatically select at runtime which mode to execute in. We extended this to also include a sequence of kernels.
|
Page generated in 0.4263 seconds