• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 194
  • 33
  • 31
  • 16
  • 11
  • 10
  • 6
  • 6
  • 5
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • Tagged with
  • 366
  • 132
  • 80
  • 72
  • 50
  • 45
  • 42
  • 40
  • 39
  • 36
  • 34
  • 34
  • 33
  • 31
  • 30
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

JAVA VIRTUAL MACHINE DESIGN FOR EMBEDDED SYSTEMS: ENERGY, TIME PREDICTABILITY AND PERFORMANCE

Sun, Yu 01 December 2010 (has links)
Embedded systems can be found everywhere in our daily lives. Due to the great variety of embedded devices, the platform independent Java language provides a good solution for embedded system development. Java virtual machine (JVM) is the most critical component of all kinds of Java platforms. Hence, it is extremely important to study the special design of JVM for embedded systems. The key challenges of designing a successful JVM for embedded systems are energy efficiency, time predictability and performance, which are investigated in this dissertation, respectively. We first study the energy issue of JVM on embedded systems. With a cycle-accurate simulator, we study each stage of Java execution separately to test the effects of different configurations in both software and hardware. After that, an alternative Adaptive Optimization System (AOS) model is introduced, which estimated the cost/benefit using energy data instead of running time. We tuned the parameters of this model to study how to improve the dynamic compilation and optimization in Jikes RVM in terms of energy consumption. In order to further reduce the energy dissipation of JVM on embedded systems, we study adaptive drowsy cache control for Java applications, where JVM can be used to make better decision on drowsy cache control. We explore the impact of different phases of Java applications on the timing behavior of cache usage. Then we propose several techniques to adaptively control drowsy cache to reduce energy consumption with minimal impact on performance. It is observed that traditional Java code generation and instruction fetch path are not efficient. So we study three hardware-based code caching strategies, which attempt to write and read the dynamically generated Java code faster and more energy-efficiently. Time predictability is another key challenge for JVM on embedded systems. So we exploit multicore computing to reduce the timing unpredictability caused by dynamic compilation and adaptive optimization. Our goal is to retain high performance comparable to that of traditional dynamic compilation and, at the same time, obtain better time predictability for JVM. We study pre-compilation techniques to utilize another core more efficiently. Furthermore, we develop Pre-optimization on Another Core (PoAC) scheme to replace AOS in Jikes JVM, which is very sensitive to execution time variation and impacts time predictability greatly. Finally, we propose two new approaches that automatically parallelizes Java programs at run-time, in order to meet the performance challenge of JVM on embedded systems. These approaches rely on run-time trace information collected during program execution, and dynamically recompiles Java byte code that can be executed in parallel. One approach utilizes trace information to improve traditional loop parallelization, and the other parallelizes traces instead of loop iterations.
62

Scratchpad Management in Software Managed Manycore Architectures

January 2017 (has links)
abstract: Caches have long been used to reduce memory access latency. However, the increased complexity of cache coherence brings significant challenges in processor design as the number of cores increases. While making caches scalable is still an important research problem, some researchers are exploring the possibility of a more power-efficient SRAM called scratchpad memories or SPMs. SPMs consume significantly less area, and are more energy-efficient per access than caches, and therefore make the design of on-chip memories much simpler. Unlike caches, which fetch data from memories automatically, an SPM requires explicit instructions for data transfers. SPM-only architectures are thus named as software managed manycore (SMM), since the data movements of such architectures rely on software. SMM processors have been widely used in different areas, such as embedded computing, network processing, or even high performance computing. While SMM processors provide a low-power platform, the hardware alone does not guarantee power efficiency, if applications on such processors deliver low performance. Efficient software techniques are therefore required. A big body of management techniques for SMM architectures are compiler-directed, as inserting data movement operations by hand forces programmers to trace flow of data, which can be error-prone and sometimes difficult if not impossible. This thesis develops compiler-directed techniques to manage data transfers for embedded applications on SMMs efficiently. The techniques analyze and find out the proper program points and insert data movement instructions accordingly. The techniques manage code, stack and heap data of applications, and reduce execution time by 14%, 52% and 80% respectively compared to their predecessors on typical embedded applications. On top of managing local data, a technique is also developed for shared data in SMM architectures. Experimental results show it achieves more than 2X speedup than the previous technique on average. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2017
63

Development of the NoGAP CL Hardware Description Language and its Compiler

Blumenthal, Carl January 2007 (has links)
The need for a more general hardware description language aimed specifically at processors, and vague notions and visions of how that language would be realized, lead to this thesis. The aim was to use the visions and initial ideas to evolve and formalize a language and begin implementing the tools to use it. The language, called NoGAP Common Language, is designed to give the programmer freedom to implement almost any processor design without being encumbered by many of the tedious tasks normally present in the creation process. While evolving the language it was chosen to borrow syntaxes from C++ and verilog to make the code and concepts easy to understand. The main advantages of NoGAP Common Language compared to RTL languages are; -the ability to define the data paths of instructions separate from each other and have them merged automatically along with assigned timings to form the pipeline. -having control paths automatically routed by activating named clauses of code coupled to control signals. -being able to specify a decoder, where the instructions and control structures are defined, that control signals are routed to. The implemented compiler was created with C++, Bison, and Flex and utilizes an AST structure, a symbol table, and a connection graph. The AST is traversed by several functions to generate the connection graph where the instructions of the processor can be merged into a pipeline. The compiler is in the early stages of development and much is left to do and solve. It has become clear though that the concepts of NoGAP Common Language can be implemented and are not just visions. / Behovet av ett mer generellt hårdvarubeskrivande språk specialiseret för processorer och visioner om ett sådant gav upphov till detta examensarbete. Målet var att utveckla visionerna, formalisera dem till ett fungerande språk och börja implementera dess verktyg. Språket, som kallas NoGAP Common Language, är designat för att ge programmeraren friheten att implementera nästan vilken processordesign som helst utan att bli nedtyngd av många av de enformiga uppgifter som annars måste utföras. Under utvecklingsprocessen valdes det att låna många syntax från C++ och verilog för att göra språket lätt att förstå och känna igen för många. De största fördelarna med att utveckla i NoGAP Common Language jämfört med vanliga RTL språk som verilog är; -att kunna specificera datavägar för instruktioner separat från varandra och få dem automatiskt förenade med hjälp av tidsangivelser till en pipeline. -att få kontrollvägar automatiskt dragna genom att aktivera namngivna klausuler med kod kopplade till kontrollsignaler. -att kunna specifiera en avkodare som kontrollvägarna kan kopplas till där kodning för instruktioner kan anges. Kompilatorn som implementerats med C++, Bison och Flex använder sig av en AST struktur, en symboltabell och en signalvägsgraf. AST strukturen traverseras av flera funktioner som bygger upp signalvägsgrafen där processorns instruktioner förenas till en pipeline. Utvecklingen av kompilatorn är ännu bara i de första stadierna och mycket är kvar att göra och lösa. Det har dock blivit klart att det är möjligt att implementera koncepten i NoGAP Common Language och att de inte bara är lösa visioner.
64

COMPILER FOR A TRACE-BASED DEEP NEURAL NETWORK ACCELERATOR

Andre Xian Ming Chang (6789503) 12 October 2021 (has links)
Deep Neural Networks (DNNs) are the algorithm of choice for various applications that require modeling large datasets, such as image classification, object detection and natural language processing. DNNs present highly parallel workloads<br>that lead to the need of custom hardware accelerators. Deep Learning (DL) models specialized on different tasks require a programmable custom hardware, and a compiler to efficiently translate various DNNs into an efficient dataflow to be executed on the accelerator. Given a DNN oriented custom instructions set, various compilation phases are needed to generate efficient code and maintain generality to support<br>many models. Different compilation phases need to have different levels of hardware awareness so that it exploits the hardware’s full potential, while abiding with the hardware constraints. The goal of this work is to present a compiler workflow and its hardware aware optimization passes for a custom DNN hardware accelerator. The compiler uses model definition files created from popular frameworks to generate custom instructions. Different levels of hardware aware code optimizations are applied to improve performance and data reuse. The software also exposes an interface to run the accelerator implemented on various FPGA platforms, proving an end-to-end solution.
65

Kompilátor zdrojového kódu pro PLC SIMATIC / Compiler of source code for SIMATIC PLC

Kubát, Zdeněk January 2014 (has links)
The thesis discusses an independent compiler for STEP 7 and WinCC V7.0 applications of the SIEMENS company. The compiler processes source files generated in STEP 7 and subsequently saves the processed data in .xls type intermediatefiles. The data of the intermediatefiles serve as source data for functions generating files with tags to be imported into WinCC V7.0. The compiler was created using the C# programming language in Visual Studio 2010.
66

Kompilátor zdrojového kódu pro PLC SIMATIC / Compiler of source code for SIMATIC PLC

Kubát, Zdeněk January 2014 (has links)
The thesis discusses an independent compiler for STEP 7 and WinCC V7.0 applications of the SIEMENS company. The compiler processes source files generated in STEP 7 and subsequently saves the processed data in .xls type intermediatefiles. The data of the intermediatefiles serve as source data for functions generating files with tags to be imported into WinCC V7.0. The compiler was created using the C# programming language in Visual Studio 2010.
67

Zadní část překladače podmnožiny jazyka C pro 8-bitový procesor / Compiler Back-End of Subset of Language C for 8-Bit Processor

Horník, Jakub January 2011 (has links)
A compiler allows us to describe an algorithm in a high-level programming language with a higher level of abstraction and readability than a low-level machine code. This work describes design of the compiler back-end of subset of language C for 8-bit soft-core microcontroller Xilinx PicoBlaze-3. Design is described from the initial selection of a suitable framework to the implementation itself. One of the main reasons of this work is that there is not any suitable compiler for this processor.
68

Software Framework to Support Operations of Nanosatellite Formations / Software Framework für die Unterstützung des Betriebs von Nanosatelliten-Formationen

Dombrovski, Veaceslav January 2022 (has links) (PDF)
Since the first CubeSat launch in 2003, the hardware and software complexity of the nanosatellites was continuosly increasing. To keep up with the continuously increasing mission complexity and to retain the primary advantages of a CubeSat mission, a new approach for the overall space and ground software architecture and protocol configuration is elaborated in this work. The aim of this thesis is to propose a uniform software and protocol architecture as a basis for software development, test, simulation and operation of multiple pico-/nanosatellites based on ultra-low power components. In contrast to single-CubeSat missions, current and upcoming nanosatellite formation missions require faster and more straightforward development, pre-flight testing and calibration procedures as well as simultaneous operation of multiple satellites. A dynamic and decentral Compass mission network was established in multiple active CubeSat missions, consisting of uniformly accessible nodes. Compass middleware was elaborated to unify the communication and functional interfaces between all involved mission-related software and hardware components. All systems can access each other via dynamic routes to perform service-based M2M communication. With the proposed model-based communication approach, all states, abilities and functionalities of a system are accessed in a uniform way. The Tiny scripting language was designed to allow dynamic code execution on ultra-low power components as a basis for constraint-based in-orbit scheduler and experiment execution. The implemented Compass Operations front-end enables far-reaching monitoring and control capabilities of all ground and space systems. Its integrated constraint-based operations task scheduler allows the recording of complex satellite operations, which are conducted automatically during the overpasses. The outcome of this thesis became an enabling technology for UWE-3, UWE-4 and NetSat CubeSat missions. / Seit dem Launch des ersten CubeSats im Jahr 2003, hat die Komplexität der Nanosatelliten stetig zugenommen. Um mit den wachsenden Anforderungen Schritt zu halten und gleichzeitig nicht auf die Hauptvorteile einer CubeSat Mission zu verzichten, wird eine einheitliche Protokoll- und Softwarearchitektur für den gesamten Weltraum- und Bodensegment vorgeschlagen. Diese Arbeit schlägt eine einheitliche Software- und Protokoll-Architektur vor als Basis für Softwareentwicklung, Tests und Betrieb von mehreren Pico-/Nanosatelliten. Im Gegensatz zu Missionen mit nur einem CubeSat, erfordern künftige Nanosatelliten-Formationen eine schnellere und einfachere Entwicklung, Vorflug-Tests, Kalibrierungsvorgänge sowie die Möglichkeit mehrere Satelliten gleichzeitig zu betreiben. Ein dynamisches und dezentrales Compass Missionsnetzwerk wurde in mehreren CubeSat-Missionen realisiert, bestehend aus einheitlich zugänglichen Knoten. Die Compass-Middleware wurde entwickelt, um sowohl die Kommunikation als auch funktionale Schnittstellen zwischen allen beteiligten Software und Hardware Systemen in einer Mission zu vereinheitlichen: Rechner des Bedienpersonals, Bodenstationen, Mission-Server, Testeinrichtungen, Simulationen und Subsysteme aller Satelliten. Mit dem Ansatz der modellbasierten Kommunikation wird auf alle Zustände und Funktionen eines Systems einheitlich zugegriffen. Die entwickelte Tiny Skriptsprache ermöglicht die Ausführung von dynamischem Code auf energiesparenden Systemen, um so in-orbit Scheduler zu realisieren. Das Compass Operations Front-End bietet zahlreiche grafische Komponenten, mit denen alle Weltraum- und Bodensegment-Systeme einheitlich überwacht, kontrolliert und bedient werden. Der integrierte Betrieb-Scheduler ermöglicht die Aufzeichnung von komplexen Satellitenbetrieb-Aufgaben, die beim Überflug automatisch ausgeführt werden. Die Ergebnisse dieser Arbeit wurden zur Enabling-Technologie für UWE-3, UWE-4 und NetSat Missionen.
69

Checkpointing without operating system intervention: Implementing Griewank's algorithm

Heller, Richard January 1998 (has links)
No description available.
70

Machine learning enhanced code optimization for high-level synthesis (ML-ECOHS)

Munafo, Robert P. 24 May 2024 (has links)
While Field-Programmable Gate Arrays (FPGAs) exist in many design configurations throughout the data center, cloud, and edge, the promise of performance and flexibility offered by the FPGA often remains unrealized for lack of hardware design expertise, with most computation remaining in fixed hardware such as CPUs, GPUs, and ASICs e.g. tensor processors. Identifying programmability as a barrier to FPGA usage, we seek to augment High Level Synthesis (HLS) design flows with machine learning. The overall goal of this dissertation is to advance the art of using unmodified high-level language (HLL) programs to create FPGA configurations that are performant, programmable, and portable. The problems in using HLL code to program FPGAs arise from the serial execution model of the target application codes, in particular, the mismatch between that model and the arbitrary data flow model of the target hardware. However, a variety of code transformation techniques, tedious to perform by a human but readily and effortlessly done by CPU compilers, allow many compute-intensive operations to be transformed into a form that is highly or massively parallel. A challenge then exists in selecting the best set of optimizations and an order in which to perform them, a choice among staggeringly many options. Brute-force and automated orthogonal search techniques have failed to produce solutions that lie within the realm of practicality. We evaluate the suitability of machine learning (ML) models to address this challenge. We develop and assess designs for systems that use ML and present feedback to these models based on their recommendations. In support of using ML models, we begin by developing and assessing methods for preparing the original program as input for processing by the model. We next develop and assess methods to apply the model's output to a compilation system and evaluate the results for use as feedback. Machine learning experiments are performed to demonstrate their potential. Feasibility is studied through simulations and specialized evaluation techniques as appropriate. Specific new artifacts are created, including extensions to the open-source GCC compiler. The significance of this work is of several types: the many options and variants to the overall design that are considered, with their distinct advantages and disadvantages; the application of proper units and baselines to experimental measurements; and the numerous contributions in the form of published extensions and improvements to open-source projects.

Page generated in 0.0298 seconds