• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 116
  • 33
  • 28
  • 7
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 282
  • 175
  • 115
  • 98
  • 93
  • 68
  • 44
  • 40
  • 40
  • 37
  • 37
  • 36
  • 35
  • 34
  • 31
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.

Compiler Support for Long-life, Low-overhead Intermittent Computation on Energy Harvesting Flash-based Devices

Ahmad, Saim 19 May 2021 (has links)
With the advent of energy harvesters, supporting fast and efficient computation on energy harvesting devices has become a key challenge in the field of energy harvesting on ubiquitous devices. Computation on energy harvesting devices is equivalent to spreading the execution time of a lasting application over short, frequent cycles of power. However, we must ensure that results obtained from intermittently executing an application do produce results that are congruent to those produced by executing the application on a device with a continuous source of power. The current state-of-the-art systems that enable intermittent computation on energy harvesters make use of novel compiler analysis techniques as well as on-board hardware on devices to measure the energy remaining for useful computation. However, currently available programming models, which mostly target devices with FRAM as the NVM, would cause failure on devices that employ the Flash as primary NVM, thereby resulting in a non-universal solution that is restricted by the choice of NVM. This is primarily the result of the Flash's limited read/write endurance. This research aims to contribute to the world of energy harvesting devices by providing solutions that would enable intermittent computation regardless of the choice of NVM on a device by utilizing only the SRAM to save state and perform computation. Utilizing the SRAM further reduces run-time overhead as SRAM reads/writes are less costlier than NVM reads/writes. Our proposed solutions rely on programmer-guidance and compiler analysis to correct and efficient intermittent computation. We then extend our system to provide a complete compiler-based solution without programmer intervention. Our system is able to run applications that would otherwise render any device with Flash as NVM useless in a matter of hours. / Master of Science / As batteries continue to take up space and make small-scale sensors hefty, battery-less devices have grown increasingly popular for non-resource intensive computations. From tracking air pressure in vehicle tires to monitoring room temperature, battery-less devices have countless applications in various walks of life. These devices function by periodically harvesting energy from the environment and its surroundings to power short bursts of computation. When device energy levels reach a lower-bound threshold these devices must power off to scavenge useful energy from the environment to further perform short bursts of computation. Usually, energy harvesting devices draw power from solar, thermal or RF energy. This vastly depends on the build of the device, also known as a microprocessor (a processing unit built to perform small-scale computations). Due to these devices constantly powering on and off, performing continuous computation on such devices is rather more difficult when compared to systems with a continuous source of power. Since applications can require more time to complete than one power cycle of such devices, by default, applications running on these devices will restart execution from the beginning at the start of every power cycle. Therefore, it is necessary for such devices to have mechanisms to remember where the were before the device lost power. The past decade has seen many solutions proposed to aid an application in restarting execution rather than recomputing everything from the beginning. Solutions utilize different categories of devices with different storage technologies as well different software and hardware utilities available to programmers in this domain. In this research, we propose two different low-overhead, long-life computation models to support intermittent computation on a subset of energy harvesting devices which use Flash-based memory to store persistent data. Our approaches are heavily dependent on programmer guidance and different program analysis techniques to sustain computation across power cycles.

Popcorn Linux: A Compiler and Runtime for Execution Migration Between Heterogeneous-ISA Architectures

Lyerly, Robert Frantz 25 April 2019 (has links)
In recent years there has been a proliferation of parallel and heterogeneous architectures. As chip designers have hit fundamental limits in traditional processor scaling, they have begun rethinking processor architecture from the ground up. In addition to creating new classes of processors, chip designers have revisited CPU microarchitecture in order to target different computing contexts. CPUs have been optimized for low-power smartphones and extended for high-performance computing in order to achieve better performance energy efficiency for heavy computational tasks. Although heterogeneity adds significant complexity to both hardware and software, recent works have shown tremendous power and performance benefits obtainable through specialization. It is clear that emerging systems will be increasingly heterogeneous. Many of these emerging systems couple together cores of different instruction set architectures (ISA), due to both market forces and the potential performance and power benefits in optimizing application execution. However, differently from symmetric multiprocessors or even asymmetric single-ISA multiprocessors, natively compiled applications cannot freely migrate between heterogeneous-ISA processors. This is due to the fact that applications are compiled to an instruction set architecture-specific format which is incompatible on other instruction set architectures. This creates serious limitations, as execution migration is a fundamental mechanism used by schedulers to reach performance or fairness goals, allows applications to migrate between heterogeneous-ISA CPUs in order to accelerate parallel applications or even leverage ISA-heterogeneity for security benefits. This dissertation describes system software for automatically migrating natively compiled applications across heterogeneous-ISA processors. This dissertation describes the implementation and evaluation of a complete software stack on commodity scale heterogeneous-ISA CPUs, emulating datacenters with heterogeneous-ISA systems or future systems that tightly integrate heterogeneous-ISA CPUs via point-to-point interconnect. This dissertation describes a compiler which builds applications for heterogeneous-ISA execution migration. The compiler generates machine code for every architecture in the system and lays out the application's code and data in a common format. In addition, the compiler generates metadata used by a state transformation runtime to dynamically transform thread execution state between ISA-specific formats, allowing application threads to migrate between different ISAs. The compiler and runtime is evaluated in conjunction with a replicated-kernel operating system, which provides thread migration and distributed shared virtual memory across heterogeneous-ISA processors. This redesigned software stack is evaluated on a setup containing and ARM and an x86 processor interconnected via point-to-point interconnect over PCIe. This dissertation shows that sub-millisecond state transformation is achievable. Additionally, it shows that for a datacenter-like workload using benchmarks from the NAS Parallel Benchmark suite, the system can trade some performance for up to a 66% reduction in energy and up to an 11% reduction in energy-delay product. This dissertation then describes an exploration into using hardware transactional memory (HTM) to maximize scheduling flexibility. Because applications can only migrate between ISAs at program locations with specific properties, there may be a significant delay between when the scheduler wishes to migrate an application and when the application can respond to the migration request. In order to reduce this migration response time, this dissertation describes compiler instrumentation which uses HTM to allow the scheduler to force applications to roll back to the most recently encountered program location suitable for migration. This is evaluated both in terms of overhead and responsiveness to migration requests. In addition to showing the viability of the infrastructure for optimizing workload placement in a heterogeneous-ISA datacenter, this dissertation also demonstrates utilizing the infrastructure to accelerate multithreaded applications. This dissertation describes a new OpenMP runtime named libopenpop that is optimized for executing applications in heterogeneous- ISA systems with distributed shared virtual memory. The runtime utilizes synchronization primitives that enable scale-out execution across rack-scale systems and new work distribution mechanisms that predict the best partitioning of parallel work across CPUs with diverse architectural characteristics. libopenpop demonstrates sizable improvements over a na¨ıve OpenMP implementation – a 38x improvement in multi-server barrier latency, a 5.4x improvement in multi-server data reductions and a geometric mean speedup of 4.04x for scalable applications in an 8-node x86-64 cluster. For a heterogeneous system composed of a highly-clocked x86 server and a highly-parallel ARM server, libopenpop delivers up to a 4.7x speedup and a geometric mean speedup of 41% across benchmarks from several benchmark suites versus the best single-node homogeneous execution. Finally, this dissertation describes leveraging the compiler and state transformation runtime to provide enhanced security for applications. Because the compiler provides detailed information about the stack layout of applications, it can be leveraged to defend against exploits such as stack smashing attacks and return-oriented programming attacks. This dissertation describes Chameleon, a runtime which uses the compiler and state transformation infrastructure to continuously re-randomize the stack layout and code of vulnerable applications to thwart attackers. Chameleon attaches to applications using existing operating system interfaces and periodically switches the application to new randomized stack layouts and code by rewriting the stack. Chameleon enhances security with little overhead – it disrupts a geometric mean 76.32% of code gadgets in benchmark binaries, randomizes stack element locations with geometric mean 3 potential randomized locations, and has 1.1% overhead when re-randomizing every 50 milliseconds, making it extremely difficult for attackers to exploit target applications. / Doctor of Philosophy / Computer processors have experienced unprecedented performance improvements over the past 50 years. However, due to physical limitations of how processors execute, in recent years this performance growth has started to slow. In order to continue scaling performance, chip designers have begun diversifying processor designs to meet different performance and power consumption targets. Processors specialized for different contexts use various instruction set architectures (ISAs), the operations made available for use by the hardware. Programs built for one instruction set architecture are not compatible with others, requiring developers to build complex applications to manually bridge the gap. This leads to brittle applications and prevents the system software managing the processors from adapting workloads to match processor characteristics. This dissertation presents the Popcorn Linux system software which provides transparent support for running applications across computers composed of processors of multiple ISAs. Popcorn Linux provides the ability to migrate applications between these processors without requiring developers to add any application instrumentation – the system software manages all the details of building and migrating applications. An evaluation of Popcorn Linux shows that transparently migrating applications between diverse processors provides power and performance benefits in a variety of scenarios. Additionally, this dissertation describes leveraging the Popcorn Linux software infrastructure to harden applications against attackers seeking to hijack applications for malicious purposes.

Braids out-of-order performance with almost in-order complexity /

Tseng, Francis, January 1900 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2007. / Vita. Includes bibliographical references.

Performance improvement through predicated execution in VLIW machines /

Biglari-Abhari, Morteza. January 2000 (has links) (PDF)
Thesis (Ph.D.) -- University of Adelaide, Dept. of Electrical and Electronic Engineering, 2000. / Bibliography: leaves 136-153.

Performance improvement through predicated execution in VLIW machines

Biglari-Abhari, Morteza. January 2000 (has links) (PDF)
Bibliography: leaves 136-153. Investigates techniques to achieve performance improvement in Very Long Instruction Word machines through predicated execution.

Modular compiler verification : a refinement algebraic approach advocating stepwise abstraction /

Müller-Olm, Markus. January 1997 (has links)
Univ., Diss--Kiel, 1996. / Literaturverz. S. 239 - 243.

Compiler architecture using a portable intermediate language

Reig Galilea, Fermín Javier. January 2002 (has links)
Thesis (Ph.D.) - University of Glasgow, 2002. / Ph.D. thesis submitted to the Department of Computing Science, University of Glasgow, 2002. Includes bibliographical references. Print version also available.

Range analysis of object level code /

Femister, James A., January 1997 (has links)
Thesis (Ph. D.)--Lehigh University, 1997. / Includes vita. Bibliography: leaves 82-85.

Automatic staged compilation /

Philipose, Matthai. January 2005 (has links)
Thesis (Ph. D.)--University of Washington, 2005. / Vita. Includes bibliographical references (p. 240-245).

Algorithms for compiler-assisted design space exploration of clustered VLIW ASIP datapaths /

Lapinskii, Viktor, January 2001 (has links)
Thesis (Ph. D.)--University of Texas at Austin, 2001. / Vita. Includes bibliographical references (leaves 72-77). Available also in a digital version from Dissertation Abstracts.

Page generated in 0.0636 seconds