Global ETD Search

1	Methods for Reducing Monitoring Overhead in Runtime Verification Wu, Chun Wah Wallace January 2013 (has links) Runtime verification is a lightweight technique that serves to complement existing approaches, such as formal methods and testing, to ensure system correctness. In runtime verification, monitors are synthesized to check a system at run time against a set of properties the system is expected to satisfy. Runtime verification may be used to determine software faults before and after system deployment. The monitor(s) can be synthesized to notify, steer and/or perform system recovery from detected software faults at run time. The research and proposed methods presented in this thesis aim to reduce the monitoring overhead of runtime verification in terms of memory and execution time by leveraging time-triggered techniques for monitoring system events. Traditionally, runtime verification frameworks employ event-triggered monitors, where the invocation of the monitor occurs after every system event. Because systems events can be sporadic or bursty in nature, event-triggered monitoring behaviour is difficult to predict. Time-triggered monitors, on the other hand, periodically preempt and process system events, making monitoring behaviour predictable. However, software system state reconstruction is not guaranteed (i.e., missed state changes/events between samples). The first part of this thesis analyzes three heuristics that efficiently solve the NP-complete problem of minimizing the amount of memory required to store system state changes to guarantee accurate state reconstruction. The experimental results demonstrate that adopting near-optimal algorithms do not greatly change the memory consumption and execution time of monitored programs; hence, NP-completeness is likely not an obstacle for time-triggered runtime verification. The second part of this thesis introduces a novel runtime verification technique called hybrid runtime verification. Hybrid runtime verification enables the monitor to toggle between event- and time-triggered modes of operation. The aim of this approach is to reduce the overall runtime monitoring overhead with respect to execution time. Minimizing the execution time overhead by employing hybrid runtime verification is not in NP. An integer linear programming heuristic is formulated to determine near-optimal hybrid monitoring schemes. Experimental results show that the heuristic typically selects monitoring schemes that are equal to or better than naively selecting exclusively one operation mode for monitoring. software runtime verification runtime monitoring Electrical and Computer Engineering
2	Methods for Reducing Monitoring Overhead in Runtime Verification Wu, Chun Wah Wallace January 2013 (has links) Runtime verification is a lightweight technique that serves to complement existing approaches, such as formal methods and testing, to ensure system correctness. In runtime verification, monitors are synthesized to check a system at run time against a set of properties the system is expected to satisfy. Runtime verification may be used to determine software faults before and after system deployment. The monitor(s) can be synthesized to notify, steer and/or perform system recovery from detected software faults at run time. The research and proposed methods presented in this thesis aim to reduce the monitoring overhead of runtime verification in terms of memory and execution time by leveraging time-triggered techniques for monitoring system events. Traditionally, runtime verification frameworks employ event-triggered monitors, where the invocation of the monitor occurs after every system event. Because systems events can be sporadic or bursty in nature, event-triggered monitoring behaviour is difficult to predict. Time-triggered monitors, on the other hand, periodically preempt and process system events, making monitoring behaviour predictable. However, software system state reconstruction is not guaranteed (i.e., missed state changes/events between samples). The first part of this thesis analyzes three heuristics that efficiently solve the NP-complete problem of minimizing the amount of memory required to store system state changes to guarantee accurate state reconstruction. The experimental results demonstrate that adopting near-optimal algorithms do not greatly change the memory consumption and execution time of monitored programs; hence, NP-completeness is likely not an obstacle for time-triggered runtime verification. The second part of this thesis introduces a novel runtime verification technique called hybrid runtime verification. Hybrid runtime verification enables the monitor to toggle between event- and time-triggered modes of operation. The aim of this approach is to reduce the overall runtime monitoring overhead with respect to execution time. Minimizing the execution time overhead by employing hybrid runtime verification is not in NP. An integer linear programming heuristic is formulated to determine near-optimal hybrid monitoring schemes. Experimental results show that the heuristic typically selects monitoring schemes that are equal to or better than naively selecting exclusively one operation mode for monitoring. software runtime verification runtime monitoring Electrical and Computer Engineering
3	Automata based monitoring and mining of execution traces Reger, Giles Matthew January 2014 (has links) This thesis contributes work to the fields of runtime monitoring and specification mining. It develops a formalism for specifying patterns of behaviour in execution traces and defines techniques for checking these patterns in, and extracting patterns from, traces. These techniques represent an extension in the expressiveness of properties that can be efficiently and effectively monitored and mined. The behaviour of a computer system is considered in terms of the actions it performs, captured in execution traces. Patterns of behaviour, formally defined in trace specifications, denote the traces that the system should (or should not) exhibit. The main task this work considers is that of checking that the system conforms to the specification i.e. is correct. Additionally, trace specifications can be used to document behaviour to aid maintenance and development. However, formal specifications are often missing or incomplete, hence the mining activity. Previous work in the field of runtime monitoring (checking execution traces) has tended to either focus on efficiency or expressiveness, with different approaches making different trade-offs. This work considers both, achieving the expressiveness of the most expressive existing tools whilst remaining competitive with the most efficient. These elements of expressiveness and efficiency depend on the specification formalism used. Therefore, we introduce quantified event automata for describing patterns of behaviour in execution traces and then develop a range of efficient monitoring algorithms. To monitor execution traces we need a formal description of expected behaviour. However, these are often difficult to write - especially as there is often a lack of understanding of actual behaviour. The field of specification mining aims to explain the behaviour present in execution traces by extracting specifications that conform to those traces. Previous work in this area has primarily been limited to simple specifications that do not consider data. By leveraging the quantified event automata formalism, and its efficient trace checking procedures, we introduce a generate-and-check style mining framework capable of accurately extracting complex specifications. This thesis, therefore, makes separate significant contributions to the fields of runtime monitoring and specification mining. This work generalises and extends existing techniques in runtime monitoring, enabling future research to better understand the interaction between expressiveness and efficiency. This work combines and extends previous approaches to specification mining, increasing the expressiveness of specifications that can be mined. 004
4	A MULTITHREADED RUNTIME SUPPORT ENVIRONMENT FOR DYNAMIC RECONFIGURABLE COMPUTING PANDEY, ANKUR 27 September 2002 (has links) No description available. reconfigurable computing runtime reconfiguration runtime support environment OS for FPGA
5	Popcorn Linux: A Compiler and Runtime for Execution Migration Between Heterogeneous-ISA Architectures Lyerly, Robert Frantz 25 April 2019 (has links) In recent years there has been a proliferation of parallel and heterogeneous architectures. As chip designers have hit fundamental limits in traditional processor scaling, they have begun rethinking processor architecture from the ground up. In addition to creating new classes of processors, chip designers have revisited CPU microarchitecture in order to target different computing contexts. CPUs have been optimized for low-power smartphones and extended for high-performance computing in order to achieve better performance energy efficiency for heavy computational tasks. Although heterogeneity adds significant complexity to both hardware and software, recent works have shown tremendous power and performance benefits obtainable through specialization. It is clear that emerging systems will be increasingly heterogeneous. Many of these emerging systems couple together cores of different instruction set architectures (ISA), due to both market forces and the potential performance and power benefits in optimizing application execution. However, differently from symmetric multiprocessors or even asymmetric single-ISA multiprocessors, natively compiled applications cannot freely migrate between heterogeneous-ISA processors. This is due to the fact that applications are compiled to an instruction set architecture-specific format which is incompatible on other instruction set architectures. This creates serious limitations, as execution migration is a fundamental mechanism used by schedulers to reach performance or fairness goals, allows applications to migrate between heterogeneous-ISA CPUs in order to accelerate parallel applications or even leverage ISA-heterogeneity for security benefits. This dissertation describes system software for automatically migrating natively compiled applications across heterogeneous-ISA processors. This dissertation describes the implementation and evaluation of a complete software stack on commodity scale heterogeneous-ISA CPUs, emulating datacenters with heterogeneous-ISA systems or future systems that tightly integrate heterogeneous-ISA CPUs via point-to-point interconnect. This dissertation describes a compiler which builds applications for heterogeneous-ISA execution migration. The compiler generates machine code for every architecture in the system and lays out the application's code and data in a common format. In addition, the compiler generates metadata used by a state transformation runtime to dynamically transform thread execution state between ISA-specific formats, allowing application threads to migrate between different ISAs. The compiler and runtime is evaluated in conjunction with a replicated-kernel operating system, which provides thread migration and distributed shared virtual memory across heterogeneous-ISA processors. This redesigned software stack is evaluated on a setup containing and ARM and an x86 processor interconnected via point-to-point interconnect over PCIe. This dissertation shows that sub-millisecond state transformation is achievable. Additionally, it shows that for a datacenter-like workload using benchmarks from the NAS Parallel Benchmark suite, the system can trade some performance for up to a 66% reduction in energy and up to an 11% reduction in energy-delay product. This dissertation then describes an exploration into using hardware transactional memory (HTM) to maximize scheduling flexibility. Because applications can only migrate between ISAs at program locations with specific properties, there may be a significant delay between when the scheduler wishes to migrate an application and when the application can respond to the migration request. In order to reduce this migration response time, this dissertation describes compiler instrumentation which uses HTM to allow the scheduler to force applications to roll back to the most recently encountered program location suitable for migration. This is evaluated both in terms of overhead and responsiveness to migration requests. In addition to showing the viability of the infrastructure for optimizing workload placement in a heterogeneous-ISA datacenter, this dissertation also demonstrates utilizing the infrastructure to accelerate multithreaded applications. This dissertation describes a new OpenMP runtime named libopenpop that is optimized for executing applications in heterogeneous- ISA systems with distributed shared virtual memory. The runtime utilizes synchronization primitives that enable scale-out execution across rack-scale systems and new work distribution mechanisms that predict the best partitioning of parallel work across CPUs with diverse architectural characteristics. libopenpop demonstrates sizable improvements over a na¨ıve OpenMP implementation – a 38x improvement in multi-server barrier latency, a 5.4x improvement in multi-server data reductions and a geometric mean speedup of 4.04x for scalable applications in an 8-node x86-64 cluster. For a heterogeneous system composed of a highly-clocked x86 server and a highly-parallel ARM server, libopenpop delivers up to a 4.7x speedup and a geometric mean speedup of 41% across benchmarks from several benchmark suites versus the best single-node homogeneous execution. Finally, this dissertation describes leveraging the compiler and state transformation runtime to provide enhanced security for applications. Because the compiler provides detailed information about the stack layout of applications, it can be leveraged to defend against exploits such as stack smashing attacks and return-oriented programming attacks. This dissertation describes Chameleon, a runtime which uses the compiler and state transformation infrastructure to continuously re-randomize the stack layout and code of vulnerable applications to thwart attackers. Chameleon attaches to applications using existing operating system interfaces and periodically switches the application to new randomized stack layouts and code by rewriting the stack. Chameleon enhances security with little overhead – it disrupts a geometric mean 76.32% of code gadgets in benchmark binaries, randomizes stack element locations with geometric mean 3 potential randomized locations, and has 1.1% overhead when re-randomizing every 50 milliseconds, making it extremely difficult for attackers to exploit target applications. / Doctor of Philosophy / Computer processors have experienced unprecedented performance improvements over the past 50 years. However, due to physical limitations of how processors execute, in recent years this performance growth has started to slow. In order to continue scaling performance, chip designers have begun diversifying processor designs to meet different performance and power consumption targets. Processors specialized for different contexts use various instruction set architectures (ISAs), the operations made available for use by the hardware. Programs built for one instruction set architecture are not compatible with others, requiring developers to build complex applications to manually bridge the gap. This leads to brittle applications and prevents the system software managing the processors from adapting workloads to match processor characteristics. This dissertation presents the Popcorn Linux system software which provides transparent support for running applications across computers composed of processors of multiple ISAs. Popcorn Linux provides the ability to migrate applications between these processors without requiring developers to add any application instrumentation – the system software manages all the details of building and migrating applications. An evaluation of Popcorn Linux shows that transparently migrating applications between diverse processors provides power and performance benefits in a variety of scenarios. Additionally, this dissertation describes leveraging the Popcorn Linux software infrastructure to harden applications against attackers seeking to hijack applications for malicious purposes. heterogeneous architectures compilers runtime systems
6	Scaling managed runtime systems for future multicore hardware Ha, Jung Woo 27 August 2010 (has links) The exponential improvement in single processor performance has recently come to an end, mainly because clock frequency has reached its limit due to power constraints. Thus, processor manufacturers are choosing to enhance computing capabilities by placing multiple cores into a single chip, which can improve performance given parallel software. This paradigm shift to chip multiprocessors (also called multicore) requires scalable parallel applications that execute tasks on each core, otherwise the additional cores are worthless. Making an application scalable requires more than simply parallelizing the application code itself. Modern applications are written in managed languages, which require automatic memory management, type and memory abstractions, dynamic analysis and just-in-time (JIT) compilation. These managed runtime systems monitor and interact frequently with the executing application. Hence, the managed runtime itself must be scalable, and the instrumentation that monitors the application should not perturb its scalability. While multicore hardware forces a redesign of managed runtimes for scalability, it also provides opportunities when applications do not fully utilize all of the cores. Using available cores for concurrent helper threads that enhance the software, with debugging, security, and software support will make the runtime itself more capable and more scalable. This dissertation presents two novel techniques that improve the scalability of managed runtimes by utilizing unused cores. The first technique is a concurrent dynamic analysis framework that provides a low-overhead buffering mechanism called Cache-friendly Asymmetric Buffering (CAB) that quickly offloads data from the application to helper threads that perform specific dynamic analyses. Our framework minimizes application instrumentation overhead, prevents microarchitectural side-effects, and supports a variety of dynamic analysis clients, ranging from call graph and path profiling to cache simulation. The use of this framework ensures that helper threads perturb the performance of application as little as possible. Our second technique is concurrent trace-based just-in-time compilation, which exploits available cores for the JavaScript runtime. The JavaScript language limits applications to a single-thread, so extra cores are worthless unless they are used by the runtime components. We redesigned a production trace-based JIT compiler to run concurrently with the interpreter, and our technique is the first to improve both responsiveness and throughput in a trace-based JIT compiler. This thesis presents the design and implementation of both techniques and shows that they improve scalability and core utilization when running applications in managed runtimes. Industry is already adopting our approaches, which demonstrates the urgency of the scalable runtime problem and the utility of these techniques. / text Scalability Multicore Managed Language Runtime System Parallelism
7	Towards an improved memory model for Java Kotrajaras, Vishnu January 2002 (has links) No description available. 005
8	Composability of parallel codes on heterogeneous architectures / La composition des codes parallèles sur plates-formes hétérogènes Hugo, Andra-Ecaterina 12 December 2014 (has links) Pour répondre aux besoins de précision et d'efficacité des simulations scientifiques, la communauté du Calcul Haute Performance augmente progressivement les demandes en terme de parallélisme, rajoutant ainsi un besoin croissant de réutiliser les bibliothèques parallèles optimisées pour les architectures complexes.L'utilisation simultanée de plusieurs bibliothèques de calcul parallèle au sein d'une application soulève bien souvent des problèmes d 'efficacité. En compétition pour l'obtention des ressources, les routines parallèles, pourtant optimisées, se gênent et l'on voit alors apparaître des phénomènes de surcharge, de contention ou de défaut de cache.Dans cette thèse, nous présentons une technique de cloisonnement de flux de calculs qui permet de limiter les effets de telles interférences. Le cloisonnement est réalisé à l'aide de contextes d'exécution qui partitionnement les unités de calculs voire en partagent certaines. La répartition des ressources entre les contextes peut être modifiée dynamiquement afin d'optimiser le rendement de la machine. A cette fin, nous proposons l'utilisation de certaines métriques par un superviseur pour redistribuer automatiquement les ressources aux contextes. Nous décrivons l'intégration des contextes d'ordonnancement au support d'exécution pour machines hétérogènes StarPU et présentons des résultats d'expériences démontrant la pertinence de notre approche. Dans ce but, nous avons implémenté une extension du solveur direct creux qr mumps dans la quelle nous avons fait appel à ces mécanismes d'allocation de ressources. A travers les contextes d'ordonnancement nous décrivons une nouvelle méthode de décomposition du problème basée sur un algorithme de \proportional mapping". Le superviseur permet de réadapter dynamiquement et automatiquement l'allocation des ressources au parallèlisme irrégulier de l'application. L'utilisation des contextes d'ordonnancement et du superviseur a amélioré la localité et la performance globale du solveur. / To face the ever demanding requirements in term of accuracy and speed of scientific simulations, the High Performance community is constantly increasing the demands in term of parallelism, adding thus tremendous value to parallel libraries strongly optimized for highly complex architectures.Enabling HPC applications to perform efficiently when invoking multiple parallel libraries simultaneously is a great challenge. Even if a uniform runtime system is used underneath, scheduling tasks or threads coming from dfferent libraries over the same set of hardware resources introduces many issues, such as resource oversubscription, undesirable cache ushes or memory bus contention.In this thesis, we present an extension of StarPU, a runtime system specifically designed for heterogeneous architectures, that allows multiple parallel codes to run concurrently with minimal interference. Such parallel codes run within scheduling contexts that provide confined executionenvironments which can be used to partition computing resources. Scheduling contexts can be dynamically resized to optimize the allocation of computing resources among concurrently running libraries. We introduced a hypervisor that automatically expands or shrinks contexts using feedback from the runtime system (e.g. resource utilization). We demonstrated the relevance of this approach by extending an existing generic sparse direct solver (qr mumps) to use these mechanisms and introduced a new decomposition method based on proportional mapping that is used to build the scheduling contexts. In order to cope with the very irregular behavior of the application, the hypervisor manages dynamically the allocation of resources. By means of the scheduling contexts and the hypervisor we improved the locality and thus the overall performance of the solver. Composition Support d'exécution Hypervisor Composability Runtime Hypervisor
9	A Coupled Multi-ALU Processing Node for a Highly Parallel Computer Keckler, Stephen W. 01 September 1992 (has links) This report describes Processor Coupling, a mechanism for controlling multiple ALUs on a single integrated circuit to exploit both instruction-level and inter-thread parallelism. A compiler statically schedules individual threads to discover available intra-thread instruction-level parallelism. The runtime scheduling mechanism interleaves threads, exploiting inter-thread parallelism to maintain high ALU utilization. ALUs are assigned to threads on a cycle byscycle basis, and several threads can be active concurrently. Simulation results show that Processor Coupling performs well both on single threaded and multi-threaded applications. The experiments address the effects of memory latencies, function unit latencies, and communication bandwidth between function units. runtime scheduling compile time scheduling parallelscomputers multithreading
10	Runtime Verification with Controllable Time Predictability and Memory Utilization Kumar, Deepak 20 September 2013 (has links) The goal of runtime verifi cation is to inspect the well-being of a system by employing a monitor during its execution. Such monitoring imposes cost in terms of resource utilization. Memory usage and predictability of monitor invocations are the key indicators of the quality of a monitoring solution, especially in the context of embedded systems. In this work, we propose a novel control-theoretic approach for coordinating time predictability and memory utilization in runtime monitoring of real-time embedded systems. In particular, we design a PID controller and four fuzzy controllers with di erent optimization control objectives. Our approach controls the frequency of monitor invocations by incorporating a bounded memory bu er that stores events which need to be monitored. The controllers attempt to improve time predictability, and maximize memory utilization, while ensuring the soundness of the monitor. Unlike existing approaches based on static analysis, our approach is scalable and well-suited for reactive systems that are required to react to stimuli from the environment in a timely fashion. Our experiments using two case studies (a laser beam stabilizer for aircraft tracking, and a Bluetooth mobile payment system) demonstrate the advantages of using controllers to achieve low variation in the frequency of monitor invocations, while maintaining maximum memory utilization in highly non-linear environments. In addition to this problem, the thesis presents a brief overview of our preceding work on runtime verifi cation. Runtime Verification

Search results