Global ETD Search

1	Cross-core Microarchitectural Attacks and Countermeasures Irazoki, Gorka 24 April 2017 (has links) In the last decade, multi-threaded systems and resource sharing have brought a number of technologies that facilitate our daily tasks in a way we never imagined. Among others, cloud computing has emerged to offer us powerful computational resources without having to physically acquire and install them, while smartphones have almost acquired the same importance desktop computers had a decade ago. This has only been possible thanks to the ever evolving performance optimization improvements made to modern microarchitectures that efficiently manage concurrent usage of hardware resources. One of the aforementioned optimizations is the usage of shared Last Level Caches (LLCs) to balance different CPU core loads and to maintain coherency between shared memory blocks utilized by different cores. The latter for instance has enabled concurrent execution of several processes in low RAM devices such as smartphones. Although efficient hardware resource sharing has become the de-facto model for several modern technologies, it also poses a major concern with respect to security. Some of the concurrently executed co-resident processes might in fact be malicious and try to take advantage of hardware proximity. New technologies usually claim to be secure by implementing sandboxing techniques and executing processes in isolated software environments, called Virtual Machines (VMs). However, the design of these isolated environments aims at preventing pure software- based attacks and usually does not consider hardware leakages. In fact, the malicious utilization of hardware resources as covert channels might have severe consequences to the privacy of the customers. Our work demonstrates that malicious customers of such technologies can utilize the LLC as the covert channel to obtain sensitive information from a co-resident victim. We show that the LLC is an attractive resource to be targeted by attackers, as it offers high resolution and, unlike previous microarchitectural attacks, does not require core-colocation. Particularly concerning are the cases in which cryptography is compromised, as it is the main component of every security solution. In this sense, the presented work does not only introduce three attack variants that can be applicable in different scenarios, but also demonstrates the ability to recover cryptographic keys (e.g. AES and RSA) and TLS session messages across VMs, bypassing sandboxing techniques. Finally, two countermeasures to prevent microarchitectural attacks in general and LLC attacks in particular from retrieving fine- grain information are presented. Unlike previously proposed countermeasures, ours do not add permanent overheads in the system but can be utilized as preemptive defenses. The first identifies leakages in cryptographic software that can potentially lead to key extraction, and thus, can be utilized by cryptographic code designers to ensure the sanity of their libraries before deployment. The second detects microarchitectural attacks embedded into innocent-looking binaries, preventing them from being posted in official application repositories that usually have the full trust of the customer. Microarchitectural attacks cache attacks side channel attacks
2	Physical design for performance and thermal and power-supply reliability in modern 2D and 3D microarchitectures Healy, Michael Benjamin 27 August 2010 (has links) The main objective of this research is to examine the performance, power noise, and thermal trade-offs in modern traditional (2D) and three-dimensionally-integrated (3D) architectures and to present design automation tools and physical design methodologies that enable higher reliability while maintaining microarchitectural performance for these systems. Five main research topics that support this goal are included. The first topic focuses on thermal reliability. The second, third, and fourth, topics examine power-supply noise. The final topic presents a set of physical design and analysis methodologies used to produce a 3D design that was sent for fabrication in March of 2010. The first section of this dissertation details a microarchitectural floorplanning algorithm that enables the user to choose and adjust the trade-off between microarchitectural performance and general operating temperature in both 2D and 3D systems, which is a major determinant of overall reliability and chip lifetime. Simulation results demonstrate that the algorithm performs as expected and successfully provides the user with the desired trade-off. The first section also presents a thermal-aware microarchitectural floorplanning algorithm designed to help reduce the operating temperature of the cores in the unique environment present within multi-core processors. Heat-coupling between neighboring cores is considered during the optimization process to provide floorplans that result in lower maximum temperature. The second section explores power-supply noise in processors caused by fine-grained clock-gating and describes a floorplanning algorithm created to work with an active noise-canceling clock-gating controller. Simulation results show that combining these two techniques results in lower power-supply noise with minimal processor performance impact. The third section turns to future 3D systems with a large number of stacked active layers (many-tier systems) and examines power-supply delivery challenges in these systems. Parasitic resistance, capacitance, and inductance are calculated for the 3D vias, and the results of scaling various parameters in the power-supply-network design are presented. Several techniques for reducing power-supply-network noise in these many-tier systems are explored. The fourth section describes a layout-level analysis of a novel power distribution through-silicon-via topology and it's effect on IR-drop and dynamic noise. Simulations show that both types of power-supply noise can be reduced by more than 20\% in systems with non-uniform per-tier power dissipation when using the proposed topology. The final section explains the physical design and analysis techniques used to produce the layouts for 3D-MAPS, a 64-core 3D-stacked memory-on-processor system targeted at demonstration of large memory bandwidth using 3D connections. The 3D-aware physical design flow utilizing non-3D-aware commercial tools is detailed, along with the techniques and add-ons that were developed to enable this process. Design automation Microarchitectural floorplanning 3D IC Microprocessors Integrated circuits Microelectronics
3	Examining the Impact of Microarchitectural Attacks on Microkernels : a study of Meltdown and Spectre Grimsdal, Gunnar, Lundgren, Patrik January 2019 (has links) Most of today's widely used operating systems are based on a monolithic design and have a very large code size which complicates verification of security-critical applications. One approach to solving this problem is to use a microkernel, i.e., a small kernel which only implements the bare necessities. A system usinga microkernel can be constructed using the operating-system framework Genode, which provides security features and a strict process hierarchy. However, these systems may still be vulnerable to microarchitectural attacks, which can bypassan operating system's security features, exploiting vulnerable hardware. This thesis aims to investigate whether microkernels are vulnerable to the microarchitectural attacks Meltdown and Spectre version 1 in the context of Genode. Furthermore, the thesis analyzes the execution cost of mitigating Spectre version 1 in a Genode's remote procedure call. The result shows how Genode does not mitigate the Meltdown attack, which will be confirmed by demonstrating a working Meltdown attack on Genode+Linux. We also determine that microkernels are vulnerable to Spectre by demonstrating a working attack against two microkernels. However, we show that the cost of mitigating this Spectre attack is small, with a cost of < 3 slowdown for remote procedure calls in Genode. Genode Meltdown Spectre Nova Okl4 microarchitectural attacks microkernel Computer Engineering Datorteknik
4	Dynamic Eviction Set Algorithms and Their Applicability to Cache Characterisation Lindqvist, Maria January 2020 (has links) Eviction sets are groups of memory addresses that map to the same cache set. They can be used to perform efficient information-leaking attacks against the cache memory, so-called cache side channel attacks. In this project, two different algorithms that find such sets are implemented and compared. The second of the algorithms improves on the first by using a concept called group testing. It is also evaluated if these algorithms can be used to analyse or reverse engineer the cache characteristics, which is a new area of application for this type of algorithms. The results show that the optimised algorithm performs significantly better than the previous state-of-the-art algorithm. This means that countermeasures developed against this type of attacks need to be designed with the possibility of faster attacks in mind. The results also shows, as a proof-of-concept, that it is possible to use these algorithms to create a tool for cache analysis. microarchitectural attacks cache attacks side channel attacks eviction set cache memory Computer Engineering Datorteknik
5	Techniques for LI-BDN Synthesis for Hybrid Microarchitectural Simulation Harris, Tyler S. 11 May 2013 (has links) (PDF) Computer designers rely upon near-cycle-accurate microarchitectural simulation to explore the design space of new systems. Unfortunately, such simulators are becoming increasingly slow as systems become more complex. Hybrid simulators which offload some of the simulation work onto FPGAs can increase the speed; however, such simulators must be automatically synthesized or the time to design them becomes prohibitive. Furthermore, FPGA implementations of simulators may require multiple FPGA clock cycles to implement behavior that takes place within one simulated clock cycle, making correct arbitrary composition of simulator components impossible and limiting the amount of hardware concurrency which can be achieved. Latency-Insensitive Bounded Dataflow Networks (LI-BDNs) have been suggested as a means to permit composition of simulator components in FPGAs. However, previous work has required that LI-BDNs be created manually. This paper introduces techniques for automated synthesis of LI-BDNs from the processes of a System-C microarchitectural model. We demonstrate that LI-BDNs can be successfully synthesized. We also introduce a technique for reducing the overhead of LI-BDNs when the latency-insensitive property is unnecessary, resulting in up to a 60% reduction in FPGA resource requirements. hybrid simulation latency insensitivity microarchitectural simulation SPRI synthesis Electrical and Computer Engineering
6	Interface Design and Synthesis for Structural Hybrid Microarchitectural Simulators Ruan, Zhuo 01 December 2013 (has links) (PDF) Computer architects have discovered the potential of using FPGAs to accelerate software microarchitectural simulators. One type of FPGA-accelerated microarchitectural simulator, namedthe hybrid structural microarchitectural simulator, is very promising. This is because a hybrid structural microarchitectural simulator combines structural software and hardware, and this particular organization provides both modeling flexibility and fast simulation speed. The performance of a hybrid simulator is significantly affected by how the interface between software and hardware is constructed. The work of this thesis creates an infrastructure, named Simulator Partitioning Research Infrastructure (SPRI), to implement the synthesis of hybrid structural microarchitectural simulators which includes simulator partitioning, simulator-to-hardware synthesis, interface synthesis. With the support of SPRI, this thesis characterizes the design space of interfaces for synthesized hybrid structural microarchitectural simulators and provides the implementations for several such interfaces. The evaluation of this thesis thoroughly studies the important design tradeoffs and performance factors (e.g. hardware capacity, design scalability, and interface latency) involved in choosing an efficient interface. The work of this thesis is essential to the research community of computer architecture. It not only contributes a complete synthesis infrastructure, but also provides guidelines to architects on how to organize software microarchitectural models and choose a proper software/hardware interface so the hybrid microarchitectural simulators synthesized from these software models can achieve desirable speedup hybrid microarchitectural simulator software codesign hardware codesign SystemC FPGA Electrical and Computer Engineering
7	Maintaining Security in the Era of Microarchitectural Attacks Oleksenko, Oleksii 16 November 2021 (has links) Shared microarchitectural state is a target for side-channel attacks that leverage timing measurements to leak information across security domains. These attacks are further enhanced by speculative execution, which transiently distorts the control and data flow of applications, and by untrusted environments, where the attacker may have complete control over the victim program. Under these conditions, microarchitectural attacks can bypass software isolation mechanisms, and hence they threaten the security of virtually any application running in a shared environment. Numerous approaches have been proposed to defend against microarchitectural attacks, but we lack the means to test them and ensure their effectiveness. The users cannot test them manually because the effects of the defences are not visible to software. Testing the defences by attempting attacks is also suboptimal because the attacks are inherently unstable, and a failed attack is not always an indicator of a successful defence. Moreover, some classes of defences can be disabled at runtime. Hence, we need automated tools that would check the effectiveness of defences, both at design time and at runtime. Yet, as it is common in security, the existing solutions lag behind the developments in attacks. In this thesis, we propose three techniques that check the effectiveness of defences against modern microarchitectural attacks. Revizor is an approach to automatically detect microarchitectural information leakage in commercial black-box CPUs. SpecFuzz is a technique for dynamic testing of applications to find instances of speculative vulnerabilities. Varys is an approach to runtime monitoring of system defences against microarchitectural attacks. We show that with these techniques, we can successfully detect microarchitectural vulnerabilities in hardware and flaws in defences against them; find unpatched instances of speculative vulnerabilities in software; and detect attempts to invalidate system defences. info:eu-repo/classification/ddc/004 ddc:004
8	A Hardware Framework for Yield and Reliability Enhancement in Chip Multiprocessors Pan, Abhisek 01 January 2009 (has links) (PDF) Device reliability and manufacturability have emerged as dominant concerns in end-of-road CMOS devices. Today an increasing number of hardware failures are attributed to device reliability problems that cause partial system failure or shutdown. Also maintaining an acceptable manufacturing yield is seen as challenge because of smaller feature sizes, process variation, and reduced headroom for burn-in tests. In this project we investigate a hardware-based scheme for improving yield and reliability of a homogeneous chip multiprocessor (CMP). The proposed solution involves a hardware framework that enables us to utilize the redundancies inherent in a multi-core system to keep the system operational in face of partial failures due to hard faults (faults due to manufacturing defects or permanent faults developed during system lifetime). A micro-architectural modification allows a faulty core in a multiprocessor system to use another core as a coprocessor to service any instruction that the former cannot execute correctly by itself. This service is accessed to improve yield and reliability, but at the cost of some loss of performance. In order to quantify this loss we have used a cycle-accurate architectural simulator to simulate the performance of dual-core and quad-core systems with one or more cores sustaining partial failure. Simulation studies indicate that when a large and sparingly-used unit such as a floating point unit fails in a core, even for a floating point intensive benchmark, we can continue to run the faulty core with as little as 10% performance impact and minimal area overhead. Incorporating this recovery mechanism entails some modifications in the microprocessor micro-architecture. The modifications are also described here through a simplified model of a superscalar processor. Computer architecture Yield and reliability Chip multiprocessors Microarchitectural modification Simulation Redundance across cores Electrical engineering Computer and Systems Architecture Hardware Systems
9	Architecture and Compiler Support for Leakage Reduction Using Power Gating in Microprocessors Roy, Soumyaroop 31 August 2010 (has links) Power gating is a technique commonly used for runtime leakage reduction in digital CMOS circuits. In microprocessors, power gating can be implemented by using sleep transistors to selectively deactivate circuit modules when they are idle during program execution. In this dissertation, a framework for power gating arithmetic functional units in embedded microprocessors with architecture and compiler support is proposed. During compile time, program regions are identified where one or more functional units are idle and sleep instructions are inserted into the code so that those units can be put to sleep during program execution. Subsequently, when their need is detected during the instruction decode stage, they are woken up with the help of hardware control signals. For a set of benchmarks from the MiBench suite, leakage energy savings of 27% and 31% are achieved (based on a 70 nm PTM model) in the functional units of a processor, modeled on the ARM architecture, with and without floating point units, respectively. Further, the impact of traditional performance-enhancing compiler optimizations on the amount of leakage savings obtained with this framework is studied through analysis and simulations. Based on the observations, a leakage-aware compilation flow is derived that improves the effectiveness of this framework. It is observed that, through the use of various compiler optimizations, an additional savings of around 15% and even up to 9X leakage energy savings in individual functional units is possible. Finally,in the context of multi-core processors supporting multithreading, three different microarchitectural techniques, for different multithreading schemes, are investigated for state-retentive power gating of register files. In an in-order core, when a thread gets blocked due to a memory stall, the corresponding register file can be placed in a low leakage state. When the memory stall gets resolved, the register file is activated so that it may be accessed again. The overhead due to wake-up latency is completely hidden in two of the schemes, while it is hidden for the most part in the third. Experimental results on multiprogrammed workloads comprised of SPEC 2000 integer benchmarks show that, in an 8-core processor executing 64 threads, the average leakage savings in the register files, modeled in FreePDK 45 nm MTCMOS technology, are 42% in coarse-grained multithreading, while they are between 7% and 8% in fine-grained and simultaneous multithreading. The contributions of this dissertation represent a significant advancement in the quest for reducing leakage energy consumption in microprocessors with minimal degradation in performance. Compiler Directed Power Gating Microarchitectural Techniques Embedded Microprocessors Multithreading Multiprocessing Multicore Niagara CGMT FGMT SMT GCC SUIF MachineSUIF M5 American Studies Arts and Humanities Computer Engineering Computer Sciences
10	Accelerating microarchitectural simulation via statistical sampling principles Bryan, Paul David 05 December 2012 (has links) The design and evaluation of computer systems rely heavily upon simulation. Simulation is also a major bottleneck in the iterative design process. Applications that may be executed natively on physical systems in a matter of minutes may take weeks or months to simulate. As designs incorporate increasingly higher numbers of processor cores, it is expected the times required to simulate future systems will become an even greater issue. Simulation exhibits a tradeoff between speed and accuracy. By basing experimental procedures upon known statistical methods, the simulation of systems may be dramatically accelerated while retaining reliable methods to estimate error. This thesis focuses on the acceleration of simulation through statistical processes. The first two techniques discussed in this thesis focus on accelerating single-threaded simulation via cluster sampling. Cluster sampling extracts multiple groups of contiguous population elements to form a sample. This thesis introduces techniques to reduce sampling and non-sampling bias components, which must be reduced for sample measurements to be reliable. Non-sampling bias is reduced through the Reverse State Reconstruction algorithm, which removes ineffectual instructions from the skipped instruction stream between simulated clusters. Sampling bias is reduced via the Single Pass Sampling Regimen Design Process, which guides the user towards selected representative sampling regimens. Unfortunately, the extension of cluster sampling to include multi-threaded architectures is non-trivial and raises many interesting challenges. Overcoming these challenges will be discussed. This thesis also introduces thread skew, a useful metric that quantitatively measures the non-sampling bias associated with divergent thread progressions at the beginning of a sampling unit. Finally, the Barrier Interval Simulation method is discussed as a technique to dramatically decrease the simulation times of certain classes of multi-threaded programs. It segments a program into discrete intervals, separated by barriers, which are leveraged to avoid many of the challenges that prevent multi-threaded sampling. Microarchitectural simulation Barrier interval simulation Single pass sampling regimen design Reverse state reconstruction Cluster sampling Sampled simulation Multi-threaded sampling obstacles Computer architecture Sampling (Statistics) Computer simulation Simulation methods

Search results