• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

SNIC-DSM: SmartNIC based DSM Infrastructure for Heterogeneous-ISA Machines

Ramesh, Hemanth 14 August 2023 (has links)
Heterogeneous computing is increasingly used in today's datacenters to meet the increasing computational demands of applications. Heterogeneous hardware typically includes CPUs, GPUs, ASICs, and FPGAs, among others. An important emerging trend is instructionset- architecture (ISA)-heterogeneity: high-end x86 servers with attached SmartNICs and SmartSSDs that incorporate general-purpose CPUs, typically of the RISC ISA family (e.g., ARM, RISC-V). To alleviate resource congestion on server computing nodes, application workloads can be scaled-out across server x86 CPUs and SmartNIC ARM CPUs using the distributed shared memory (DSM) abstraction. We present SNIC-DSM, a SmartNIC-based DSM infrastructure for heterogeneous ISA machines. SNIC-DSM implements a low-latency messaging layer, which enables inter-node communication across multi-ISA CPUs, and a DSM protocol processor that provides memory coherency among these nodes, both implemented in SmartNIC's FPGA logic. SNIC-DSM is reconfigurable and allows the implementation of different memory consistency protocols. Our experimental studies using compute-intensive benchmarks reveal that SNIC-DSM outperforms the state-of-the-art DSM - Popcorn Linux's software DSM - when server resource congestion is high. / Master of Science / The availability of heterogeneous computing architectures has led to the development of distributed shared memory systems, which allows compute-intensive applications to run in a distributed manner on different types of computing devices such as graphics processors, reconfigurable logic devices, and custom integrated circuits. Adopting such a heterogeneous computing strategy yields better performance and improves power consumption. Generally, these DSM systems use a software-based approach, which offers great flexibility but suffers from software overheads. Hardware-based approaches are used to overcome these limitations but they generally do not offer flexibility. This thesis presents, SNIC-DSM, which is a reconfigurable implementation of the DSM framework. SNIC-DSM provides a platform for the host and smart networking devices such as SmartNICs to communicate with each other and enables application execution in a distributed manner by providing memory coherency. Our experimental evaluation using High-Performance Computing benchmarks reveals that SNIC-DSM improves performance when compared with software-based DSM.
2

f-DSM: An FPGA-Accelerated Distributed Shared Memory for Heterogeneous Instruction-Set-Architecture Hardware

VSathish, Naarayanan Rao 03 March 2022 (has links)
Due to the diminishing relevance of Moore's Law, traditional multi-core systems are increasingly struggling to meet the computational demands of many emerging workloads. Heterogeneous computing, which involves exploiting higher degrees of parallelism (e.g., GPUs) and application-specific specialization (e.g., FPGAs), is increasingly used to meet this demand. An important architectural trend in this space involves using instruction-set-architecture (ISA) heterogeneity. An exemplar case is emerging I/O devices that include CPU cores with ISAs (e.g., ARM, RISC-V) that differ from that of host CPUs (e.g., x86) and have physically discrete memory. Shared-memory programming of such systems requires the Dis- tributed Shared Memory (DSM) abstraction. Software DSM incurs significant OS overhead for maintaining memory coherency. Despite outperforming software predecessors, hardware DSM and cache-coherent interfaces require custom chips and lack the flexibility to experiment with different DSM consistency protocols. This thesis presents fDSM, an FPGA-accelerated DSM framework for ISA-heterogeneous hardware. fDSM implements a high-speed messaging layer to enable inter-node communication across ISA-different CPU cores and a DSM protocol processor that maintains virtual memory coherency using a multiple-reader single- writer DSM algorithm. Experimental studies reveal that fDSM outperforms prior art, including Popcorn Linux's software DSM abstraction, which uses TCP-IP and state-of-the-art Infiniband RDMA messaging layers by 2.8X and 7%, respectively. fDSM also provides reconfigurability and thereby allows implementation and experimentation of different memory consistency models. / Master of Science / Moore's Law predicts that the number of transistors in a chip will double approximately every two years. Chip vendors are increasingly observing that this law is nearing its limit when transistor sizes are shrunk to 5nm and 3nm due to power consumption and heat dissipation issues. As a result, innovation in new computing architectures has increasingly focused on heterogeneity, i.e., the use of hardware performance accelerators like graphic processors and reconfigurable logic used in confluence with a computer's CPU (host). To improve the programmability of these architectures, which usually have physically separate memory, the shared-memory programming model is usually used to provide coherent virtual memory. The shared memory model, when applied to such distributed systems, called distributed shared memory (or DSM), has been previously developed in software as well as in hardware. The former usually suffer from high latency overheads, while the latter often requires custom chips and lack programmability for implementing new memory consistency protocols. This thesis presents fDSM, a reconfigurable distributed shared memory framework that provides coherent shared memory between a host and a smart I/O device such as a SmartNIC. fDSM is implemented in FPGAs, which are increasingly available in hosts and Smart I/O devices at the commodity scale. Our prototype implementation uses ISA-heterogeneous hosts to emulate such an environment. Our experimental evaluation using applications from High- Performance Computing benchmark suites reveal that fDSM yields performance benefits over a state-of-the-art software DSM.

Page generated in 0.0334 seconds