Spelling suggestions: "subject:"remote direct demory access (RDMA)"" "subject:"remote direct demory cccess (RDMA)""
1 |
Scaling RDMA RPCs with FLOCKMonga, Sumit Kumar 30 November 2021 (has links)
RDMA-capable networks are gaining traction with datacenter deployments due to their high throughput, low latency, CPU efficiency, and advanced features, such as remote memory operations. However, efficiently utilizing RDMA capability in a common setting of high fan-in, fan-out asymmetric network topology is challenging. For instance, using RDMA programming features comes at the cost of connection scalability, which does not scale with increasing cluster size. To address that, several works forgo some RDMA features by only focusing on conventional RPC APIs. In this work, we strive to exploit the full capability of RDMA, while scaling the number of connections regardless of the cluster size. We present FLOCK, a communication framework for RDMA networks that uses hardware provided reliable connection. Using a partially shared model, FLOCK departs from the conventional
RDMA design by enabling connection sharing among threads, which provides significant performance improvements contrary to the widely held belief that connection sharing deteriorates performance. At its core, FLOCK uses a connection handle abstraction for connection multiplexing; a new coalescing-based synchronization approach for efficient network utilization; and a load-control mechanism for connections with symbiotic send-recv scheduling, which reduces the synchronization overheads associated with connection sharing along with ensuring fair utilization of network connections. / M.S. / Internet is one of the great discoveries of our time. It provides access to enormous knowledge sources, makes it easier to communicate across the globe seamlessly with other countless advantages. Accessing the internet over the years, it is noticeable that the latency of services like web searches and downloading files has gone down sharply. A download that used to take minutes during the 2000s can complete within seconds in present times. Network speeds have been improving, facilitating a faster and smoother user experience. Another factor contributing to the improved internet experience is the service providers like Google, Amazon, and others that can process user requests in a fraction of time what used to take before. Web services such as search, e-commerce are implemented using a multi-layer architecture with layer containing hundreds to thousands of servers. Each server runs one or more components of the web service application. In this architecture, user requests are received in the upper layer and processed by the lower layers. Servers in different layers communicate over an ultrafast network like Remote Direct Memory Access (RDMA). The implication of the multi-layer architecture is that a server has to communicate with multiple other servers in the upper and lower layers. Unfortunately, due to its inherent limitations, RDMA does not perform well when network communication takes place with a large number of servers. In this thesis, a new communication framework for RDMA networks, FLOCK is proposed to overcome the scalability limitations of RDMA hardware. FLOCK maintains scalability when communicating with many servers and it consistently provides better performance compared to the state-of-the-art. Additionally, FLOCK utilizes the network bandwidth efficiently and reduces the CPU overheads incurred due to network communication.
|
2 |
Software-defined Buffer Management and Robust Congestion Control for Modern Datacenter NetworksDanushka N Menikkumbura (12208121) 20 April 2022 (has links)
<p> Modern datacenter network applications continue to demand ultra low latencies and very high throughputs. At the same time, network infrastructure keeps achieving higher speeds and larger bandwidths. We still need better network management solutions to keep these two demand and supply fronts go hand-in-hand. There are key metrics that define network performance such as flow completion time (the lower the better), throughput (the higher the better), and end-to-end latency (the lower the better) that are mainly governed by how effectively network application get their fair share of network resources. We observe that buffer utilization on network switches gives a very accurate indication of network performance. Therefore, network buffer management is important in modern datacenter networks, and other network management solutions can be efficiently built around buffer utilization. This dissertation presents three solutions based on buffer use on network switches.</p>
<p> This dissertation consists of three main sections. The first section is on a specification language for buffer management in modern programmable switches. The second section is on a congestion control solution for Remote Direct Memory Access (RDMA) networks. The third section is on a solution to head-of-the-line blocking in modern datacenter networks.</p>
|
Page generated in 0.0727 seconds