Return to search

Scaling RDMA RPCs with FLOCK

RDMA-capable networks are gaining traction with datacenter deployments due to their high throughput, low latency, CPU efficiency, and advanced features, such as remote memory operations. However, efficiently utilizing RDMA capability in a common setting of high fan-in, fan-out asymmetric network topology is challenging. For instance, using RDMA programming features comes at the cost of connection scalability, which does not scale with increasing cluster size. To address that, several works forgo some RDMA features by only focusing on conventional RPC APIs. In this work, we strive to exploit the full capability of RDMA, while scaling the number of connections regardless of the cluster size. We present FLOCK, a communication framework for RDMA networks that uses hardware provided reliable connection. Using a partially shared model, FLOCK departs from the conventional

RDMA design by enabling connection sharing among threads, which provides significant performance improvements contrary to the widely held belief that connection sharing deteriorates performance. At its core, FLOCK uses a connection handle abstraction for connection multiplexing; a new coalescing-based synchronization approach for efficient network utilization; and a load-control mechanism for connections with symbiotic send-recv scheduling, which reduces the synchronization overheads associated with connection sharing along with ensuring fair utilization of network connections. / M.S. / Internet is one of the great discoveries of our time. It provides access to enormous knowledge sources, makes it easier to communicate across the globe seamlessly with other countless advantages. Accessing the internet over the years, it is noticeable that the latency of services like web searches and downloading files has gone down sharply. A download that used to take minutes during the 2000s can complete within seconds in present times. Network speeds have been improving, facilitating a faster and smoother user experience. Another factor contributing to the improved internet experience is the service providers like Google, Amazon, and others that can process user requests in a fraction of time what used to take before. Web services such as search, e-commerce are implemented using a multi-layer architecture with layer containing hundreds to thousands of servers. Each server runs one or more components of the web service application. In this architecture, user requests are received in the upper layer and processed by the lower layers. Servers in different layers communicate over an ultrafast network like Remote Direct Memory Access (RDMA). The implication of the multi-layer architecture is that a server has to communicate with multiple other servers in the upper and lower layers. Unfortunately, due to its inherent limitations, RDMA does not perform well when network communication takes place with a large number of servers. In this thesis, a new communication framework for RDMA networks, FLOCK is proposed to overcome the scalability limitations of RDMA hardware. FLOCK maintains scalability when communicating with many servers and it consistently provides better performance compared to the state-of-the-art. Additionally, FLOCK utilizes the network bandwidth efficiently and reduces the CPU overheads incurred due to network communication.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/107438
Date30 November 2021
CreatorsMonga, Sumit Kumar
ContributorsElectrical and Computer Engineering, Min, Changwoo, Butt, Ali R., Nikolopoulos, Dimitrios
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf, application/pdf
RightsCreative Commons Attribution-ShareAlike 4.0 International, http://creativecommons.org/licenses/by-sa/4.0/

Page generated in 0.0023 seconds