Peer-to-peer (P2P) paradigm, for its scalability and low cost management, is widely used in today’s network. Based on the typical designs for request/response services, a lot of efforts have been made to support publish-subscribe services in P2P networks. Gossip-based publish-subscribe system, which is commonly used in unstructured P2P networks, can provide great flexibility in query language and does not require special efforts on maintaining topology. The purpose of our work is to investigate effective and efficient mechanisms to build gossip-based publish-subscribe systems in unstructured P2P networks. Specifically, the probabilistic bi-quorum system (PBQS), for its assurance in effectiveness, becomes the object of our study.
Uniform sampling is a fundamental tool to construct PBQS. By adopting uniform sampling, PBQS provides a bound on the likelihood that data messages will find a copy of the subscription. A random walk of length O(log n) is commonly used to gain a uniform sample on an expander graph of size n. To obtain a multitude of uniform samples thus requires an equivalent number of random walks of length O(log n) each. A number of works have relied on the Chernoff bound to analytically reduce the overhead needed to obtain a multitude of uniform samples. Besides, researchers have also shown that it is not necessary to replicate both data and query on uniformly chosen nodes. Alternatively, BubbleStorm performs controlled flooding on a constructed overlay to build PBQS. BubbleStorm does not require nodes forming a bubble to be uniformly chosen at random, and the probabilistic bound computed by BubbleStorm is different from uniform sampling based PBQS.
In this thesis, we first show that the Chernoff bound on the statistical properties of samples collected from a random walk does not help in selecting uniformly random nodes. We then re-examine the role of uniform sampling in PBQS, and found that when multiple data answer a single subscription, it is sufficient and necessary for each data to be distributed uniformly at random. Looking into BubbleStorm, we examine more closely the probabilistic bound provided by this system. We found that, unlike uniform sampling based PBQS, the bubble intersection in BubbleStorm is distance dependent. Given a specific pair of publisher-subscriber, the data may never find the subscription. We further investigate the topology construction and found that re-creating topology prior to each controlled flooding or keeping topology with high degree of churn can help alleviate the distance dependency problem. We arrive at the conclusion that BubbleStorm construction is equivalent to caching of random walks. We show that re-using this cache to obtain samples over time leads to degradation of uniformity of the samples. We evaluate topology re-wiring as a simple method to keep the cache fresh, thereby benefiting from the low latency of controlled flooding without degrading the uniformity of samples over time. / published_or_final_version / Electrical and Electronic Engineering / Master / Master of Philosophy
Identifer | oai:union.ndltd.org:HKU/oai:hub.hku.hk:10722/208013 |
Date | January 2014 |
Creators | Zhang, Xin, 张昕 |
Contributors | Yeung, LK |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Source Sets | Hong Kong University Theses |
Language | English |
Detected Language | English |
Type | PG_Thesis |
Rights | Creative Commons: Attribution 3.0 Hong Kong License, The author retains all proprietary rights, (such as patent rights) and the right to use in future works. |
Relation | HKU Theses Online (HKUTO) |
Page generated in 0.0032 seconds