Return to search

Algorithms with theoretical guarantees for several database problems. / 一些數據庫問題的具有理論保證的算法 / CUHK electronic theses & dissertations collection / Yi xie shu ju ku wen ti de ju you li lun bao zheng de suan fa

在此論文中,我們為一系列應用與數據庫系統的問題設計有理論保證的數據結構與/或算法。這些問題可以分為兩類。第一類是一些計算機科學中的經典問題:近似最近鄰居(approximatenearest neighbor)問題、近似最近點對(approximate closest pair )問題、skyline問題(亦稱maxima問題)和二維正交區域聚合(2d orthogonalrange aggregation)問題。第二類則是在此論文中提出的新問題:歷史分位數(historical quantile)問題、k-跳步最短路徑(k-skipshortest path)問題、XML文檔中的最近關鍵字(nearest keyword)問題、最連通節點(most connected vertex )問題和先入先出索引(FIFOindexing)問題。對於每一個問題,或者我們給出最壞情況亦高效的(worst-case efficient)解決方案;或者當最壞情況性能的意義不大時,我們證明方法的實例最優性(instance optimality)。 / In this thesis, we propose data structures and/or algorithms with theoretical guarantees for solving a series of problems that find applications in database systems. These problems can be classified into two categories. The first one contains several classic problems in computer science, including the approximate nearest neighbor problem, the approximate closest pair problem, the skyline problem (a.k.a. the maxima problem), and 2d orthogonal range search. The second category, on the other hand, consists of problems that are newly introduced by this thesis: the historical quantile problem, the k-skip shortest path problem, the nearest keyword problem on XML documents, the most connected vertex problem, and the FIFO indexing problem. For each problem, we establish either the worstcase efficiency of our solutions, or their instance optimality when worst-case performance is not interesting. / Detailed summary in vernacular field only. / Sheng, Cheng. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 277-298). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Computation models --- p.1 / Chapter 1.2 --- Thesis contributions and organization --- p.2 / Chapter 2 --- Nearest Neighbors and Closest Pairs --- p.7 / Chapter 2.1 --- Introduction --- p.7 / Chapter 2.2 --- Problem settings --- p.10 / Chapter 2.3 --- The preliminaries --- p.11 / Chapter 2.3.1 --- Rigorous-LSH and ball cover --- p.12 / Chapter 2.3.2 --- Adhoc-LSH --- p.13 / Chapter 2.3.3 --- Details of hash functions --- p.15 / Chapter 2.4 --- LSB-tree --- p.16 / Chapter 2.4.1 --- Building a LSB-tree --- p.16 / Chapter 2.4.2 --- Nearest neighbor algorithm --- p.18 / Chapter 2.5 --- Theoretical analysis --- p.21 / Chapter 2.5.1 --- Quality guarantee --- p.21 / Chapter 2.5.2 --- Space and query time --- p.25 / Chapter 2.5.3 --- Comparison with rigorous-LSH --- p.25 / Chapter 2.6 --- Extensions --- p.26 / Chapter 2.7 --- Closest pair search --- p.30 / Chapter 2.7.1 --- Ball pair search --- p.30 / Chapter 2.7.2 --- Solving the closest pair problem --- p.34 / Chapter 2.8 --- Related work --- p.38 / Chapter 2.9 --- Experiments --- p.41 / Chapter 2.9.1 --- Data and queries --- p.41 / Chapter 2.9.2 --- Competitors for nearest neighbor search --- p.42 / Chapter 2.9.3 --- Competitors for closest pair search --- p.43 / Chapter 2.9.4 --- Computing environments and assessment metrics --- p.44 / Chapter 2.9.5 --- Behavior of LSH implementations --- p.45 / Chapter 2.9.6 --- Comparison of NN solutions --- p.48 / Chapter 2.9.7 --- Comparison of CP solutions --- p.51 / Chapter 2.10 --- Chapter summary --- p.54 / Chapter 3 --- The Skyline Problem and Its Variants --- p.57 / Chapter 3.1 --- Introduction --- p.57 / Chapter 3.1.1 --- Previous results --- p.58 / Chapter 3.1.2 --- Our results --- p.60 / Chapter 3.2 --- Preliminaries --- p.63 / Chapter 3.3 --- Our skyline algorithm --- p.66 / Chapter 3.3.1 --- 3d --- p.67 / Chapter 3.3.2 --- 4d --- p.69 / Chapter 3.3.3 --- Higher dimensionalities --- p.70 / Chapter 3.3.4 --- Eliminating the general-position assumption --- p.74 / Chapter 3.4 --- Variants of the skyline problem --- p.74 / Chapter 3.5 --- Low-cardinality domains --- p.77 / Chapter 3.6 --- Non-fixed dimensionality --- p.80 / Chapter 3.6.1 --- An improved algorithm in internal memory --- p.80 / Chapter 3.6.2 --- Externalizing the algorithm --- p.82 / Chapter 3.7 --- Chapter summary --- p.83 / Chapter 4 --- Orthogonal Range Aggregation --- p.85 / Chapter 4.1 --- Introduction --- p.85 / Chapter 4.1.1 --- Applications --- p.86 / Chapter 4.1.2 --- Computation model --- p.87 / Chapter 4.1.3 --- Previous results --- p.87 / Chapter 4.1.4 --- Our results --- p.88 / Chapter 4.2 --- Preliminaries --- p.91 / Chapter 4.3 --- Bundled compressed B-tree --- p.93 / Chapter 4.4 --- Three-sided window-max --- p.95 / Chapter 4.4.1 --- The first structure --- p.96 / Chapter 4.4.2 --- The improved structure --- p.97 / Chapter 4.5 --- Segment-intersection-max --- p.101 / Chapter 4.6 --- Stabbing-max --- p.103 / Chapter 4.6.1 --- Ray-segment-max --- p.103 / Chapter 4.6.2 --- Stabbing-max --- p.106 / Chapter 4.7 --- Rectangle-intersection-sum --- p.106 / Chapter 5 --- Persistent Quantiles --- p.109 / Chapter 5.1 --- Introduction --- p.109 / Chapter 5.1.1 --- Problem definition --- p.111 / Chapter 5.1.2 --- Previous work --- p.112 / Chapter 5.1.3 --- Our main results --- p.113 / Chapter 5.2 --- Space lower bounds for historical quantile search --- p.114 / Chapter 5.3 --- A structure for historical quantile search --- p.119 / Chapter 5.3.1 --- Persistence technique --- p.119 / Chapter 5.3.2 --- High-level rationales and challenges --- p.121 / Chapter 5.3.3 --- The structure and its query algorithm --- p.122 / Chapter 5.3.4 --- Construction algorithm --- p.127 / Chapter 5.3.5 --- Complexity analysis --- p.130 / Chapter 5.3.6 --- An alternative simple solution --- p.132 / Chapter 5.4 --- Experiments --- p.133 / Chapter 5.4.1 --- Competitors and metrics --- p.133 / Chapter 5.4.2 --- Performance characteristics --- p.133 / Chapter 5.4.3 --- Performance on real data --- p.136 / Chapter 5.5 --- Chapter summary --- p.138 / Chapter 6 --- k-Skip Shortest Paths --- p.141 / Chapter 6.1 --- Introduction --- p.141 / Chapter 6.2 --- Related work --- p.144 / Chapter 6.2.1 --- Dijkstra and reach --- p.144 / Chapter 6.2.2 --- More results on SP computation --- p.148 / Chapter 6.3 --- k-skip Shortest Paths --- p.150 / Chapter 6.4 --- k-skip graph --- p.152 / Chapter 6.4.1 --- Size of k-skip cover --- p.152 / Chapter 6.4.2 --- Computing a k-skip cover --- p.154 / Chapter 6.4.3 --- Computing a k-skip graph --- p.156 / Chapter 6.5 --- Query algorithm --- p.158 / Chapter 6.5.1 --- High-level description --- p.158 / Chapter 6.5.2 --- Reach* --- p.160 / Chapter 6.5.3 --- Zoom-in --- p.163 / Chapter 6.6 --- Experiments --- p.164 / Chapter 6.7 --- Chapter summary --- p.168 / Chapter 7 --- Nearest Keyword Queries on XML Documents --- p.171 / Chapter 7.1 --- Introduction --- p.171 / Chapter 7.1.1 --- Motivation --- p.171 / Chapter 7.1.2 --- Contributions --- p.174 / Chapter 7.2 --- Preliminaries --- p.175 / Chapter 7.3 --- Nearest keyword search --- p.178 / Chapter 7.3.1 --- Overview --- p.178 / Chapter 7.3.2 --- TVP characteristics --- p.180 / Chapter 7.3.3 --- Finding the minimum TVP --- p.183 / Chapter 7.4 --- Nearest keyword search as an operator --- p.186 / Chapter 7.4.1 --- XPath evaluation --- p.186 / Chapter 7.4.2 --- Finding approximate group steiner trees --- p.192 / Chapter 7.5 --- Related work --- p.193 / Chapter 7.6 --- Experiments --- p.196 / Chapter 7.7 --- Chapter summary --- p.202 / Chapter 8 --- FIFO Indexes for Decomposable Problems --- p.203 / Chapter 8.1 --- Introduction --- p.203 / Chapter 8.1.1 --- FIFO update scheme and its applications --- p.203 / Chapter 8.1.2 --- Technical motivations --- p.204 / Chapter 8.1.3 --- Problems, computation models, and basic notations --- p.205 / Chapter 8.1.4 --- Previous results --- p.207 / Chapter 8.1.5 --- Our results --- p.210 / Chapter 8.2 --- Making a static structure FIFO --- p.213 / Chapter 8.2.1 --- The RAM model --- p.214 / Chapter 8.2.2 --- The EM model --- p.220 / Chapter 8.3 --- Solving concrete problems --- p.221 / Chapter 8.4 --- Chapter summary --- p.225 / Chapter 9 --- The Most Connected Vertex Problem --- p.227 / Chapter 9.1 --- Introduction --- p.227 / Chapter 9.1.1 --- Motivation --- p.227 / Chapter 9.1.2 --- Our main results --- p.230 / Chapter 9.2 --- Problem and Related Work --- p.232 / Chapter 9.3 --- Preliminaries --- p.235 / Chapter 9.4 --- Exact algorithms --- p.239 / Chapter 9.5 --- Theoretical analysis of the exact algorithms --- p.244 / Chapter 9.5.1 --- The randomized algorithm class --- p.244 / Chapter 9.5.2 --- The deterministic algorithm class --- p.251 / Chapter 9.6 --- Approximate algorithms and their analysis --- p.254 / Chapter 9.6.1 --- 1-MCV --- p.254 / Chapter 9.6.2 --- k-MCV --- p.259 / Chapter 9.7 --- Experiments --- p.262 / Chapter 9.7.1 --- Datasets --- p.262 / Chapter 9.7.2 --- Methods --- p.264 / Chapter 9.7.3 --- How pessimistic is the worst case? --- p.265 / Chapter 9.7.4 --- Performance of random-probe algorithms --- p.266 / Chapter 9.7.5 --- Performance of deterministic-probe algorithms --- p.268 / Chapter 9.7.6 --- Performance of AMCV --- p.269 / Chapter 9.8 --- Chapter summary --- p.274 / Bibliography --- p.277

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_328160
Date January 2012
ContributorsSheng, Cheng., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatelectronic resource, electronic resource, remote, 1 online resource (xvi, 298 leaves) : ill. (some col.)
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0033 seconds