In this thesis, we propose optimization techniques for distributed graph processing. First, we describe a data processing pipeline that leverages an iterative graph algorithm for automatic classification of web trackers. Using this application as a motivating example, we examine how asymmetrical convergence of iterative graph algorithms can be used to reduce the amount of computation and communication in large-scale graph analysis. We propose an optimization framework for fixpoint algorithms and a declarative API for writing fixpoint applications. Our framework uses a cost model to automatically exploit asymmetrical convergence and evaluate execution strategies during runtime. We show that our cost model achieves speedup of up to 1.7x and communication savings of up to 54%. Next, we propose to use the concepts of semi-metricity and the metric backbone to reduce the amount of data that needs to be processed in large-scale graph analysis. We provide a distributed algorithm for computing the metric backbone using the vertex-centric programming model. Using the backbone, we can reduce graph sizes up to 88% and achieve speedup of up to 6.7x. / <p>QC 20160919</p>
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-192471 |
Date | January 2016 |
Creators | Kalavri, Vasiliki |
Publisher | KTH, Programvaruteknik och Datorsystem, SCS |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Doctoral thesis, monograph, info:eu-repo/semantics/doctoralThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | TRITA-ICT ; 2016:25 |
Page generated in 0.0021 seconds