Global ETD Search

Return to search

Distributed Statistical Learning under Communication Constraints

"In this thesis, we study distributed statistical learning, in which multiple terminals, connected by links with limited capacity, cooperate to perform a learning task. As the links connecting the terminals have limited capacity, the messages exchanged between the terminals have to be compressed. The goal of this thesis is to investigate how to compress the data observations at multiple terminals and how to use the compressed data for inference. We first focus on the distributed parameter estimation problem, in which terminals send messages related to their local observations using limited rates to a fusion center that will obtain an estimate of a parameter related to the observations of all terminals. It is well known that if the transmission rates are in the Slepian-Wolf region, the fusion center can fully recover all observations and hence can construct an estimator having the same performance as that of the centralized case. One natural question is whether Slepian-Wolf rates are necessary to achieve the same estimation performance as that of the centralized case. In this thesis, we show that the answer to this question is negative. We then examine the optimality of data dimensionality reduction via sufficient statistics compression in distributed parameter estimation problems. The data dimensionality reduction step is often needed especially if the data has a very high dimension and the communication rate is not as high as the one characterized above. We show that reducing the dimensionality by extracting sufficient statistics of the parameter to be estimated does not degrade the overall estimation performance in the presence of communication constraints. We further analyze the optimal estimation performance in the presence of communication constraints and we verify the derived bound using simulations. Finally, we study distributed optimization problems, for which we examine the randomized distributed coordinate descent algorithm with quantized updates. In the literature, the iteration complexity of the randomized distributed coordinate descent algorithm has been characterized under the assumption that machines can exchange updates with an infinite precision. We consider a practical scenario in which the messages exchange occurs over channels with finite capacity, and hence the updates have to be quantized. We derive sufficient conditions on the quantization error such that the algorithm with quantized update still converge."

convex optimization

MVUE

sufficient statistics

Distributed learning

parameter estimation

coordinate descent

Identifer	oai:union.ndltd.org:wpi.edu/oai:digitalcommons.wpi.edu:etd-dissertations-1313
Date	21 June 2017
Creators	El Gamal, Mostafa
Contributors	Lifeng Lai, Advisor, Alexander M. Wyglinski, Committee Member, Randy C. Paffenroth, Committee Member
Publisher	Digital WPI
Source Sets	Worcester Polytechnic Institute
Detected Language	English
Type	text
Format	application/pdf
Source	Doctoral Dissertations (All Dissertations, All Years)

Page generated in 0.0016 seconds

Distributed Statistical Learning under Communication Constraints

Description

Links & Downloads

Tags

Additional Fields