Spelling suggestions: "subject:"cufficient statistics"" "subject:"asufficient statistics""
1 |
Electronic structure and optical properties of ZnO : bulk and surfaceYan, Caihua 23 February 1994 (has links)
Graduation date: 1994
|
2 |
Distributed Statistical Learning under Communication ConstraintsEl Gamal, Mostafa 21 June 2017 (has links)
"In this thesis, we study distributed statistical learning, in which multiple terminals, connected by links with limited capacity, cooperate to perform a learning task. As the links connecting the terminals have limited capacity, the messages exchanged between the terminals have to be compressed. The goal of this thesis is to investigate how to compress the data observations at multiple terminals and how to use the compressed data for inference. We first focus on the distributed parameter estimation problem, in which terminals send messages related to their local observations using limited rates to a fusion center that will obtain an estimate of a parameter related to the observations of all terminals. It is well known that if the transmission rates are in the Slepian-Wolf region, the fusion center can fully recover all observations and hence can construct an estimator having the same performance as that of the centralized case. One natural question is whether Slepian-Wolf rates are necessary to achieve the same estimation performance as that of the centralized case. In this thesis, we show that the answer to this question is negative. We then examine the optimality of data dimensionality reduction via sufficient statistics compression in distributed parameter estimation problems. The data dimensionality reduction step is often needed especially if the data has a very high dimension and the communication rate is not as high as the one characterized above. We show that reducing the dimensionality by extracting sufficient statistics of the parameter to be estimated does not degrade the overall estimation performance in the presence of communication constraints. We further analyze the optimal estimation performance in the presence of communication constraints and we verify the derived bound using simulations. Finally, we study distributed optimization problems, for which we examine the randomized distributed coordinate descent algorithm with quantized updates. In the literature, the iteration complexity of the randomized distributed coordinate descent algorithm has been characterized under the assumption that machines can exchange updates with an infinite precision. We consider a practical scenario in which the messages exchange occurs over channels with finite capacity, and hence the updates have to be quantized. We derive sufficient conditions on the quantization error such that the algorithm with quantized update still converge."
|
3 |
Multiple Learning for Generalized Linear Models in Big DataXiang Liu (11819735) 19 December 2021 (has links)
Big data is an enabling technology in digital transformation. It perfectly complements ordinary linear models and generalized linear models, as training well-performed ordinary linear models and generalized linear models require huge amounts of data. With the help of big data, ordinary and generalized linear models can be well-trained and thus offer better services to human beings. However, there are still many challenges to address for training ordinary linear models and generalized linear models in big data. One of the most prominent challenges is the computational challenges. Computational challenges refer to the memory inflation and training inefficiency issues occurred when processing data and training models. Hundreds of algorithms were proposed by the experts to alleviate/overcome the memory inflation issues. However, the solutions obtained are locally optimal solutions. Additionally, most of the proposed algorithms require loading the dataset to RAM many times when updating the model parameters. If multiple model hyper-parameters needed to be computed and compared, e.g. ridge regression, parallel computing techniques are applied in practice. Thus, multiple learning with sufficient statistics arrays are proposed to tackle the memory inflation and training inefficiency issues.
|
4 |
Minimal Sufficient Statistics for Incomplete Block Designs With Interaction Under an Eisenhart Model IIIKapadia, C. H., Kvanli, Alan H., Lee, Kwan R. 01 January 1988 (has links)
The purpose of this paper is to derive minimal sufficient statistics for the balanced incomplete block design and the group divisible partially balanced incomplete block design when the Eisenhart Model III (mixed model) is assumed. The results are identical to Hultquist and Graybill's (1965) and Hirotsu's (1965) for the same model without interaction, except for the addition of a statistic, ∑ijY2ij•.
|
Page generated in 0.078 seconds