Global ETD Search

Return to search

Multiple Learning for Generalized Linear Models in Big Data

Big data is an enabling technology in digital transformation. It perfectly complements ordinary linear models and generalized linear models, as training well-performed ordinary linear models and generalized linear models require huge amounts of data. With the help of big data, ordinary and generalized linear models can be well-trained and thus offer better services to human beings. However, there are still many challenges to address for training ordinary linear models and generalized linear models in big data. One of the most prominent challenges is the computational challenges. Computational challenges refer to the memory inflation and training inefficiency issues occurred when processing data and training models. Hundreds of algorithms were proposed by the experts to alleviate/overcome the memory inflation issues. However, the solutions obtained are locally optimal solutions. Additionally, most of the proposed algorithms require loading the dataset to RAM many times when updating the model parameters. If multiple model hyper-parameters needed to be computed and compared, e.g. ridge regression, parallel computing techniques are applied in practice. Thus, multiple learning with sufficient statistics arrays are proposed to tackle the memory inflation and training inefficiency issues.

10.25394/pgs.17153546.v1

Distributed Computing

big data

Linear regression analyses

Distributed computing

Sufficient statistics

Generalized Linear Model

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/17153546
Date	19 December 2021
Creators	Xiang Liu (11819735)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/Multiple_Learning_for_Generalized_Linear_Models_in_Big_Data/17153546

Page generated in 0.0024 seconds

Multiple Learning for Generalized Linear Models in Big Data

Description

Links & Downloads

Tags

Additional Fields