Global ETD Search

371	Automatic model selection on local Gaussian structures with priors: comparative investigations and applications. / 基於帶先驗的局部高斯結构的自動模型選擇: 比較性分析及應用研究 / CUHK electronic theses & dissertations collection / Ji yu dai xian yan de ju bu Gaosi jie gou de zi dong mo xing xuan ze: bi jiao xing fen xi ji ying yong yan jiu January 2012 (has links) 作為機器學習領域中的一個重要課題，模型選擇旨在給定有限樣本的情況下、恰當地確定模型的複雜度。自動模型選擇是指一類快速有效的模型選擇方法，它們以一個足夠大的模型複雜度作為初始，在學習過程中有一種內在機制能夠驅使冗餘結構自動地變為不起作用、從而可以剔除。爲了輔助自動模型選擇的進行，模型的參數通常被假設帶有先驗。對於考慮先驗的各種自動模型選擇方法，已有工作中尚缺乏系統性的比較研究。本篇論文著眼於具有局部高斯結構的模型，進行了系統性的比較分析。 / 具體而言，本文比較了三種典型的自動模型選擇方法的優劣勢，它們分別為變分貝葉斯（Variational Bayesian），最小信息長度（Minimum Message Length），以及貝葉斯陰陽和諧學習（Bayesian Ying‐Yang harmony learning）。首先，我們研究針對高斯混合模型（Gaussian Mixture Model）的模型選擇，即確定該模型中高斯成份的個數。進而，我们假設每個高斯成份都有子空間結構、并研究混合因子分析模型（Mixture of Factor Analyzers）及局部因子分析模型（Local Factor Analysis）下的模型選擇問題，即確定模型中混合成份的個數及各個局部子空間的維度。 / 本篇論文考慮以上各模型的參數的兩類先驗，分別為共軛型先驗及Jeffreys 先驗。其中，共軛型先驗在高斯混合模型上為DNW（Dirichlet‐Normal‐Wishart）先驗，在混合因子分析模型及局部因子分析模型上均為DNG（Dirichlet‐Normal‐Gamma）先驗。由於推導對應Fisher 信息矩陣的解析表達非常困難，在混合因子分析模型及局部因子分析模型上，我們不考慮Jeffreys 先驗以及最小信息長度方法。 / 通過一系列的仿真實驗及應用分析，本文比較了幾種自動模型選擇算法（包括基於高斯混合模型的6 個算法，基於混合因子分析模型及局部因子分析模型的4 個算法），并得到了如下主要發現：1. 對於各種自動模型選擇方法，在所有參數上加先驗都比僅在混合權重上加先驗的效果好。2. 在高斯混合模型上，考慮 DNW 先驗的效果比考慮Jeffreys 先驗的效果好。其中，考慮Jeffreys 先驗時，最小信息長度比變分貝葉斯的效果略好；而考慮DNW 先驗時，變分貝葉斯比最小信息長度的效果好。3. 在高斯混合模型上，當DNW 先驗的超參數（hyper‐parameters）由保持固定變為根據各自學習準則進行優化時，貝葉斯陰陽和諧學習的效果得到了提高，而變分貝葉斯及最小信息長度的結果都會變差。在基於帶DNG 先驗的混合因子分析模型及局部因子分析模型的比較中，以上觀察結果同樣維持。事實上，變分貝葉斯及最小信息長度都缺乏一種引導先驗超參數優化的良好機制。4. 對以上各種模型、無論考慮哪種先驗、以及無論先驗超參數是否在學習過程中進行優化，貝葉斯陰陽和諧學習的效果都明顯地優於變分貝葉斯和最小信息長度。與后兩者相比，貝葉斯陰陽和諧學習對於先驗的依賴程度不高，它的結果在不考慮先驗的情況下已較好，並在考慮Jeffreys 或共軛型先驗時有進一步提高。5. 儘管混合因子分析模型及局部因子分析模型在最大似然準則的參數估計中等價，它們在變分貝葉斯及貝葉斯陰陽和諧學習下的自動模型選擇中卻表现不同。在這兩種方法下，局部因子分析模型皆以明顯的優勢優於混合因子分析模型。 / 爲進行以上比較分析，除了直接使用已有算法或做少許修改之外，本篇論文還提出了五個新的算法來填補空白。針對高斯混合模型，我們提出了帶Jeffreys 先驗的變分貝葉斯算法；通過邊際化（marginalization），我們得到了有多變量學生分佈（Student’s T‐distribution）形式的后驗，并提出了帶DNW 先驗的貝葉斯陰陽和諧學習算法。針對混合因子分析模型及局部因子分析模型，我們通過一系列的近似邊際化過程，得到了有多個學生分佈乘積形式的后驗，并提出了帶DNG 先驗的貝葉斯陰陽和諧學習算法。對應於已有的基於混合因子分析模型的變分貝葉斯算法，我們還提出了基於局部因子分析模型的變分貝葉斯算法，作為一種更有效的可替代選擇。 / Model selection aims to determine an appropriate model scale given a small size of samples, which is an important topic in machine learning. As one type of efficient solution, an automatic model selection starts from a large enough model scale, and has an intrinsic mechanism to push redundant structures to be ineffective and thus discarded automatically during learning. Priors are usually imposed on parameters to facilitate an automatic model selection. There still lack systematic comparisons on automatic model selection approaches with priors, and this thesis is motivated for such a study based on models with local Gaussian structures. / Particularly, we compare the relative strength and weakness of three typical automatic model selection approaches, namely Variational Bayesian (VB), Minimum Message Length (MML) and Bayesian Ying-Yang (BYY) harmony learning, on models with local Gaussian structures. First, we consider Gaussian Mixture Model (GMM), for which the number of Gaussian components is to be determined. Further assuming each Gaussian component has a subspace structure, we extend to consider two models namely Mixture of Factor Analyzers (MFA) and Local Factor Analysis (LFA), for both of which the component number and local subspace dimensionalities are to be determined. / Two types of priors are imposed on parameters, namely a conjugate form prior and a Jeffreys prior. The conjugate form prior is chosen as a Dirichlet-Normal- Wishart (DNW) prior for GMM, and as a Dirichlet-Normal-Gamma (DNG) prior for both MFA and LFA. The Jeffreys prior and the MML approach are not considered on MFA/LFA due to the difficulty in deriving the corresponding Fisher information matrix. Via extensive simulations and applications, comparisons on the automatic model selection algorithms (six for GMM and four for MFA/LFA), we get following main findings:1. Considering priors on all parameters makes each approach perform better than considering priors merely on the mixing weights.2. For all the three approaches on GMM, the performance with the DNW prior is better than with the Jeffreys prior. Moreover, Jeffreys prior makes MML slightly better than VB, while the DNW prior makes VB better than MML.3. As the DNW prior hyper-parameters on GMM are changed from fixed to freely optimized by each of its own learning principle, BYY improves its performance, while VB and MML deteriorate their performances. This observation remains the same when we compare BYY and VB on either MFA or LFA with the DNG prior. Actually, VB and MML lack a good guide for optimizing prior hyper-parameters.4. For bothGMMand MFA/LFA, BYY considerably outperforms both VB and MML, for any type of priors and whether hyper-parameters are optimized. Being different from VB and MML that rely on appropriate priors, BYY does not highly depend on the type of priors. It performs already well without priors and improves by imposing a Jeffreys or a conjugate form prior. 5. Despite the equivalence in maximum likelihood parameter learning, MFA and LFA affect the performances by VB and BYY in automatic model selection. Particularly, both BYY and VB perform better on LFA than on MFA, and the superiority of LFA is reliable and robust. / In addition to adopting the existing algorithms either directly or with some modifications, this thesis develops five new algorithms to fill the missing gap. Particularly on GMM, the VB algorithm with Jeffreys prior and the BYY algorithm with DNW prior are developed, in the latter of which a multivariate Student’s Tdistribution is obtained as the posterior via marginalization. On MFA and LFA, BYY algorithms with DNG priors are developed, where products of multiple Student’s T-distributions are obtained in posteriors via approximated marginalization. Moreover, a VB algorithm on LFA is developed as an alternative choice to the existing VB algorithm on MFA. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Shi, Lei. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 153-166). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.3 / Chapter 1.2 --- Main Contributions of the Thesis --- p.11 / Chapter 1.3 --- Outline of the Thesis --- p.14 / Chapter 2 --- Automatic Model Selection on GMM --- p.16 / Chapter 2.1 --- Introduction --- p.17 / Chapter 2.2 --- Gaussian Mixture, Model Selection, and Priors --- p.21 / Chapter 2.2.1 --- Gaussian Mixture Model and EM algorithm --- p.21 / Chapter 2.2.2 --- Three automatic model selection approaches --- p.22 / Chapter 2.2.3 --- Jeffreys prior and Dirichlet-Normal-Wishart prior --- p.24 / Chapter 2.3 --- Algorithms with Jeffreys Priors --- p.25 / Chapter 2.3.1 --- Bayesian Ying-Yang learning and BYY-Jef algorithms --- p.25 / Chapter 2.3.2 --- Variational Bayesian and VB-Jef algorithms --- p.29 / Chapter 2.3.3 --- Minimum Message Length and MML-Jef algorithms --- p.33 / Chapter 2.4 --- Algorithms with Dirichlet and DNW Priors --- p.35 / Chapter 2.4.1 --- Algorithms BYY-Dir(α), VB-Dir(α) and MML-Dir(α) --- p.35 / Chapter 2.4.2 --- Algorithms with DNW priors --- p.40 / Chapter 2.5 --- Empirical Analysis on Simulated Data --- p.44 / Chapter 2.5.1 --- With priors on mixing weights: a quick look --- p.44 / Chapter 2.5.2 --- With full priors: extensive comparisons --- p.51 / Chapter 2.6 --- Concluding Remarks --- p.55 / Chapter 3 --- Applications of GMM Algorithms --- p.57 / Chapter 3.1 --- Face and Handwritten Digit Images Clustering --- p.58 / Chapter 3.2 --- Unsupervised Image Segmentation --- p.59 / Chapter 3.3 --- Image Foreground Extraction --- p.62 / Chapter 3.4 --- Texture Classification --- p.68 / Chapter 3.5 --- Concluding Remarks --- p.71 / Chapter 4 --- Automatic Model Selection on MFA/LFA --- p.73 / Chapter 4.1 --- Introduction --- p.74 / Chapter 4.2 --- MFA/LFA Models and the Priors --- p.78 / Chapter 4.2.1 --- MFA and LFA models --- p.78 / Chapter 4.2.2 --- The Dirichlet-Normal-Gamma priors --- p.79 / Chapter 4.3 --- Algorithms on MFA/LFA with DNG Priors --- p.82 / Chapter 4.3.1 --- BYY algorithm on MFA with DNG prior --- p.83 / Chapter 4.3.2 --- BYY algorithm on LFA with DNG prior --- p.86 / Chapter 4.3.3 --- VB algorithm on MFA with DNG prior --- p.89 / Chapter 4.3.4 --- VB algorithm on LFA with DNG prior --- p.91 / Chapter 4.4 --- Empirical Analysis on Simulated Data --- p.93 / Chapter 4.4.1 --- On the “chair data: a quick look --- p.94 / Chapter 4.4.2 --- Extensive comparisons on four series of simulations --- p.97 / Chapter 4.5 --- Concluding Remarks --- p.101 / Chapter 5 --- Applications of MFA/LFA Algorithms --- p.102 / Chapter 5.1 --- Face and Handwritten Digit Images Clustering --- p.103 / Chapter 5.2 --- Unsupervised Image Segmentation --- p.105 / Chapter 5.3 --- Radar HRRP based Airplane Recognition --- p.106 / Chapter 5.3.1 --- Background of HRRP radar target recognition --- p.106 / Chapter 5.3.2 --- Data description --- p.109 / Chapter 5.3.3 --- Experimental results --- p.111 / Chapter 5.4 --- Concluding Remarks --- p.113 / Chapter 6 --- Conclusions and FutureWorks --- p.114 / Chapter A --- Referred Parametric Distributions --- p.117 / Chapter B --- Derivations of GMM Algorithms --- p.119 / Chapter B.1 --- The BYY-DNW Algorithm --- p.119 / Chapter B.2 --- The MML-DNW Algorithm --- p.124 / Chapter B.3 --- The VB-DNW Algorithm --- p.127 / Chapter C --- Derivations of MFA/LFA Algorithms --- p.130 / Chapter C.1 --- The BYY Algorithms with DNG Priors --- p.130 / Chapter C.1.1 --- The BYY-DNG-MFA algorithm --- p.130 / Chapter C.1.2 --- The BYY-DNG-LFA algorithm --- p.137 / Chapter C.2 --- The VB Algorithms with DNG Priors --- p.145 / Chapter C.2.1 --- The VB-DNG-MFA algorithm --- p.145 / Chapter C.2.2 --- The VB-DNG-LFA algorithm --- p.149 / Bibliography --- p.152 Mathematical statistics--Data processing Gaussian measures Machine learning
372	Implementation of multiple comparison procedures in a generalized least squares program Marasinghe, Mervyn G January 2010 (has links) Typescript, etc. / Digitized by Kansas Correctional Industries Least squares--Computer programs
373	Spectral Filtering for Spatio-temporal Dynamics and Multivariate Forecasts Meng, Lu January 2016 (has links) Due to the increasing availability of massive spatio-temporal data sets, modeling high dimensional data becomes quite challenging. A large number of research questions are rooted in identifying underlying dynamics in such spatio-temporal data. For many applications, the science suggests that the intrinsic dynamics be smooth and of low dimension. To reduce the variance of estimates and increase the computational tractability, dimension reduction is also quite necessary in the modeling procedure. In this dissertation, we propose a spectral filtering approach for dimension reduction and forecast amelioration, and apply it to multiple applications. We show the effectiveness of dimension reduction via our method and also illustrate its power for prediction in both simulation and real data examples. The resultant lower dimensional principal component series has a diagonal spectral density at each frequency whose diagonal elements are in descending order, which is not well motivated can be hard to interpret. Therefore we propose a phase-based filtering method to create principal component series with interpretable dynamics in the time domain. Our method is based on an approach of structural decomposition and phase-aligned construction in the frequency domain, identifying lower-rank dynamics and its components embedded in a high dimensional spatio-temporal system. In both our simulated examples and real data applications, we illustrate that the proposed method is able to separate and identify meaningful lower-rank movements. Benefiting from the zero-coherence property of the principal component series, we subsequently develop a predictive model for high-dimensional forecasting via lower-rank dynamics. Our modeling approach reduces multivariate modeling task to multiple univariate modeling and is flexible in combining with regularization techniques to obtain more stable estimates and improve interpretability. The simulation results and real data analysis show that our model achieves superior forecast performance compared to the class of autoregressive models. Mathematical statistics--Data processing Dynamics Dimension reduction (Statistics) Statistics
374	Flexible Sparse Learning of Feature Subspaces Ma, Yuting January 2017 (has links) It is widely observed that the performances of many traditional statistical learning methods degenerate when confronted with high-dimensional data. One promising approach to prevent this downfall is to identify the intrinsic low-dimensional spaces where the true signals embed and to pursue the learning process on these informative feature subspaces. This thesis focuses on the development of flexible sparse learning methods of feature subspaces for classification. Motivated by the success of some existing methods, we aim at learning informative feature subspaces for high-dimensional data of complex nature with better flexibility, sparsity and scalability. The first part of this thesis is inspired by the success of distance metric learning in casting flexible feature transformations by utilizing local information. We propose a nonlinear sparse metric learning algorithm using a boosting-based nonparametric solution to address metric learning problem for high-dimensional data, named as the sDist algorithm. Leveraged a rank-one decomposition of the symmetric positive semi-definite weight matrix of the Mahalanobis distance metric, we restructure a hard global optimization problem into a forward stage-wise learning of weak learners through a gradient boosting algorithm. In each step, the algorithm progressively learns a sparse rank-one update of the weight matrix by imposing an L-1 regularization. Nonlinear feature mappings are adaptively learned by a hierarchical expansion of interactions integrated within the boosting framework. Meanwhile, an early stopping rule is imposed to control the overall complexity of the learned metric. As a result, without relying on computationally intensive tools, our approach automatically guarantees three desirable properties of the final metric: positive semi-definiteness, low rank and element-wise sparsity. Numerical experiments show that our learning model compares favorably with the state-of-the-art methods in the current literature of metric learning. The second problem arises from the observation of high instability and feature selection bias when applying online methods to highly sparse data of large dimensionality for sparse learning problem. Due to the heterogeneity in feature sparsity, existing truncation-based methods incur slow convergence and high variance. To mitigate this problem, we introduce a stabilized truncated stochastic gradient descent algorithm. We employ a soft-thresholding scheme on the weight vector where the imposed shrinkage is adaptive to the amount of information available in each feature. The variability in the resulted sparse weight vector is further controlled by stability selection integrated with the informative truncation. To facilitate better convergence, we adopt an annealing strategy on the truncation rate. We show that, when the true parameter space is of low dimension, the stabilization with annealing strategy helps to achieve lower regret bound in expectation. Mathematical statistics Machine learning--Statistical methods Machine learning Statistics
375	Essays in Cluster Sampling and Causal Inference Makela, Susanna January 2018 (has links) This thesis consists of three papers in applied statistics, specifically in cluster sampling, causal inference, and measurement error. The first paper studies the problem of estimating the finite population mean from a two-stage sample with unequal selection probabilies in a Bayesian framework. Cluster sampling is common in survey practice, and the corresponding inference has been predominantly design-based. We develop a Bayesian framework for cluster sampling and account for the design effect in the outcome modeling. In a two-stage cluster sampling design, clusters are first selected with probability proportional to cluster size, and units are then randomly sampled within selected clusters. Methodological challenges arise when the sizes of nonsampled cluster are unknown. We propose both nonparametric and parametric Bayesian approaches for predicting the cluster size, and we implement inference for the unknown cluster sizes simultaneously with inference for survey outcome. We implement this method in Stan and use simulation studies to compare the performance of an integrated Bayesian approach to classical methods on their frequentist properties. We then apply our propsed method to the Fragile Families and Child Wellbeing study as an illustration of complex survey inference. The second paper focuses on the problem of weak instrumental variables, motivated by estimating the causal effect of incarceration on recidivism. An instrument is weak when it is only weakly predictive of the treatment of interest. Given the well-known pitfalls of weak instrumental variables, we propose a method for strengthening a weak instrument. We use a matching strategy that pairs observations to be close on observed covariates but far on the instrument. This strategy strengthens the instrument, but with the tradeoff of reduced sample size. To help guide the applied researcher in selecting a match, we propose simulating the power of a sensitivity analysis and design sensitivity and using graphical methods to examine the results. We also demonstrate the use of recently developed methods for identifying effect modification, which is an interaction between a pretreatment covariate and the treatment. Larger and less variable treatment effects are less sensitive to unobserved bias, so identifying when effect modification is present and which covariates may be the source is important. We undertake our study in the context of studying the causal effect of incarceration on recividism via a natural experiment in the state of Pennsylvania, a motivating example that illustrates each component of our analysis. The third paper considers the issue of measurement error in the context of survey sampling and hierarchical models. Researchers are often interested in studying the relationship between community-levels variables and individual outcomes. This approach often requires estimating the neighborhood-level variable of interest from the sampled households, which induces measurement error in the neighborhood-level covariate since not all households are sampled. Other times, neighborhood-level variables are not observed directly, and only a noisy proxy is available. In both cases, the observed variables may contain measurement error. Measurement error is known to attenuate the coefficient of the mismeasured variable, but it can also affect other coefficients in the model, and ignoring measurement error can lead to misleading inference. We propose a Bayesian hierarchical model that integrates an explicit model for the measurement error process along with a model for the outcome of interest for both sampling-induced measurement error and classical measurement error. Advances in Bayesian computation, specifically the development of the Stan probabilistic programming language, make the implementation of such models easy and straightforward. Statistics Mathematical statistics Cluster analysis Errors-in-variables models
376	Bayesian variable selection for high dimensional data analysis. / CUHK electronic theses & dissertations collection January 2010 (has links) In the practice of statistical modeling, it is often desirable to have an accurate predictive model. Modern data sets usually have a large number of predictors. For example, DNA microarray gene expression data usually have the characteristics of fewer observations and larger number of variables. Hence parsimony is especially an important issue. Best-subset selection is a conventional method of variable selection. Due to the large number of variables with relatively small sample size and severe collinearity among the variables, standard statistical methods for selecting relevant variables often face difficulties. / In the third part of the thesis, we propose a Bayesian stochastic search variable selection approach for multi-class classification, which can identify relevant genes by assessing sets of genes jointly. We consider a multinomial probit model with a generalized g-prior for the regression coefficients. An efficient algorithm using simulation-based MCMC methods are developed for simulating parameters from the posterior distribution. This algorithm is robust to the choice of initial value, and produces posterior probabilities of relevant genes for biological interpretation. We demonstrate the performance of the approach with two well- known gene expression profiling data: leukemia data and lymphoma data. Compared with other classification approaches, our approach selects smaller numbers of relevant genes and obtains competitive classification accuracy based on obtained results. / The last part of the thesis is about the further research, which presents a stochastic variable selection approach with different two-level hierarchical prior distributions. These priors can be used as a sparsity-enforcing mechanism to perform gene selection for classification. Using simulation-based MCMC methods for simulating parameters from the posterior distribution, an efficient algorithm can be developed and implemented. / The second part of the thesis proposes a Bayesian stochastic variable selection approach for gene selection based on a probit regression model with a generalized singular g-prior distribution for regression coefficients. Using simulation-based MCMC methods for simulating parameters from the posterior distribution, an efficient and dependable algorithm is implemented. It is also shown that this algorithm is robust to the choice of initial values, and produces posterior probabilities of related genes for biological interpretation. The performance of the proposed approach is compared with other popular methods in gene selection and classification via the well known colon cancer and leukemia data sets in microarray literature. / Yang, Aijun. / Adviser: Xin-Yuan Song. / Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2010. / Includes bibliographical references (leaves 89-98). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. Bayesian statistical decision theory Gene expression--Statistical methods Mathematical statistics
377	Customer-centric data analysis. / 以顧客為本的數據分析 / CUHK electronic theses & dissertations collection / Yi gu ke wei ben de shu ju fen xi January 2008 (has links) With the advancement of information technology and declining hardware price, organizations and companies are able to collect large amount of personal data. Individual health records, product preferences and membership information are all converted into digital format. The ability to store and retrieve large amount of electronic records benefits many parties. Useful knowledge often hides in a large pool of raw data. In many customer-centric applications, customers want to find some "best" services according to their needs. However, since different customers may have different preferences to find "best" services, different services are suggested accordingly to different customers. In this thesis, we study models for different customer needs. Besides, customers also want to protect their individual privacy in many applications. In this thesis, we also study how individual privacy can be protected. / Wong, Chi Wing. / "June 2008." / Adviser: Ada Wai-Chee Fu. / Source: Dissertation Abstracts International, Volume: 70-03, Section: B, page: 1770. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 133-137). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts in English and Chinese. / School code: 1307. Consumers--Information services Mathematical statistics--Data processing Privacy, Right of
378	Investigation on Bayesian Ying-Yang learning for model selection in unsupervised learning. / CUHK electronic theses & dissertations collection / Digital dissertation consortium January 2005 (has links) For factor analysis models, we develop an improved BYY harmony data smoothing learning criterion BYY-HDS in help of considering the dependence between the factors and observations. We make empirical comparisons of the BYY harmony empirical learning criterion BYY-HEC, BYY-HDS, the BYY automatic model selection method BYY-AUTO, AIC, CAIC, BIC, and CV for selecting the number of factors not only on simulated data sets of different sample sizes, noise variances, data dimensions and factor numbers, but also on two real data sets from air pollution data and sport track records, respectively. / Model selection is a critical issue in unsupervised learning. Conventionally, model selection is implemented in two phases by some statistical model selection criterion such as Akaike's information criterion (AIC), Bozdogan's consistent Akaike's information criterion (CAIC), Schwarz's Bayesian inference criterion (BIC) which formally coincides with the minimum description length (MDL) criterion, and the cross-validation (CV) criterion. These methods are very time intensive and may become problematic when sample size is small. Recently, the Bayesian Ying-Yang (BYY) harmony learning has been developed as a unified framework with new mechanisms for model selection and regularization. In this thesis we make a systematic investigation on BYY learning as well as several typical model selection criteria for model selection on factor analysis models, Gaussian mixture models, and factor analysis mixture models. / The most remarkable findings of our study is that BYY-HDS is superior to its counterparts, especially when the sample size is small. AIC, BYY-HEC, BYY-AUTO and CV have a risk of overestimating, while BIC and CAIC have a risk of underestimating in most cases. BYY-AUTO is superior to other methods in a computational cost point of view. The cross-validation method requires the highest computing cost. (Abstract shortened by UMI.) / Hu Xuelei. / "November 2005." / Adviser: Lei Xu. / Source: Dissertation Abstracts International, Volume: 67-07, Section: B, page: 3899. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2005. / Includes bibliographical references (p. 131-142). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307. Bayesian statistical decision theory Machine learning Mathematical statistics--Data processing
379	Combinatorial properties of uniform designs and their applications in the constructions of low-discrepancy designs Tang, Yu 01 January 2005 (has links) No description available. Combinatorial designs and configurations Factorial experiment designs Mathematical statistics
380	Uncertainty quantification in palaeoclimate reconstruction Carson, J. January 2015 (has links) Studying the dynamics of the palaeoclimate is a challenging problem. Part of the challenge lies in the fact that our understanding must be based on only a single realisation of the climate system. With only one climate history, it is essential that palaeoclimate data are used to their full extent, and that uncertainties arising from both data and modelling are well characterised. This is the motivation behind this thesis, which explores approaches for uncertainty quantification in problems related to palaeoclimate reconstruction. We focus on uncertainty quantification problems for the glacial-interglacial cycle, namely parameter estimation, model comparison, and age estimation of palaeoclimate observations. We develop principled data assimilation schemes that allow us to assimilate palaeoclimate data into phenomenological models of the glacial-interglacial cycle. The statistical and modelling approaches we take in this thesis means that this amounts to the task of performing Bayesian inference for multivariate stochastic differential equations that are only partially observed. One contribution of this thesis is the synthesis of recent methodological advances in approximate Bayesian computation and particle filter methods. We provide an up-to-date overview that relates the different approaches and provides new insights into their performance. Through simulation studies we compare these approaches using a common benchmark, and in doing so we highlight the relative strengths and weaknesses of each method. There are two main scientific contributions in this thesis. The first is that by using inference methods to jointly perform parameter estimation and model comparison, we demonstrate that the current two-stage practice of first estimating observation times, and then treating them as fixed for subsequent analysis, leads to conclusions that are not robust to the methods used for estimating the observation times. The second main contribution is the development of a novel age model based on a linear sediment accumulation model. By extending the target of the particle filter we are able to jointly perform parameter estimation, model comparison, and observation age estimation. In doing so, we are able to perform palaeoclimate reconstruction using sediment core data that takes age uncertainty in the data into account, thus solving the problem of dating uncertainty highlighted above. 519.5

Search results