281 |
On course evaluation--: a study of the course evaluation data for science faculty.January 2000 (has links)
Yiu Tat-choi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 68-69). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Student Ratings of Instructors --- p.2 / Chapter 1.2 --- Research Plan and Difficulties Encountered in the Study --- p.4 / Chapter 2 --- Data and An Overall Picture of Study --- p.7 / Chapter 2.1 --- The Questionnaire and Data Collection Method --- p.7 / Chapter 2.2 --- Pilot Study --- p.8 / Chapter 2.3 --- Data Editing --- p.12 / Chapter 2.3.1 --- Clerical Error --- p.12 / Chapter 2.3.2 --- Strange Patterns --- p.13 / Chapter 2.4 --- Missing Items ´ؤ Item Nonresponse --- p.14 / Chapter 2.5 --- Missing Items - Unit Nonresponse --- p.16 / Chapter 2.6 --- Effective Class Size --- p.21 / Chapter 2.7 --- Imputation of Item Nonresponse Data --- p.23 / Chapter 2.8 --- Overall Picture of Study --- p.25 / Chapter 3 --- Data Analysis I: Logistic Regression --- p.28 / Chapter 3.1 --- Conditional Independence --- p.29 / Chapter 3.2 --- Partial Correlation --- p.30 / Chapter 3.3 --- Simultaneous p-value --- p.31 / Chapter 3.4 --- Logit Model --- p.32 / Chapter 3.5 --- Logit Model for Ordinal Variables --- p.35 / Chapter 3.6 --- Iteratively Reweighted Least Squares (IRLS) Algorithm --- p.36 / Chapter 3.7 --- Criteria for Assessing Model Fit --- p.38 / Chapter 3.7.1 --- Assessing the Fit of the Model --- p.39 / Chapter 3.7.2 --- Pearson Chi-Square and Deviance --- p.40 / Chapter 3.8 --- Interpretation of the Coefficients of The Weighted Logistic Re- gression Model --- p.42 / Chapter 3.8.1 --- Nominal Independent Variable --- p.42 / Chapter 3.8.2 --- Continuous Independent Variable --- p.45 / Chapter 4 --- Data Analysis II: Adjusted Instructor Score --- p.49 / Chapter 4.1 --- Removing Effects of Class Characteristics Factor and Adjust- ing the Score --- p.50 / Chapter 4.2 --- Adjusted Instructor Score (AIS) --- p.54 / Chapter 4.3 --- Estimate Standard Error of AIS by Bootstrap Method --- p.55 / Chapter 5 --- Conclusion --- p.58 / Chapter 5.1 --- Comparison Between the AIS and Average Score --- p.58 / Chapter 5.2 --- Discussion --- p.60 / Appendix A1: Course Evaluation Survey Form --- p.63 / Appendix A2: Course Evaluation Supplementary Form . --- p.64 / Appendix B: Descriptive Statistics for Response Rate --- p.65 / Appendix C: The Descriptions of Class Characteristics Dummy Variables --- p.67 / Bibliography --- p.68
|
282 |
Experimental studies of the statistical properties of coherent thermal structures in turbulent Rayleigh-Bénard convection =: 湍動對流中相干熱結构統計性質的實驗硏究. / 湍動對流中相干熱結构統計性質的實驗硏究 / Experimental studies of the statistical properties of coherent thermal structures in turbulent Rayleigh-Bénard convection =: Tuan dong dui liu zhong xiang gan re jie gou tong ji xing zhi de shi yan yan jiu. / Tuan dong dui liu zhong xiang gan re jie gou tong ji xing zhi de shi yan yan jiuJanuary 2000 (has links)
Zhou Sheng-qi. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 66-70). / Text in English; abstracts in English and Chinese. / Zhou Sheng-qi. / Abstract (in Chinese) --- p.i / Abstract (in English) --- p.ii / Acknowledgement --- p.iii / Table of Contents --- p.iv / List of Figures --- p.vi / List of Tables --- p.viii / Chapter / Chapter 1. --- Introduction --- p.1 / Chapter 1.1 --- Turbulence: a Universal Problem --- p.1 / Chapter 1.2 --- Rayleigh-Benard Convection --- p.2 / Chapter 1.2.1 --- The History of Rayleigh-Benard Convection --- p.2 / Chapter 1.2.2 --- The Dimensionless Parameters --- p.4 / Chapter 1.2.3 --- The Physical Picture of Turbulent Convection --- p.5 / Chapter 1.3 --- Motivation of This Study --- p.8 / Chapter 2. --- Theoretical Base and Experimental Setup --- p.11 / Chapter 2.1 --- The Rayleigh-Benard problem --- p.11 / Chapter 2.1.1 --- The Boussinesq approximation --- p.11 / Chapter 2.1.2 --- The Convection Equation --- p.13 / Chapter 2.2 --- Experimental Setup and Measurement --- p.14 / Chapter 2.2.1 --- The Convection Cell --- p.14 / Chapter 2.2.2 --- The Power Supply and the Refrigerated Recirculator --- p.19 / Chapter 2.2.3 --- The Temperature Probes --- p.19 / Chapter 2.2.4 --- The Temperature Measurement System --- p.20 / Chapter 2.2.5 --- Building up the Convection State --- p.25 / Chapter 3. --- Temperature Power Spectra and the Viscous Boundary Layer in the Thermal Turbulence --- p.27 / Chapter 3.1 --- The Power Spectra Method --- p.27 / Chapter 3.2 --- The Suspicions of the Power Spectra Method --- p.30 / Chapter 3.3 --- Discussion of the Experimental Results --- p.32 / Chapter 3.4 --- Summary --- p.39 / Chapter 4. --- The Correlation Function of Temperature --- p.40 / Chapter 4.1 --- Preparation of Experiment --- p.41 / Chapter 4.1.1 --- Apparatus --- p.41 / Chapter 4.1.2 --- Definition of correlation function --- p.41 / Chapter 4.2 --- Results and Discussion --- p.44 / Chapter 4.2.1 --- The Delay Time (¡’0) --- p.47 / Chapter 4.2.2 --- The Maximum Correlation Coefficient (R) --- p.52 / Chapter 4.2.3 --- The Half Width (¡’h) --- p.58 / Chapter 4.3 --- Summary --- p.61 / Chapter 5. --- Conclusions --- p.63 / References --- p.66
|
283 |
Comparing relative predictive power through squared multiple correlations in within-sample regression analysis. / Comparing relative predictive powerJanuary 2008 (has links)
Cheung, Yu Hin Ray. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (leaves 47-49). / Abstracts in English and Chinese. / Chapter CHAPTER ONE: --- INTRODUCTION --- p.1 / Chapter CHAPTER TWO: --- A UNIFIED BOOTSTRAP PROCEDURE --- p.7 / Chapter CHAPTER THREE: --- A SIMULATION STUDY --- p.10 / Chapter CHAPTER FOUR: --- RESULTS --- p.18 / Chapter CHAPTER FIVE: --- DISCUSSION --- p.33 / Chapter CHAPTER SIX: --- CONCLUSION --- p.37 / APPENDICES --- p.38 / REFERENCES --- p.46
|
284 |
Bayesian approach for two model-selection-related bioinformatics problems. / CUHK electronic theses & dissertations collectionJanuary 2013 (has links)
在貝葉斯推理框架下,貝葉斯方法可以通過數據推斷複雜概率模型中的參數和結構。它被廣泛應用於多个領域。對於生物信息學問題,貝葉斯方法同樣也是一個理想的方法。本文通過介紹新的貝葉斯模型和計算方法討論並解決了兩個與模型選擇相關的生物信息學問題。 / 第一個問題是關於在DNA 序列中的模式識別的相關研究。串聯重複序列片段在DNA 序列中經常出現。它對於基因組進化和人類疾病的研究非常重要。在這一部分,本文主要討論不確定數目的同一模式的串聯重複序列彌散分佈在同一個序列中的情況。我們首先對串聯重複序列片段構建概率模型。然後利用馬爾可夫鏈蒙特卡羅算法探索後驗分佈進而推斷出串聯重複序列的重複片段的模式矩陣和位置。此外,利用RJMCMC 算法解決由不確定數目的重複片段引起的模型選擇問題。 / 另一個問題是對於生物分子的構象轉換的分析。一組生物分子的構象可被分成幾個不同的亞穩定狀態。由於生物分子的功能和構象之間的固有聯繫,構象轉變在不同的生物分子的生物過程中都扮演者非常重要的角色。一般我們從分子動力學模擬中可以得到構象轉換的數據。基於從分子動力學模擬中得到的微觀狀態水準上的構象轉換資訊,我們利用貝葉斯方法研究從微觀狀態到可變數目的亞穩定狀態的聚合問題。 / 本文通過對以上兩個問題討論闡釋貝葉斯方法在生物信息學研究的多個方面具備優勢。這包括闡述生物問題的多變性,處理噪聲和失數據,以及解決模型選擇問題。 / Bayesian approach is a powerful framework for inferring the parameters and structures of complicated probabilistic models from data. It is widely applied in many areas and also ideal for Bioinformatics problems due to their usually high complexity. In this thesis, new Bayesian models and computing methods are introduced to solve two Bioinformatics problems which are both related to model selection. / The first problem is about the repeat pattern recognition. Tandem repeats occur frequently in DNA sequences. They are important for studying genome evolution and human disease. This thesis focuses on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. A probabilistic generative model is introduced for the tandem repeats. Markov chain Monte Carlo algorithms are used to explore the posterior distribution as an effort to infer both the specific pattern of the tandem repeats and the location of repeat segments. Furthermore, reversible jump Markov chain Monte Carlo algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments. / The second part of this thesis is engaged in the conformational transitions of biomolecules. Because the function of a biological biomolecule is inherently related to its variable conformations which can be grouped into a set of metastable or long-live states, conformational transitions are important in biological processes. The 3D structure changes are generally simulated from the molecular dynamics computer simulation. Based on the conformational transitions on microstate level from molecular dynamics simulation, a Bayesian approach is developed to cluster the microstates into an uncertainty number of metastable that induces the model selection problem. / With these two problems, this thesis shows that the Bayesian approach for bioinformatics problems has its advantages in terms of taking account of the inherent uncertainty in biological data, handling noisy or missing data, and dealing with the model selection problem. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Liang, Tong. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 120-130). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Statistical Background --- p.2 / Chapter 1.3 --- Tandem Repeats --- p.4 / Chapter 1.4 --- Conformational Space --- p.5 / Chapter 1.5 --- Outlines --- p.7 / Chapter 2 --- Preliminaries --- p.9 / Chapter 2.1 --- Bayesian Inference --- p.9 / Chapter 2.2 --- Markov chain Monte Carlo --- p.10 / Chapter 2.2.1 --- Gibbs sampling --- p.11 / Chapter 2.2.2 --- Metropolis - Hastings algorithm --- p.12 / Chapter 2.2.3 --- Reversible Jump MCMC --- p.12 / Chapter 3 --- Detection of Dispersed Short Tandem Repeats Using Reversible Jump MCMC --- p.14 / Chapter 3.1 --- Background --- p.14 / Chapter 3.2 --- Generative Model --- p.17 / Chapter 3.3 --- Statistical inference --- p.18 / Chapter 3.3.1 --- Likelihood --- p.19 / Chapter 3.3.2 --- Prior Distributions --- p.19 / Chapter 3.3.3 --- Sampling from Posterior Distribution via RJMCMC --- p.20 / Chapter 3.3.4 --- Extra MCMC moves for better mixing --- p.26 / Chapter 3.3.5 --- The complete algorithm --- p.29 / Chapter 3.4 --- Experiments --- p.29 / Chapter 3.4.1 --- Evaluation and comparison of the two RJMCMC versions using synthetic data --- p.30 / Chapter 3.4.2 --- Comparison with existing methods using synthetic data --- p.33 / Chapter 3.4.3 --- Sensitivity to Priors --- p.43 / Chapter 3.4.4 --- Real data experiment --- p.45 / Chapter 3.5 --- Discussion --- p.50 / Chapter 4 --- A Probabilistic Clustering Algorithm for Conformational Changes of Biomolecules --- p.53 / Chapter 4.1 --- Introduction --- p.53 / Chapter 4.1.1 --- Molecular dynamic simulation --- p.54 / Chapter 4.1.2 --- Hierarchical Conformational Space --- p.55 / Chapter 4.1.3 --- Clustering Algorithms --- p.56 / Chapter 4.2 --- Generative Model --- p.58 / Chapter 4.2.1 --- Model 1: Vanilla Model --- p.59 / Chapter 4.2.2 --- Model 2: Zero-Inflated Model --- p.60 / Chapter 4.2.3 --- Model 3: Constrained Model --- p.61 / Chapter 4.2.4 --- Model 4: Constrained and Zero-Inflated Model --- p.61 / Chapter 4.3 --- Statistical Inference for Vanilla Model --- p.62 / Chapter 4.3.1 --- Priors --- p.62 / Chapter 4.3.2 --- Posterior distribution --- p.63 / Chapter 4.3.3 --- Collapsed Gibbs for Vanilla Model with a Fixed Number of Clusters --- p.63 / Chapter 4.3.4 --- Inference on the Number of Clusters --- p.65 / Chapter 4.3.5 --- Synthetic Data Study --- p.68 / Chapter 4.4 --- Statistical Inference for Zero-Inflated Model --- p.76 / Chapter 4.4.1 --- Method 1 --- p.78 / Chapter 4.4.2 --- Method 2 --- p.81 / Chapter 4.4.3 --- Synthetic Data Study --- p.84 / Chapter 4.5 --- Statistical Inference for Constrained Model --- p.85 / Chapter 4.5.1 --- Priors --- p.85 / Chapter 4.5.2 --- Posterior Distribution --- p.86 / Chapter 4.5.3 --- Collapsed Posterior Distribution --- p.86 / Chapter 4.5.4 --- Updating for Cluster Labels K --- p.89 / Chapter 4.5.5 --- Updating for Constrained Λ from Truncated Distribution --- p.89 / Chapter 4.5.6 --- Updating the Number of Clusters --- p.91 / Chapter 4.5.7 --- Uniform Background Parameters on Λ --- p.92 / Chapter 4.6 --- Real Data Experiments --- p.93 / Chapter 4.7 --- Discussion --- p.104 / Chapter 5 --- Conclusion and FutureWork --- p.107 / Chapter A --- Appendix --- p.109 / Chapter A.1 --- Post-processing for indel treatment --- p.109 / Chapter A.2 --- Consistency Score --- p.111 / Chapter A.3 --- A Proof for Collapsed Posterior distribution in Constrained Model in Chapter 4 --- p.111 / Chapter A.4 --- Estimated Transition Matrices for Alanine Dipeptide by Chodera et al. (2006) --- p.117 / Bibliography --- p.120
|
285 |
Exact simulation and importance sampling of diffusion process. / CUHK electronic theses & dissertations collectionJanuary 2012 (has links)
随着全球金融市场的日益创新和不断加剧的竞争,金融产品也变得越来越结构复杂。这些复杂的金融产品,从定价,对冲到风险管理,都对相应的数学技术提出越来越高的要求。在目前运用的技术中,蒙特卡洛模拟方法由于其广泛的适用性而备受欢迎。本篇论文对于在金融工程和工业界都受到广泛关注的两个问题进行研究:局部化以及对于受布朗运动驱动的随机微分方程的精确抽样;布朗河曲,重要性抽样已经对于扩散过程极值的无偏估计。 / 第一篇文章考虑了使用蒙特卡洛模拟方法产生随机微分方程的样本路径。离散化方法是此前普遍使用的近似产生路径的方法:这种方法很容易实施,但是会产生抽样偏差。本篇文章提出一种模拟方法,可用于随机微分方程路径的精确抽样。一个至关重要的发现是:随机微分方程的概率分布可以被分解为两部分的乘积,一部分是标准布朗运动的概率分布,另外一部分是双重随机的泊松过程。基于这样的分解和局部化技术,本篇文章提出一种接受-拒绝算法。数值试验可以验证,这种方法的均方误差-计算时间的收敛速度可以达到O(t⁻¹[superscript /]²),优于传统的离散化方法。更进一步的优点是:这种方法可以对带边界的随机微分方程进行精确抽样,而带边界的微分方程正是传统离散方法经常遇到困难的情形。 / 第二篇文章研究了如何计算基于扩散过程极值的泛函。传统的离散化方法收率速度很慢。本篇文章提出了一种基于维纳测度分解的无偏蒙特卡洛估计。运用重要性抽样技术和对于布朗运动路径的威廉分解,本篇文章将对于一般性扩散过程的极值的抽样化简为对于两个布朗河曲的抽样。数值试验部分也验证了本篇文章所提方法的准确性和计算上的高效率。 / With increased innovation and competition in the current financial market, financial product has become more and more complicated, which requires advanced techniques in pricing, hedging and risk management. Monte Carlo simulation is among the most popular ones due to its great °exibility. This dissertation contains two problems recently arises and receives much attention from both the financial engineering and simulation communities: Localization and Exact Simulation of Brownian Motion Driven Stochastic Differential Equations; And Brownian Meanders, Importance Sampling and Un-biased Simulation of Diffusion Extremes. / The first essay considers generating sample paths of stochastic differential equations (SDE) using the Monte Carlo method. Discretization is a popular approximate approach to generating those paths: it is easy to implement but prone to simulation bias. This essay presents a new simulation scheme to exactly generate samples for SDEs. The key observation is that the law of a general SDE can be decomposed into a product of the law of standard Brownian motion and the law of a doubly stochastic Poisson process. An acceptance-rejection algorithm is devised based on the combination of this decomposition and a localization technique. The numerical results corroborates that the mean-square error of the proposed method is in the order of O(t⁻¹[superscript /]²), which is superior to the conventional discretization schemes. Furthermore, the proposed method also can generate exact samples for SDE with boundaries which the discretization schemes usually find difficulty in dealing with. / The second essay considers computing expected values of functions involving extreme values of diffusion processes. The conventional discretization Monte Carlo simulation schemes often converge very slowly. In this paper, we propose a Wiener measure decomposition-based approach to construct unbiased Monte Carlo estimators. Combined with the importance sampling technique and the celebrated Williams' path decomposition of Brownian motion, this approach transforms the task of simulating extreme values of a general diffusion process to the simulation of two Brownian meanders. The numerical experiments show the accuracy and efficiency of our Poisson-kernel unbiased estimators. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Huang, Zhengyu. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 107-115). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Background --- p.1 / Chapter 1.2 --- SDEs and Discretization Methods --- p.4 / Chapter 1.3 --- The Beskos-Roberts Exact Simulation --- p.15 / Chapter 1.4 --- Major Contributions --- p.19 / Chapter 1.5 --- Organization --- p.26 / Chapter 2 --- Localization and Exact Simulation of SDEs --- p.27 / Chapter 2.1 --- Main Result: A Localization Technique --- p.27 / Chapter 2.1.1 --- Sampling of ζ --- p.33 / Chapter 2.1.2 --- Sampling of Wζ^(T-t) --- p.35 / Chapter 2.1.3 --- Sampling of the Bernoulli I --- p.38 / Chapter 2.1.4 --- Comparison Involving Infinite Sums --- p.40 / Chapter 2.2 --- Discussions --- p.43 / Chapter 2.2.1 --- One Extension: SDEs with Boundaries --- p.43 / Chapter 2.2.2 --- Simulation Efficiency --- p.45 / Chapter 2.2.3 --- Extension to Multi-dimensional SDE --- p.48 / Chapter 2.3 --- Numerical Examples --- p.52 / Chapter 2.3.1 --- Ornstein-Uhlenbeck Mean-Reverting Process --- p.52 / Chapter 2.3.2 --- A Double-Well Potential Model --- p.56 / Chapter 2.3.3 --- Cox-Ingersoll-Ross Square-Root Process --- p.56 / Chapter 2.3.4 --- Linear-Drift CEV-Type-Diffusion Model --- p.62 / Chapter 2.4 --- Appendix --- p.62 / Chapter 2.4.1 --- Simulation of Brownian Bridges --- p.62 / Chapter 2.4.2 --- Proofs of Main Results --- p.64 / Chapter 2.4.3 --- The Oscillating Property of the Series --- p.71 / Chapter 3 --- Unbiased Simulation of Diffusion Extremes --- p.79 / Chapter 3.1 --- A Wiener Measure Decomposition --- p.79 / Chapter 3.2 --- Brownian Meanders and Importance Sampler of Diffusion Extremes --- p.81 / Chapter 3.2.1 --- Exact Simulation of (θT, KT, WT) --- p.83 / Chapter 3.2.2 --- Simulating Importance Sampling Weight --- p.84 / Chapter 3.3 --- Some Extensions --- p.88 / Chapter 3.3.1 --- Variance Reduction --- p.88 / Chapter 3.3.2 --- Double Barrier Options --- p.90 / Chapter 3.4 --- Numerical Examples --- p.94 / Chapter 3.5 --- Appendix --- p.98 / Chapter 3.5.1 --- Brownian Bridges and Meanders --- p.98 / Chapter 3.5.2 --- Proofs of Main Results --- p.101 / Bibliography --- p.107
|
286 |
Statistical pattern recognition based structural health monitoring strategiesBalsamo, Luciana January 2015 (has links)
Structural Health Monitoring (SHM) is concerned with the analysis of aerospace, mechanical and civil systems with the objective of identifying damage at its onset. In civil engineering applications, damage may be defined as any change in the structural properties that hinders the current or future performance of that system. This is the premise on which vibration-based techniques are based. Vibration-based methods exploit the response measured directly on the system to solve the SHM assignment. However, also fluctuations in the external conditions may induce changes in the structural properties. For these reasons, the SHM problem is ideally suited to be solved within the context of statistical pattern recognition, which is the discipline concerned with the automatic classification of objects into categories. Within the statistical pattern recognition based SHM framework, the structural response is portrayed by means of a compact representation of its main traits, called damage sensitive features (dsf). In this dissertation, two typologies of dsf are studied: the first type is extracted from the response of the system by means of digital signal processes alone, while the other is obtained by making use of a physical model of the system. In both approaches, the effects of external conditions are accounted for by modeling the damage sensitive features as random variables. While the first method uses outlier analysis tools and delivers a method optimally apt to perform the task of damage detection within the short-term horizon, the second approach, being model-based, allows for a deeper characterization of damage, and it is then more suited for long-term monitoring purposes. In the dissertation, an approach is also proposed that allows the use of the statistical pattern recognition framework when there is limited availability of data to model the damage sensitive features. All proposed methodologies are validated both numerically and experimentally.
|
287 |
Flexible Sparse Learning of Feature SubspacesMa, Yuting January 2017 (has links)
It is widely observed that the performances of many traditional statistical learning methods degenerate when confronted with high-dimensional data. One promising approach to prevent this downfall is to identify the intrinsic low-dimensional spaces where the true signals embed and to pursue the learning process on these informative feature subspaces. This thesis focuses on the development of flexible sparse learning methods of feature subspaces for classification. Motivated by the success of some existing methods, we aim at learning informative feature subspaces for high-dimensional data of complex nature with better flexibility, sparsity and scalability.
The first part of this thesis is inspired by the success of distance metric learning in casting flexible feature transformations by utilizing local information. We propose a nonlinear sparse metric learning algorithm using a boosting-based nonparametric solution to address metric learning problem for high-dimensional data, named as the sDist algorithm. Leveraged a rank-one decomposition of the symmetric positive semi-definite weight matrix of the Mahalanobis distance metric, we restructure a hard global optimization problem into a forward stage-wise learning of weak learners through a gradient boosting algorithm. In each step, the algorithm progressively learns a sparse rank-one update of the weight matrix by imposing an L-1 regularization. Nonlinear feature mappings are adaptively learned by a hierarchical expansion of interactions integrated within the boosting framework. Meanwhile, an early stopping rule is imposed to control the overall complexity of the learned metric. As a result, without relying on computationally intensive tools, our approach automatically guarantees three desirable properties of the final metric: positive semi-definiteness, low rank and element-wise sparsity. Numerical experiments show that our learning model compares favorably with the state-of-the-art methods in the current literature of metric learning.
The second problem arises from the observation of high instability and feature selection bias when applying online methods to highly sparse data of large dimensionality for sparse learning problem. Due to the heterogeneity in feature sparsity, existing truncation-based methods incur slow convergence and high variance. To mitigate this problem, we introduce a stabilized truncated stochastic gradient descent algorithm. We employ a soft-thresholding scheme on the weight vector where the imposed shrinkage is adaptive to the amount of information available in each feature. The variability in the resulted sparse weight vector is further controlled by stability selection integrated with the informative truncation. To facilitate better convergence, we adopt an annealing strategy on the truncation rate. We show that, when the true parameter space is of low dimension, the stabilization with annealing strategy helps to achieve lower regret bound in expectation.
|
288 |
Selected Legal Applications for Bayesian MethodsCheng, Edward K. January 2018 (has links)
This dissertation offers three contexts in which Bayesian methods can address tricky problems in the legal system. Chapter 1 offers a method for attacking case publication bias, the possibility that certain legal outcomes may be more likely to be published or observed than others. It builds on ideas from multiple systems estimation (MSE), a technique traditionally used for estimating hidden populations, to detect and correct case publication bias. Chapter 2 proposes new methods for dividing attorneys' fees in complex litigation involving multiple firms. It investigates optimization and statistical approaches that use peer reports of each firm's relative contribution to estimate a "fair" or consensus division of the fees. The methods proposed have lower informational requirements than previous work and appear to be robust to collusive behavior by the firms. Chapter 3 introduces a statistical method for classifying legal cases by doctrinal area or subject matter. It proposes using a latent space approach based on case citations as an alternative to the traditional manual coding of cases, reducing subjectivity, arbitrariness, and confirmation bias in the classification process.
|
289 |
A Three-Paper Dissertation on Longitudinal Data Analysis in Education and PsychologyAhmadi, Hedyeh January 2019 (has links)
In longitudinal settings, modeling the covariance structure of repeated measure data is essential for proper analysis. The first paper in this three-paper dissertation presents a survey of four journals in the fields of Education and Psychology to identify the most commonly used methods for analyzing longitudinal data. It provides literature reviews and statistical details for each identified method. This paper also offers a summary table giving the benefits and drawbacks of all the surveyed methods in order to help researchers choose the optimal model according to the structure of their data. Finally, this paper highlights that even when scholars do use more advanced methods for analyzing repeated measure data, they very rarely report (or explore in their discussions) the covariance structure implemented in their choice of modeling. This suggests that, at least in some cases, researchers may not be taking advantage of the optimal covariance patterns. This paper identifies a gap in the standard statistical practices of the fields of Education and Psychology, namely that researchers are not modeling the covariance structure as an extension of fixed/random effects modeling. The second paper introduces the General Serial Covariance (GSC) approach, an extension of the Linear Mixed Modeling (LMM) or Hierarchical Linear Model (HLM) techniques that models the covariance structure using spatial correlation functions such as Gaussian, Exponential, and other patterns. These spatial correlations model the covariance structure in a continuous manner and therefore can deal with missingness and imbalanced data in a straightforward way. A simulation study in the second paper reveals that when data are consistent with the GSC model, using basic HLMs is not optimal for the estimation and testing of the fixed effects. The third paper is a tutorial that uses a real-world data set from a drug abuse prevention intervention to demonstrate the use of the GSC and basic HLM models in R programming language. This paper utilizes variograms (a visualization tool borrowed from geostatistics) among other exploratory tools to determine the covariance structure of the repeated measure data. This paper aims to introduce the GSC model and variogram plots to Education and Psychology, where, according to the survey in the first paper, they are not in use. This paper can also help scholars seeking guidance for interpreting the fixed effect-parameters.
|
290 |
Testing mediating effects with structural equation modeling: problems and solutions.January 2004 (has links)
Lau Suk Yin Rebecca. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 111-117). / Abstracts in English and Chinese. / ABSTRACT (ENGLISH) --- p.i / ABSTRACT (CHINESE) --- p.iii / ACKNOWLEDGEMENT --- p.iv / TABLE OF CONTENTS --- p.v / LIST OF TABLES --- p.viii / LIST OF FIGURES --- p.ix / Chapter CHAPTER I --- INTRODUCTION --- p.1 / Chapter CHAPTER II --- LITERATURE REVIEW --- p.6 / Chapter 2.1 --- Definition of Mediating Effects --- p.6 / Chapter 2.2 --- Approaches to Mediational Analyses --- p.12 / Chapter 2.2.1 --- Correlation Approach --- p.13 / Chapter 2.2.2 --- Hierarchical Regression Approach --- p.17 / Chapter 2.2.3 --- SEM Approach --- p.39 / Chapter 2.3 --- Summary --- p.44 / Chapter CHAPTER III --- A TEST FOR THE SIGNIFICANCE OF MEDIATING EFFECTS IN SEM --- p.47 / Chapter 3.1 --- A Significance Test for the Mediating Effects with SEM --- p.48 / Chapter 3.1.1 --- Model without Mediating Effects --- p.48 / Chapter 3.1.2 --- Model with Full Mediation --- p.49 / Chapter 3.1.3 --- Model with Partial Mediation --- p.49 / Chapter 3.1.4 --- Model with Suppression --- p.50 / Chapter 3.2 --- Procedure for Testing the Significance of Mediating Effects in SEM --- p.50 / Chapter 3.3 --- Summary --- p.56 / Chapter CHAPTER IV --- MODEL COMPARISON IN SEM --- p.59 / Chapter 4.2 --- Testing the Significance of Mediating Effects with ΔFIs --- p.61 / Chapter CHAPTER V --- METHODOLOGY OF SIMULATION --- p.65 / Chapter 5.1 --- Resampling Space Generation --- p.65 / Chapter 5.2 --- Sample Generation and Method of Analysis --- p.67 / Chapter CHAPTER VI --- SIMULATION RESULTS AND DISCUSSION --- p.73 / Chapter 6.1 --- Simulation Results --- p.73 / Chapter 6.1.1 --- Variance Explained by Model Characteristics --- p.73 / Chapter 6.1.1.1 --- Variance Explained Under the Condition of No Mediation --- p.80 / Chapter 6.1.1.2 --- Variance Explained Under the Condition of Mediating Effects at 0.1 --- p.81 / Chapter 6.1.1.2.1 --- Variance Explained by Factor Loadings --- p.81 / Chapter 6.1.1.2.2 --- Variance Explained by Sample Size --- p.82 / Chapter 6.1.1.2.3 --- Variance Explained by Number of Items --- p.83 / Chapter 6.1.1.2.4 --- "Variance Explained by 2-Way Interactions of Factor Loadings, Sample Size and Number of Items" --- p.83 / Chapter 6.1.2 --- Correlation between FIs and ΔFIs --- p.84 / Chapter 6.2 --- Simulation Result Discussion --- p.88 / Chapter CHAPTER VII --- NUMERICAL EXAMPLE --- p.91 / Chapter 7.1 --- Testing Mediating Effects in a Model in Past Literature --- p.91 / Chapter 7.2 --- Summary --- p.94 / Chapter CHAPTER VIII --- DISCUSSION --- p.96 / Chapter 8.1 --- Limitations and Directions for Future Research --- p.101 / APPENDIX / Chapter APPENDIX I --- Syntax for Testing the Significance of Mediating Effects (Unconstrained Model) / Chapter APPENDIX II --- Syntax for Testing the Significance of Mediating Effects (Constrained Model) / Chapter APPENDIX III --- Syntax for Testing Full Mediation --- p.106 / Chapter APPENDIX IV --- "Syntax for Testing Mediating Effects in Model by Foley, Kidder & Powell (2002) (DV: Intentions to Leave) (Unconstrained Model)" --- p.107 / Chapter APPENDIX V --- "Syntax for Testing Mediating Effects in Model by Foley, Kidder & Powell (2002) (DV: Intentions to Leave) (Constrained Model)" --- p.108 / Chapter APPENDIX VI --- "Syntax for Testing Mediating Effects in Model by Foley, Kidder & Powell (2002) (DV: Perceived Career Prospects) (Unconstrained Model)" --- p.109 / Chapter APPENDIX VII --- "Syntax for Testing Mediating Effects in Model by Foley, Kidder & Powell (2002) (DV: Perceived Career Prospects) (Constrained Model)" --- p.110 / REFERENCES --- p.111
|
Page generated in 0.1102 seconds