• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Coefficient of intrinsic dependence: a new measure of association

Liu, Li-yu Daisy 29 August 2005 (has links)
To detect dependence among variables is an essential task in many scientific investigations. In this study we propose a new measure of association, the coefficient of intrinsic dependence (CID), which takes value in [0,1] and faithfully reflects the full range of dependence for two random variables. The CID is free of distributional and functional assumptions. It can be easily implemented and extended to multivariate situations. Traditionally, the correlation coefficient is the preferred measure of association. However, it's effectiveness is considerably compromised when the random variables are not normally distributed. Besides, the interpretation of the correlation coefficient is difficult when the data are categorical. By contrast, the CID is free of these problems. In our simulation studies, we find that the ability of the CID in differentiating different levels of dependence remains robust across different data types (categorical or continuous) and model features (linear or curvilinear). Also, the CID is particularly effective when the dependence is strong, making it a powerful tool for variable selection. As an illustration, the CID is applied to variable selection in two aspects: classification and prediction. The analysis of actual data from a study of breast cancer gene expression is included. For the classification problem, we identify a pair of genes that best classify a patient's prognosis signature, and for the prediction problem, we identify a pair of genes that best relates to the expression of a specific gene.
2

Modelos baseados no planejamento para análise de populações finitas / Design-based models for the analysis of finite populations

González Garcia, Luz Mery 23 April 2008 (has links)
Estudamos o problema de obtenção de estimadores/preditores ótimos para combinações lineares de respostas coletadas de uma população finita por meio de amostragem aleatória simples. Nesse contexto, estendemos o modelo misto para populações finitas proposto por Stanek, Singer & Lencina (2004, Journal of Statistical Planning and Inference) para casos em que se incluem erros de medida (endógenos e exógenos) e informação auxiliar. Admitindo que as variâncias são conhecidas, mostramos que os estimadores/preditores propostos têm erro quadrático médio menor dentro da classe dos estimadores lineares não viciados. Por meio de estudos de simulação, comparamos o desempenho desses estimadores/preditores empíricos, i.e., obtidos com a substituição das componentes de variância por estimativas, com aquele de competidores tradicionais. Também, estendemos esses modelos para análise de estudos com estrutura do tipo pré-teste/pós-teste. Também por intermédio de simulação, comparamos o desempenho dos estimadores empíricos com o desempenho do estimador obtido por meio de técnicas clássicas de análise de medidas repetidas e com o desempenho do estimador obtido via análise de covariância por meio de mínimos quadrados, concluindo que os estimadores/ preditores empíricos apresentaram um menor erro quadrático médio e menor vício. Em geral, sugerimos o emprego dos estimadores/preditores empíricos propostos para dados com distribuição assimétrica ou amostras pequenas. / We consider optimal estimation of finite population parameters with data obtained via simple random samples. In this context, we extend a finite population mixed model proposed by Stanek, Singer & Lencina (2004, Journal of Statistical Planning and Inference) by including measurement errors (endogenous or exogenous) and auxiliary information. Assuming that variance components are known, we show that the proposed estimators/predictors have the smallest mean squared error in the class of unbiased estimators. Using simulation studies, we compare the performance of the empirical estimators/predictors obtained by replacing variance components with estimates with the performance of a traditional estimator. We also extend the finite population mixed model to data obtained via pretest-posttest designs. Through simulation studies, we compare the performance of the empirical estimator of the difference in gain between groups with the performance of the usual repeated measures estimator and with the performance of the usual analysis of covariance estimator obtained via ordinary least squares. The empirical estimator has smaller mean squared error and bias than the alternative estimators under consideration. In general, we recommend the use of the proposed estimators/ predictors for either asymmetric response distributions or small samples.
3

Modelos baseados no planejamento para análise de populações finitas / Design-based models for the analysis of finite populations

Luz Mery González Garcia 23 April 2008 (has links)
Estudamos o problema de obtenção de estimadores/preditores ótimos para combinações lineares de respostas coletadas de uma população finita por meio de amostragem aleatória simples. Nesse contexto, estendemos o modelo misto para populações finitas proposto por Stanek, Singer & Lencina (2004, Journal of Statistical Planning and Inference) para casos em que se incluem erros de medida (endógenos e exógenos) e informação auxiliar. Admitindo que as variâncias são conhecidas, mostramos que os estimadores/preditores propostos têm erro quadrático médio menor dentro da classe dos estimadores lineares não viciados. Por meio de estudos de simulação, comparamos o desempenho desses estimadores/preditores empíricos, i.e., obtidos com a substituição das componentes de variância por estimativas, com aquele de competidores tradicionais. Também, estendemos esses modelos para análise de estudos com estrutura do tipo pré-teste/pós-teste. Também por intermédio de simulação, comparamos o desempenho dos estimadores empíricos com o desempenho do estimador obtido por meio de técnicas clássicas de análise de medidas repetidas e com o desempenho do estimador obtido via análise de covariância por meio de mínimos quadrados, concluindo que os estimadores/ preditores empíricos apresentaram um menor erro quadrático médio e menor vício. Em geral, sugerimos o emprego dos estimadores/preditores empíricos propostos para dados com distribuição assimétrica ou amostras pequenas. / We consider optimal estimation of finite population parameters with data obtained via simple random samples. In this context, we extend a finite population mixed model proposed by Stanek, Singer & Lencina (2004, Journal of Statistical Planning and Inference) by including measurement errors (endogenous or exogenous) and auxiliary information. Assuming that variance components are known, we show that the proposed estimators/predictors have the smallest mean squared error in the class of unbiased estimators. Using simulation studies, we compare the performance of the empirical estimators/predictors obtained by replacing variance components with estimates with the performance of a traditional estimator. We also extend the finite population mixed model to data obtained via pretest-posttest designs. Through simulation studies, we compare the performance of the empirical estimator of the difference in gain between groups with the performance of the usual repeated measures estimator and with the performance of the usual analysis of covariance estimator obtained via ordinary least squares. The empirical estimator has smaller mean squared error and bias than the alternative estimators under consideration. In general, we recommend the use of the proposed estimators/ predictors for either asymmetric response distributions or small samples.

Page generated in 0.1075 seconds