Spelling suggestions: "subject:"receiver aperating characteristic curve"" "subject:"receiver aperating eharacteristic curve""
1 |
A covariate-adjusted classification model for multiple biomarkers in disease screening and diagnosisYu, Suizhi January 1900 (has links)
Doctor of Philosophy / Department of Statistics / Wei-Wen Hsu / The classification methods based on a linear combination of multiple biomarkers have been widely used to improve the accuracy in disease screening and diagnosis. However, it is seldom to include covariates such as gender and age at diagnosis into these classification procedures. It is known that biomarkers or patient outcomes are often associated with some covariates in practice, therefore the inclusion of covariates may further improve the power of prediction as well as the classification accuracy. In this study, we focus on the classification methods for multiple biomarkers adjusting for covariates. First, we proposed a covariate-adjusted classification model for multiple cross-sectional biomarkers. Technically, it is a two-stage method with a parametric or non-parametric approach to combine biomarkers first, and then incorporating covariates with the use of the maximum rank correlation estimators. Specifically, these parameter coefficients associated with covariates can be estimated by maximizing the area under the receiver operating characteristic (ROC) curve. The asymptotic properties of these estimators in the model are also discussed. An intensive simulation study is conducted to evaluate the performance of this proposed method in finite sample sizes. The data of colorectal cancer and pancreatic cancer are used to illustrate the proposed methodology for multiple cross-sectional biomarkers. We further extend our classification method to longitudinal biomarkers. With the use of a natural cubic spline basis, each subject's longitudinal biomarker profile can be characterized by spline coefficients with a significant reduction in the dimension of data. Specifically, the maximum reduction can be achieved by controlling the number of knots or degrees of freedom in the spline approach, and its coefficients can be obtained by the ordinary least squares method. We consider each spline coefficient as ``biomarker'' in our previous method, then the optimal linear combination of those spline coefficients can be acquired using Stepwise method without any distributional assumption. Afterward, covariates are included by maximizing the corresponding AUC as the second stage. The proposed method is applied to the longitudinal data of Alzheimer's disease and the primary biliary cirrhosis data for illustration. We conduct a simulation study to assess the finite-sample performance of the proposed method for longitudinal biomarkers.
|
2 |
Contribution to Statistical Techniques for Identifying Differentially Expressed Genes in Microarray DataHossain, Ahmed 30 August 2011 (has links)
With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes (features or genomic biomarkers) simultaneously in one single experiment. Robust and accurate gene selection methods are required to identify differentially expressed genes across different samples for disease diagnosis or prognosis. The problem of identifying significantly differentially expressed genes can be stated as follows: Given gene expression measurements from an experiment of two (or more)conditions, find a subset of all genes having significantly
different expression levels across these two (or more) conditions.
Analysis of genomic data is challenging due to high dimensionality of data and low sample size. Currently several mathematical and statistical methods exist to identify significantly differentially expressed genes. The methods typically focus on gene by gene analysis within a parametric hypothesis testing framework. In this study, we propose three flexible procedures for analyzing microarray data.
In the first method we propose a parametric method which is based on a flexible distribution, Generalized Logistic Distribution of Type II (GLDII), and an approximate likelihood ratio test (ALRT) is
developed. Though the method considers gene-by-gene analysis, the ALRT method with distributional assumption GLDII appears to provide a favourable fit to microarray data.
In the second method we propose a test statistic for testing whether area under receiver operating characteristic curve (AUC) for each gene is greater than 0.5 allowing different variances for each gene.
This proposed method is computationally less intensive and can identify genes that are reasonably stable with satisfactory
prediction performance. The third method is based on comparing two AUCs for a pair of genes that is designed for selecting highly
correlated genes in the microarray datasets. We propose a nonparametric procedure for selecting genes with expression levels
correlated with that of a ``seed" gene in microarray experiments.
The test proposed by DeLong et al. (1988) is the conventional nonparametric procedure for comparing correlated AUCs. It uses a
consistent variance estimator and relies on asymptotic normality of the AUC estimator. Our proposed method includes DeLong's variance estimation technique in comparing pair of genes and can identify genes with biologically sound implications.
In this thesis, we focus on the primary step in the gene selection process, namely, the ranking of genes with respect to a statistical measure of differential expression. We assess the proposed
approaches by extensive simulation studies and demonstrate the methods on real datasets. The simulation study indicates that the parametric method performs favorably well at any settings of variance, sample size and treatment effects. Importantly, the method is found less sensitive to contaminated by noise. The proposed nonparametric methods do not involve complicated formulas and do not
require advanced programming skills. Again both methods can identify a large fraction of truly differentially expressed (DE) genes,
especially if the data consists of large sample sizes or the presence of outliers. We conclude that the proposed methods offer
good choices of analytical tools to identify DE genes for further biological and clinical analysis.
|
3 |
Contribution to Statistical Techniques for Identifying Differentially Expressed Genes in Microarray DataHossain, Ahmed 30 August 2011 (has links)
With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes (features or genomic biomarkers) simultaneously in one single experiment. Robust and accurate gene selection methods are required to identify differentially expressed genes across different samples for disease diagnosis or prognosis. The problem of identifying significantly differentially expressed genes can be stated as follows: Given gene expression measurements from an experiment of two (or more)conditions, find a subset of all genes having significantly
different expression levels across these two (or more) conditions.
Analysis of genomic data is challenging due to high dimensionality of data and low sample size. Currently several mathematical and statistical methods exist to identify significantly differentially expressed genes. The methods typically focus on gene by gene analysis within a parametric hypothesis testing framework. In this study, we propose three flexible procedures for analyzing microarray data.
In the first method we propose a parametric method which is based on a flexible distribution, Generalized Logistic Distribution of Type II (GLDII), and an approximate likelihood ratio test (ALRT) is
developed. Though the method considers gene-by-gene analysis, the ALRT method with distributional assumption GLDII appears to provide a favourable fit to microarray data.
In the second method we propose a test statistic for testing whether area under receiver operating characteristic curve (AUC) for each gene is greater than 0.5 allowing different variances for each gene.
This proposed method is computationally less intensive and can identify genes that are reasonably stable with satisfactory
prediction performance. The third method is based on comparing two AUCs for a pair of genes that is designed for selecting highly
correlated genes in the microarray datasets. We propose a nonparametric procedure for selecting genes with expression levels
correlated with that of a ``seed" gene in microarray experiments.
The test proposed by DeLong et al. (1988) is the conventional nonparametric procedure for comparing correlated AUCs. It uses a
consistent variance estimator and relies on asymptotic normality of the AUC estimator. Our proposed method includes DeLong's variance estimation technique in comparing pair of genes and can identify genes with biologically sound implications.
In this thesis, we focus on the primary step in the gene selection process, namely, the ranking of genes with respect to a statistical measure of differential expression. We assess the proposed
approaches by extensive simulation studies and demonstrate the methods on real datasets. The simulation study indicates that the parametric method performs favorably well at any settings of variance, sample size and treatment effects. Importantly, the method is found less sensitive to contaminated by noise. The proposed nonparametric methods do not involve complicated formulas and do not
require advanced programming skills. Again both methods can identify a large fraction of truly differentially expressed (DE) genes,
especially if the data consists of large sample sizes or the presence of outliers. We conclude that the proposed methods offer
good choices of analytical tools to identify DE genes for further biological and clinical analysis.
|
4 |
Evaluation of a neural network classifier for pancreatic masses based on CT findings池田, 充, Ikeda, Mitsuru, 伊藤, 茂樹, Ito, Shigeki, 石垣, 武男, Ishigaki, Takeo, Yamauchi, Kazunobu, 山内, 一信 05 1900 (has links)
No description available.
|
5 |
Diagnostic Utility of the Culture-Language Interpretive Matrix for the WISC-IV Among Referred StudentsJanuary 2012 (has links)
abstract: The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning Disabilities (SLD) or other neurocognitive disabilities (i.e., disordered). Diagnostic utility statistics were used to test the ability of the Wechsler Intelligence Scales for Children-Fourth Edition (WISC-IV) C-LIM to accurately identify students from a referred sample of English language learners (Ells) (n = 86) for whom Spanish was the primary language spoken at home and a sample of students from the WISC-IV normative sample (n = 2,033) as either culturally and linguistically different from the WISC-IV normative sample or culturally and linguistically similar to the WISC-IV normative sample. WISC-IV scores from three paired comparison groups were analyzed using the Receiver Operating Characteristic (ROC) curve: (a) Ells with SLD and the WISC-IV normative sample, (b) Ells without SLD and the WISC-IV normative sample, and (c) Ells with SLD and Ells without SLD. Results of the ROC yielded Area Under the Curve (AUC) values that ranged between 0.51 and 0.53 for the comparison between Ells with SLD and the WISC-IV normative sample, AUC values that ranged between 0.48 and 0.53 for the comparison between Ells without SLD and the WISC-IV normative sample, and AUC values that ranged between 0.49 and 0.55 for the comparison between Ells with SLD and Ells without SLD. These values indicate that the C-LIM has low diagnostic accuracy in terms of differentiating between a sample of Ells and the WISC-IV normative sample. Current available evidence does not support use of the C-LIM in applied practice at this time. / Dissertation/Thesis / Ph.D. Educational Psychology 2012
|
6 |
Bivariate Random Effects And Hierarchical Meta-analysis Of Summary Receiver Operating Characteristic Curve On Fine Needle Aspiration CytologyErte, Idil 01 September 2011 (has links) (PDF)
In this study, meta-analysis of diagnostic tests, Summary Receiver Operating Characteristic (SROC) curve, bivariate random effects and Hierarchical Summary Receiver Operating Characteristic (HSROC) curve theories have been discussed and accuracy in literature of Fine Needle Aspiration (FNA) biopsy that is used in the diagnosis of masses in breast cancer (malignant or benign) has been analyzed. FNA Cytological (FNAC) examination in breast tumor is, easy, effective, effortless, and does not require special training for clinicians. Because of the uncertainty related to FNAC&lsquo / s accurate usage in publications, 25 FNAC studies have been gathered in the meta-analysis. In the plotting of the summary ROC curve, the logit difference and sums of the true positive rates and the false positive rates included in the meta-analysis&lsquo / s codes have been generated by SAS. The formula of the bivariate random effects model and hierarchical summary ROC curve is presented in context with the literature. Then bivariate random effects implementation with the new SAS PROC GLIMMIX is generated. Moreover, HSROC implementation is generated by SAS PROC HSROC NLMIXED. Curves are plotted with RevMan Version 5 (2008). It has been stated that the meta-analytic results of bivariate random effects are nearly identical to the results from the HSROC approach. The results achieved through both random effects meta-analytic methods prove that FNA Cytology is a diagnostic test with a high level of distinguish over breast tumor.
|
7 |
Assessment of Residual Nonuniformity on Hyperspectral Target Detection PerformanceCusumano, Carl Joseph January 2019 (has links)
No description available.
|
8 |
Επεξεργασία και ανάλυση καρδιακού ρυθμού κατά την διάρκεια του τοκετού με τη χρήση μετασχηματισμού κυματιδίου (wavelet) / Processing and analysis of heart rate during childbirth using wavelet transformΧατζής, Δημήτριος 29 June 2007 (has links)
Στην εργασία χρησιμοποιούνται σήματα καρδιακού ρυθμού, τα οποία αντιστοιχούν σε φυσιολογικές και οξαιμικές περιπτώσεις.Στην συνέχεια αυτά τα σήματα τα επεξεργαζόμαστε με διάφορες τεχνικές. Στόχος της εργασίας αυτής είναι ο διαχωρισμός των δυο αυτών ομάδων. / In this thesis are used signals of cardiac rythm, that correspond in physiologic and oxidemic cases.Then we processed these signals with various techniques.Target of this thesis is the segregation of this two teams.
|
9 |
PERFORMANCE ANALYSIS OF SRCP IMAGE BASED SOUND SOURCE DETECTION ALGORITHMSNalavolu, Praveen Reddy 01 January 2010 (has links)
Steered Response Power based algorithms are widely used for finding sound source location using microphone array systems. SRCP-PHAT is one such algorithm that has a robust performance under noisy and reverberant conditions. The algorithm creates a likelihood function over the field of view. This thesis employs image processing methods on SRCP-PHAT images, to exploit the difference in power levels and pixel patterns to discriminate between sound source and background pixels. Hough Transform based ellipse detection is used to identify the sound source locations by finding the centers of elliptical edge pixel regions typical of source patterns. Monte Carlo simulations of an eight microphone perimeter array with single and multiple sound sources are used to simulate the test environment and area under receiver operating characteristic (ROCA) curve is used to analyze the algorithm performance. Performance was compared to a simpler algorithm involving Canny edge detection and image averaging and an algorithms based simply on the magnitude of local maxima in the SRCP image. Analysis shows that Canny edge detection based method performed better in the presence of coherent noise sources.
|
10 |
Prediçao de distribuíção de espécies arbustivo-arbóreas no sul do Brasil / Prediction of distribution of shrub and trees species in southern BrazilVerdi, Marcio January 2013 (has links)
Em vista das mudanças ambientais em nível global, disponibilizar informações ecológicas e buscar uma melhor compreensão dos fatores e processos que moldam a distribuição de espécies, é uma iniciativa importante para o planejamento de ações de conservação. Neste contexto, a importância e carência de informações sobre a distribuição geográficas das espécies nos motivaram a predizer a distribuição potencial de arbustos e árvores das famílias Lauraceae e Myrtaceae na Floresta Atlântica, no sul do Brasil. Modelos lineares generalizados (GLM) foram usados para ajustar modelos preditivos com os registros de ocorrência de 88 espécies em função de variáveis ambientais. As variáveis preditoras foram selecionadas com base no menor critério de informação de Akaike corrigido. Nós avaliamos o desempenho dos modelos usando o método de validação cruzada (10-fold) para calcular a habilidade estatística verdadeira (TSS) e a área sob a curva característica do operador receptor (AUC). Nós usamos GLM para testar a influência da área de ocorrência estimada, do número de registros das espécies e da complexidade dos modelos sobre a TSS e a AUC. Nossos resultados mostraram que as variáveis climáticas governam amplamente a distribuição de espécies, mas as variáveis que captam as variações ambientais locais são relativamente importantes na área de estudo. A TSS foi significativamente influenciada pelo número de registros e complexidade dos modelos, enquanto a AUC sofreu com o efeito de todos os três fatores avaliados. A interação entre estes fatores é uma questão importante e a ser considerada em novas avaliações sobre ambas medidas e com diferentes técnicas de modelagem. Nossos resultados também mostraram que as distribuições de algumas espécies foram superestimadas e outras corresponderam bem com a ocorrência por nós conhecida. Efetivamente nossos resultados têm fundamentos para embasar novos levantamentos de campo, a avaliação de áreas prioritárias e planos de conservação, além de inferências dos efeitos de mudanças ambientais sobre as espécies da Mata Atlântica. / In view of environmental change on a global level, providing ecological information and getting a better understanding of the factors and processes that shape species distribution is an important initiative for planning conservation actions. In this context, the importance and lack of information about the geographical distribution of species motivated us to predict the potential species distribution of shrubs and trees of the family Lauraceae and Myrtaceae, in the Atlantic Forest in southern Brazil. Generalized linear models (GLM) were used to fit predictive models with records of occurrence of 88 species according to environmental variables. Predictor variables were selected based on the lowest corrected Akaike information criterion. We evaluate the performance of the models using the method of cross-validation (10-fold) to calculate the true skill statistic (TSS) and area under the receiver operator characteristic curve (AUC). We used GLM to test the influence of the area of occurrence estimated, the number of records of the species and the complexity of the models on the TSS and AUC. Our results show that climatic variables largely govern the distribution of species, but the variables that capture the local environmental variations are relatively important in the study area. The TSS was significantly influenced by the number of records and complexity of models while the AUC suffered from the effect of all three evaluated factors. The interaction between these factors is an important issue and be considered for new reviews on both measures and with different modeling techniques. Our results also showed that the distributions of some species were overestimated and other corresponded well with the occurrence known to us. Indeed our results have foundations to support new field surveys, assessment of priority areas and conservation plans, and inferences of the effects of environmental change on species of the Atlantic Forest.
|
Page generated in 0.1216 seconds