Global ETD Search

11	Contribution to Statistical Techniques for Identifying Differentially Expressed Genes in Microarray Data Hossain, Ahmed 30 August 2011 (has links) With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes (features or genomic biomarkers) simultaneously in one single experiment. Robust and accurate gene selection methods are required to identify differentially expressed genes across different samples for disease diagnosis or prognosis. The problem of identifying significantly differentially expressed genes can be stated as follows: Given gene expression measurements from an experiment of two (or more)conditions, find a subset of all genes having significantly different expression levels across these two (or more) conditions. Analysis of genomic data is challenging due to high dimensionality of data and low sample size. Currently several mathematical and statistical methods exist to identify significantly differentially expressed genes. The methods typically focus on gene by gene analysis within a parametric hypothesis testing framework. In this study, we propose three flexible procedures for analyzing microarray data. In the first method we propose a parametric method which is based on a flexible distribution, Generalized Logistic Distribution of Type II (GLDII), and an approximate likelihood ratio test (ALRT) is developed. Though the method considers gene-by-gene analysis, the ALRT method with distributional assumption GLDII appears to provide a favourable fit to microarray data. In the second method we propose a test statistic for testing whether area under receiver operating characteristic curve (AUC) for each gene is greater than 0.5 allowing different variances for each gene. This proposed method is computationally less intensive and can identify genes that are reasonably stable with satisfactory prediction performance. The third method is based on comparing two AUCs for a pair of genes that is designed for selecting highly correlated genes in the microarray datasets. We propose a nonparametric procedure for selecting genes with expression levels correlated with that of a ``seed" gene in microarray experiments. The test proposed by DeLong et al. (1988) is the conventional nonparametric procedure for comparing correlated AUCs. It uses a consistent variance estimator and relies on asymptotic normality of the AUC estimator. Our proposed method includes DeLong's variance estimation technique in comparing pair of genes and can identify genes with biologically sound implications. In this thesis, we focus on the primary step in the gene selection process, namely, the ranking of genes with respect to a statistical measure of differential expression. We assess the proposed approaches by extensive simulation studies and demonstrate the methods on real datasets. The simulation study indicates that the parametric method performs favorably well at any settings of variance, sample size and treatment effects. Importantly, the method is found less sensitive to contaminated by noise. The proposed nonparametric methods do not involve complicated formulas and do not require advanced programming skills. Again both methods can identify a large fraction of truly differentially expressed (DE) genes, especially if the data consists of large sample sizes or the presence of outliers. We conclude that the proposed methods offer good choices of analytical tools to identify DE genes for further biological and clinical analysis. Micorarray gene expressilon Generalized Logistic Distribution Receiver operating Characteristic curve FDR 0566
12	Discrimination of High Risk and Low Risk Populations for the Treatment of STDs Zhao, Hui 05 August 2011 (has links) It is an important step in clinical practice to discriminate real diseased patients from healthy persons. It would be great to get such discrimination from some common information like personal information, life style, and the contact with diseased patient. In this study, a score is calculated for each patient based on a survey through generalized linear model, and then the diseased status is decided according to previous sexually transmitted diseases (STDs) records. This study will facilitate clinics in grouping patients into real diseased or healthy, which in turn will affect the method clinics take to screen patients: complete screening for possible diseased patient and some common screening for potentially healthy persons. Sexually transmitted diseases (STDs) Generalized linear model Mathematics
13	Evaluation of a neural network classifier for pancreatic masses based on CT findings 池田, 充, Ikeda, Mitsuru, 伊藤, 茂樹, Ito, Shigeki, 石垣, 武男, Ishigaki, Takeo, Yamauchi, Kazunobu, 山内, 一信 05 1900 (has links) No description available. Computer aided diagnosis Neural network Radiology and radiologists Pancreas
14	Diagnostic Utility of the Culture-Language Interpretive Matrix for the WISC-IV Among Referred Students January 2012 (has links) abstract: The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning Disabilities (SLD) or other neurocognitive disabilities (i.e., disordered). Diagnostic utility statistics were used to test the ability of the Wechsler Intelligence Scales for Children-Fourth Edition (WISC-IV) C-LIM to accurately identify students from a referred sample of English language learners (Ells) (n = 86) for whom Spanish was the primary language spoken at home and a sample of students from the WISC-IV normative sample (n = 2,033) as either culturally and linguistically different from the WISC-IV normative sample or culturally and linguistically similar to the WISC-IV normative sample. WISC-IV scores from three paired comparison groups were analyzed using the Receiver Operating Characteristic (ROC) curve: (a) Ells with SLD and the WISC-IV normative sample, (b) Ells without SLD and the WISC-IV normative sample, and (c) Ells with SLD and Ells without SLD. Results of the ROC yielded Area Under the Curve (AUC) values that ranged between 0.51 and 0.53 for the comparison between Ells with SLD and the WISC-IV normative sample, AUC values that ranged between 0.48 and 0.53 for the comparison between Ells without SLD and the WISC-IV normative sample, and AUC values that ranged between 0.49 and 0.55 for the comparison between Ells with SLD and Ells without SLD. These values indicate that the C-LIM has low diagnostic accuracy in terms of differentiating between a sample of Ells and the WISC-IV normative sample. Current available evidence does not support use of the C-LIM in applied practice at this time. / Dissertation/Thesis / Ph.D. Educational Psychology 2012 Educational psychology Educational tests & measurements assessment receiver operating characteristic curve special education specific learning disabilities
15	IMPROVED GENE PAIR BIOMARKERS FOR MICROARRAY DATA CLASSIFICATION Khamesipour, Alireza 01 August 2018 (has links) The Top Scoring Pair (TSP) classifier, based on the notion of relative ranking reversals in the expressions of two marker genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. We introduce the AUC-based TSP classifier, which is based on the Area Under the ROC (Receiver Operating Characteristic) Curve. The AUCTSP classifier works according to the same principle as TSP but differs from the latter in that the probabilities that determine the top scoring pair are computed based on the relative rankings of the two marker genes across all subjects as opposed to for each individual subject. Although the classification is still done on an individual subject basis, the generalization that the AUC-based probabilities provide during training yield an overall better and more stable classifier. Through extensive simulation results and case studies involving classification in ovarian, leukemia, colon, and breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative pivot genes. The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across {\em all} subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair.\\ We have also proposed the use of the AUC test statistic in order to reduce the computational cost of the TSP in selecting the most informative pair of genes for diagnosing a specific disease. We have proven the efficacy of our proposed method through case studies in ovarian, colon, leukemia, breast and prostate cancers and diffuse large b-cell lymphoma in selecting informative genes. We have compared the selected pairs, computational cost and running time and classification performance of a subset of differentially expressed genes selected based on the AUC probability with the original TSP in the aforementioned datasets. The reduce sized TSP has proven to dramatically reduce the computational cost and time complexity of selecting the top scoring pair of genes in comparison to the original TSP in all of the case studies without degrading the performance of the classifier. Using the AUC probability, we were able to reduce the computational cost and CPU running time of the TSP by 79\% and 84\% respectively on average in the tested case studies. In addition, the use of the AUC probability prior to applying the TSP tends to avoid the selection of genes that are not expressed (``pivot'' genes) due to the imposed condition. We have demonstrated through LOOCV and 5-fold cross validation that the reduce sized TSP and TSP have shown to perform approximately the same in terms of classification accuracy for smaller threshold values. In conclusion, we suggest the use of the AUC test statistic in reducing the size of the dataset for the extensions of the TSP method, e.g. the k-TSP and TST, in order to make these methods feasible and cost effective. AUC Cancer diagnosis Gene expression Gene selection Microarray data analysis
16	Condition monitoring of pharmaceutical powder compression during tabletting using acoustic emission Eissa, Salah January 2003 (has links) This research project aimed to develop a condition monitoring system for the final production quality of pharmaceutical tablets and detection capping and lamination during powder compression process using the acoustic emission (AE) method. Pharmaceutical tablet manufacturers obliged by regulatory bodies to test the tablet's physical properties such as hardness, dissolution and disintegration before the tablets are released to the market. Most of the existing methods and techniques for testing and monitoring these tablet's properties are performed at the tablet post-compression stage. Furthermore, these tests are destructive in nature. Early experimental investigations revealed that the AE energy that is generated during powder compression is directly proportional to the peak force that is required to crush the tablet, i. e. crushing strength. Further laboratory and industrial experimental investigation have been conducted to study the relationship between the AE signals and the compression conditions. Traditional AE signal features such as energy, count, peak amplitude, average signal level, event duration and rise time were recorded. AE data analysis with the aid of advanced classification algorithm, fuzzy C-mean clustering showed that the AE energy is a very useful parameter in tablet condition monitoring. It was found that the AE energy that is generated during powder compression is sensitive to the process and is directly proportional to the compression speed, particle size, homogeneity of mixture and the amount of material present. Also this AE signal is dependent upon the type of material used as the tablet filler. Acoustic emission has been shown to be a useful technique for characterising some of the complex physical changes which occur during tabletting. Capping and lamination are serious problems that are encountered during tabletting. A capped or laminated tablet is one which no longer retains its mechanical integrity and exhibit low strength characteristics. Capping and lamination can be caused by a number of factors such as excessive pressure, insufficient binder in the granules and poor material flowabilities. However, capping and lamination can also occur randomly and they are also dependent upon the material used in tabletting. It was possible to identify a capped or laminated tablet by monitoring the AE energy level during continuous on-line monitoring of tabletting. Capped tablets indicated by low level of AE energy. The proposed condition monitoring system aimed to set the AE energy threshold that could discriminate between capped and non-capped tablets. This was based upon statistical distributions of the AE energy values for both the capped and non-capped tablets. The system aims to minimise the rate of false alarms (indication of capping when in reality capping has not occurred) and the rate of missed detection (an indication of non capping, when in reality capping has occurred). A novel approach that employs both the AE method and the receiver operating characteristic (ROC) curve was proposed for the on-line detection of capping and lamination during tabletting. The proposed system employs AE energy as the discriminating parameter to detect between capped and non-capped tablets. The ROC curve was constructed from the area under the two distributions of both capped and non-capped tablet. This curve shows a trade-off between the probabilities of true detection rate and false alarm rate for capped and non-capped tablet. A two-graph receiver operating characteristic (ROC) curve was presented as a modification of the original ROC curve to enable an operator to directly select the desired energy threshold for tablet monitoring. This plot shows the ROC co-ordinate as a function of the threshold value over the entire threshold (AE energy) range for all test outcomes. An alternative way of deciding a threshold based on the slope of the ROC curve was also developed. The slope of the ROC curve represents the optimal operating point on the curve. It depends upon the penalties cost of capping and the prevalence of capping. Sets of guidelines have been outlined for decision making i.e. threshold setting. These guidelines take into account both the prevalence of capping in manufacturing and the cost associated with various outcomes of tablet formation. The proposed condition monitoring system also relates AE monitoring to non-AE measurement as it enable an operator predicting tablet hardness and disintegration form the AE energy, a relationship which was established in this research. 620
17	Assessing computed tomography image quality for combined detection and estimation tasks Tseng, Hsin-Wu, Fan, Jiahua, Kupinski, Matthew A. 21 November 2017 (has links) Maintaining or even improving image quality while lowering patient dose is always the desire in clinical computed tomography (CT) imaging. Iterative reconstruction (IR) algorithms have been designed to allow for a reduced dose while maintaining or even improving an image. However, we have previously shown that the dose-saving capabilities allowed with IR are different for different clinical tasks. The channelized scanning linear observer (CSLO) was applied to study clinical tasks that combine detection and estimation when assessing CT image data. The purpose of this work is to illustrate the importance of task complexity when assessing dose savings and to move toward more realistic tasks when performing these types of studies. Human-observer validation of these methods will take place in a future publication. Low-contrast objects embedded in body-size phantoms were imaged multiple times and reconstructed by filtered back projection (FBP) and an IR algorithm. The task was to detect, localize, and estimate the size and contrast of low-contrast objects in the phantom. Independent signal-present and signal-absent regions of interest cropped from images were channelized by the dense-difference of Gauss channels for CSLO training and testing. Estimation receiver operating characteristic (EROC) curves and the areas under EROC curves (EAUC) were calculated by CSLO as the figure of merit. The one-shot method was used to compute the variance of the EAUC values. Results suggest that the IR algorithm studied in this work could efficiently reduce the dose by similar to 50% while maintaining an image quality comparable to conventional FBP reconstruction warranting further investigation using real patient data. (C) The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. computed tomography iterative reconstruction channelized scanning linear observer detection estimation EROC curves
18	Application of receiver operating characteristic analysis to a remote monitoring model for chronic obstructive pulmonary disease to determine utility and predictive value Brown Connolly, Nancy January 2013 (has links) This is a foundational study that applies Receiver Operating Characteristic (ROC) analysis to the evaluation of a chronic disease model that utilizes Remote Monitoring (RM) devices to identify clinical deterioration in a Chronic Obstructive Pulmonary Disease (COPD) population. Background: RM programmes in Disease Management (DM) are proliferating as one strategy to address management of chronic disease. The need to validate and quantify evidence-based value is acute. There is a need to apply new methods to better evaluate automated RM systems. ROC analysis is an engineering approach that has been widely applied to medical programmes but has not been applied to RM systems. Evaluation of classifiers, determination of thresholds and predictive accuracy for RM systems have not been evaluated using ROC analysis. Objectives: (1) apply ROC analysis to evaluation of a RM system; (2) analyse the performance of the model when applied to patient outcomes for a COPD population; (3) identify predictive classifier(s); (4) identify optimal threshold(s) and the predictive capacity of the classifiers. Methods: Parametric and non-parametric methods are utilized to determine accuracy, sensitivity, specificity and predictive capacity of classifiers Saturated Peripheral Oxygen (SpO2), Blood Pressure (BP), Pulse Rate (PR) based on event-based patient outcomes that include hospitalisation (IP), accident & emergency (A&E) and home visits (HH). Population: Patients identified with a primary diagnosis of COPD, monitored for a minimum of 183 days with at least one episode of in-patient (IP) hospitalisation for COPD in the 12 months preceding the monitoring period. Data Source: A subset of retrospective de-identified patient data from an NHS Direct evaluation of a COPD RM programme. Subsets utilized include classifiers, biometric readings, alerts generated by the system and resource utilisation. Contribution: Validates ROC methodology, identifies classifier performance and optimal threshold settings for the classifier, while making design recommendations and putting forth the next steps for research. The question answered by this research is that ROC analysis can provide additional information on the predictive capacity of RM systems. Justification of benefit: The results can be applied when evaluating health services and planning decisions on the costs and benefits. Methods can be applied to system design, protocol development, work flows and commissioning decisions based on value and benefit. Conclusion: Results validate the use of ROC analysis as a robust methodology for DM programmes that use RM devices to evaluate classifiers, thresholds and identification of the predictive capacity as well as identify areas where additional design may improve the predictive capacity of the model.
19	Design of Comprehensible Learning Machine Systems for Protein Structure Prediction Hu, Hae-Jin 06 August 2007 (has links) With the efforts to understand the protein structure, many computational approaches have been made recently. Among them, the Support Vector Machine (SVM) methods have been recently applied and showed successful performance compared with other machine learning schemes. However, despite the high performance, the SVM approaches suffer from the problem of understandability since it is a black-box model; the predictions made by SVM cannot be interpreted as biologically meaningful way. To overcome this limitation, a new association rule based classifier PCPAR was devised based on the existing classifier, CPAR to handle the sequential data. The performance of the PCPAR was improved more by designing the following two hybrid schemes. The PCPAR/SVM method is a parallel combination of the PCPAR and the SVM and the PCPAR_SVM method is a sequential combination of the PCPAR and the SVM. To understand the SVM prediction, the SVM_PCPAR scheme was developed. The experimental result presents that the PCPAR scheme shows better performance with respect to the accuracy and the number of generated patterns than CPAR method. The PCPAR/SVM scheme presents better performance than the PCPAR, PCPAR_SVM or the SVM_PCPAR and almost equal performance to the SVM. The generated patterns are easily understandable and biologically meaningful. The system sturdiness evaluation and the ROC curve analysis proved that this new scheme is robust and competent. Embedded Membrane Segment Prediction Transmembrane Receiver Operating Characteristic Association Rule Based Classifier Data Mining Support Vector Machines Computer Sciences
20	Bivariate Random Effects And Hierarchical Meta-analysis Of Summary Receiver Operating Characteristic Curve On Fine Needle Aspiration Cytology Erte, Idil 01 September 2011 (has links) (PDF) In this study, meta-analysis of diagnostic tests, Summary Receiver Operating Characteristic (SROC) curve, bivariate random effects and Hierarchical Summary Receiver Operating Characteristic (HSROC) curve theories have been discussed and accuracy in literature of Fine Needle Aspiration (FNA) biopsy that is used in the diagnosis of masses in breast cancer (malignant or benign) has been analyzed. FNA Cytological (FNAC) examination in breast tumor is, easy, effective, effortless, and does not require special training for clinicians. Because of the uncertainty related to FNAC&lsquo / s accurate usage in publications, 25 FNAC studies have been gathered in the meta-analysis. In the plotting of the summary ROC curve, the logit difference and sums of the true positive rates and the false positive rates included in the meta-analysis&lsquo / s codes have been generated by SAS. The formula of the bivariate random effects model and hierarchical summary ROC curve is presented in context with the literature. Then bivariate random effects implementation with the new SAS PROC GLIMMIX is generated. Moreover, HSROC implementation is generated by SAS PROC HSROC NLMIXED. Curves are plotted with RevMan Version 5 (2008). It has been stated that the meta-analytic results of bivariate random effects are nearly identical to the results from the HSROC approach. The results achieved through both random effects meta-analytic methods prove that FNA Cytology is a diagnostic test with a high level of distinguish over breast tumor. QA Analysis 299.6-433

Search results