• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 50
  • 50
  • 43
  • 23
  • 23
  • 12
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Contribution to Statistical Techniques for Identifying Differentially Expressed Genes in Microarray Data

Hossain, Ahmed 30 August 2011 (has links)
With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes (features or genomic biomarkers) simultaneously in one single experiment. Robust and accurate gene selection methods are required to identify differentially expressed genes across different samples for disease diagnosis or prognosis. The problem of identifying significantly differentially expressed genes can be stated as follows: Given gene expression measurements from an experiment of two (or more)conditions, find a subset of all genes having significantly different expression levels across these two (or more) conditions. Analysis of genomic data is challenging due to high dimensionality of data and low sample size. Currently several mathematical and statistical methods exist to identify significantly differentially expressed genes. The methods typically focus on gene by gene analysis within a parametric hypothesis testing framework. In this study, we propose three flexible procedures for analyzing microarray data. In the first method we propose a parametric method which is based on a flexible distribution, Generalized Logistic Distribution of Type II (GLDII), and an approximate likelihood ratio test (ALRT) is developed. Though the method considers gene-by-gene analysis, the ALRT method with distributional assumption GLDII appears to provide a favourable fit to microarray data. In the second method we propose a test statistic for testing whether area under receiver operating characteristic curve (AUC) for each gene is greater than 0.5 allowing different variances for each gene. This proposed method is computationally less intensive and can identify genes that are reasonably stable with satisfactory prediction performance. The third method is based on comparing two AUCs for a pair of genes that is designed for selecting highly correlated genes in the microarray datasets. We propose a nonparametric procedure for selecting genes with expression levels correlated with that of a ``seed" gene in microarray experiments. The test proposed by DeLong et al. (1988) is the conventional nonparametric procedure for comparing correlated AUCs. It uses a consistent variance estimator and relies on asymptotic normality of the AUC estimator. Our proposed method includes DeLong's variance estimation technique in comparing pair of genes and can identify genes with biologically sound implications. In this thesis, we focus on the primary step in the gene selection process, namely, the ranking of genes with respect to a statistical measure of differential expression. We assess the proposed approaches by extensive simulation studies and demonstrate the methods on real datasets. The simulation study indicates that the parametric method performs favorably well at any settings of variance, sample size and treatment effects. Importantly, the method is found less sensitive to contaminated by noise. The proposed nonparametric methods do not involve complicated formulas and do not require advanced programming skills. Again both methods can identify a large fraction of truly differentially expressed (DE) genes, especially if the data consists of large sample sizes or the presence of outliers. We conclude that the proposed methods offer good choices of analytical tools to identify DE genes for further biological and clinical analysis.
12

Contribution to Statistical Techniques for Identifying Differentially Expressed Genes in Microarray Data

Hossain, Ahmed 30 August 2011 (has links)
With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes (features or genomic biomarkers) simultaneously in one single experiment. Robust and accurate gene selection methods are required to identify differentially expressed genes across different samples for disease diagnosis or prognosis. The problem of identifying significantly differentially expressed genes can be stated as follows: Given gene expression measurements from an experiment of two (or more)conditions, find a subset of all genes having significantly different expression levels across these two (or more) conditions. Analysis of genomic data is challenging due to high dimensionality of data and low sample size. Currently several mathematical and statistical methods exist to identify significantly differentially expressed genes. The methods typically focus on gene by gene analysis within a parametric hypothesis testing framework. In this study, we propose three flexible procedures for analyzing microarray data. In the first method we propose a parametric method which is based on a flexible distribution, Generalized Logistic Distribution of Type II (GLDII), and an approximate likelihood ratio test (ALRT) is developed. Though the method considers gene-by-gene analysis, the ALRT method with distributional assumption GLDII appears to provide a favourable fit to microarray data. In the second method we propose a test statistic for testing whether area under receiver operating characteristic curve (AUC) for each gene is greater than 0.5 allowing different variances for each gene. This proposed method is computationally less intensive and can identify genes that are reasonably stable with satisfactory prediction performance. The third method is based on comparing two AUCs for a pair of genes that is designed for selecting highly correlated genes in the microarray datasets. We propose a nonparametric procedure for selecting genes with expression levels correlated with that of a ``seed" gene in microarray experiments. The test proposed by DeLong et al. (1988) is the conventional nonparametric procedure for comparing correlated AUCs. It uses a consistent variance estimator and relies on asymptotic normality of the AUC estimator. Our proposed method includes DeLong's variance estimation technique in comparing pair of genes and can identify genes with biologically sound implications. In this thesis, we focus on the primary step in the gene selection process, namely, the ranking of genes with respect to a statistical measure of differential expression. We assess the proposed approaches by extensive simulation studies and demonstrate the methods on real datasets. The simulation study indicates that the parametric method performs favorably well at any settings of variance, sample size and treatment effects. Importantly, the method is found less sensitive to contaminated by noise. The proposed nonparametric methods do not involve complicated formulas and do not require advanced programming skills. Again both methods can identify a large fraction of truly differentially expressed (DE) genes, especially if the data consists of large sample sizes or the presence of outliers. We conclude that the proposed methods offer good choices of analytical tools to identify DE genes for further biological and clinical analysis.
13

Discrimination of High Risk and Low Risk Populations for the Treatment of STDs

Zhao, Hui 05 August 2011 (has links)
It is an important step in clinical practice to discriminate real diseased patients from healthy persons. It would be great to get such discrimination from some common information like personal information, life style, and the contact with diseased patient. In this study, a score is calculated for each patient based on a survey through generalized linear model, and then the diseased status is decided according to previous sexually transmitted diseases (STDs) records. This study will facilitate clinics in grouping patients into real diseased or healthy, which in turn will affect the method clinics take to screen patients: complete screening for possible diseased patient and some common screening for potentially healthy persons.
14

Testing an Assumption of Non-Differential Misclassification in Case-Control Studies

Hui, Qin 01 August 2011 (has links)
One of the issues regarding the misclassification in case-control studies is whether the misclassification error rates are the same for both cases and controls. Currently, a common practice is to assume that the rates are the same (“non-differential” assumption). However, it is suspicious that this assumption is valid in many case-control studies. Unfortunately, no test is available so far to test the validity of the assumption of non-differential misclassification when the validation data are not available. We propose the first such method to test the validity of non-differential assumption in a case-control study with 2 × 2 contingency table. First, the Exposure Operating Characteristic curve is defined. Next, two non-parametric methods are applied to test the assumption of non-differential misclassification. Three examples from practical applications are used to illustrate the methods and a comparison is made.
15

Evaluation of a neural network classifier for pancreatic masses based on CT findings

池田, 充, Ikeda, Mitsuru, 伊藤, 茂樹, Ito, Shigeki, 石垣, 武男, Ishigaki, Takeo, Yamauchi, Kazunobu, 山内, 一信 05 1900 (has links)
No description available.
16

Diagnostic Utility of the Culture-Language Interpretive Matrix for the WISC-IV Among Referred Students

January 2012 (has links)
abstract: The Culture-Language Interpretive Matrix (C-LIM) is a new tool hypothesized to help practitioners accurately determine whether students who are administered an IQ test are culturally and linguistically different from the normative comparison group (i.e., different) or culturally and linguistically similar to the normative comparison group and possibly have Specific Learning Disabilities (SLD) or other neurocognitive disabilities (i.e., disordered). Diagnostic utility statistics were used to test the ability of the Wechsler Intelligence Scales for Children-Fourth Edition (WISC-IV) C-LIM to accurately identify students from a referred sample of English language learners (Ells) (n = 86) for whom Spanish was the primary language spoken at home and a sample of students from the WISC-IV normative sample (n = 2,033) as either culturally and linguistically different from the WISC-IV normative sample or culturally and linguistically similar to the WISC-IV normative sample. WISC-IV scores from three paired comparison groups were analyzed using the Receiver Operating Characteristic (ROC) curve: (a) Ells with SLD and the WISC-IV normative sample, (b) Ells without SLD and the WISC-IV normative sample, and (c) Ells with SLD and Ells without SLD. Results of the ROC yielded Area Under the Curve (AUC) values that ranged between 0.51 and 0.53 for the comparison between Ells with SLD and the WISC-IV normative sample, AUC values that ranged between 0.48 and 0.53 for the comparison between Ells without SLD and the WISC-IV normative sample, and AUC values that ranged between 0.49 and 0.55 for the comparison between Ells with SLD and Ells without SLD. These values indicate that the C-LIM has low diagnostic accuracy in terms of differentiating between a sample of Ells and the WISC-IV normative sample. Current available evidence does not support use of the C-LIM in applied practice at this time. / Dissertation/Thesis / Ph.D. Educational Psychology 2012
17

IMPROVED GENE PAIR BIOMARKERS FOR MICROARRAY DATA CLASSIFICATION

Khamesipour, Alireza 01 August 2018 (has links)
The Top Scoring Pair (TSP) classifier, based on the notion of relative ranking reversals in the expressions of two marker genes, has been proposed as a simple, accurate, and easily interpretable decision rule for classification and class prediction of gene expression profiles. We introduce the AUC-based TSP classifier, which is based on the Area Under the ROC (Receiver Operating Characteristic) Curve. The AUCTSP classifier works according to the same principle as TSP but differs from the latter in that the probabilities that determine the top scoring pair are computed based on the relative rankings of the two marker genes across all subjects as opposed to for each individual subject. Although the classification is still done on an individual subject basis, the generalization that the AUC-based probabilities provide during training yield an overall better and more stable classifier. Through extensive simulation results and case studies involving classification in ovarian, leukemia, colon, and breast and prostate cancers and diffuse large b-cell lymphoma, we show the superiority of the proposed approach in terms of improving classification accuracy, avoiding overfitting and being less prone to selecting non-informative pivot genes. The proposed AUCTSP is a simple yet reliable and robust rank-based classifier for gene expression classification. While the AUCTSP works by the same principle as TSP, its ability to determine the top scoring gene pair based on the relative rankings of two marker genes across {\em all} subjects as opposed to each individual subject results in significant performance gains in classification accuracy. In addition, the proposed method tends to avoid selection of non-informative (pivot) genes as members of the top-scoring pair.\\ We have also proposed the use of the AUC test statistic in order to reduce the computational cost of the TSP in selecting the most informative pair of genes for diagnosing a specific disease. We have proven the efficacy of our proposed method through case studies in ovarian, colon, leukemia, breast and prostate cancers and diffuse large b-cell lymphoma in selecting informative genes. We have compared the selected pairs, computational cost and running time and classification performance of a subset of differentially expressed genes selected based on the AUC probability with the original TSP in the aforementioned datasets. The reduce sized TSP has proven to dramatically reduce the computational cost and time complexity of selecting the top scoring pair of genes in comparison to the original TSP in all of the case studies without degrading the performance of the classifier. Using the AUC probability, we were able to reduce the computational cost and CPU running time of the TSP by 79\% and 84\% respectively on average in the tested case studies. In addition, the use of the AUC probability prior to applying the TSP tends to avoid the selection of genes that are not expressed (``pivot'' genes) due to the imposed condition. We have demonstrated through LOOCV and 5-fold cross validation that the reduce sized TSP and TSP have shown to perform approximately the same in terms of classification accuracy for smaller threshold values. In conclusion, we suggest the use of the AUC test statistic in reducing the size of the dataset for the extensions of the TSP method, e.g. the k-TSP and TST, in order to make these methods feasible and cost effective.
18

Condition monitoring of pharmaceutical powder compression during tabletting using acoustic emission

Eissa, Salah January 2003 (has links)
This research project aimed to develop a condition monitoring system for the final production quality of pharmaceutical tablets and detection capping and lamination during powder compression process using the acoustic emission (AE) method. Pharmaceutical tablet manufacturers obliged by regulatory bodies to test the tablet's physical properties such as hardness, dissolution and disintegration before the tablets are released to the market. Most of the existing methods and techniques for testing and monitoring these tablet's properties are performed at the tablet post-compression stage. Furthermore, these tests are destructive in nature. Early experimental investigations revealed that the AE energy that is generated during powder compression is directly proportional to the peak force that is required to crush the tablet, i. e. crushing strength. Further laboratory and industrial experimental investigation have been conducted to study the relationship between the AE signals and the compression conditions. Traditional AE signal features such as energy, count, peak amplitude, average signal level, event duration and rise time were recorded. AE data analysis with the aid of advanced classification algorithm, fuzzy C-mean clustering showed that the AE energy is a very useful parameter in tablet condition monitoring. It was found that the AE energy that is generated during powder compression is sensitive to the process and is directly proportional to the compression speed, particle size, homogeneity of mixture and the amount of material present. Also this AE signal is dependent upon the type of material used as the tablet filler. Acoustic emission has been shown to be a useful technique for characterising some of the complex physical changes which occur during tabletting. Capping and lamination are serious problems that are encountered during tabletting. A capped or laminated tablet is one which no longer retains its mechanical integrity and exhibit low strength characteristics. Capping and lamination can be caused by a number of factors such as excessive pressure, insufficient binder in the granules and poor material flowabilities. However, capping and lamination can also occur randomly and they are also dependent upon the material used in tabletting. It was possible to identify a capped or laminated tablet by monitoring the AE energy level during continuous on-line monitoring of tabletting. Capped tablets indicated by low level of AE energy. The proposed condition monitoring system aimed to set the AE energy threshold that could discriminate between capped and non-capped tablets. This was based upon statistical distributions of the AE energy values for both the capped and non-capped tablets. The system aims to minimise the rate of false alarms (indication of capping when in reality capping has not occurred) and the rate of missed detection (an indication of non capping, when in reality capping has occurred). A novel approach that employs both the AE method and the receiver operating characteristic (ROC) curve was proposed for the on-line detection of capping and lamination during tabletting. The proposed system employs AE energy as the discriminating parameter to detect between capped and non-capped tablets. The ROC curve was constructed from the area under the two distributions of both capped and non-capped tablet. This curve shows a trade-off between the probabilities of true detection rate and false alarm rate for capped and non-capped tablet. A two-graph receiver operating characteristic (ROC) curve was presented as a modification of the original ROC curve to enable an operator to directly select the desired energy threshold for tablet monitoring. This plot shows the ROC co-ordinate as a function of the threshold value over the entire threshold (AE energy) range for all test outcomes. An alternative way of deciding a threshold based on the slope of the ROC curve was also developed. The slope of the ROC curve represents the optimal operating point on the curve. It depends upon the penalties cost of capping and the prevalence of capping. Sets of guidelines have been outlined for decision making i.e. threshold setting. These guidelines take into account both the prevalence of capping in manufacturing and the cost associated with various outcomes of tablet formation. The proposed condition monitoring system also relates AE monitoring to non-AE measurement as it enable an operator predicting tablet hardness and disintegration form the AE energy, a relationship which was established in this research.
19

Assessing computed tomography image quality for combined detection and estimation tasks

Tseng, Hsin-Wu, Fan, Jiahua, Kupinski, Matthew A. 21 November 2017 (has links)
Maintaining or even improving image quality while lowering patient dose is always the desire in clinical computed tomography (CT) imaging. Iterative reconstruction (IR) algorithms have been designed to allow for a reduced dose while maintaining or even improving an image. However, we have previously shown that the dose-saving capabilities allowed with IR are different for different clinical tasks. The channelized scanning linear observer (CSLO) was applied to study clinical tasks that combine detection and estimation when assessing CT image data. The purpose of this work is to illustrate the importance of task complexity when assessing dose savings and to move toward more realistic tasks when performing these types of studies. Human-observer validation of these methods will take place in a future publication. Low-contrast objects embedded in body-size phantoms were imaged multiple times and reconstructed by filtered back projection (FBP) and an IR algorithm. The task was to detect, localize, and estimate the size and contrast of low-contrast objects in the phantom. Independent signal-present and signal-absent regions of interest cropped from images were channelized by the dense-difference of Gauss channels for CSLO training and testing. Estimation receiver operating characteristic (EROC) curves and the areas under EROC curves (EAUC) were calculated by CSLO as the figure of merit. The one-shot method was used to compute the variance of the EAUC values. Results suggest that the IR algorithm studied in this work could efficiently reduce the dose by similar to 50% while maintaining an image quality comparable to conventional FBP reconstruction warranting further investigation using real patient data. (C) The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
20

Análisis, modelamiento y simulación espacial del cambio de cobertura del suelo, entre las áreas naturales y las de origen antrópico en la provincia de Napo (Ecuador), para el período 1990-2020

Hurtado Pidal, Jorge 03 July 2014 (has links)
Se recopiló una base de datos geográfica, con cartografía básica y temática, sobre la provincia de Napo (Ecuador), en la que se destacan los mapas de cobertura del suelo de los años 2002 y 2008. Como primer producto se elaboró un mapa de cobertura del suelo del año 1990 a partir de imágenes del sensor TM, (Landsat 4 y 5). Posteriormente se realizó un modelo de probabilidad de presencia de coberturas de tipo antrópico, usando la técnica de regresión logística multivariada; se evaluó el modelo con la curva ROC (Relative Operating Characteristic) y se determinó un alto poder de predicción en el modelo (AUC 0.89), distinguiendo además, que la distancia a centros poblados y a vías de comunicación son las variables más influyentes para la presencia de coberturas antrópicas. Se utilizó el mapa resultante del modelo de probabilidad como entrada en un modelo de transición de coberturas que combina Autómatas Celulares y Cadenas de Markov, entre otros aspectos, simulando un mapa de tipo de coberturas (natural o antrópico) para el año 2008. Se evaluó este mapa simulado, comparándolo con uno de referencia, a partir de índices kappa, y se obtuvo un porcentaje de concordancia general de 93%, lo cual es un buen indicador. Una vez que se ha contado con un modelo que permitía hacer simulaciones con el grado de confianza necesario, se realizaron simulaciones para los años 2015 y 2020. En estos escenarios de tipo de cobertura, se puede ver una clara presión hacia los bosques de la rivera del Rio Napo en un futuro, y también en aquellos cercanos a los principales centros poblados como Tena especialmente. Sin embargo, las áreas protegidas muestran un estado de conservación “natural” en las simulaciones, y esto se debe a sus condiciones de inaccesibilidad, en cuanto a falta de infraestructura vial, y a sus condiciones ambientales especiales. Por último, se verificó que la tasa de deforestación (cambio de natural hacia antrópico) en el período 1990-2008 fue de 4661 ha/año y en el período 2008-2020 sería de 3550 ha/año, indicando que la tendencia en el tiempo muestra en el mejor de los casos una disminución o por lo menos una estabilización de los procesos de deforestación.

Page generated in 0.1634 seconds