Global ETD Search

1	Prediction Intervals for Class Probabilities Yu, Xiaofeng January 2007 (has links) Prediction intervals for class probabilities are of interest in machine learning because they can quantify the uncertainty about the class probability estimate for a test instance. The idea is that all likely class probability values of the test instance are included, with a pre-specified confidence level, in the calculated prediction interval. This thesis proposes a probabilistic model for calculating such prediction intervals. Given the unobservability of class probabilities, a Bayesian approach is employed to derive a complete distribution of the class probability of a test instance based on a set of class observations of training instances in the neighbourhood of the test instance. A random decision tree ensemble learning algorithm is also proposed, whose prediction output constitutes the neighbourhood that is used by the Bayesian model to produce a PI for the test instance. The Bayesian model, which is used in conjunction with the ensemble learning algorithm and the standard nearest-neighbour classifier, is evaluated on artificial datasets and modified real datasets. bayesian classifier machine learning
2	Automated detection of breast cancer using SAXS data and wavelet features Erickson, Carissa Michelle 02 August 2005 The overarching goal of this project was to improve breast cancer screening protocols first by collecting small angle x-ray scattering (SAXS) images from breast biopsy tissue, and second, by applying pattern recognition techniques as a semi-automatic screen. Wavelet based features were generated from the SAXS image data. The features were supplied to a classifier, which sorted the images into distinct groups, such as normal and tumor. <p>The main problem in the project was to find a set of features that provided sufficient separation for classification into groups of normal and tumor. In the original SAXS patterns, information useful for classification was obscured. The wavelet maps allowed new scale-based information to be uncovered from each SAXS pattern. The new information was subsequently used to define features that allowed for classification. Several calculations were tested to extract useful features from the wavelet decomposition maps. The wavelet map average intensity feature was selected as the most promising feature. The wavelet map intensity feature was improved by using pre-processing to remove the high central intensities from the SAXS patterns, and by using different wavelet bases for the wavelet decomposition. <p>The investigation undertaken for this project showed very promising results. A classification rate of 100% was achieved for distinguishing between normal samples and tumor samples. The system also showed promising results when tested on unrelated MRI data. In the future, the semi-automatic pattern recognition tool developed for this project could be automated. With a larger set of data for training and testing, the tool could be improved upon and used to assist radiologists in the detection and classification of breast lesions. Naive Bayesian Classifier Synchrotron Diffraction Multiresolution
3	Automated detection of breast cancer using SAXS data and wavelet features Erickson, Carissa Michelle 02 August 2005 (has links) The overarching goal of this project was to improve breast cancer screening protocols first by collecting small angle x-ray scattering (SAXS) images from breast biopsy tissue, and second, by applying pattern recognition techniques as a semi-automatic screen. Wavelet based features were generated from the SAXS image data. The features were supplied to a classifier, which sorted the images into distinct groups, such as normal and tumor. <p>The main problem in the project was to find a set of features that provided sufficient separation for classification into groups of normal and tumor. In the original SAXS patterns, information useful for classification was obscured. The wavelet maps allowed new scale-based information to be uncovered from each SAXS pattern. The new information was subsequently used to define features that allowed for classification. Several calculations were tested to extract useful features from the wavelet decomposition maps. The wavelet map average intensity feature was selected as the most promising feature. The wavelet map intensity feature was improved by using pre-processing to remove the high central intensities from the SAXS patterns, and by using different wavelet bases for the wavelet decomposition. <p>The investigation undertaken for this project showed very promising results. A classification rate of 100% was achieved for distinguishing between normal samples and tumor samples. The system also showed promising results when tested on unrelated MRI data. In the future, the semi-automatic pattern recognition tool developed for this project could be automated. With a larger set of data for training and testing, the tool could be improved upon and used to assist radiologists in the detection and classification of breast lesions. Naive Bayesian Classifier Synchrotron Diffraction Multiresolution
4	Predicting context specific enhancer-promoter interactions from ChIP-Seq time course data Dzida, Tomasz January 2017 (has links) We develop machine learning approaches to predict context specific enhancer-promoter interactions using evidence from changes in genomic protein occupancy over time. Occupancy of estrogen receptor alpha (ER-alpha), RNA polymerase (Pol II) and histone marks H2AZ and H3K4me3 were measured over time using ChIP-Seq experiments in MCF7 cells stimulated with estrogen. Two Bayesian classifiers were developed, unsupervised and supervised. The supervised approach uses the correlation of temporal binding patterns at enhancers and promoters and genomic proximity as features and predicts interactions. The method was trained using experimentally determined interactions from the same system and achieves much higher precision than predictions based on the genomic proximity of nearest ER-alpha binding. We use the method to identify a confident set of ER-alpha target genes and their regulatory enhancers genome-wide. Validation with publicly available GRO-Seq data shows our predicted targets are much more likely to show early nascent transcription than predictions based on genomic ER-alpha binding proximity alone. Accuracy of the predictions from the supervised model was compared against the second more complex unsupervised generative approach which uses proximity-based prior and temporal binding patterns at enhancers and promoters to infer protein-mediated regulatory complexes involving individual genes and their networks of multiple distant regulatory enhancers. 572.8
5	Holistic Face Recognition By Dimension Reduction Gul, Ahmet Bahtiyar 01 January 2003 (has links) (PDF) Face recognition is a popular research area where there are different approaches studied in the literature. In this thesis, a holistic Principal Component Analysis (PCA) based method, namely Eigenface method is studied in detail and three of the methods based on the Eigenface method are compared. These are the Bayesian PCA where Bayesian classifier is applied after dimension reduction with PCA, the Subspace Linear Discriminant Analysis (LDA) where LDA is applied after PCA and Eigenface where Nearest Mean Classifier applied after PCA. All the three methods are implemented on the Olivetti Research Laboratory (ORL) face database, the Face Recognition Technology (FERET) database and the CNN-TURK Speakers face database. The results are compared with respect to the effects of changes in illumination, pose and aging. Simulation results show that Subspace LDA and Bayesian PCA perform slightly well with respect to PCA under changes in pose / however, even Subspace LDA and Bayesian PCA do not perform well under changes in illumination and aging although they perform better than PCA.
6	Analyse de changements multiples : une approche probabiliste utilisant les réseaux bayésiens Bali, Khaled 12 1900 (has links) No description available. Genie logiciel impact du changement classifieur bayésien Software engineering change impact Bayesian classifier
7	Analyse de changements multiples : une approche probabiliste utilisant les réseaux bayésiens Bali, Khaled 12 1900 (has links) La maintenance du logiciel est une phase très importante du cycle de vie de celui-ci. Après les phases de développement et de déploiement, c’est celle qui dure le plus longtemps et qui accapare la majorité des coûts de l'industrie. Ces coûts sont dus en grande partie à la difficulté d’effectuer des changements dans le logiciel ainsi que de contenir les effets de ces changements. Dans cette perspective, de nombreux travaux ont ciblé l’analyse/prédiction de l’impact des changements sur les logiciels. Les approches existantes nécessitent de nombreuses informations en entrée qui sont difficiles à obtenir. Dans ce mémoire, nous utilisons une approche probabiliste. Des classificateurs bayésiens sont entraînés avec des données historiques sur les changements. Ils considèrent les relations entre les éléments (entrées) et les dépendances entre changements historiques (sorties). Plus spécifiquement, un changement complexe est divisé en des changements élémentaires. Pour chaque type de changement élémentaire, nous créons un classificateur bayésien. Pour prédire l’impact d’un changement complexe décomposé en changements élémentaires, les décisions individuelles des classificateurs sont combinées selon diverses stratégies. Notre hypothèse de travail est que notre approche peut être utilisée selon deux scénarios. Dans le premier scénario, les données d’apprentissage sont extraites des anciennes versions du logiciel sur lequel nous voulons analyser l’impact de changements. Dans le second scénario, les données d’apprentissage proviennent d’autres logiciels. Ce second scénario est intéressant, car il permet d’appliquer notre approche à des logiciels qui ne disposent pas d’historiques de changements. Nous avons réussi à prédire correctement les impacts des changements élémentaires. Les résultats ont montré que l’utilisation des classificateurs conceptuels donne les meilleurs résultats. Pour ce qui est de la prédiction des changements complexes, les méthodes de combinaison "Voting" et OR sont préférables pour prédire l’impact quand le nombre de changements à analyser est grand. En revanche, quand ce nombre est limité, l’utilisation de la méthode Noisy-Or ou de sa version modifiée est recommandée. / Software maintenance is one of the most important phases in the software life cycle. After the development and deployment phases, maintenance is a continuous phase that lasts until removing the software from operation. It is then the most costly phase. Indeed, those costs are due to the difficulty of implementing different changes in the system and to manage their impacts. In this context, much research work has targeted the problem of change impact analysis/prediction. The existent approaches require many inputs that are difficult to extract. In this Master thesis, we propose a probabilistic approach that uses Bayesian classifiers to predict the change impact. These classifiers are trained with historical data about changes. The consider the relations between the elements of a system (input), and the dependencies between the occurred changes (output). More precisely, a complex change in a system is divided into a set of elementary changes. For each type of elementary change, we create a classifier. To predict the impact of complex change, the individual decisions of each classifier are combined using different strategies. We evaluate our approach in two scenarios. In the first, we extract the learning data from the oldest versions of the same system. In the second scenario, the learn data comes from other systems to create the classifiers. This second scenario is interesting because it allows us to use our approach on systems without change histories. Our approach showed that it can predict the impact of elementary changes. The best results are obtained using the classifiers based on conceptual relations. For the prediction of complex changes by the combination of elementary decisions, the results are encouraging considering the few used inputs. More specifically, the voting method and the OR method predict better complex changes when the number of case to analyze is large. Otherwise, using the method Noisy-Or or its modified version is recommended when the number of cases is small. Genie logiciel impact du changement classifieur bayésien Software engineering change impact Bayesian classifier
8	Algorithms For Geospatial Analysis Using Multi-Resolution Remote Sensing Data Uttam Kumar, * 03 1900 (has links) (PDF) Geospatial analysis involves application of statistical methods, algorithms and information retrieval techniques to geospatial data. It incorporates time into spatial databases and facilitates investigation of land cover (LC) dynamics through data, model, and analytics. LC dynamics induced by human and natural processes play a major role in global as well as regional scale patterns, which in turn influence weather and climate. Hence, understanding LC dynamics at the local / regional as well as at global levels is essential to evolve appropriate management strategies to mitigate the impacts of LC changes. This can be captured through the multi-resolution remote sensing (RS) data. However, with the advancements in sensor technologies, suitable algorithms and techniques are required for optimal integration of information from multi-resolution sensors which are cost effective while overcoming the possible data and methodological constraints. In this work, several per-pixel traditional and advanced classification techniques have been evaluated with the multi-resolution data along with the role of ancillary geographical data on the performance of classifiers. Techniques for linear and non-linear un-mixing, endmember variability and determination of spatial distribution of class components within a pixel have been applied and validated on multi-resolution data. Endmember estimation method is proposed and its performance is compared with manual, semi-automatic and fully automatic methods of endmember extraction. A novel technique - Hybrid Bayesian Classifier is developed for per pixel classification where the class prior probabilities are determined by un-mixing a low spatial-high spectral resolution multi-spectral data while posterior probabilities are determined from the training data obtained from ground, that are assigned to every pixel in a high spatial-low spectral resolution multi-spectral data in Bayesian classification. These techniques have been validated with multi-resolution data for various landscapes with varying altitudes. As a case study, spatial metrics and cellular automata based models applied for rapidly urbanising landscape with moderate altitude has been carried out. Image Fusion Landscape Dynamics Urban Growth - Modeling and Simulation Pixel Classification Geospatial Analysis - Algorithms Multi-resolution Remote Sensing Data Land Use Pattern Classification Coarse Resolution Pixels Spatial Metrics Hybrid Bayesian Classifier Cellular Automata Applied Optics
9	Pokročilé dolování v datech v kardiologii / Advanced Data Mining in Cardiology Mézl, Martin January 2009 (has links) The aim of this master´s thesis is to analyse and search unusual dependencies in database of patients from Internal Cardiology Clinic Faculty Hospital Brno. The part of the work is theoretical overview of common data mining methods used in medicine, especially decision trees, naive Bayesian classifier, artificial neural networks and association rules. Looking for unusual dependencies between atributes is realized by association rules and naive Bayesian classifier. The output of this work is a complex system for Knowledge discovery in databases process for any data set. This work was realized with collaboration of Internal Cardiology Clinic Faculty Hospital Brno. All programs were made in Matlab 7.0.1.
10	Využití prostředků umělé inteligence na kapitálových trzích / The Use of Means of Artificial Intelligence for the Decision Making Support on Stock Market Hrach, Vlastimil January 2011 (has links) The diploma thesis deals with artificial intelligence utilization for predictions on stock markets.The prediction is unconventionally based on Bayes' probabilistic model theorem and on its based Naive Bayes classifier. I the practical part algorithm is designed. The algorithm uses recognized relations between identifiers of technical analyze. Concretely exponential running averages at 20 and 50 days had been used. The program output is a graphic forecast of future stock development which is designed on ground of relations classification between the identifiers

Search results