11 |
Deep Learning One-Class Classification With Support Vector MethodsHampton, Hayden D 01 January 2024 (has links) (PDF)
Through the specialized lens of one-class classification, anomalies–irregular observations that uncharacteristically diverge from normative data patterns–are comprehensively studied. This dissertation focuses on advancing boundary-based methods in one-class classification, a critical approach to anomaly detection. These methodologies delineate optimal decision boundaries, thereby facilitating a distinct separation between normal and anomalous observations. Encompassing traditional approaches such as One-Class Support Vector Machine and Support Vector Data Description, recent adaptations in deep learning offer a rich ground for innovation in anomaly detection. This dissertation proposes three novel deep learning methods for one-class classification, aiming to enhance the efficacy and accuracy of anomaly detection in an era where data volume and complexity present unprecedented challenges. The first two methods are designed for tabular data from a least squares perspective. Formulating these optimization problems within a least squares framework offers notable advantages. It facilitates the derivation of closed-form solutions for critical gradients that largely influence the optimization procedure. Moreover, this approach circumvents the prevalent issue of degenerate or uninformative solutions, a challenge often associated with these types of deep learning algorithms. The third method is designed for second-order tensors. This proposed method has certain computational advantages and alleviates the need for vectorization, which can lead to structural information loss when spatial or contextual relationships exist in the data structure. The performance of the three proposed methods are demonstrated with simulation studies and real-world datasets. Compared to kernel-based one-class classification methods, the proposed deep learning methods achieve significantly better performance under the settings considered.
|
12 |
Modelos de aprendizado supervisionado usando métodos kernel, conjuntos fuzzy e medidas de probabilidade / Supervised machine learning models using kernel methods, probability measures and fuzzy setsGuevara Díaz, Jorge Luis 04 May 2015 (has links)
Esta tese propõe uma metodologia baseada em métodos de kernel, teoria fuzzy e probabilidade para tratar conjuntos de dados cujas observações são conjuntos de pontos. As medidas de probabilidade e os conjuntos fuzzy são usados para modelar essas observações. Posteriormente, graças a kernels definidos sobre medidas de probabilidade, ou em conjuntos fuzzy, é feito o mapeamento implícito dessas medidas de probabilidade, ou desses conjuntos fuzzy, para espaços de Hilbert com kernel reproduzível, onde a análise pode ser feita com algum método kernel. Usando essa metodologia, é possível fazer frente a uma ampla gamma de problemas de aprendizado para esses conjuntos de dados. Em particular, a tese apresenta o projeto de modelos de descrição de dados para observações modeladas com medidas de probabilidade. Isso é conseguido graças ao mergulho das medidas de probabilidade nos espaços de Hilbert, e a construção de esferas envolventes mínimas nesses espaços de Hilbert. A tese apresenta como esses modelos podem ser usados como classificadores de uma classe, aplicados na tarefa de detecção de anomalias grupais. No caso que as observações sejam modeladas por conjuntos fuzzy, a tese propõe mapear esses conjuntos fuzzy para os espaços de Hilbert com kernel reproduzível. Isso pode ser feito graças à projeção de novos kernels definidos sobre conjuntos fuzzy. A tese apresenta como esses novos kernels podem ser usados em diversos problemas como classificação, regressão e na definição de distâncias entre conjuntos fuzzy. Em particular, a tese apresenta a aplicação desses kernels em problemas de classificação supervisionada em dados intervalares e teste kernel de duas amostras para dados contendo atributos imprecisos. / This thesis proposes a methodology based on kernel methods, probability measures and fuzzy sets, to analyze datasets whose individual observations are itself sets of points, instead of individual points. Fuzzy sets and probability measures are used to model observations; and kernel methods to analyze the data. Fuzzy sets are used when the observation contain imprecise, vague or linguistic values. Whereas probability measures are used when the observation is given as a set of multidimensional points in a $D$-dimensional Euclidean space. Using this methodology, it is possible to address a wide range of machine learning problems for such datasets. Particularly, this work presents data description models when observations are modeled by probability measures. Those description models are applied to the group anomaly detection task. This work also proposes a new class of kernels, \\emph{the kernels on fuzzy sets}, that are reproducing kernels able to map fuzzy sets to a geometric feature spaces. Those kernels are similarity measures between fuzzy sets. We give from basic definitions to applications of those kernels in machine learning problems as supervised classification and a kernel two-sample test. Potential applications of those kernels include machine learning and patter recognition tasks over fuzzy data; and computational tasks requiring a similarity measure estimation between fuzzy sets.
|
13 |
Defining GeoDesign and the emergent role of the sustainable sites initiative (SITES) for integrative project managementRisinger, Emily Diane 16 March 2015 (has links)
This report is a discussion of the multifarious applications of the modern day geographic information system and how the universal merit of the technology across disciplines has led to the emergence of GeoDesign. The purpose of this Master’s Professional Report was to retrace the core conceptual framework and landmark events occurring in the evolution GIS technology, and how these factors have led to recent creation of new performance based rating systems and evidence-based design techniques. The Sustainable Sites Initiative (SITES), a new performance based rating system that has emerged in response to the call for increased knowledge and best practices lacking in LEED, is discussed; along with integrated project management. This professional report was intended to be an exploratory discussion of the larger theoretical implications fueling the shift towards mandating greater standards for sustainable design. It offers some ideas for how we should continue evolving GeoDesign moving into the next century; and outlines the importance of all new rating systems needing to acknowledge the growing importance of GeoDesign and ever advancing imagery technologies in understanding complex system processes in the future. / text
|
14 |
Modèles statistiques non linéaires pour l'analyse de formes : application à l'imagerie cérébraleSfikas, Giorgos 07 September 2012 (has links) (PDF)
Cette thèse a pour objet l'analyse statistique de formes, dans le contexte de l'imagerie médicale.Dans le champ de l'imagerie médicale, l'analyse de formes est utilisée pour décrire la variabilité morphologique de divers organes et tissus. Nous nous focalisons dans cette thèse sur la construction d'un modèle génératif et discriminatif, compact et non-linéaire, adapté à la représentation de formes.Ce modèle est évalué dans le contexte de l'étude d'une population de patients atteints de la maladie d'Alzheimer et d'une population de sujets contrôles sains. Notre intérêt principal ici est l'utilisationdu modèle discriminatif pour découvrir les différences morphologiques les plus discriminatives entre une classe de formes donnée et des formes n'appartenant pas à cette classe. L'innovation théorique apportée par notre modèle réside en deux points principaux : premièrement, nous proposons un outil pour extraire la différence discriminative dans le cadre Support Vector Data Description (SVDD) ; deuxièmement, toutes les reconstructions générées sont anatomiquementcorrectes. Ce dernier point est dû au caractère non-linéaire et compact du modèle, lié à l'hypothèse que les données (les formes) se trouvent sur une variété non-linéaire de dimension faible. Une application de notre modèle à des données médicales réelles montre des résultats cohérents avec les connaissances médicales.
|
15 |
Modelos de aprendizado supervisionado usando métodos kernel, conjuntos fuzzy e medidas de probabilidade / Supervised machine learning models using kernel methods, probability measures and fuzzy setsJorge Luis Guevara Díaz 04 May 2015 (has links)
Esta tese propõe uma metodologia baseada em métodos de kernel, teoria fuzzy e probabilidade para tratar conjuntos de dados cujas observações são conjuntos de pontos. As medidas de probabilidade e os conjuntos fuzzy são usados para modelar essas observações. Posteriormente, graças a kernels definidos sobre medidas de probabilidade, ou em conjuntos fuzzy, é feito o mapeamento implícito dessas medidas de probabilidade, ou desses conjuntos fuzzy, para espaços de Hilbert com kernel reproduzível, onde a análise pode ser feita com algum método kernel. Usando essa metodologia, é possível fazer frente a uma ampla gamma de problemas de aprendizado para esses conjuntos de dados. Em particular, a tese apresenta o projeto de modelos de descrição de dados para observações modeladas com medidas de probabilidade. Isso é conseguido graças ao mergulho das medidas de probabilidade nos espaços de Hilbert, e a construção de esferas envolventes mínimas nesses espaços de Hilbert. A tese apresenta como esses modelos podem ser usados como classificadores de uma classe, aplicados na tarefa de detecção de anomalias grupais. No caso que as observações sejam modeladas por conjuntos fuzzy, a tese propõe mapear esses conjuntos fuzzy para os espaços de Hilbert com kernel reproduzível. Isso pode ser feito graças à projeção de novos kernels definidos sobre conjuntos fuzzy. A tese apresenta como esses novos kernels podem ser usados em diversos problemas como classificação, regressão e na definição de distâncias entre conjuntos fuzzy. Em particular, a tese apresenta a aplicação desses kernels em problemas de classificação supervisionada em dados intervalares e teste kernel de duas amostras para dados contendo atributos imprecisos. / This thesis proposes a methodology based on kernel methods, probability measures and fuzzy sets, to analyze datasets whose individual observations are itself sets of points, instead of individual points. Fuzzy sets and probability measures are used to model observations; and kernel methods to analyze the data. Fuzzy sets are used when the observation contain imprecise, vague or linguistic values. Whereas probability measures are used when the observation is given as a set of multidimensional points in a $D$-dimensional Euclidean space. Using this methodology, it is possible to address a wide range of machine learning problems for such datasets. Particularly, this work presents data description models when observations are modeled by probability measures. Those description models are applied to the group anomaly detection task. This work also proposes a new class of kernels, \\emph{the kernels on fuzzy sets}, that are reproducing kernels able to map fuzzy sets to a geometric feature spaces. Those kernels are similarity measures between fuzzy sets. We give from basic definitions to applications of those kernels in machine learning problems as supervised classification and a kernel two-sample test. Potential applications of those kernels include machine learning and patter recognition tasks over fuzzy data; and computational tasks requiring a similarity measure estimation between fuzzy sets.
|
16 |
GIS využití krajiny obce Jinačovice / Land use GIS of Jinačovice municipalityKelblová, Kristýna January 2013 (has links)
This master thesis focuses on the development of the GIS in the Jinačovice municipality, particularly on the land use. The core data used for the research has been gathered by aerial photography since 1950. Other data sources include other types of maps, especially Cadastral map, ZABAGED data and statistical data. An analysis of the Land use has been carried out as well as additional analyses of the area. The research has been conducted specifically for the Jinačovice municipality.
|
17 |
Mahalanobis kernel-based support vector data description for detection of large shifts in mean vectorNguyen, Vu 01 January 2015 (has links)
Statistical process control (SPC) applies the science of statistics to various process control in order to provide higher-quality products and better services. The K chart is one among the many important tools that SPC offers. Creation of the K chart is based on Support Vector Data Description (SVDD), a popular data classifier method inspired by Support Vector Machine (SVM). As any methods associated with SVM, SVDD benefits from a wide variety of choices of kernel, which determines the effectiveness of the whole model. Among the most popular choices is the Euclidean distance-based Gaussian kernel, which enables SVDD to obtain a flexible data description, thus enhances its overall predictive capability. This thesis explores an even more robust approach by incorporating the Mahalanobis distance-based kernel (hereinafter referred to as Mahalanobis kernel) to SVDD and compare it with SVDD using the traditional Gaussian kernel. Method's sensitivity is benchmarked by Average Run Lengths obtained from multiple Monte Carlo simulations. Data of such simulations are generated from multivariate normal, multivariate Student's (t), and multivariate gamma populations using R, a popular software environment for statistical computing. One case study is also discussed using a real data set received from Halberg Chronobiology Center. Compared to Gaussian kernel, Mahalanobis kernel makes SVDD and thus the K chart significantly more sensitive to shifts in mean vector, and also in covariance matrix.
|
18 |
Computer aided diagnosis of epilepsy lesions based on multivariate and multimodality data analysis / Recherche de biomarqueurs par l’analyse multivariée d’images paramétriques multimodales pour le bilan non-invasif préchirurgical de l’épilepsie focale pharmaco-résistanteEl Azami, Meriem 23 September 2016 (has links)
Environ 150.000 personnes souffrent en France d'une épilepsie partielle réfractaire à tous les médicaments. La chirurgie, qui constitue aujourd’hui le meilleur recours thérapeutique nécessite un bilan préopératoire complexe. L'analyse de données d'imagerie telles que l’imagerie par résonance magnétique (IRM) anatomique et la tomographie d’émission de positons (TEP) au FDG (fluorodéoxyglucose) tend à prendre une place croissante dans ce protocole, et pourrait à terme limiter de recourir à l’électroencéphalographie intracérébrale (SEEG), procédure très invasive mais qui constitue encore la technique de référence. Pour assister les cliniciens dans leur tâche diagnostique, nous avons développé un système d'aide au diagnostic (CAD) reposant sur l'analyse multivariée de données d'imagerie. Compte tenu de la difficulté relative à la constitution de bases de données annotées et équilibrées entre classes, notre première contribution a été de placer l'étude dans le cadre méthodologique de la détection du changement. L'algorithme du séparateur à vaste marge adapté à ce cadre là (OC-SVM) a été utilisé pour apprendre, à partir de cartes multi-paramétriques extraites d'IRM T1 de sujets normaux, un modèle prédictif caractérisant la normalité à l'échelle du voxel. Le modèle permet ensuite de faire ressortir, dans les images de patients, les zones cérébrales suspectes s'écartant de cette normalité. Les performances du système ont été évaluées sur des lésions simulées ainsi que sur une base de données de patients. Trois extensions ont ensuite été proposées. D'abord un nouveau schéma de détection plus robuste à la présence de bruit d'étiquetage dans la base de données d'apprentissage. Ensuite, une stratégie de fusion optimale permettant la combinaison de plusieurs classifieurs OC-SVM associés chacun à une séquence IRM. Enfin, une généralisation de l'algorithme de détection d'anomalies permettant la conversion de la sortie du CAD en probabilité, offrant ainsi une meilleure interprétation de la sortie du système et son intégration dans le bilan pré-opératoire global. / One third of patients suffering from epilepsy are resistant to medication. For these patients, surgical removal of the epileptogenic zone offers the possibility of a cure. Surgery success relies heavily on the accurate localization of the epileptogenic zone. The analysis of neuroimaging data such as magnetic resonance imaging (MRI) and positron emission tomography (PET) is increasingly used in the pre-surgical work-up of patients and may offer an alternative to the invasive reference of Stereo-electro-encephalo -graphy (SEEG) monitoring. To assist clinicians in screening these lesions, we developed a computer aided diagnosis system (CAD) based on a multivariate data analysis approach. Our first contribution was to formulate the problem of epileptogenic lesion detection as an outlier detection problem. The main motivation for this formulation was to avoid the dependence on labelled data and the class imbalance inherent to this detection task. The proposed system builds upon the one class support vector machines (OC-SVM) classifier. OC-SVM was trained using features extracted from MRI scans of healthy control subjects, allowing a voxelwise assessment of the deviation of a test subject pattern from the learned patterns. System performance was evaluated using realistic simulations of challenging detection tasks as well as clinical data of patients with intractable epilepsy. The outlier detection framework was further extended to take into account the specificities of neuroimaging data and the detection task at hand. We first proposed a reformulation of the support vector data description (SVDD) method to deal with the presence of uncertain observations in the training data. Second, to handle the multi-parametric nature of neuroimaging data, we proposed an optimal fusion approach for combining multiple base one-class classifiers. Finally, to help with score interpretation, threshold selection and score combination, we proposed to transform the score outputs of the outlier detection algorithm into well calibrated probabilities.
|
Page generated in 0.0677 seconds