• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 378
  • 64
  • 43
  • 26
  • 6
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 606
  • 606
  • 276
  • 211
  • 208
  • 148
  • 133
  • 125
  • 92
  • 91
  • 88
  • 85
  • 78
  • 76
  • 75
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Clustering Via Supervised Support Vector Machines

Merat, Sepehr 07 August 2008 (has links)
An SVM-based clustering algorithm is introduced that clusters data with no a priori knowledge of input classes. The algorithm initializes by first running a binary SVM classifier against a data set with each vector in the set randomly labeled. Once this initialization step is complete, the SVM confidence parameters for classification on each of the training instances can be accessed. The lowest confidence data (e.g., the worst of the mislabeled data) then has its labels switched to the other class label. The SVM is then re-run on the data set (with partly re-labeled data). The repetition of the above process improves the separability until there is no misclassification. Variations on this type of clustering approach are shown.
82

Data Driven Visual Recognition

Aghazadeh, Omid January 2014 (has links)
This thesis is mostly about supervised visual recognition problems. Based on a general definition of categories, the contents are divided into two parts: one which models categories and one which is not category based. We are interested in data driven solutions for both kinds of problems. In the category-free part, we study novelty detection in temporal and spatial domains as a category-free recognition problem. Using data driven models, we demonstrate that based on a few reference exemplars, our methods are able to detect novelties in ego-motions of people, and changes in the static environments surrounding them. In the category level part, we study object recognition. We consider both object category classification and localization, and propose scalable data driven approaches for both problems. A mixture of parametric classifiers, initialized with a sophisticated clustering of the training data, is demonstrated to adapt to the data better than various baselines such as the same model initialized with less subtly designed procedures. A nonparametric large margin classifier is introduced and demonstrated to have a multitude of advantages in comparison to its competitors: better training and testing time costs, the ability to make use of indefinite/invariant and deformable similarity measures, and adaptive complexity are the main features of the proposed model. We also propose a rather realistic model of recognition problems, which quantifies the interplay between representations, classifiers, and recognition performances. Based on data-describing measures which are aggregates of pairwise similarities of the training data, our model characterizes and describes the distributions of training exemplars. The measures are shown to capture many aspects of the difficulty of categorization problems and correlate significantly to the observed recognition performances. Utilizing these measures, the model predicts the performance of particular classifiers on distributions similar to the training data. These predictions, when compared to the test performance of the classifiers on the test sets, are reasonably accurate. We discuss various aspects of visual recognition problems: what is the interplay between representations and classification tasks, how can different models better adapt to the training data, etc. We describe and analyze the aforementioned methods that are designed to tackle different visual recognition problems, but share one common characteristic: being data driven. / <p>QC 20140604</p>
83

Classifying natural forests using LiDAR data / Klassificering av nyckelbiotoper med hjälp av LiDAR-data

Arvidsson, Simon, Gullstrand, Marcus January 2019 (has links)
In forestry, natural forests are forest areas with high biodiversity, in need of preservation. The current mapping of natural forests is a tedious task that requires manual labor that could possibly be automated. In this paper we explore the main features used by a random forest algorithm to classify natural forest and managed forest in northern Sweden. The goal was to create a model with a substantial strength of agreement, meaning a Kappa value of 0.61 or higher, placing the model in the same range as models produced in previous research. We used raster data gathered from airborne LiDAR, combined with labeled sample areas, both supplied by the Swedish Forest Agency. Two experiments were performed with different features. Experiment 1 used features extracted using methods inspired from previous research while Experiment 2 further added upon those features. From the total number of used sample areas (n=2882), 70% was used to train the models and 30% was used for evaluation. The result was a Kappa value of 0.26 for Experiment 1 and 0.32 for Experiment 2. Features shown to be prominent are features derived from canopy height, where the supplied data also had the highest resolution. Percentiles, kurtosis and canopy crown areas derived from the canopy height were shown to be the most important for classification. The results fell short of our goal, possibly indicating a range of flaws in the data used. The size of the sample areas and resolution of raster data are likely important factors when extracting features, playing a large role in the produced model’s performance.
84

Time Series Forecasting of House Prices: An evaluation of a Support Vector Machine and a Recurrent Neural Network with LSTM cells

Rostami, Jako, Hansson, Fredrik January 2019 (has links)
In this thesis, we examine the performance of different forecasting methods. We use dataof monthly house prices from the larger Stockholm area and the municipality of Uppsalabetween 2005 and early 2019 as the time series to be forecast. Firstly, we compare theperformance of two machine learning methods, the Long Short-Term Memory, and theSupport Vector Machine methods. The two methods forecasts are compared, and themodel with the lowest forecasting error measured by three metrics is chosen to be comparedwith a classic seasonal ARIMA model. We find that the Long Short-Term Memorymethod is the better performing machine learning method for a twelve-month forecast,but that it still does not forecast as well as the ARIMA model for the same forecast period.
85

Expansão de recursos para análise de sentimentos usando aprendizado semi-supervisionado / Extending sentiment analysis resources using semi-supervised learning

Brum, Henrico Bertini 23 March 2018 (has links)
O grande volume de dados que temos disponíveis em ambientes virtuais pode ser excelente fonte de novos recursos para estudos em diversas tarefas de Processamento de Linguagem Natural, como a Análise de Sentimentos. Infelizmente é elevado o custo de anotação de novos córpus, que envolve desde investimentos financeiros até demorados processos de revisão. Nossa pesquisa propõe uma abordagem de anotação semissupervisionada, ou seja, anotação automática de um grande córpus não anotado partindo de um conjunto de dados anotados manualmente. Para tal, introduzimos o TweetSentBR, um córpus de tweets no domínio de programas televisivos que possui anotação em três classes e revisões parciais feitas por até sete anotadores. O córpus representa um importante recurso linguístico de português brasileiro, e fica entre os maiores córpus anotados na literatura para classificação de polaridades. Além da anotação manual do córpus, realizamos a implementação de um framework de aprendizado semissupervisionado que faz uso de dados anotados e, de maneira iterativa, expande o mesmo usando dados não anotados. O TweetSentBR, que possui 15:000 tweets anotados é assim expandido cerca de oito vezes. Para a expansão, foram treinados modelos de classificação usando seis classificadores de polaridades, assim como foram avaliados diferentes parâmetros e representações a fim de obter um córpus confiável. Realizamos experimentos gerando córpus expandidos por cada classificador, tanto para a classificação em três polaridades (positiva, neutra e negativa) quanto para classificação binária. Avaliamos os córpus gerados usando um conjunto de held-out e comparamos a FMeasure da classificação usando como treinamento os córpus anotados manualmente e semiautomaticamente. O córpus semissupervisionado que obteve os melhores resultados para a classificação em três polaridades atingiu 62;14% de F-Measure média, superando a média obtida com as avaliações no córpus anotado manualmente (61;02%). Na classificação binária, o melhor córpus expandido obteve 83;11% de F1-Measure média, superando a média obtida na avaliação do córpus anotado manualmente (79;80%). Além disso, simulamos nossa expansão em córpus anotados da literatura, medindo o quão corretas são as etiquetas anotadas semi-automaticamente. Nosso melhor resultado foi na expansão de um córpus de reviews de produtos que obteve FMeasure de 93;15% com dados binários. Por fim, comparamos um córpus da literatura obtido por meio de supervisão distante e nosso framework semissupervisionado superou o primeiro na classificação de polaridades binária em cross-domain. / The high volume of data available in the Internet can be a good resource for studies of several tasks in Natural Language Processing as in Sentiment Analysis. Unfortunately there is a high cost for the annotation of new corpora, involving financial support and long revision processes. Our work proposes an approach for semi-supervised labeling, an automatic annotation of a large unlabeled set of documents starting from a manually annotated corpus. In order to achieve that, we introduced TweetSentBR, a tweet corpora on TV show programs domain with annotation for 3-point (positive, neutral and negative) sentiment classification partially reviewed by up to seven annotators. The corpus is an important linguistic resource for Brazilian Portuguese language and it stands between the biggest annotated corpora for polarity classification. Beyond the manual annotation, we implemented a semi-supervised learning based framework that uses this labeled data and extends it using unlabeled data. TweetSentBR corpus, containing 15:000 documents, had its size augmented in eight times. For the extending process, we trained classification models using six polarity classifiers, evaluated different parameters and representation schemes in order to obtain the most reliable corpora. We ran experiments generating extended corpora for each classifier, both for 3-point and binary classification. We evaluated the generated corpora using a held-out subset and compared the obtained F-Measure values with the manually and the semi-supervised annotated corpora. The semi-supervised corpus that obtained the best values for 3-point classification achieved 62;14% on average F-Measure, overcoming the results obtained by the same classification with the manually annotated corpus (61;02%). On binary classification, the best extended corpus achieved 83;11% on average F-Measure, overcoming the results on the manually corpora (79;80%). Furthermore, we simulated the extension of labeled corpora in literature, measuring how well the semi-supervised annotation works. Our best results were in the extension of a product review corpora, achieving 93;15% on F1-Measure. Finally, we compared a literature corpus which was labeled by using distant supervision with our semi-supervised corpus, and this overcame the first in binary polarity classification on cross-domain data.
86

Interpretação de clusters gerados por algoritmos de clustering hierárquico / Interpreting clusters generated by hierarchical clustering algorithms

Metz, Jean 04 August 2006 (has links)
O processo de Mineração de Dados (MD) consiste na extração automática de padrões que representam o conhecimento implícito em grandes bases de dados. Em geral, a MD pode ser classificada em duas categorias: preditiva e descritiva. Tarefas da primeira categoria, tal como a classificação, realizam inferências preditivas sobre os dados enquanto que tarefas da segunda categoria, tal como o clustering, exploram o conjunto de dados em busca de propriedades que o descrevem. Diferentemente da classificação, que analisa exemplos rotulados, o clustering utiliza exemplos para os quais o rótulo da classe não é previamente conhecido. Nessa tarefa, agrupamentos são formados de modo que exemplos de um mesmo cluster apresentam alta similaridade, ao passo que exemplos em clusters diferentes apresentam baixa similaridade. O clustering pode ainda facilitar a organização de clusters em uma hierarquia de agrupamentos, na qual são agrupados eventos similares, criando uma taxonomia que pode simplificar a interpretação de clusters. Neste trabalho, é proposto e desenvolvido um módulo de aprendizado não-supervisionado, que agrega algoritmos de clustering hierárquico e ferramentas de análise de clusters para auxiliar o especialista de domínio na interpretação dos resultados do clustering. Uma vez que o clustering hierárquico agrupa exemplos de acordo com medidas de similaridade e organiza os clusters em uma hierarquia, o usuário/especialista pode analisar e explorar essa hierarquia de agrupamentos em diferentes níveis para descobrir conceitos descritos por essa estrutura. O módulo proposto está integrado em um sistema maior, em desenvolvimento no Laboratório de Inteligência Computacional ? LABIC ?, que contempla todas as etapas do processo de MD, desde o pré-processamento de dados ao pós-processamento de conhecimento. Para avaliar o módulo proposto e seu uso para descoberta de conceitos a partir da estrutura hierárquica de clusters, foram realizados diversos experimentos sobre conjuntos de dados naturais, assim como um estudo de caso utilizando um conjunto de dados real. Os resultados mostram a viabilidade da metodologia proposta para interpretação dos clusters, apesar da complexidade do processo ser dependente das características do conjunto de dados. / The Data Mining (DM) process consists of the automated extraction of patterns representing knowledge implicitly stored in large databases. In general, DM tasks can be classified into two categories: predictive and descriptive. Tasks in the first category, such as classification and prediction, perform inference on the data in order to make predictions, while tasks in the second category, such as clustering, characterize the general properties of the data. Unlike classification and prediction, which analyze class-labeled data objects, clustering analyses data objects without a known class-label. Clusters of objects are formed so that objects that are in the same cluster have a close similarity among them, but are very dissimilar to objects in other clusters. Clustering can also facilitate the organization of clusters into a hierarchy of clusters that group similar events together. This taxonomy formation can facilitate interpretation of clusters. In this work, we propose and develop tools to deal with this task by implementing a module which comprises hierarchical clustering algorithms and several cluster analysis tools, aiming to help the domain specialist to interpret the clustering results. Once clusters group objects based on similarity measures which are organized into a hierarchy, the user/specialist is able to carry out an analysis and exploration of the agglomeration hierarchy at different levels of the hierarchy in order to discover concepts described by this structure. The proposed module is integrated into a large system under development by researchers from the Computational Intelligence Laboratory ? LABIC ?- which contemplates all the DM process steps, from data pre-processing to knowledge post-processing. To evaluate the implemented module and its use to discover concepts from the hierarchical structure of clusters, several experiments on natural databases were carried out as well as a case study using a real database. Results show the viability of the proposed methodology although the process could be complex depending on the characteristics of the database.
87

Impacto da geração de grafos na classificação semissupervisionada / Impact of graph construction on semi-supervised classification

Sousa, Celso André Rodrigues de 18 July 2013 (has links)
Uma variedade de algoritmos de aprendizado semissupervisionado baseado em grafos e métodos de geração de grafos foram propostos pela comunidade científica nos últimos anos. Apesar de seu aparente sucesso empírico, a área de aprendizado semissupervisionado carece de um estudo empírico detalhado que avalie o impacto da geração de grafos na classificação semissupervisionada. Neste trabalho, é provido tal estudo empírico. Para tanto, combinam-se uma variedade de métodos de geração de grafos com uma variedade de algoritmos de aprendizado semissupervisionado baseado em grafos para compará-los empiricamente em seis bases de dados amplamente usadas na literatura de aprendizado semissupervisionado. Os algoritmos são avaliados em tarefas de classificação de dígitos, caracteres, texto, imagens e de distribuições gaussianas. A avaliação experimental proposta neste trabalho é subdividida em quatro partes: (1) análise de melhor caso; (2) avaliação da estabilidade dos classificadores semissupervisionados; (3) avaliação do impacto da geração de grafos na classificação semissupervisionada; (4) avaliação da influência dos parâmetros de regularização no desempenho de classificação dos classificadores semissupervisionados. Na análise de melhor caso, avaliam-se as melhores taxas de erro de cada algoritmo semissupervisionado combinado com os métodos de geração de grafos usando uma variedade de valores para o parâmetro de esparsificação, o qual está relacionado ao número de vizinhos de cada exemplo de treinamento. Na avaliação da estabilidade dos classificadores, avalia-se a estabilidade dos classificadores semissupervisionados combinados com os métodos de geração de grafos usando uma variedade de valores para o parâmetro de esparsificação. Para tanto, fixam-se os valores dos parâmetros de regularização (quando existirem) que geraram os melhores resultados na análise de melhor caso. Na avaliação do impacto da geração de grafos, avaliam-se os métodos de geração de grafos combinados com os algoritmos de aprendizado semissupervisionado usando uma variedade de valores para o parâmetro de esparsificação. Assim como na avaliação da estabilidade dos classificadores, para esta avaliação, fixam-se os valores dos parâmetros de regularização (quando existirem) que geraram os melhores resultados na análise de melhor caso. Na avaliação da influência dos parâmetros de regularização na classificação semissupervisionada, avaliam-se as superfícies de erro geradas pelos classificadores semissupervisionados em cada grafo e cada base de dados. Para tanto, fixam-se os grafos que geraram os melhores resultados na análise de melhor caso e variam-se os valores dos parâmetros de regularização. O intuito destes experimentos é avaliar o balanceamento entre desempenho de classificação e estabilidade dos algoritmos de aprendizado semissupervisionado baseado em grafos numa variedade de métodos de geração de grafos e valores de parâmetros (de esparsificação e de regularização, se houver). A partir dos resultados obtidos, pode-se concluir que o grafo k- vizinhos mais próximos mútuo (mutKNN) pode ser a melhor opção dentre os métodos de geração de grafos de adjacência, enquanto que o kernel RBF pode ser a melhor opção dentre os métodos de geração de matrizes ponderadas. Em adição, o grafo mutKNN tende a gerar superfícies de erro que são mais suaves que aquelas geradas pelos outros métodos de geração de grafos de adjacência. Entretanto, o grafo mutKNN é instável para valores relativamente pequenos de k. Os resultados obtidos neste trabalho indicam que o desempenho de classificação dos algoritmos semissupervisionados baseados em grafos é fortemente influenciado pela configuração de parâmetros. Poucos padrões evidentes foram encontrados para auxiliar o processo de seleção de parâmetros. As consequências dessa instabilidade são discutidas neste trabalho em termos de pesquisa e aplicações práticas / A variety of graph-based semi-supervised learning algorithms have been proposed by the research community in the last few years. Despite its apparent empirical success, the field of semi-supervised learning lacks a detailed empirical study that evaluates the influence of graph construction on semisupervised learning. In this work we provide such an empirical study. For such purpose, we combine a variety of graph construction methods with a variety of graph-based semi-supervised learning algorithms in order to empirically compare them in six benchmark data sets widely used in the semi-supervised learning literature. The algorithms are evaluated in tasks about digit, character, text, and image classification as well as classification of gaussian distributions. The experimental evaluation proposed in this work is subdivided into four parts: (1) best case analysis; (2) evaluation of classifiers stability; (3) evaluation of the influence of graph construction on semi-supervised learning; (4) evaluation of the influence of regularization parameters on the classification performance of semi-supervised learning algorithms. In the best case analysis, we evaluate the lowest error rates of each semi-supervised learning algorithm combined with the graph construction methods using a variety of sparsification parameter values. Such parameter is associated with the number of neighbors of each training example. In the evaluation of classifiers stability, we evaluate the stability of the semi-supervised learning algorithms combined with the graph construction methods using a variety of sparsification parameter values. For such purpose, we fixed the regularization parameter values (if any) with the values that achieved the best result in the best case analysis. In the evaluation of the influence of graph construction, we evaluate the graph construction methods combined with the semi-supervised learning algorithms using a variety of sparsification parameter values. In this analysis, as occurred in the evaluation of classifiers stability, we fixed the regularization parameter values (if any) with the values that achieved the best result in the best case analysis. In the evaluation of the influence of regularization parameters on the classification performance of semi-supervised learning algorithms, we evaluate the error surfaces generated by the semi-supervised classifiers in each graph and data set. For such purpose, we fixed the graphs that achieved the best results in the best case analysis and varied the regularization parameters values. The intention of our experiments is evaluating the trade-off between classification performance and stability of the graphbased semi-supervised learning algorithms in a variety of graph construction methods as well as parameter values (sparsification and regularization, if applicable). From the obtained results, we conclude that the mutual k-nearest neighbors (mutKNN) graph may be the best choice for adjacency graph construction while the RBF kernel may be the best choice for weighted matrix generation. In addition, mutKNN tends to generate error surfaces that are smoother than those generated by other adjacency graph construction methods. However, mutKNN is unstable for relatively small values of k. Our results indicate that the classification performance of the graph-based semi-supervised learning algorithms are heavily influenced by parameter setting. We found just a few evident patterns that could help parameter selection. The consequences of such instability are discussed in this work in research and practice
88

Técnica de aprendizado semissupervisionado para detecção de outliers / A semi-supervised technique for outlier detection

Zamoner, Fabio Willian 23 January 2014 (has links)
Detecção de outliers desempenha um importante papel para descoberta de conhecimento em grandes bases de dados. O estudo é motivado por inúmeras aplicações reais como fraudes de cartões de crédito, detecção de falhas em componentes industriais, intrusão em redes de computadores, aprovação de empréstimos e monitoramento de condições médicas. Um outlier é definido como uma observação que desvia das outras observações em relação a uma medida e exerce considerável influência na análise de dados. Embora existam inúmeras técnicas de aprendizado de máquina para tratar desse problemas, a maioria delas não faz uso de conhecimento prévio sobre os dados. Técnicas de aprendizado semissupervisionado para detecção de outliers são relativamente novas e incluem apenas um pequeno número de rótulos da classe normal para construir um classificador. Recentemente um modelo semissupervisionado baseado em rede foi proposto para classificação de dados empregando um mecanismo de competição e cooperação de partículas. As partículas são responsáveis pela propagação dos rótulos para toda a rede. Neste trabalho, o modelo foi adaptado a fim de detectar outliers através da definição de um escore de outlier baseado na frequência de visitas. O número de visitas recebido por um outlier é significativamente diferente dos demais objetos de mesma classe. Essa abordagem leva a uma maneira não tradicional de tratar os outliers. Avaliações empíricas sobre bases artificiais e reais demonstram que a técnica proposta funciona bem para bases desbalanceadas e atinge precisão comparável às obtidas pelas técnicas tradicionais de detecção de outliers. Além disso, a técnica pode fornecer novas perspectivas sobre como diferenciar objetos, pois considera não somente a distância física, mas também a formação de padrão dos dados / Outloier detection plays an important role for discovering knowledge in large data sets. The study is motivated by plethora of real applications such as credit card frauds, fault detection in industrial components, network instrusion detection, loan application precoessing and medical condition monitoring. An outlier is defined as an observation that deviates from other observations with respect to a measure and exerts a substantial influence on data analysis. Although numerous machine learning techniques have been developed for attacking this problem, most of them work with no prior knowledge of the data. Semi-supervised outlier detection techniques are reçlatively new and include only a few labels of normal class for building a classifier. Recently, a network-based semi-supervised model was proposed for data clasification by employing a mechanism based on particle competiton and cooperation. Such particle competition and cooperaction. Such particles are responsible for label propagation throughout the network. In this work, we adapt this model by defining a new outlier score based on visit frequency counting. The number of visits received by an outlier is significantly different from the remaining objects. This approach leads to an anorthodox way to deal with outliers. Our empirical ecaluations on both real and simulated data sets demonstrate that proposed technique works well with unbalanced data sets and achieves a precision compared to traditional outlier detection techniques. Moreover, the technique might provide new insights into how to differentiate objects because it considers not only the physical distance but also the pattern formation of the data
89

A Graph Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction

Trivedi, Shubhendu 30 April 2012 (has links)
The fact that clustering is perhaps the most used technique for exploratory data analysis is only a semaphore that underlines its fundamental importance. The general problem statement that broadly describes clustering as the identification and classification of patterns into coherent groups also implicitly indicates it's utility in other tasks such as supervised learning. In the past decade and a half there have been two developments that have altered the landscape of research in clustering: One is improved results by the increased use of graph theoretic techniques such as spectral clustering and the other is the study of clustering with respect to its relevance in semi-supervised learning i.e. using unlabeled data for improving prediction accuracies. In this work an attempt is made to make contributions to both these aspects. Thus our contributions are two-fold: First, we identify some general issues with the spectral clustering framework and while working towards a solution, we introduce a new algorithm which we call "Regularity Clustering" which makes an attempt to harness the power of the Szemeredi Regularity Lemma, a remarkable result from extremal graph theory for the task of clustering. Secondly, we investigate some practical and useful strategies for using clustering unlabeled data in boosting prediction accuracy. For all of these contributions we evaluate our methods against existing ones and also apply these ideas in a number of settings.
90

The differential geometric structure in supervised learning of classifiers

Bai, Qinxun 12 May 2017 (has links)
In this thesis, we study the overfitting problem in supervised learning of classifiers from a geometric perspective. As with many inverse problems, learning a classification function from a given set of example-label pairs is an ill-posed problem, i.e., there exist infinitely many classification functions that can correctly predict the class labels for all training examples. Among them, according to Occam's razor, simpler functions are favored since they are less overfitted to training examples and are therefore expected to perform better on unseen examples. The standard technique to enforce Occam's razor is to introduce a regularization scheme, which penalizes some type of complexity of the learned classification function. Some widely used regularization techniques are functional norm-based (Tikhonov) techniques, ensemble-based techniques, early stopping techniques, etc. However, there is important geometric information in the learned classification function that is closely related to overfitting, and has been overlooked by previous methods. In this thesis, we study the complexity of a classification function from a new geometric perspective. In particular, we investigate the differential geometric structure in the submanifold corresponding to the estimator of the class probability P(y|x), based on the observation that overfitting produces rapid local oscillations and hence large mean curvature of this submanifold. We also show that our geometric perspective of supervised learning is naturally related to an elastic model in physics, where our complexity measure is a high dimensional extension of the surface energy in physics. This study leads to a new geometric regularization approach for supervised learning of classifiers. In our approach, the learning process can be viewed as a submanifold fitting problem that is solved by a mean curvature flow method. In particular, our approach finds the submanifold by iteratively fitting the training examples in a curvature or volume decreasing manner. Our technique is unified for both binary and multiclass classification, and can be applied to regularize any classification function that satisfies two requirements: firstly, an estimator of the class probability can be obtained; secondly, first and second derivatives of the class probability estimator can be calculated. For applications, where we apply our regularization technique to standard loss functions for classification, our RBF-based implementation compares favorably to widely used regularization methods for both binary and multiclass classification. We also design a specific algorithm to incorporate our regularization technique into the standard forward-backward training of deep neural networks. For theoretical analysis, we establish Bayes consistency for a specific loss function under some mild initialization assumptions. We also discuss the extension of our approach to situations where the input space is a submanifold, rather than a Euclidean space. / 2018-11-30T00:00:00Z

Page generated in 0.0915 seconds