• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 33
  • 6
  • 4
  • 3
  • Tagged with
  • 57
  • 57
  • 18
  • 17
  • 14
  • 13
  • 11
  • 11
  • 10
  • 10
  • 10
  • 10
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Incremental Learning with Large Datasets

Giritharan, Balathasan 05 1900 (has links)
This dissertation focuses on the novel learning strategy based on geometric support vector machines to address the difficulties of processing immense data set. Support vector machines find the hyper-plane that maximizes the margin between two classes, and the decision boundary is represented with a few training samples it becomes a favorable choice for incremental learning. The dissertation presents a novel method Geometric Incremental Support Vector Machines (GISVMs) to address both efficiency and accuracy issues in handling massive data sets. In GISVM, skin of convex hulls is defined and an efficient method is designed to find the best skin approximation given available examples. The set of extreme points are found by recursively searching along the direction defined by a pair of known extreme points. By identifying the skin of the convex hulls, the incremental learning will only employ a much smaller number of samples with comparable or even better accuracy. When additional samples are provided, they will be used together with the skin of the convex hull constructed from previous dataset. This results in a small number of instances used in incremental steps of the training process. Based on the experimental results with synthetic data sets, public benchmark data sets from UCI and endoscopy videos, it is evident that the GISVM achieved satisfactory classifiers that closely model the underlying data distribution. GISVM improves the performance in sensitivity in the incremental steps, significantly reduced the demand for memory space, and demonstrates the ability of recovery from temporary performance degradation.
2

Machine condition monitoring using artificial intelligence: The incremental learning and multi-agent system approach

Vilakazi, Christina Busisiwe 20 August 2008 (has links)
Machine condition monitoring is gaining importance in industry due to the need to increase machine reliability and decrease the possible loss of production due to machine breakdown. Often the data available to build a condition monitoring system does not fully represent the system. It is also often common that the data becomes available in small batches over a period of time. Hence, it is important to build a system that is able to accommodate new data as it becomes available without compromising the performance of the previously learned data. In real-world applications, more than one condition monitoring technology is used to monitor the condition of a machine. This leads to large amounts of data, which require a highly skilled diagnostic specialist to analyze. In this thesis, artificial intelligence (AI) techniques are used to build a condition monitoring system that has incremental learning capabilities. Two incremental learning algorithms are implemented, the first method uses Fuzzy ARTMAP (FAM) algorithm and the second uses Learn++ algorithm. In addition, intelligent agents and multi-agent systems are used to build a condition monitoring system that is able to accommodate various analysis techniques. Experimentation was performed on two sets of condition monitoring data; the dissolved gas analysis (DGA) data obtained from high voltage bushings and the vibration data obtained from motor bearing. Results show that both Learn++ and FAM are able to accommodate new data without compromising the performance of classifiers on previously learned information. Results also show that intelligent agent and multi-agent system are able to achieve modularity and flexibility.
3

Incremental Learning and Online-Style SVM for Traffic Light Classification

Liu, Wen 28 January 2016 (has links)
Training a large dataset has become a serious issue for researchers because it requires large memories and can take a long time for computing. People are trying to process large scale dataset not only by changing programming model, such as using MapReduce and Hadoop, but also by designing new algorithms that can retain performance with less complexity and runtime. In this thesis, we present implementations of incremental learning and online learning methods to classify a large traffic light dataset for traffic light recognition. The introduction part includes the concepts and related works of incremental learning and online learning. The main algorithm is a modification of IMORL incremental learning model to enhance its performance over the learning process of our application. Then we briefly discuss how the traffic light recognition algorithm works and the problem we encounter during training. Rather than focusing on incremental learning, which uses batch to batch data during training procedure, we introduce Pegasos, an online style primal gradient-based support vector machine method. The performance of Pegasos for classification is extraordinary and the number of instances it uses for training is relatively small. Therefore, Pegasos is the recommended solution to the large dataset training problem.
4

Enhancement of Incremental Learning Algorithm for Support Vector Machines Using Fuzzy Set Theory

Chuang, Yu-Ming 03 February 2009 (has links)
Over the past few years, a considerable number of studies have been made on Support Vector Machines (SVMs) in many domains to improve classification or prediction. However, SVMs request high computational time and memory when the datasets are large. Although incremental learning techniques are viewed as one possible solution developed to reduce the computation complexity of the scalability problem, few studies have considered that some examples close to the decision hyperplane other than support vectors (SVs) might contribute to the learning process. Consequently, we propose three novel algorithms, named Mixed Incremental learning (MIL), Half-Mixed Incremental learning (HMIL), and Partition Incremental learning (PIL), by improving Syed¡¦s incremental learning method based on fuzzy set theory. We expect to achieve better accuracy than other methods. In the experiments, the proposed algorithms are investigated on five standard machine learning benchmark datasets to demonstrate the effectiveness of the method. Experimental results show that HIL have superior classification accuracy than the other incremental or active learning algorithms. Especially, for the datasets that might have high accuracy in other research reports, HMIL and PIL could even improve the performance.
5

Audio Recognition in Incremental Open-set Environments

Jleed, Hitham 16 June 2022 (has links)
Machine learning algorithms have shown their abilities to tackle difficult recognition problems, but they are still rife with challenges. Among these challenges is how to deal with problems where new categories constantly occur, and the datasets can dynamically grow. Most contemporary learning algorithms developed to this point are governed by the assumptions that all testing data classes must be the same as training data classes, often with equal distribution. Under these assumptions, machine-learning algorithms can perform very well, using their ability to handle large feature spaces and classify outliers. The systems under these assumptions are called Closed Set Recognition systems (CSR). However, these assumptions cannot reflect practical applications in which out-of-set data may be encountered. This adversely affects the recognition prediction performances. When samples from a new class occur, they will be classified as one of the known classes. Even if this sample is far from any of the training samples, the algorithm may classify it with a high probability, that is, the algorithm will not only be wrong, but it may also be very confident in its results. A more practical problem is Open Set Recognition (OSR), where samples of classes not seen during training may show up at testing time. Inherently, there is a problem how the system can identify the novel sound classes and how the system can update its models with new classes. This thesis highlights the problems of multi-class recognition for OSR of sounds as well as incremental model adaptation and proposes solutions towards addressing these problems. The proposed solutions are validated through extensive experiments and are shown to provide improved performance over a wide range of openness values for sound classification scenarios.
6

Dynamic protein classification: Adaptive models based on incremental learning strategies

Mohamed, Shakir 18 March 2008 (has links)
Abstract One of the major problems in computational biology is the inability of existing classification models to incorporate expanding and new domain knowledge. This problem of static classification models is addressed in this thesis by the introduction of incremental learning for problems in bioinformatics. The tools which have been developed are applied to the problem of classifying proteins into a number of primary and putative families. The importance of this type of classification is of particular relevance due to its role in drug discovery programs and the benefit it lends to this process in terms of cost and time saving. As a secondary problem, multi–class classification is also addressed. The standard approach to protein family classification is based on the creation of committees of binary classifiers. This one-vs-all approach is not ideal, and the classification systems presented here consists of classifiers that are able to do all-vs-all classification. Two incremental learning techniques are presented. The first is a novel algorithm based on the fuzzy ARTMAP classifier and an evolutionary strategy. The second technique applies the incremental learning algorithm Learn++. The two systems are tested using three datasets: data from the Structural Classification of Proteins (SCOP) database, G-Protein Coupled Receptors (GPCR) database and Enzymes from the Protein Data Bank. The results show that both techniques are comparable with each other, giving classification abilities which are comparable to that of the single batch trained classifiers, with the added ability of incremental learning. Both the techniques are shown to be useful to the problem of protein family classification, but these techniques are applicable to problems outside this area, with applications in proteomics including the predictions of functions, secondary and tertiary structures, and applications in genomics such as promoter and splice site predictions and classification of gene microarrays.
7

Incremental nonparametric discriminant analysis based active learning and its applications

Dhoble, Kshitij January 2010 (has links)
Learning is one such innate general cognitive ability which has empowered the living animate entities and especially humans with intelligence. It is obtained by acquiring new knowledge and skills that enable them to adapt and survive. With the advancement of technology, a large amount of information gets amassed. Due to the sheer volume of increasing information, its analysis is humanly unfeasible and impractical. Therefore, for the analysis of massive data we need machines (such as computers) with the ability to learn and evolve in order to discover new knowledge from the analysed data. The majority of the traditional machine learning algorithms function optimally on a parametric (static) data. However, the datasets acquired in real practices are often vast, inaccurate, inconsistent, non-parametric and highly volatile. Therefore, the learning algorithms’ optimized performance can only be transitory, thus requiring a learning algorithm that can constantly evolve and adapt according to the data it processes. In light of a need for such machine learning algorithm, we look for the inspiration in humans’ innate cognitive learning ability. Active learning is one such biologically inspired model, designed to mimic humans’ dynamic, evolving, adaptive and intelligent cognitive learning ability. Active learning is a class of learning algorithms that aim to create an accurate classifier by iteratively selecting essentially important unlabeled data points by the means of adaptive querying and training the classifier on those data points which are potentially useful for the targeted learning task (Tong & Koller, 2002). The traditional active learning techniques are implemented under supervised or semi-supervised learning settings (Pang et al., 2009). Our proposed model performs the active learning in an unsupervised setting by introducing a discriminative selective sampling criterion, which reduces the computational cost by substantially decreasing the number of irrelevant instances to be learned by the classifier. The methods based on passive learning (which assumes the entire dataset for training is truly informative and is presented in advance) prove to be inadequate in a real world application (Pang et al., 2009). To overcome this limitation, we have developed Active Mode Incremental Nonparametric Discriminant Analysis (aIncNDA) which undertakes adaptive discriminant selection of the instances for an incremental NDA learning. NDA is a discriminant analysis method that has been incorporated in our selective sampling technique in order to reduce the effects of the outliers (which are anomalous observations/data points in a dataset). It works with significant efficiency on the anomalous datasets, thereby minimizing the computational cost (Raducanu & Vitri´a, 2008). NDA is one of the methods used in the proposed active learning model. This thesis presents the research on a discrimination-based active learning where NDA is extended for fast discrimination analysis and data sampling. In addition to NDA, a base classifier (such as Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN)) is applied to discover and merge the knowledge from the newly acquired data. The performance of our proposed method is evaluated against benchmark University of California, Irvine (UCI) datasets, face image, and object image category datasets. The assessment that was carried out on the UCI datasets showed that Active Mode Incremental NDA (aIncNDA) performs at par and in many cases better than the incremental NDA with a lower number of instances. Additionally, aIncNDA also performs efficiently under the different levels of redundancy, but has an improved discrimination performance more often than a passive incremental NDA. In an application that undertakes the face image and object image recognition and retrieval task, it can be seen that the proposed multi-example active learning system dynamically and incrementally learns from the newly obtained images, thereby gradually reducing its retrieval (classification) error rate by the means of iterative refinement. The results of the empirical investigation show that our proposed active learning model can be used for classification with increased efficiency. Furthermore, given the nature of network data which is large, streaming, and constantly changing, we believe that our method can find practical application in the field of Internet security.
8

Incremental nonparametric discriminant analysis based active learning and its applications

Dhoble, Kshitij January 2010 (has links)
Learning is one such innate general cognitive ability which has empowered the living animate entities and especially humans with intelligence. It is obtained by acquiring new knowledge and skills that enable them to adapt and survive. With the advancement of technology, a large amount of information gets amassed. Due to the sheer volume of increasing information, its analysis is humanly unfeasible and impractical. Therefore, for the analysis of massive data we need machines (such as computers) with the ability to learn and evolve in order to discover new knowledge from the analysed data. The majority of the traditional machine learning algorithms function optimally on a parametric (static) data. However, the datasets acquired in real practices are often vast, inaccurate, inconsistent, non-parametric and highly volatile. Therefore, the learning algorithms’ optimized performance can only be transitory, thus requiring a learning algorithm that can constantly evolve and adapt according to the data it processes. In light of a need for such machine learning algorithm, we look for the inspiration in humans’ innate cognitive learning ability. Active learning is one such biologically inspired model, designed to mimic humans’ dynamic, evolving, adaptive and intelligent cognitive learning ability. Active learning is a class of learning algorithms that aim to create an accurate classifier by iteratively selecting essentially important unlabeled data points by the means of adaptive querying and training the classifier on those data points which are potentially useful for the targeted learning task (Tong & Koller, 2002). The traditional active learning techniques are implemented under supervised or semi-supervised learning settings (Pang et al., 2009). Our proposed model performs the active learning in an unsupervised setting by introducing a discriminative selective sampling criterion, which reduces the computational cost by substantially decreasing the number of irrelevant instances to be learned by the classifier. The methods based on passive learning (which assumes the entire dataset for training is truly informative and is presented in advance) prove to be inadequate in a real world application (Pang et al., 2009). To overcome this limitation, we have developed Active Mode Incremental Nonparametric Discriminant Analysis (aIncNDA) which undertakes adaptive discriminant selection of the instances for an incremental NDA learning. NDA is a discriminant analysis method that has been incorporated in our selective sampling technique in order to reduce the effects of the outliers (which are anomalous observations/data points in a dataset). It works with significant efficiency on the anomalous datasets, thereby minimizing the computational cost (Raducanu & Vitri´a, 2008). NDA is one of the methods used in the proposed active learning model. This thesis presents the research on a discrimination-based active learning where NDA is extended for fast discrimination analysis and data sampling. In addition to NDA, a base classifier (such as Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN)) is applied to discover and merge the knowledge from the newly acquired data. The performance of our proposed method is evaluated against benchmark University of California, Irvine (UCI) datasets, face image, and object image category datasets. The assessment that was carried out on the UCI datasets showed that Active Mode Incremental NDA (aIncNDA) performs at par and in many cases better than the incremental NDA with a lower number of instances. Additionally, aIncNDA also performs efficiently under the different levels of redundancy, but has an improved discrimination performance more often than a passive incremental NDA. In an application that undertakes the face image and object image recognition and retrieval task, it can be seen that the proposed multi-example active learning system dynamically and incrementally learns from the newly obtained images, thereby gradually reducing its retrieval (classification) error rate by the means of iterative refinement. The results of the empirical investigation show that our proposed active learning model can be used for classification with increased efficiency. Furthermore, given the nature of network data which is large, streaming, and constantly changing, we believe that our method can find practical application in the field of Internet security.
9

An Incremental Multilinear System for Human Face Learning and Recognition

Wang, Jin 05 November 2010 (has links)
This dissertation establishes a novel system for human face learning and recognition based on incremental multilinear Principal Component Analysis (PCA). Most of the existing face recognition systems need training data during the learning process. The system as proposed in this dissertation utilizes an unsupervised or weakly supervised learning approach, in which the learning phase requires a minimal amount of training data. It also overcomes the inability of traditional systems to adapt to the testing phase as the decision process for the newly acquired images continues to rely on that same old training data set. Consequently when a new training set is to be used, the traditional approach will require that the entire eigensystem will have to be generated again. However, as a means to speed up this computational process, the proposed method uses the eigensystem generated from the old training set together with the new images to generate more effectively the new eigensystem in a so-called incremental learning process. In the empirical evaluation phase, there are two key factors that are essential in evaluating the performance of the proposed method: (1) recognition accuracy and (2) computational complexity. In order to establish the most suitable algorithm for this research, a comparative analysis of the best performing methods has been carried out first. The results of the comparative analysis advocated for the initial utilization of the multilinear PCA in our research. As for the consideration of the issue of computational complexity for the subspace update procedure, a novel incremental algorithm, which combines the traditional sequential Karhunen-Loeve (SKL) algorithm with the newly developed incremental modified fast PCA algorithm, was established. In order to utilize the multilinear PCA in the incremental process, a new unfolding method was developed to affix the newly added data at the end of the previous data. The results of the incremental process based on these two methods were obtained to bear out these new theoretical improvements. Some object tracking results using video images are also provided as another challenging task to prove the soundness of this incremental multilinear learning method.
10

Temporal Data Mining in a Dynamic Feature Space

Wenerstrom, Brent K. 22 May 2006 (has links) (PDF)
Many interesting real-world applications for temporal data mining are hindered by concept drift. One particular form of concept drift is characterized by changes to the underlying feature space. Seemingly little has been done to address this issue. This thesis presents FAE, an incremental ensemble approach to mining data subject to concept drift. FAE achieves better accuracies over four large datasets when compared with a similar incremental learning algorithm.

Page generated in 0.1235 seconds