Spelling suggestions: "subject:"continual learning"" "subject:"continualy learning""
1 |
Real-Time Evaluation in Online Continual Learning: A New HopeGhunaim, Yasir 02 1900 (has links)
Current evaluations of Continual Learning (CL) methods typically assume that there is no constraint on training time and computation. This is an unrealistic assumption for any real-world setting, which motivates us to propose: a practical real-time evaluation of continual learning, in which the stream does not wait for the model to complete training before revealing the next data for predictions. To do this, we evaluate current CL methods with respect to their computational costs. We conduct extensive experiments on CLOC, a large-scale dataset containing 39 million time-stamped images with geolocation labels. We show that a simple baseline outperforms state-of-the-art CL methods under this evaluation, questioning the applicability of existing methods in realistic settings. In addition, we explore various CL components commonly used in the literature, including memory sampling strategies and regularization approaches. We find that all considered methods fail to be competitive against our simple baseline. This surprisingly suggests that the majority of existing CL literature is tailored to a specific class of streams that is not practical. We hope that the evaluation we provide will be the first step towards a paradigm shift to consider the computational cost in the development of online continual learning methods.
|
2 |
Deep Learning Approaches for Time-Evolving ScenariosBertugli, Alessia 18 April 2023 (has links)
One of the most challenging topics of deep learning (DL) is the analysis of temporal series in complex real-world scenarios. The majority of proposed DL methods tend to simplify such environments without considering several factors. The first part of this thesis focuses on developing video surveillance and sports analytic systems, in which obstacles, social interactions, and flow directions are relevant aspects. A DL model is then proposed to predict future paths, taking into account human interactions sharing a common memory, and favouring the most common paths through belief maps. Another model is proposed, adding the possibility to consider agents' goals. This aspect is particularly relevant in sports games where players can share objectives and tactics. Both the proposed models rely on the common hypothesis that the whole amount of labelled data is available from the beginning of the analysis, without evolving. This can be a strong simplification for most real-world scenarios, where data is available as a stream and changes over time. Thus, a theoretical model for continual learning is then developed to face problems where few data come as a stream, and labelling them is a hard task. Finally, continual learning strategies are applied to one of the most challenging scenarios for DL: financial market predictions. A collection of state-of-the-art continual learning techniques are applied to financial indicators representing temporal data. Results achieved during this PhD show how artificial intelligence algorithms can help to solve real-world problems in complex and time-evolving scenarios.
|
3 |
Machines Do Not Have Little Gray Cells: : Analysing Catastrophic Forgetting in Cross-Domain Intrusion Detection Systems / Machines Do Not Have Little Gray Cells: : Analysing Catastrophic Forgetting in Cross-Domain Intrusion Detection SystemsValieh, Ramin, Esmaeili Kia, Farid January 2023 (has links)
Cross-domain intrusion detection, a critical component of cybersecurity, involves evaluating the performance of neural networks across diverse datasets or databases. The ability of intrusion detection systems to effectively adapt to new threats and data sources is paramount for safeguarding networks and sensitive information. This research delves into the intricate world of cross-domain intrusion detection, where neural networks must demonstrate their versatility and adaptability. The results of our experiments expose a significant challenge: the phenomenon known as catastrophic forgetting. This is the tendency of neural networks to forget previously acquired knowledge when exposed to new information. In the context of intrusion detection, it means that as models are sequentially trained on different intrusion detection datasets, their performance on earlier datasets degrades drastically. This degradation poses a substantial threat to the reliability of intrusion detection systems. In response to this challenge, this research investigates potential solutions to mitigate the effects of catastrophic forgetting. We propose the application of continual learning techniques as a means to address this problem. Specifically, we explore the Elastic Weight Consolidation (EWC) algorithm as an example of preserving previously learned knowledge while allowing the model to adapt to new intrusion detection tasks. By examining the performance of neural networks on various intrusion detection datasets, we aim to shed light on the practical implications of catastrophic forgetting and the potential benefits of adopting EWC as a memory-preserving technique. This research underscores the importance of addressing catastrophic forgetting in cross-domain intrusion detection systems. It provides a stepping stone for future endeavours in enhancing multi-task learning and adaptability within the critical domain of intrusion detection, ultimately contributing to the ongoing efforts to fortify cybersecurity defences.
|
4 |
Continual Object LearningErculiani, Luca 10 June 2021 (has links)
This work focuses on building frameworks to strengthen the relation between human and machine learning. This is achieved by proposing a new category of algorithms and a new theory to formalize the perception and categorizationof objects. For what concerns the algorithmic part, we developed a series of procedures to perform Interactive Continuous Open World learning from the point of view of a single user. As for humans, the input of the algorithms are continuous streams of visual information (sequences of frames), that enable the extraction of richer representations by exploiting the persistence of the same object in the input data. Our approaches are able to incrementally learn and recognize collections of objects, starting from emph{zero} knowledge, and organizing them in a
hierarchy that follows the will of the user. We then present a novel Knowledge Representation theory that formalizes the property of our setting and enables the learning over it. The theory is based on the notion of separating the visual representation of objects from the semantic meaning associated with them. This distinction enables to treat both instances and classes of objects as being elements of the same kind, as well as allowing for dynamically rearranging objects according to the needs of the user. The whole framework is gradually introduced through the entire thesis and is coupled with an extensive series of experiments to demonstrate its working
principles. The experiments focus also on demonstrating the role of a developmental learning policy, in which new objects are regularly introduced, enabling both an increase in recognition performance while reducing the amount of supervision provided by the user.
|
5 |
CONTINUAL LEARNING: TOWARDS IMAGE CLASSIFICATION FROM SEQUENTIAL DATAJiangpeng He (13157496) 28 July 2022 (has links)
<p>Though modern deep learning based approaches have achieved remarkable progress in computer vision community such as image classification using a static image dataset, it suf- fers from catastrophic forgetting when learning new classes incrementally in a phase-by-phase fashion, in which only data for new classes are provided at each learning phase. In this work we focus on continual learning with the objective of learning new tasks from sequentially available data without forgetting the learned knowledge. We study this problem from three perspectives including (1) continual learning in online scenario where each data is used only once for training (2) continual learning in unsupervised scenario where no class label is pro- vided and (3) continual learning in real world applications. Specifically, for problem (1), we proposed a variant of knowledge distillation loss together with a two-step learning technique to efficiently maintain the learned knowledge and a novel candidates selection algorithm to reduce the prediction bias towards new classes. For problem (2), we introduced a new framework for unsupervised continual learning by using pseudo labels obtained from cluster assignments and an efficient out-of-distribution detector is designed to identify whether each new data belongs to new or learned classes in unsupervised scenario. For problem (3), we proposed a novel training regime targeted on food images using balanced training batch and a more efficient exemplar selection algorithm. Besides, we further proposed an exemplar-free continual learning approach to address the memory issue and privacy concerns caused by storing part of old data as exemplars.</p>
<p>In addition to the work related to continual learning, we study the image-based dietary assessment with the objective of determining what someone eats and how much energy is consumed during the course of a day by using food or eating scene images. Specifically, we proposed a multi-task framework for simultaneously classification and portion size estima- tion by future fusion and soft-parameter sharing between backbone networks. Besides, we introduce RGB-Distribution image by concatenating the RGB image with the energy distri- bution map as the fourth channel, which is then used for end-to-end multi-food recognition and portion size estimation.</p>
|
6 |
Self-supervised Learning Methods for Vision-based TasksTurrisi Da Costa, Victor Guilherme 22 May 2024 (has links)
Dealing with large amounts of unlabeled data is a very challenging task. Recently, many different approaches have been proposed to leverage this data for training many machine learning models. Among them, self-supervised learning appears as an efficient solution capable of training powerful and generalizable models. More specifically, instead of relying on human-generated labels, it proposes training objectives that use ``labels'' generated from the data itself, either via data augmentation or by masking the data in some way and trying to reconstruct it. Apart from being able to train models from scratch, self-supervised methods can also be used in specific applications to further improve a pre-trained model. In this thesis, we propose to leverage self-supervised methods in novel ways to tackle different application scenarios. We present four published papers: an open-source library for self-supervised learning that is flexible, scalable, and easy to use; two papers tackling unsupervised domain adaptation in action recognition; and one paper on self-supervised learning for continual learning. The published papers highlight that self-supervised techniques can be leveraged for many scenarios, yielding state-of-the-art results.
|
7 |
Referencing Unlabelled World Data to Prevent Catastrophic Forgetting in Class-incremental LearningLi, Xuan 24 June 2022 (has links)
This thesis presents a novel strategy to address the challenge of "catastrophic forgetting" in deep continual-learning systems. The term refers to severe performance degradation for older tasks, as a system learns new tasks that are presented sequentially. Most previous techniques have emphasized preservation of existing knowledge while learning new tasks, in some cases advocating a memory buffer that grows in proportion to the number of tasks. However, we offer another perspective, which is that mitigating local-task fitness during learning is as important as attempting to preserve existing knowledge. We posit the existence of a consistent, unlabelled world environment that the system uses as an easily-accessible reference to avoid favoring spurious properties over more generalizable ones. Based on this assumption, we have developed a novel method called Learning with Reference (LwR), which delivers substantial performance gains relative to its state-of-the-art counterparts. The approach does not involve a growing memory buffer, and therefore promotes better performance at scale. We present extensive empirical evaluation on real-world datasets. / Master of Science / Rome was not built in a day, and in nature knowledge is acquired and consolidated gradually over time. Evolution has taught biological systems how to address emerging challenges by building on past experience, adapting quickly while retaining known skills. Modern artificial intelligence systems also seek to amortize the learning process over time. Specifically, one large learning task can be divided into many smaller non-overlapping tasks. For example, a classification task of two classes, tiger and horse, is divided into two tasks, where the classifier only sees and learns from tiger data in the first task and horse data in the second task. The systems are expected to sequentially acquire knowledge from these smaller tasks. Such learning strategy is known as continual learning and provides three meaningful benefits: higher resource efficiency, a progressively better knowledge base, and strong adaptability. In this thesis, we investigate the class-incremental learning problem, a subset of continual learning, which refers to learning a classification model from a sequence of tasks.
Different from transfer learning, which targets better performance in new domains, continual learning emphasizes the knowledge preservation of both old and new tasks. In deep neural networks, one challenge against the preservation is "catastrophic forgetting", which refers to severe performance degradation for older tasks, as a system learns new ones that are presented sequentially. An intuitive explanation is that old task data is missing in the new tasks under continual learning setting and the model is optimized toward new tasks without concerning the old ones. To overcome this, most previous techniques have emphasized the preservation of existing knowledge while learning new tasks, in some cases advocating old-data replay with a memory buffer, which grows in proportion to the number of tasks.
In this thesis, we offer another perspective, which is that mitigating local-task fitness during learning is as important as attempting to preserve existing knowledge. We notice that local task data always has strong biases because of its smaller size. Optimization on it leads the model to local optima, therefore losing a holistic view that is crucial for other tasks. To mitigate this, a reliable reference should be enforced across tasks and the model should consistently learn all new knowledge based on this. With this assumption, we have developed a novel method called Learning with Reference (LwR), which posits the existence of a consistent, unlabelled world environment that the system uses as an easily-accessible reference to avoid favoring spurious properties over more generalizable ones. Our extensive empirical experiments show that it significantly outperforms state-of-the-art counterparts in real-world datasets.
|
8 |
Continual Learning for Deep Dense PredictionLokegaonkar, Sanket Avinash 11 June 2018 (has links)
Transferring a deep learning model from old tasks to a new one is known to suffer from the catastrophic forgetting effects. Such forgetting mechanism is problematic as it does not allow us to accumulate knowledge sequentially and requires retaining and retraining on all the training data. Existing techniques for mitigating the abrupt performance degradation on previously trained tasks are mainly studied in the context of image classification. In this work, we present a simple method to alleviate catastrophic forgetting for pixel-wise dense labeling problems. We build upon the regularization technique using knowledge distillation to minimize the discrepancy between the posterior distribution of pixel class labels for old tasks predicted from 1) the original and 2) the updated networks. This technique, however, might fail in circumstances where the source and target distribution differ significantly. To handle the above scenario, we further propose an improvement to the distillation based approach by adding adaptive l2-regularization depending upon the per-parameter importance to the older tasks. We train our model on FCN8s, but our training can be generalized to stronger models like DeepLab, PSPNet, etc. Through extensive evaluation and comparisons, we show that our technique can incrementally train dense prediction models for novel object classes, different visual domains, and different visual tasks. / Master of Science / Modern deep networks have been successful on many important problems in computer vision viz. object classification, object detection, pixel-wise dense labeling, etc. However, learning various tasks incrementally still remains a fundamental problem in computer vision. When trained incrementally on multiple tasks, deep networks have been known to suffer from catastrophic forgetting on the older tasks, which leads to significant decrease in accuracy. Such forgetting is problematic and fundamentally constraints the knowledge that can be accumulated within these networks.
In this work, we present a simple algorithm to alleviate catastrophic forgetting for pixel-wise dense labeling problems. To prevent the network from forgetting connections important to the older tasks, we record the predicted labels/outputs of the older tasks for the images of the new task. While training on the images of the new task, we enforce a constraint to “remember” the recorded predictions of the older tasks while learning a new task. Additionally, we identify which connections in the deep network are important to the older tasks and prevent these connections from changing significantly. We show that our proposed algorithm can incrementally learn dense prediction models for novel object classes, different visual domains, and different visual tasks.
|
9 |
Towards Robust Machine Learning Models for Data ScarcityJanuary 2020 (has links)
abstract: Recently, a well-designed and well-trained neural network can yield state-of-the-art results across many domains, including data mining, computer vision, and medical image analysis. But progress has been limited for tasks where labels are difficult or impossible to obtain. This reliance on exhaustive labeling is a critical limitation in the rapid deployment of neural networks. Besides, the current research scales poorly to a large number of unseen concepts and is passively spoon-fed with data and supervision.
To overcome the above data scarcity and generalization issues, in my dissertation, I first propose two unsupervised conventional machine learning algorithms, hyperbolic stochastic coding, and multi-resemble multi-target low-rank coding, to solve the incomplete data and missing label problem. I further introduce a deep multi-domain adaptation network to leverage the power of deep learning by transferring the rich knowledge from a large-amount labeled source dataset. I also invent a novel time-sequence dynamically hierarchical network that adaptively simplifies the network to cope with the scarce data.
To learn a large number of unseen concepts, lifelong machine learning enjoys many advantages, including abstracting knowledge from prior learning and using the experience to help future learning, regardless of how much data is currently available. Incorporating this capability and making it versatile, I propose deep multi-task weight consolidation to accumulate knowledge continuously and significantly reduce data requirements in a variety of domains. Inspired by the recent breakthroughs in automatically learning suitable neural network architectures (AutoML), I develop a nonexpansive AutoML framework to train an online model without the abundance of labeled data. This work automatically expands the network to increase model capability when necessary, then compresses the model to maintain the model efficiency.
In my current ongoing work, I propose an alternative method of supervised learning that does not require direct labels. This could utilize various supervision from an image/object as a target value for supervising the target tasks without labels, and it turns out to be surprisingly effective. The proposed method only requires few-shot labeled data to train, and can self-supervised learn the information it needs and generalize to datasets not seen during training. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2020
|
10 |
Efficient and Online Deep Learning through Model Plasticity and StabilityJanuary 2020 (has links)
abstract: The rapid advancement of Deep Neural Networks (DNNs), computing, and sensing technology has enabled many new applications, such as the self-driving vehicle, the surveillance drone, and the robotic system. Compared to conventional edge devices (e.g. cell phone or smart home devices), these emerging devices are required to deal with much more complicated and dynamic situations in real-time with bounded computation resources. However, there are several challenges, including but not limited to efficiency, real-time adaptation, model stability, and automation of architecture design.
To tackle the challenges mentioned above, model plasticity and stability are leveraged to achieve efficient and online deep learning, especially in the scenario of learning streaming data at the edge:
First, a dynamic training scheme named Continuous Growth and Pruning (CGaP) is proposed to compress the DNNs through growing important parameters and pruning unimportant ones, achieving up to 98.1% reduction in the number of parameters.
Second, this dissertation presents Progressive Segmented Training (PST), which targets catastrophic forgetting problems in continual learning through importance sampling, model segmentation, and memory-assisted balancing. PST achieves state-of-the-art accuracy with 1.5X FLOPs reduction in the complete inference path.
Third, to facilitate online learning in real applications, acquisitive learning (AL) is further proposed to emphasize both knowledge inheritance and acquisition: the majority of the knowledge is first pre-trained in the inherited model and then adapted to acquire new knowledge. The inherited model's stability is monitored by noise injection and the landscape of the loss function, while the acquisition is realized by importance sampling and model segmentation. Compared to a conventional scheme, AL reduces accuracy drop by >10X on CIFAR-100 dataset, with 5X reduction in latency per training image and 150X reduction in training FLOPs.
Finally, this dissertation presents evolutionary neural architecture search in light of model stability (ENAS-S). ENAS-S uses a novel fitness score, which addresses not only the accuracy but also the model stability, to search for an optimal inherited model for the application of continual learning. ENAS-S outperforms hand-designed DNNs when learning from a data stream at the edge.
In summary, in this dissertation, several algorithms exploiting model plasticity and model stability are presented to improve the efficiency and accuracy of deep neural networks, especially for the scenario of continual learning. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2020
|
Page generated in 0.0712 seconds