Global ETD Search

31	Learning to Adapt Neural Networks Across Visual Domains Roy, Subhankar 29 September 2022 (has links) In the field of machine learning (ML) a very commonly encountered problem is the lack of generalizability of learnt classification functions when subjected to new samples that are not representative of the training distribution. The discrepancy between the training (a.k.a. source) and test (a.k.a.target) distributions are caused by several latent factors such as change in appearance, illumination, viewpoints and so on, which is also popularly known as domain-shift. In order to make a classifier cope with such domain-shifts, a sub-field in machine learning called domain adaptation (DA) has emerged that jointly uses the annotated data from the source domain together with the unlabelled data from the target domain of interest. For a classifier to be adapted to an unlabelled target data set is of tremendous practical significance because it has no associated labelling cost and allows for more accurate predictions in the environment of interest. A majority of the DA methods which address the single source and single target domain scenario are not easily extendable to many practical DA scenarios. As there has been as increasing focus to make ML models deployable, it calls for devising improved methods that can handle inherently complex practical DA scenarios in the real world. In this work we build towards this goal of addressing more practical DA settings and help realize novel methods for more real world applications: (i) We begin our work with analyzing and addressing the single source and single target setting by proposing whitening-based embedded normalization layers to align the marginal feature distributions between two domains. To better utilize the unlabelled target data we propose an unsupervised regularization loss that encourages both confident and consistent predictions. (ii) Next, we build on top of the proposed normalization layers and use them in a generative framework to address multi-source DA by posing it as an image translation problem. This proposed framework TriGAN allows a single generator to be learned by using all the source domain data into a single network, leading to better generation of target-like source data. (iii) We address multi-target DA by learning a single classifier for all of the target domains. Our proposed framework exploits feature aggregation with a graph convolutional network to align feature representations of similar samples across domains. Moreover, to counteract the noisy pseudo-labels we propose to use a co-teaching strategy with a dual classifier head. To enable smoother adaptation, we propose a domain curriculum learning ,when the domain labels are available, that adapts to one target domain at a time, with increasing domain gap. (iv) Finally, we address the challenging source-free DA where the only source of supervision is a source-trained model. We propose to use Laplace Approximation to build a probabilistic source model that can quantify the uncertainty in the source model predictions on the target data. The uncertainty is then used as importance weights during the target adaptation process, down-weighting target data that do not lie in the source manifold.
32	CONTINUAL LEARNING: TOWARDS IMAGE CLASSIFICATION FROM SEQUENTIAL DATA Jiangpeng He (13157496) 28 July 2022 (has links) <p>Though modern deep learning based approaches have achieved remarkable progress in computer vision community such as image classification using a static image dataset, it suf- fers from catastrophic forgetting when learning new classes incrementally in a phase-by-phase fashion, in which only data for new classes are provided at each learning phase. In this work we focus on continual learning with the objective of learning new tasks from sequentially available data without forgetting the learned knowledge. We study this problem from three perspectives including (1) continual learning in online scenario where each data is used only once for training (2) continual learning in unsupervised scenario where no class label is pro- vided and (3) continual learning in real world applications. Specifically, for problem (1), we proposed a variant of knowledge distillation loss together with a two-step learning technique to efficiently maintain the learned knowledge and a novel candidates selection algorithm to reduce the prediction bias towards new classes. For problem (2), we introduced a new framework for unsupervised continual learning by using pseudo labels obtained from cluster assignments and an efficient out-of-distribution detector is designed to identify whether each new data belongs to new or learned classes in unsupervised scenario. For problem (3), we proposed a novel training regime targeted on food images using balanced training batch and a more efficient exemplar selection algorithm. Besides, we further proposed an exemplar-free continual learning approach to address the memory issue and privacy concerns caused by storing part of old data as exemplars.</p> <p>In addition to the work related to continual learning, we study the image-based dietary assessment with the objective of determining what someone eats and how much energy is consumed during the course of a day by using food or eating scene images. Specifically, we proposed a multi-task framework for simultaneously classification and portion size estima- tion by future fusion and soft-parameter sharing between backbone networks. Besides, we introduce RGB-Distribution image by concatenating the RGB image with the energy distri- bution map as the fourth channel, which is then used for end-to-end multi-food recognition and portion size estimation.</p> Signal processing deep learning Continual Learning image classification tasks
33	A novel application of deep learning with image cropping: a smart cities use case for flood monitoring Mishra, Bhupesh K., Thakker, Dhaval, Mazumdar, S., Neagu, Daniel, Gheorghe, Marian, Simpson, Sydney 13 February 2020 (has links) Yes / Event monitoring is an essential application of Smart City platforms. Real-time monitoring of gully and drainage blockage is an important part of flood monitoring applications. Building viable IoT sensors for detecting blockage is a complex task due to the limitations of deploying such sensors in situ. Image classification with deep learning is a potential alternative solution. However, there are no image datasets of gullies and drainages. We were faced with such challenges as part of developing a flood monitoring application in a European Union-funded project. To address these issues, we propose a novel image classification approach based on deep learning with an IoT-enabled camera to monitor gullies and drainages. This approach utilises deep learning to develop an effective image classification model to classify blockage images into different class labels based on the severity. In order to handle the complexity of video-based images, and subsequent poor classification accuracy of the model, we have carried out experiments with the removal of image edges by applying image cropping. The process of cropping in our proposed experimentation is aimed to concentrate only on the regions of interest within images, hence leaving out some proportion of image edges. An image dataset from crowd-sourced publicly accessible images has been curated to train and test the proposed model. For validation, model accuracies were compared considering model with and without image cropping. The cropping-based image classification showed improvement in the classification accuracy. This paper outlines the lessons from our experimentation that have a wider impact on many similar use cases involving IoT-based cameras as part of smart city event monitoring platforms. / European Regional Development Fund Interreg project Smart Cities and Open Data REuse (SCORE). Image classification Deep learning DCNN IoT sensors Drainage blockage
34	Parkinson's Disease Automated Hand Tremor Analysis from Spiral Images DeSipio, Rebecca E. 05 1900 (has links) Parkinson’s Disease is a neurological degenerative disease affecting more than six million people worldwide. It is a progressive disease, impacting a person’s movements and thought processes. In recent years, computer vision and machine learning researchers have been developing techniques to aid in the diagnosis. This thesis is motivated by the exploration of hand tremor symptoms in Parkinson’s patients from the Archimedean Spiral test, a paper-and-pencil test used to evaluate hand tremors. This work presents a novel Fourier Domain analysis technique that transforms the pencil content of hand-drawn spiral images into frequency features. Our technique is applied to an image dataset consisting of spirals drawn by healthy individuals and people with Parkinson’s Disease. The Fourier Domain analysis technique achieves 81.5% accuracy predicting images drawn by someone with Parkinson’s, a result 6% higher than previous methods. We compared this method against the results using extracted features from the ResNet-50 and VGG16 pre-trained deep network models. The VGG16 extracted features achieve 95.4% accuracy classifying images drawn by people with Parkinson’s Disease. The extracted features of both methods were also used to develop a tremor severity rating system which scores the spiral images on a scale from 0 (no tremor) to 1 (severe tremor). The results show correlation to the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS) developed by the International Parkinson and Movement Disorder Society. These results can be useful for aiding in early detection of tremors, the medical treatment process, and symptom tracking to monitor the progression of Parkinson’s Disease. / M.S. / Parkinson’s Disease is a neurological degenerative disease affecting more than six million people worldwide. It is a progressive disease, impacting a person’s movements and thought processes. In recent years, computer vision and machine learning researchers have been developing techniques to aid in the diagnosis. This thesis is motivated by the exploration of hand tremor symptoms in Parkinson’s patients from the Archimedean Spiral test, a paper-and-pencil test used to evaluate hand tremors. This work presents a novel spiral analysis technique that converts the pencil content of hand-drawn spirals into numeric values, called features. The features measure spiral smoothness. Our technique is applied to an image dataset consisting of spirals drawn by healthy and Parkinson’s individuals. The spiral analysis technique achieves 81.5% accuracy predicting images drawn by someone with Parkinson’s. We compared this method against the results using extracted features from pre-trained deep network models. The VGG16 pre-trained model extracted features achieve 95.4% accuracy classifying images drawn by people with Parkinson’s Disease. The extracted features of both methods were also used to develop a tremor severity rating system which scores the spiral images on a scale from 0 (no tremor) to 1 (severe tremor). The results show a similar trend to the tremor evaluations rated by the Unified Parkinson’s Disease Rating Scale (MDS-UPDRS) developed by the International Parkinson and Movement Disorder Society. These results can be useful for aiding in early detection of tremors, the medical treatment process, and symptom tracking to monitor the progression of Parkinson’s Disease. Archimedean Spiral Machine Learning Deep Learning Image Classification Fourier Domain
35	Color Invariant Skin Segmentation Xu, Han 25 March 2022 (has links) This work addresses the problem of automatically detecting human skin in images without reliance on color information. Unlike previous methods, we present a new approach that performs well in the absence of such information. A key aspect of the work is that color-space augmentation is applied strategically during the training, with the goal of reducing the influence of features that are based entirely on color and increasing more semantic understanding. The resulting system exhibits a dramatic improvement in performance for images in which color details are diminished. We have demonstrated the concept using the U-Net architecture, and experimental results show improvements in evaluations for all Fitzpatrick skin tones in the ECU dataset. We further tested the system with RFW dataset to show that the proposed method is consistent across different ethnicities and reduces bias to any skin tones. Therefore, this work has strong potential to aid in mitigating bias in automated systems that can be applied to many applications including surveillance and biometrics. / Master of Science / Skin segmentation deals with the classification of skin and non-skin pixels and regions in a image containing these information. Although most previous skin-detection methods have used color cues almost exclusively, they are vulnerable to external factors (e.g., poor or unnatural illumination and skin tones). In this work, we present a new approach based on U-Net that performs well in the absence of color information. To be specific, we apply a new color space augmentation into the training stage to improve the performance of skin segmentation system over the illumination and skin tone diverse. The system was trained and tested with both original and color changed ECU dataset. We also test our system with RFW dataset, a larger dataset with four human races with different skin tones. The experimental results show improvements in evaluations for skin tones and complex illuminations. Deep learning (Machine learning) Image Segmentation Skin Detection Image Classification
36	Pokročilé metody segmentace cévního řečiště na fotografiích sítnice / Advanced retinal vessel segmentation methods in colour fundus images Svoboda, Ondřej January 2013 (has links) Segmentation of vasculature tree is an important step of the process of image processing. There are many methods of automatic blood vessel segmentation. These methods are based on matched filters, pattern recognition or image classification. Use of automatic retinal image processing greatly simplifies and accelerates retinal images diagnosis. The aim of the automatic image segmentation algorithms is thresholding. This work primarily deals with retinal image thresholding. We discuss a few works using local and global image thresholding and supervised image classification to segmentation of blood tree from retinal images. Subsequently is to set of results from two different methods used image classification and discuss effectiveness of the vessel segmentation. Use image classification instead of global thresholding changed statistics of first method on healthy part of HRF. Sensitivity and accuracy decreased to 62,32 %, respectively 94,99 %. Specificity increased to 95,75 %. Second method achieved sensitivity 69.24 %, specificity 98.86% and 95.29 % accuracy. Combining the results of both methods achieved sensitivity up to72.48%, specificity to 98.59% and the accuracy to 95.75%. This confirmed the assumption that the classifier will achieve better results. At the same time, was shown that extend the feature vector combining the results from both methods have increased sensitivity, specificity and accuracy.
37	A New Look Into Image Classification: Bootstrap Approach Ochilov, Shuhratchon January 2012 (has links) Scene classification is performed on countless remote sensing images in support of operational activities. Automating this process is preferable since manual pixel-level classification is not feasible for large scenes. However, developing such an algorithmic solution is a challenging task due to both scene complexities and sensor limitations. The objective is to develop efficient and accurate unsupervised methods for classification (i.e., assigning each pixel to an appropriate generic class) and for labeling (i.e., properly assigning true labels to each class). Unique from traditional approaches, the proposed bootstrap approach achieves classification and labeling without training data. Here, the full image is partitioned into subimages and the true classes found in each subimage are provided by the user. After these steps, the rest of the process is automatic. Each subimage is individually classified into regions and then using the joint information from all subimages and regions the optimal configuration of labels is found based on an objective function based on a Markov random field (MRF) model. The bootstrap approach has been successfully demonstrated with SAR sea-ice and lake ice images which represent challenging scenes used operationally for ship navigation, climate study, and ice fraction estimation. Accuracy assessment is based on evaluation conducted by third party experts. The bootstrap method is also demonstrated using synthetic and natural images. The impact of this technique is a repeatable and accurate methodology that generates classified maps faster than the standard methodology. System Design Engineering
38	A New Look Into Image Classification: Bootstrap Approach Ochilov, Shuhratchon January 2012 (has links) Scene classification is performed on countless remote sensing images in support of operational activities. Automating this process is preferable since manual pixel-level classification is not feasible for large scenes. However, developing such an algorithmic solution is a challenging task due to both scene complexities and sensor limitations. The objective is to develop efficient and accurate unsupervised methods for classification (i.e., assigning each pixel to an appropriate generic class) and for labeling (i.e., properly assigning true labels to each class). Unique from traditional approaches, the proposed bootstrap approach achieves classification and labeling without training data. Here, the full image is partitioned into subimages and the true classes found in each subimage are provided by the user. After these steps, the rest of the process is automatic. Each subimage is individually classified into regions and then using the joint information from all subimages and regions the optimal configuration of labels is found based on an objective function based on a Markov random field (MRF) model. The bootstrap approach has been successfully demonstrated with SAR sea-ice and lake ice images which represent challenging scenes used operationally for ship navigation, climate study, and ice fraction estimation. Accuracy assessment is based on evaluation conducted by third party experts. The bootstrap method is also demonstrated using synthetic and natural images. The impact of this technique is a repeatable and accurate methodology that generates classified maps faster than the standard methodology. System Design Engineering
39	Recognizing describable attributes of textures and materials in the wild and clutter Cimpoi, Mircea January 2015 (has links) Visual textures play an important role in image understanding because theyare a key component of the semantic of many images. Furthermore, texture representations, which pool local image descriptors in an orderless manner, have hada tremendous impact in a wide range of computer vision problems, from texture recognition to object detection. In this thesis we make several contributions to the area of texture understanding. First, we add a new semantic dimension to texture recognition. Instead of focusing on instance or material recognition, we propose a human-interpretable vocabulary of texture attributes, inspired from studies in Cognitive Science, to describe common texture patterns. We also develop a corresponding dataset, the Describable Texture Dataset (DTD), for benchmarking. We show that these texture attributes produce intuitive descriptions of textures. We also show that they can be used to extract a very low dimensional representation of any texture that is very effective in other texture analysis tasks, including improving the state-of-the art in material recognition on the most challenging datasets available today. Second, we look at the problem of recognizing texture attributes and materials in realistic uncontrolled imaging conditions, including when textures appear in clutter. We build on top of the recently proposed Open Surfaces dataset, introduced by the graphics community, by deriving a corresponding benchmarks for material recognition. In addition to material labels, we also augment a subset of Open Surfaces with semantic attributes. Third, we propose a novel texture representation, combining the recent advances in deep-learning with the power of Fisher Vector pooling. We provide thorough evaluation of the new representation, and revisit in general classic texture representations, including bag-of-visual-words, VLAD and the Fisher Vectors, in the context of deep learning. We show that these pooling mechanisms have excellent efficiency and generalisation properties if the convolutional layers of a deep model are used as local features. We obtain in this manner state-of-the-art performance in numerous datasets, both in texture recognition and image understanding in general. We show through our experiments that the proposed representation is an efficient way to apply deep features to image regions, and that it is an effective manner of transferring deep features from one domain to another. 006.3
40	Extraction de Descripteurs Pertinents et Classiﬁcation pour le Problème de Recherche des Images par le Contenu / Seeking for Relevant Descriptors and Classification for Content Based Image Retrieval Vieux, Rémi 30 March 2011 (has links) Dans le cadre du projet Européen X-Media, de nombreuses contributions ont été apportées aux problèmes de classification d'image et de recherche d'images par le contenu dans des contextes industriels hétérogènes. Ainsi, après avoir établi un état de l'art des descripteurs d'image les plus courant, nous nous sommes dans un premier temps intéressé a des méthodes globales, c'est à dire basée sur la description totale de l'image par des descripteurs. Puis, nous nous sommes attachés a une analyse plus fine du contenu des images afin d'en extraire des informations locales, sur la présence et la localisation d'objets d'intérêt. Enfin, nous avons proposé une méthode hybride de recherche d'image basée sur le contenu qui s'appuie sur la description locale des régions de l'image afin d'en tirer une signature pouvant être utilisée pour des requêtes globales et locales. / The explosive development of affordable, high quality image acquisition deviceshas made available a tremendous amount of digital content. Large industrial companies arein need of efficient methods to exploit this content and transform it into valuable knowledge.This PhD has been accomplished in the context of the X-MEDIA project, a large Europeanproject with two major industrial partners, FIAT for the automotive industry andRolls-Royce plc. for the aircraft industry. The project has been the trigger for research linkedwith strong industrial requirements. Although those user requirements can be very specific,they covered more generic research topics. Hence, we bring several contributions in thegeneral context of Content-Based Image Retrieval (CBIR), Indexing and Classification.In the first part of the manuscript we propose contributions based on the extraction ofglobal image descriptors. We rely on well known descriptors from the literature to proposemodels for the indexing of image databases, and the approximation of a user defined categorisation.Additionally, we propose a new descriptor for a CBIR system which has toprocess a very specific image modality, for which traditional descriptors are irrelevant. Inthe second part of the manuscript, we focus on the task of image classification. Industrialrequirements on this topic go beyond the task of global image classification. We developedtwo methods to localize and classify the local content of images, i.e. image regions, usingsupervised machine learning algorithms (Support Vector Machines). In the last part of themanuscript, we propose a model for Content-Based Image Retrieval based on the constructionof a visual dictionary of image regions. We extensively experiment the model in orderto identify the most influential parameters in the retrieval efficiency. X-MEDIA RIBC Indexation Classification d'images SVM X-MEDIA CBIR Indexing Image Classification SVM

Search results