• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 1
  • Tagged with
  • 7
  • 7
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Deep neural networks for food waste analysis and classification : Subtraction-based methods for the case of data scarcity

Brunell, David January 2022 (has links)
Machine learning generally requires large amounts of data, however data is often limited. On the whole the amount of data needed grows with the complexity of the problem to be solved. Utilising transfer learning, data augmentation and problem reduction, acceptable performance can be achieved with limited data for a multitude of tasks. The goal of this master project is to develop an artificial neural network-based model for food waste analysis, an area in which large quantities of data is not yet readily available. Given two images an algorithm is expected to identify what has changed in the image, ignore the uncharged areas even though they might contain objects which can be classified and finally classify the change. The approach chosen in this project was to attempt to reduce the problem the machine learning algorithm has to solve by subtracting the images before they are handled by the neural network. In theory this should resolve both object localisation and filtering of uninteresting objects, which only leaves classification to the neural network. Such a procedure significantly simplifies the task to be resolved by the neural network, which results in reduced need for training data as well as keeping the process of gathering data relatively simple and fast. Several models were assessed and theories of adaptation of the neural network to this particular task were evaluated. Test accuracy of at best 78.9% was achieved with a limited dataset of about 1000 images with 10 different classes. This performance was accomplished by a siamese neural network based on VGG19 utilising triplet loss and training data using subtraction as a basis for ground truth mask creation, which was multiplied with the image containing the changed object. / Maskininlärning kräver generellt mycket data, men stora mängder data står inte alltid till förfogande. Generellt ökar behovet av data med problemets komplexitet. Med hjälp av överföringsinlärning, dataaugumentation och problemreduktion kan dock acceptabel prestanda erhållas på begränsad datamängd för flera uppgifter.  Målet med denna masteruppsats är att ta fram en modell baserad på artificiella neurala nätverk för matavfallsanalys, ett område inom vilket stora mängder data ännu inte finns tillgängligt. Givet två bilder väntas en algoritm identifiera vad som ändrats i bilden, ignorera de oförändrade områdena även om dessa innehåller objekt som kan klassificeras och slutligen klassificera ändringen. Tillvägagångssättet som valdes var att försöka reducera problemet som maskininlärningsalgoritmen, i detta fall ett artificiellt neuralt nätverk, behöver hantera genom att subtrahera bilderna innan de hanterades av det neurala nätverket. I teorin bör detta ta hand om både objektslokaliseringen och filtreringen av ointressanta objekt, vilket endast lämnar klassificeringen till det neurala nätverket. Ett sådant tillvägagångssätt förenklar problemet som det neurala nätverket behöver lösa avsevärt och resulterar i minskat behov av träningsdata, samtidigt som datainsamling hålls relativt snabbt och simpelt.  Flera olika modeller utvärderades och teorier om specialanpassningar av neurala nätverk för denna uppgift evaluerades. En testnoggrannhet på som bäst 78.9% uppnåddes med begränsad datamängd om ca 1000 bilder med 10 klasser. Denna prestation erhölls med ett siamesiskt neuralt nätverk baserat på VGG19 med tripletförlust och träningsdata som använde subtraktion av bilder som grund för framställning av grundsanningsmasker (eng. Ground truth masks) multiplicerade med bilden innehållande förändringen.
2

Computer Vision Based Model for Art Skills Assessment

Alghamdi, Asaad 20 December 2022 (has links)
No description available.
3

Product Matching through Multimodal Image and Text Combined Similarity Matching / Produktmatchning Genom Multimodal Kombinerad Bild- och Textlikhetsmatchning

Ko, E Soon January 2021 (has links)
Product matching in e-commerce is an area that faces more and more challenges with growth in the e-commerce marketplace as well as variation in the quality of data available online for each product. Product matching for e-commerce provides competitive possibilities for vendors and flexibility for customers by identifying identical products from different sources. Traditional methods in product matching are often conducted through rule-based methods and methods tackling the issue through machine learning usually do so through unimodal systems. Moreover, existing methods would tackle the issue through product identifiers which are not always unified for each product. This thesis provides multimodal approaches through product name, description, and image to the problem area of product matching that outperforms unimodal approaches. Three multimodal approaches were taken, one unsupervised and two supervised. The unsupervised approach uses straight-forward embedding space to nearest neighbor search that provides better results than unimodal approaches. One of the supervised multimodal approaches uses Siamese network on the embedding space which outperforms the unsupervised multi- modal approach. Finally, the last supervised approach instead tackles the issue by exploiting distance differences in each modality through logistic regression and a decision system that provided the best results. / Produktmatchning inom e-handel är ett område som möter fler och fler utmaningar med hänsyn till den tillväxt som e-handelsmarknaden undergått och fortfarande undergår samt variation i kvaliteten på den data som finns tillgänglig online för varje produkt. Produktmatchning inom e-handel är ett område som ger konkurrenskraftiga möjligheter för leverantörer och flexibilitet för kunder genom att identifiera identiska produkter från olika källor. Traditionella metoder för produktmatchning genomfördes oftast genom regelbaserade metoder och metoder som utnyttjar maskininlärning gör det vanligtvis genom unimodala system. Dessutom utnyttjar mestadels av befintliga metoder produktidentifierare som inte alltid är enhetliga för varje produkt mellan olika källor. Denna studie ger istället förslag till multimodala tillvägagångssätt som istället använder sig av produktnamn, produktbeskrivning och produktbild för produktmatchnings-problem vilket ger bättre resultat än unimodala metoder. Tre multimodala tillvägagångssätt togs, en unsupervised och två supervised. Den unsupervised metoden använder embeddings vektorerna rakt av för att göra en nearest neighborsökning vilket gav bättre resultat än unimodala tillvägagångssätt. Ena supervised multimodal tillvägagångssätten använder siamesiska nätverk på embedding utrymmet vilket gav resultat som överträffade den unsupervised multimodala tillvägagångssättet. Slutligen tar den sista supervised metoden istället avståndsskillnader i varje modalitet genom logistisk regression och ett beslutssystem som gav bästa resultaten.
4

Počítání unikátních aut ve snímcích / Unique Car Counting

Uhrín, Peter January 2021 (has links)
Current systems for counting cars on parking lots usually use specialized equipment, such as barriers at the parking lot entrance. Usage of such equipment is not suitable for free or residential parking areas. However, even in these car parks, it can help keep track of their occupancy and other data. The system designed in this thesis uses the YOLOv4 model for visual detection of cars in photos. It then calculates an embedding vector for each vehicle, which is used to describe cars and compare whether the car has changed over time at the same parking spot. This information is stored in the database and used to calculate various statistical values like total cars count, average occupancy, or average stay time. These values can be retrieved using REST API or be viewed in the web application.
5

Utility-Preserving Face Redaction and Change Detection For Satellite Imagery

Hanxiang Hao (11540203) 22 November 2021 (has links)
<div><div><div><p>Face redaction is needed by law enforcement and mass media outlets to guarantee privacy. In this thesis, a performance analysis of several face redaction/obscuration methods, such as blurring and pixelation is presented. The analysis is based on various threat models and obscuration attackers to achieve a comprehensive evaluation. We show that the traditional blurring and pixelation methods cannot guarantee privacy. To provide a more secured privacy protection, we propose two novel obscuration methods that are based on the generative adversarial networks. The proposed methods not only remove the identifiable information, but also preserve the non-identifiable facial information (as known as the utility information), such as expression, age, skin tone and gender.</p><p>We also propose methods for change detection in satellite imagery. In this thesis, we consider two types of building changes: 2D appearance change and 3D height change. We first present a model with an attention mechanism to detect the building appearance changes that are caused by natural disasters. Furthermore, to detect the changes of building height, we present a height estimation model that is based on building shadows and solar angles without relying on height annotation. Both change detection methods require good building segmentation performance, which might be hard to achieve for the low-quality images, such as off-nadir images. To solve this issue, we use uncertainty modeling and satellite imagery metadata to achieve accurate building segmentation for the noisy images that are taken from large off-nadir angles.</p></div></div></div>
6

Unsupervised representation learning for anomaly detection on neuroimaging. Application to epilepsy lesion detection on brain MRI / Apprentissage de représentations non supervisé pour la détection d'anomalies en neuro-imagerie. Application à la détection de lésions d’épilepsie en IRM

Alaverdyan, Zaruhi 18 January 2019 (has links)
Cette étude vise à développer un système d’aide au diagnostic (CAD) pour la détection de lésions épileptogènes, reposant sur l’analyse de données de neuroimagerie, notamment, l’IRM T1 et FLAIR. L’approche adoptée, introduite précédemment par Azami et al., 2016, consiste à placer la tâche de détection dans le cadre de la détection de changement à l'échelle du voxel, basée sur l’apprentissage d’un modèle one-class SVM pour chaque voxel dans le cerveau. L'objectif principal de ce travail est de développer des mécanismes d’apprentissage de représentations, qui capturent les informations les plus discriminantes à partir de l’imagerie multimodale. Les caractéristiques manuelles ne sont pas forcément les plus pertinentes pour la tâche visée. Notre première contribution porte sur l'intégration de différents réseaux profonds non-supervisés, pour extraire des caractéristiques dans le cadre du problème de détection de changement. Nous introduisons une nouvelle configuration des réseaux siamois, mieux adaptée à ce contexte. Le système CAD proposé a été évalué sur l’ensemble d’images IRM T1 des patients atteints d'épilepsie. Afin d'améliorer la performance obtenue, nous avons proposé d'étendre le système pour intégrer des données multimodales qui possèdent des informations complémentaires sur la pathologie. Notre deuxième contribution consiste donc à proposer des stratégies de combinaison des différentes modalités d’imagerie dans un système pour la détection de changement. Ce système multimodal a montré une amélioration importante sur la tâche de détection de lésions épileptogènes sur les IRM T1 et FLAIR. Notre dernière contribution se focalise sur l'intégration des données TEP dans le système proposé. Etant donné le nombre limité des images TEP, nous envisageons de synthétiser les données manquantes à partir des images IRM disponibles. Nous démontrons que le système entraîné sur les données réelles et synthétiques présente une amélioration importante par rapport au système entraîné sur les images réelles uniquement. / This work represents one attempt to develop a computer aided diagnosis system for epilepsy lesion detection based on neuroimaging data, in particular T1-weighted and FLAIR MR sequences. Given the complexity of the task and the lack of a representative voxel-level labeled data set, the adopted approach, first introduced in Azami et al., 2016, consists in casting the lesion detection task as a per-voxel outlier detection problem. The system is based on training a one-class SVM model for each voxel in the brain on a set of healthy controls, so as to model the normality of the voxel. The main focus of this work is to design representation learning mechanisms, capturing the most discriminant information from multimodality imaging. Manual features, designed to mimic the characteristics of certain epilepsy lesions, such as focal cortical dysplasia (FCD), on neuroimaging data, are tailored to individual pathologies and cannot discriminate a large range of epilepsy lesions. Such features reflect the known characteristics of lesion appearance; however, they might not be the most optimal ones for the task at hand. Our first contribution consists in proposing various unsupervised neural architectures as potential feature extracting mechanisms and, eventually, introducing a novel configuration of siamese networks, to be plugged into the outlier detection context. The proposed system, evaluated on a set of T1-weighted MRIs of epilepsy patients, showed a promising performance but a room for improvement as well. To this end, we considered extending the CAD system so as to accommodate multimodality data which offers complementary information on the problem at hand. Our second contribution, therefore, consists in proposing strategies to combine representations of different imaging modalities into a single framework for anomaly detection. The extended system showed a significant improvement on the task of epilepsy lesion detection on T1-weighted and FLAIR MR images. Our last contribution focuses on the integration of PET data into the system. Given the small number of available PET images, we make an attempt to synthesize PET data from the corresponding MRI acquisitions. Eventually we show an improved performance of the system when trained on the mixture of synthesized and real images.
7

Automatic non linear metric learning : Application to gesture recognition / Apprentissage automatique de métrique non linéaire : Application à la reconnaissance de gestes

Berlemont, Samuel 11 February 2016 (has links)
Cette thèse explore la reconnaissance de gestes à partir de capteurs inertiels pour Smartphone. Ces gestes consistent en la réalisation d'un tracé dans l'espace présentant une valeur sémantique, avec l'appareil en main. Notre étude porte en particulier sur l'apprentissage de métrique entre signatures gestuelles grâce à l'architecture "Siamoise" (réseau de neurones siamois, SNN), qui a pour but de modéliser les relations sémantiques entre classes afin d'extraire des caractéristiques discriminantes. Cette architecture est appliquée au perceptron multicouche (MultiLayer Perceptron). Les stratégies classiques de formation d'ensembles d'apprentissage sont essentiellement basées sur des paires similaires et dissimilaires, ou des triplets formés d'une référence et de deux échantillons respectivement similaires et dissimilaires à cette référence. Ainsi, nous proposons une généralisation de ces approches dans un cadre de classification, où chaque ensemble d'apprentissage est composé d’une référence, un exemple positif, et un exemple négatif pour chaque classe dissimilaire. Par ailleurs, nous appliquons une régularisation sur les sorties du réseau au cours de l'apprentissage afin de limiter les variations de la norme moyenne des vecteurs caractéristiques obtenus. Enfin, nous proposons une redéfinition du problème angulaire par une adaptation de la notion de « sinus polaire », aboutissant à une analyse en composantes indépendantes non-linéaire supervisée. A l'aide de deux bases de données inertielles, la base MHAD (Multimodal Human Activity Dataset) ainsi que la base Orange, composée de gestes symboliques inertiels réalisés avec un Smartphone, les performances de chaque contribution sont caractérisées. Ainsi, des protocoles modélisant un monde ouvert, qui comprend des gestes inconnus par le système, mettent en évidence les meilleures capacités de détection et rejet de nouveauté du SNN. En résumé, le SNN proposé permet de réaliser un apprentissage supervisé de métrique de similarité non-linéaire, qui extrait des vecteurs caractéristiques discriminants, améliorant conjointement la classification et le rejet de gestes inertiels. / As consumer devices become more and more ubiquitous, new interaction solutions are required. In this thesis, we explore inertial-based gesture recognition on Smartphones, where gestures holding a semantic value are drawn in the air with the device in hand. In our research, speed and delay constraints required by an application are critical, leading us to the choice of neural-based models. Thus, our work focuses on metric learning between gesture sample signatures using the "Siamese" architecture (Siamese Neural Network, SNN), which aims at modelling semantic relations between classes to extract discriminative features, applied to the MultiLayer Perceptron. Contrary to some popular versions of this algorithm, we opt for a strategy that does not require additional parameter fine tuning, namely a set threshold on dissimilar outputs, during training. Indeed, after a preprocessing step where the data is filtered and normalised spatially and temporally, the SNN is trained from sets of samples, composed of similar and dissimilar examples, to compute a higher-level representation of the gesture, where features are collinear for similar gestures, and orthogonal for dissimilar ones. While the original model already works for classification, multiple mathematical problems which can impair its learning capabilities are identified. Consequently, as opposed to the classical similar or dissimilar pair; or reference, similar and dissimilar sample triplet input set selection strategies, we propose to include samples from every available dissimilar classes, resulting in a better structuring of the output space. Moreover, we apply a regularisation on the outputs to better determine the objective function. Furthermore, the notion of polar sine enables a redefinition of the angular problem by maximising a normalised volume induced by the outputs of the reference and dissimilar samples, which effectively results in a Supervised Non-Linear Independent Component Analysis. Finally, we assess the unexplored potential of the Siamese network and its higher-level representation for novelty and error detection and rejection. With the help of two real-world inertial datasets, the Multimodal Human Activity Dataset as well as the Orange Dataset, specifically gathered for the Smartphone inertial symbolic gesture interaction paradigm, we characterise the performance of each contribution, and prove the higher novelty detection and rejection rate of our model, with protocols aiming at modelling unknown gestures and open world configurations. To summarise, the proposed SNN allows for supervised non-linear similarity metric learning, which extracts discriminative features, improving both inertial gesture classification and rejection.

Page generated in 0.0503 seconds