Global ETD Search

161	Knowledge transfer for image understanding / Transfert de connaissance pour la compréhension des images Kulkarni, Praveen 23 January 2017 (has links) Le Transfert de Connaissance (Knowledge Transfer or Transfer Learning) est une solution prometteuse au difficile problème de l’apprentissage des réseaux profonds au moyen de bases d’apprentissage de petite taille, en présence d’une grande variabilité visuelle intra-classe. Dans ce travail, nous reprenons ce paradigme, dans le but d’étendre les capacités des CNN les plus récents au problème de la classification. Dans un premier temps, nous proposons plusieurs techniques permettant, lors de l’apprentissage et de la prédiction, une réduction des ressources nécessaires – une limitation connue des CNN. (i) En utilisant une méthode hybride combinant des techniques classiques comme des Bag-Of-Words (BoW) avec des CNN. (iv) En introduisant une nouvelle méthode d’agrégation intégrée à une structure de type CNN ainsi qu’un modèle non-linéaire s’appuyant sur des parties de l’image. La contribution clé est, finalement, une technique capable d’isoler les régions des images utiles pour une représentation locale. De plus, nous proposons une méthode nouvelle pour apprendre une représentation structurée des coefficients des réseaux de neurones. Nous présentons des résultats sur des jeux de données difficiles, ainsi que des comparaisons avec des méthodes concurrentes récentes. Nous prouvons que les méthodes proposées s’étendent à d’autres tâches de reconnaissance visuelles comme la classification d’objets, de scènes ou d’actions. / Knowledge transfer is a promising solution for the difficult problem of training deep convolutional neural nets (CNNs) using only small size training datasets with a high intra-class visual variability. In this thesis work, we explore this paradigm to extend the ability of state-of-the-art CNNs for image classification.First, we propose several effective techniques to reduce the training and test-time computational burden associated to CNNs:(i) Using a hybrid method to combine conventional, unsupervised aggregators such as Bag-of-Words (BoW) with CNNs;(ii) Introducing a novel pooling methods within a CNN framework along with non-linear part-based models. The key contribution lies in a technique able to discover useful regions per image involved in the pooling of local representations;In addition, we also propose a novel method to learn the structure of weights in deep neural networks. Experiments are run on challenging datasets with comparisons against state-of-the-art methods. The methods proposed are shown to generalize to different visual recognition tasks, such as object, scene or action classification. Apprentissage Machine Classification d’Images Transfer de connaissances Modèles à Parties Computer Vision Machine Learning Image Classification Transfer Learning Part-Based Models
162	Implications of Conversational AI on Humanoid Robots Soudamalla, Sharath Kumar 09 October 2020 (has links) Humanizing Technologies GmbH develops Intelligent software for the humanoid robots from Softbank Robotics. The main objective of this thesis is to develop and deploy Conversational Artificial Intelligence software into the humanoid robots using deep learning techniques. Development of conversational agents using Machine Learning or Artificial Intelligence is an intriguing issue with regards to Natural Language Processing. Great research and experimentation is being conducted in this area. Currently most of the chatbots are developed with rule based programming that cannot hold conversation which replicates real human interaction. This issue is addressed in this thesis with the development of Deep learning conversational AI based on Sequence to sequence, Attention mechanism, Transfer learning, Active learning and Beam search decoding which emulates human like conversation. The complete end to end conversational AI software is designed, implemented and deployed in this thesis work according to the conceptual specifications. The research objectives are successfully accomplished and results of the proposed concept are dis- cussed in detail. info:eu-repo/classification/ddc/004 ddc:004
163	Improving Biometric Log Detection with Partitioning and Filtering of the Search Space Rajabli, Nijat January 2021 (has links) Tracking of tree logs from a harvesting site to its processing site is a legal requirement for timber-based industries for social and economic reasons. Biometric tree log detection systems use images of the tree logs to track the logs by checking whether a given log image matches any of the logs registered in the system. However, as the number of registered tree logs in the database increases, the number of pairwise comparisons, and consequently the search time increase proportionally. Growing search space degrades the accuracy and the response time of matching queries and slows down the tracking process, costing time and resources. This work introduces database filtering and partitioning approaches based on discriminative log-end features to reduce the search space of the biometric log identification algorithms. In this study, 252 unique log images are used to train and test models for extracting features from the log images and to filter and cluster a database of logs. Experiments are carried out to show the end-to-end accuracy and speed-up impact of the individual approaches as well as the combinations thereof. The findings of this study indicate that the proposed approaches are suited for speeding-up tree log identification systems and highlight further opportunities in this field tree log tracking tree log identification cross-section segmentation search space reduction transfer learning Computer Sciences Datavetenskap (datalogi)
164	Sentiment Analysis of Financial News with Supervised Learning Syeda, Farha Shazmeen January 2020 (has links) Financial data in banks are unstructured and complicated. It is challenging to analyze these texts manually due to the small amount of labeled training data in financial text. Moreover, the financial text consists of language in the economic domain where a general-purpose model is not efficient. In this thesis, data had collected from MFN (Modular Finance) financial news, this data is scraped and persisted in the database and price indices are collected from Bloomberg terminal. Comprehensive study and tests are conducted to find the state-of-art results for classifying the sentiments using traditional classifiers like Naive Bayes and transfer learning models like BERT and FinBERT. FinBERT outperform the Naive Bayes and BERT classifier. The time-series indices for sentiments are built, and their correlations with price indices calculated using Pearson correlation. Augmented Dickey-Fuller (ADF) is used to check if both the time series data are stationary. Finally, the statistical hypothesis Granger causality test determines if the sentiment time series helps predict price. This result shows that there is a significant correlation and causal relation between sentiments and price. Financial news Transfer learning Sentiment classification BERT FinBERT Time series indices Casual inference Computer Sciences Datavetenskap (datalogi)
165	Detecting gastrointestinal abnormalities with binary classification of the Kvasir-Capsule dataset : A TensorFlow deep learning study / Detektering av gastrointenstinentala abnormaliteter med binär klassificering av datasetet Kvasir-Capsule : En TensoFlow djupinlärning studie Hollstensson, Mathias January 2022 (has links) The early discovery of gastrointestinal (GI) disorders can significantly decrease the fatality rate of severe afflictions. Video capsule endoscopy (VCE) is a technique that produces an eight hour long recording of the GI tract that needs to be manually reviewed. This has led to the demand for AI-based solutions, but unfortunately, the lack of labeled data has been a major obstacle. In 2020 the Kvasir-Capsule dataset was produced which is the largest labeled dataset of GI abnormalities to date, but challenges still exist.The dataset suffers from unbalanced and very similar data created from labeled video frames. To avoid specialization to the specific data the creators of the set constructed an official split which is encouraged to use for testing. This study evaluates the use of transfer learning, Data augmentation and binary classification to detect GI abnormalities. The performance of machine learning (ML) classification is explored, with and without official split-based testing. For the performance evaluation, a specific focus will be on achieving a low rate of false negatives. The proposition behind this is that the most important aspect of an automated detection system for GI abnormalities is a low miss rate of possible lethal abnormalities. The results from the controlled experiments conducted in this study clearly show the importance of using official split-based testing. The difference in performance between a model trained and tested on the same set and a model that uses official split-based testing is significant. This enforces that without the use of official split-based testing the model will not produce reliable and generalizable results. When using official split-based testing the performance is improved compared to the initial baseline that is presented with the Kvasir-Capsule set. Some experiments in the study produced results with as low as a 1.56% rate of false negatives but with the cost of lowered performance for the normal class. TensorFlow Image classification Transfer learning Binary classification Data augmentation Video capsule endoscopy Kvasir-Capsule. Computer Sciences Datavetenskap (datalogi)
166	Measuring Porosity in Ceramic Coating using Convolutional Neural Networks and Semantic Segmentation Isaksson, Filip January 2022 (has links) Ceramic materials contain several defects, one of which is porosity. At the time of writing, porosity measurement is a manual and time-consuming process performed by a human operator. With advances in deep learning for computer vision, this thesis explores to what degree convolutional neural networks and semantic segmentation can reliably measure porosity from microscope images. Combining classical image processing techniques with deep learning, images were automatically labeled and then used for training semantic segmentation neural networks leveraging transfer learning. Deep learning-based methods were more robust and could more reliably identify porosity in a larger variety of images than solely relying on classical image processing techniques. Deep Learning Image processing Transfer Learning Material Science
167	OSPREY: Person Re-Identification in the sport of Padel : Utilizing One-Shot Person Re-identification with locally aware transformers to improve tracking Svensson, Måns, Hult, Jim January 2022 (has links) This thesis is concerned with the topic of person re-identification. Many tracking algorithms today cannot keep track of players reentering the scene from different angles and times. Therefore, in this thesis, current literature is explored to gather information about the topic, and a current state-of-the-art model is tested. The person re-identification techniques will be applied to Padel games due to the collaboration with PadelPlay AB. The purpose of the thesis is to keep track of players during full matches of Padel with correct identities. To this, a current state-of-the-art model is applied to an existing tracking algorithm to enhance its capabilities. Furthermore, the purpose is broken down into two research questions. Firstly, how well does an existing person re-id model perform on Padel matches when it comes to keeping a consistent and accurate id on all players. Secondly, how can this model be improved upon to perform better in the new domain, being the sport of Padel? To be able to answer the research questions, a Padel dataset is created for benchmarking purposes. The state-of-the-art model is tested on the new dataset to see how it handles a new domain. Additionally, the same state-of-the-art model is retrained on the Padel dataset to answer the second research question. The results show that the state-of-the-art model that is previously trained on the Market-1501 dataset is highly generalizable on the Padel dataset and performs closely to the new model that is purely trained on the Padel dataset. Although they perform alike, the new model trained on the Padel dataset is slightly better as seen through both the quantitative and qualitative evaluations. Furthermore, the application of re-identification technology to keep track of players yielded significantly higher results than conventional solutions such as YOLOv5 with Deepsort. AI Person Re-identification Re-id Transformer Padel Computer Vision Transfer-learning Deep Learning Computer Sciences Datavetenskap (datalogi)
168	Multi-task regression QSAR/QSPR prediction utilizing text-based Transformer Neural Network and single-task using feature-based models Dimitriadis, Spyridon January 2021 (has links) With the recent advantages of machine learning in cheminformatics, the drug discovery process has been accelerated; providing a high impact in the field of medicine and public health. Molecular property and activity prediction are key elements in the early stages of drug discovery by helping prioritize the experiments and reduce the experimental work. In this thesis, a novel approach for multi-task regression using a text-based Transformer model is introduced and thoroughly explored for training on a number of properties or activities simultaneously. This multi-task regression with Transformer based model is inspired by the field of Natural Language Processing (NLP) which uses prefix tokens to distinguish between each task. In order to investigate our architecture two data categories are used; 133 biological activities from ExCAPE database and three physical chemistry properties from MoleculeNet benchmark datasets. The Transformer model consists of the embedding layer with positional encoding, a number of encoder layers, and a Feedforward Neural Network (FNN) to turn it into a regression problem. The molecules are represented as a string of characters using the Simplified Molecular-Input Line-Entry System (SMILES) which is a ’chemistry language’ with its own syntax. In addition, the effect of Transfer Learning is explored by experimenting with two pretrained Transformer models, pretrained on 1.5 million and on 100 million molecules. The text-base Transformer models are compared with a feature-based Support Vector Regression (SVR) with the Tanimoto kernel where the input molecules are encoded as Extended Connectivity Fingerprint (ECFP), which are calculated features. The results have shown that Transfer Learning is crucial for improving the performance on both property and activity predictions. On bioactivity tasks, the larger pretrained Transformer on 100 million molecules achieved comparable performance to the feature-based SVR model; however, overall SVR performed better on the majority of the bioactivity tasks. On the other hand, on physicochemistry property tasks, the larger pretrained Transformer outperformed SVR on all three tasks. Concluding, the multi-task regression architecture with the prefix token had comparable performance with the traditional feature-based approach on predicting different molecular properties or activities. Lastly, using the larger pretrained models trained on a wide chemical space can play a key role in improving the performance of Transformer models on these tasks. multi-task regression QSAR QSPR deep learning attention based models transfer learning Other Computer and Information Science Annan data- och informationsvetenskap
169	Transforming Legal Entity Recognition Andersson-Säll, Tim January 2021 (has links) Transformer-based architectures have in recent years advanced state-of-the-art performance in Natural Language Processing. Researchers have successfully adapted such models to downstream tasks within NLP in a domain-specific setting. This thesis examines the application of these models to the legal domain by doing Named Entity Recognition (NER) in a setting of scarce training data. Three different pre-trained BERT models are fine-tuned on a set of 101 court case documents, whereof one model is pre-trained on legal corpora and the other two on general corpora. Experiments are run to evaluate the models’ predictive performance given smaller or larger quantities of data to fine-tune on. Results show that BERT models work reasonably well for NER with legal data. Unlike many other domain-specific BERT models, the BERT model trained on legal corpora does not outperform the base models. Modest amounts of annotated data seem sufficient for reasonably good performance. Natural Language Processing BERT Transformer Legal AI Transfer Learning Neural Networks Named Entity Recognition Probability Theory and Statistics Sannolikhetsteori och statistik
170	Investigating techniques for improving accuracy and limiting overfitting for YOLO and real-time object detection on iOS Güven, Jakup January 2019 (has links) I detta arbete genomförs utvecklingen av ett realtids objektdetekteringssystem för iOS. För detta ändamål används YOLO, en ett-stegs objektdetekterare och ett s.k. ihoplänkat neuralt nätverk vilket åstadkommer betydligt bättre prestanda än övriga realtidsdetek- terare i termer av hastighet och precision. En dörrdetekterare baserad på YOLO tränas och implementeras i en systemutvecklingsprocess. Maskininlärningsprocessen sammanfat- tas och praxis för att undvika överträning eller “overfitting” samt för att öka precision och hastighet diskuteras och appliceras. Vidare genomförs en rad experiment vilka pekar på att dataaugmentation och inkludering av negativ data i ett dataset medför ökad precision. Hyperparameteroptimisering och kunskapsöverföring pekas även ut som medel för att öka en objektdetekringsmodells prestanda. Författaren lyckas öka modellens mAP, ett sätt att mäta precision för objektdetekterare, från 63.76% till 86.73% utifrån de erfarenheter som dras av experimenten. En modells tendens för överträning utforskas även med resultat som pekar på att träning med över 300 epoker rimligen orsakar en övertränad modell. / This paper features the creation of a real time object detection system for mobile iOS using YOLO, a state-of-the-art one stage object detector and convoluted neural network far surpassing other real time object detectors in speed and accuracy. In this process an object detecting model is trained to detect doors. The machine learning process is outlined and practices to combat overfitting and increasing accuracy and speed are discussed. A series of experiments are conducted, the results of which suggests that data augmentation, including negative data in a dataset, hyperparameter optimisation and transfer learning are viable techniques in improving the performance of an object detection model. The author is able to increase mAP, a measurement of accuracy for object detectors, from 63.76% to 86.73% based on the results of experiments. The tendency for overfitting is also explored and results suggest that training beyond 300 epochs is likely to produce an overfitted model. YOLO object detection overfitting dataset composition hyperparameter optimisation transfer learning iOS real-time improving accuracy Engineering and Technology Teknik och teknologier

Search results