Spelling suggestions: "subject:"transferlearning"" "subject:"transferleading""
21 |
Fine-tuned convolutional neural networks for improved glaucoma predictionSmedjegård, Filip January 2024 (has links)
Early detection is crucial for effectively treating glaucoma, a leading cause of irreversible blindness. Diagnosing glaucoma can be challenging due to its subtle early symptoms. This study aims to enhance glaucoma prediction by fine-tuning pre-trained convolutional neural networks. Several networks were re-trained and tested on publicly available retinal image datasets. Additionally, the models were evaluated on fundus images from patients at Region Västernorrland (RVN). The methodology involved exploring how to effectively process and prepare patient data for prediction purposes. The results showed that a majority voting ensemble of the fine-tuned models produced the highest performance, achieving an accuracy of approximately 0.94, with a specificity and sensitivity of 0.97 and 0.90 respectively. The ensemble also identified 0.90 glaucomatous images from RVN correctly. In terms of specificity and sensitivity, all models outperformed the results of ophthalmologist specialists described in a previous study. The findings suggest the effectiveness of transfer learning in enhancing the diagnostic accuracy of glaucoma. It also underscores the importance of proper storage and preparation of medical data for developing predicitive machine learning models. / Glaukom, mer känt som grön starr, är en av de vanligast förekommande ögonsjukdomarna som orsakar blindhet. Det är viktigt att diagnostisera glaukom tidigt i sjukdomsförloppet för att genom behandling, sakta ner eller stoppa ytterligare synförlust. Att diagnostisera glaukom kan vara utmanande, eftersom det vanligtvis inte visar några tidiga symtom. Artificiell intelligens (AI), eller mer specifikt maskininlärning (ML), kan hjälpa läkare att ställa rätt diagnos om det används som ett beslutsstöd. Faltande neurala nätverk (convolutional neural network, CNN) kan lära sig att känna igen mönster i bilder, för att därigenom klassificera bilder till olika kategorier. Ett sätt att diagnostisera glaukom är att studera näthinnan och synnerven i ögats bakre del, som kallas ögonbotten. I denna studie finjusterades redan tränade CNN:s för att prediktera glaukom utifrån ögonbottenbilder. Detta uppnåddes genom att träna om modellerna på publikt tillgängliga ögonbottenbilder. Målet var att jämföra nätverkens noggrannhet på en delmängd av bilderna, samt att evaluera dem på ögonbottenbilder från sjukhus i Region Västernorrland (RVN). För att uppnå detta ingick det även i metodiken att utforska begränsningarna och möjligheterna med hur patientdata får användas, samt att undersöka hur datat bör lagras och tillrättaläggas för att möjliggöra utvecklingen av prediktionsmodeller. Syftet med studien var att öka noggrannheten vid diagnostisering av glaukom. Resultaten visade att en ensemble baserad på majoritetsröstning av alla modeller gav den bästa noggrannheten, ungefär 0.94. Sensitiviteten och specificiteten var 0.90, respektive 0.97. Vidare klassificerades 90% av ögonbottenbilderna från RVN korrekt. Resultaten tyder på att maskininlärning är effektivt för att förbättra den diagnostiska noggrannheten för glaukom. Det understryker också vikten av strategisk lagring och förberedelse av medicinska data för att utveckla prediktiva maskininlärningsmodeller i framtiden.
|
22 |
Graph neural networks for prediction of formation energies of crystals / Graf-neuronnät för prediktion av kristallers formationsenergierEkström, Filip January 2020 (has links)
Predicting formation energies of crystals is a common but computationally expensive task. In this work, it is therefore investigated how a neural network can be used as a tool for predicting formation energies with less computational cost compared to conventional methods. The investigated model shows promising results in predicting formation energies, reaching below a mean absolute error of 0.05 eV/atom with less than 4000 training datapoints. The model also shows great transferability, being able to reach below an MAE of 0.1 eV/atom with less than 100 training points when transferring from a pre-trained model. A drawback of the model is however that it is relying on descriptions of the crystal structures that include interatomic distances. Since these are not always accurately known, it is investigated how inaccurate structure descriptions affect the performance of the model. The results show that the quality of the descriptions definitely worsen the accuracy. The less accurate descriptions can however be used to reduce the search space in the creation of phase diagrams, and the proposed workflow which combines conventional density functional theory and machine learning shows a reduction in time consumption of more than 50 \% compared to only using density functional theory for creating a ternary phase diagram.
|
23 |
Interpretation of Swedish Sign Language using Convolutional Neural Networks and Transfer LearningHalvardsson, Gustaf, Peterson, Johanna January 2020 (has links)
The automatic interpretation of signs of a sign language involves image recognition. An appropriate approach for this task is to use Deep Learning, and in particular, Convolutional Neural Networks. This method typically needs large amounts of data to be able to perform well. Transfer learning could be a feasible approach to achieve high accuracy despite using a small data set. The hypothesis of this thesis is to test if transfer learning works well to interpret the hand alphabet of the Swedish Sign Language. The goal of the project is to implement a model that can interpret signs, as well as to build a user-friendly web application for this purpose. The final testing accuracy of the model is 85%. Since this accuracy is comparable to those received in other studies, the project’s hypothesis is shown to be supported. The final network is based on the pre-trained model InceptionV3 with five frozen layers, and the optimization algorithm mini-batch gradient descent with a batch size of 32, and a step-size factor of 1.2. Transfer learning is used, however, not to the extent that the network became too specialized in the pre-trained model and its data. The network has shown to be unbiased for diverse testing data sets. Suggestions for future work include integrating dynamic signing data to interpret words and sentences, evaluating the method on another sign language’s hand alphabet, and integrate dynamic interpretation in the web application for several letters or words to be interpreted after each other. In the long run, this research could benefit deaf people who have access to technology and enhance good health, quality education, decent work, and reduced inequalities. / Automatisk tolkning av tecken i ett teckenspråk involverar bildigenkänning. Ett ändamålsenligt tillvägagångsätt för denna uppgift är att använda djupinlärning, och mer specifikt, Convolutional Neural Networks. Denna metod behöver generellt stora mängder data för att prestera väl. Därför kan transfer learning vara en rimlig metod för att nå en hög precision trots liten mängd data. Avhandlingens hypotes är att utvärdera om transfer learning fungerar för att tolka det svenska teckenspråkets handalfabet. Målet med projektet är att implementera en modell som kan tolka tecken, samt att bygga en användarvänlig webapplikation för detta syfte. Modellen lyckas klassificera 85% av testinstanserna korrekt. Då denna precision är jämförbar med de från andra studier, tyder det på att projektets hypotes är korrekt. Det slutgiltiga nätverket baseras på den förtränade modellen InceptionV3 med fem frysta lager, samt optimiseringsalgoritmen mini-batch gradient descent med en batchstorlek på 32 och en stegfaktor på 1,2. Transfer learning användes, men däremot inte till den nivå så att nätverket blev för specialiserat på den förtränade modellen och dess data. Nätverket har visat sig vara ickepartiskt för det mångfaldiga testningsdatasetet. Förslag på framtida arbeten inkluderar att integrera dynamisk teckendata för att kunna tolka ord och meningar, evaluera metoden på andra teckenspråkshandalfabet, samt att integrera dynamisk tolkning i webapplikationen så flera bokstäver eller ord kan tolkas efter varandra. I det långa loppet kan denna studie gagna döva personer som har tillgång till teknik, och därmed öka chanserna för god hälsa, kvalitetsundervisning, anständigt arbete och minskade ojämlikheter.
|
24 |
Evaluating CNN Architectures on the CSAW-M Dataset / Evaluering av olika CNN Arkitekturer på CSAW-MKristoffersson, Ludwig, Zetterman, Noa January 2022 (has links)
CSAW-M is a dataset that contains about 10 000 x-ray images created from mammograms. Mammograms are used to identify patients with breast cancer through a screening process with the goal of catching cancer tumours early. Modern convolutional neural networks are very sophisticated and capable of identifying patterns nearly indistinguishable to humans. CSAW-M doesn’t contain images of active cancer tumours, rather, whether the patient will develop cancer or not. Classification tasks such as this are known to require large datasets for training, which is cumbersome to acquire in the biomedical domain. In this paper we investigate how classification performance of non-trivial classification tasks scale with the size of available annotated images. To research this, a wide range of data-sets are generated from CSAW-M, with varying sample size and cancer types. Three different convolutional neural networks were trained on all data-sets. The study showed that classification performance does increase with the size of the annotated dataset. All three networks generally improved their prediction on the supplied benchmarking dataset. However, the improvements were very small and the research question could not be conclusively answered. The primary reasons for this was the challenging nature of the classification task, and the size of the data-set. Further research is required to gain more understanding of how much data is needed to yield a usable model. / CSAW-M är ett dataset som innehåller ungefär 10 000 röntgenbilder skapade från ett stort antal mammografier. Mammografi används för att identifiera patienter med bröstcancer genom en screeningprocess med målet att fånga cancerfall tidigt. Moderna konvolutionella neurala nätverk är mycket sofistikerade och kan tränas till att identifiera mönster i bilder mycket bättre än människor. CSAW-M innehåller inga bilder av cancertumörer, utan istället data på huruvida patienten kommer att utveckla cancer eller inte. Klassificeringsuppgifter som denna är kända för att kräva stora datamängder för träning, vilket är svårt att införskaffa inom den biomedicinska domänen. I denna artikel undersöker vi hur klassificerings prestanda för svåra klassificeringsuppgifter skalar med storleken på tillgänglig annoterad data. För att undersöka detta, genererades ett antal nya dataset från CSAW-M, med varierande storleksurval och cancertyp. Tre olika konvolutionella neurala nätverk tränades på alla nya data-set. Studien visar att klassificeringsprestanda ökar med storleken på den annoterade datamängden. Alla tre nätverk förbättrade generellt sin klassificeringsprestanda desto större urval som gjordes från CSAW-M. Förbättringarna var dock små och den studerade frågan kunde inte besvaras fullständigt. De främsta anledningarna till detta var klassificeringsuppgiftens utmanande karaktär och storleken på det tillgängliga datat i CSAW-M. Ytterligare forskning krävs för att få mer förståelse för hur mycket data som behövs för att skapa en användbar modell.
|
25 |
Fuzzy transfer learningShell, Jethro January 2013 (has links)
The use of machine learning to predict output from data, using a model, is a well studied area. There are, however, a number of real-world applications that require a model to be produced but have little or no data available of the specific environment. These situations are prominent in Intelligent Environments (IEs). The sparsity of the data can be a result of the physical nature of the implementation, such as sensors placed into disaster recovery scenarios, or where the focus of the data acquisition is on very defined user groups, in the case of disabled individuals. Standard machine learning approaches focus on a need for training data to come from the same domain. The restrictions of the physical nature of these environments can severely reduce data acquisition making it extremely costly, or in certain situations, impossible. This impedes the ability of these approaches to model the environments. It is this problem, in the area of IEs, that this thesis is focussed. To address complex and uncertain environments, humans have learnt to use previously acquired information to reason and understand their surroundings. Knowledge from different but related domains can be used to aid the ability to learn. For example, the ability to ride a road bicycle can help when acquiring the more sophisticated skills of mountain biking. This humanistic approach to learning can be used to tackle real-world problems where a-priori labelled training data is either difficult or not possible to gain. The transferral of knowledge from a related, but differing context can allow for the reuse and repurpose of known information. In this thesis, a novel composition of methods are brought together that are broadly based on a humanist approach to learning. Two concepts, Transfer Learning (TL) and Fuzzy Logic (FL) are combined in a framework, Fuzzy Transfer Learning (FuzzyTL), to address the problem of learning tasks that have no prior direct contextual knowledge. Through the use of a FL based learning method, uncertainty that is evident in dynamic environments is represented. By combining labelled data from a contextually related source task, and little or no unlabelled data from a target task, the framework is shown to be able to accomplish predictive tasks using models learned from contextually different data. The framework incorporates an additional novel five stage online adaptation process. By adapting the underlying fuzzy structure through the use of previous labelled knowledge and new unlabelled information, an increase in predictive performance is shown. The framework outlined is applied to two differing real-world IEs to demonstrate its ability to predict in uncertain and dynamic environments. Through a series of experiments, it is shown that the framework is capable of predicting output using differing contextual data.
|
26 |
Transfer learning for object category detectionAytar, Yusuf January 2014 (has links)
Object category detection, the task of determining if one or more instances of a category are present in an image with their corresponding locations, is one of the fundamental problems of computer vision. The task is very challenging because of the large variations in imaged object appearance, particularly due to the changes in viewpoint, illumination and intra-class variance. Although successful solutions exist for learning object category detectors, they require massive amounts of training data. Transfer learning builds upon previously acquired knowledge and thus reduces training requirements. The objective of this work is to develop and apply novel transfer learning techniques specific to the object category detection problem. This thesis proposes methods which not only address the challenges of performing transfer learning for object category detection such as finding relevant sources for transfer, handling aspect ratio mismatches and considering the geometric relations between the features; but also enable large scale object category detection by quickly learning from considerably fewer training samples and immediate evaluation of models on web scale data with the help of part-based indexing. Several novel transfer models are introduced such as: (a) rigid transfer for transferring knowledge between similar classes, (b) deformable transfer which tolerates small structural changes by deforming the source detector while performing the transfer, and (c) part level transfer particularly for the cases where full template transfer is not possible due to aspect ratio mismatches or not having adequately similar sources. Building upon the idea of using part-level transfer, instead of performing an exhaustive sliding window search, part-based indexing is proposed for efficient evaluation of templates enabling us to obtain immediate detection results in large scale image collections. Furthermore, easier and more robust optimization methods are developed with the help of feature maps defined between proposed transfer learning formulations and the “classical” SVM formulation.
|
27 |
Bayesian Learning with Dependency Structures via Latent Factors, Mixtures, and CopulasHan, Shaobo January 2016 (has links)
<p>Bayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.</p> / Dissertation
|
28 |
Image enhancement effect on the performance of convolutional neural networksChen, Xiaoran January 2019 (has links)
Context. Image enhancement algorithms can be used to enhance the visual effects of images in the field of human vision. So can image enhancement algorithms be used in the field of computer vision? The convolutional neural network, as the most powerful image classifier at present, has excellent performance in the field of image recognition. This paper explores whether image enhancement algorithms can be used to improve the performance of convolutional neural networks. Objectives. The purpose of this paper is to explore the effect of image enhancement algorithms on the performance of CNN models in deep learning and transfer learning, respectively. The article selected five different image enhancement algorithms, they are the contrast limited adaptive histogram equalization (CLAHE), the successive means of the quantization transform (SMQT), the adaptive gamma correction, the wavelet transform, and the Laplace operator. Methods. In this paper, experiments are used as research methods. Three groups of experiments are designed; they respectively explore whether the enhancement of grayscale images can improve the performance of CNN in deep learning, whether the enhancement of color images can improve the performance of CNN in deep learning and whether the enhancement of RGB images can improve the performance of CNN in transfer learning?Results. In the experiment, in deep learning, when training a complete CNN model, using the Laplace operator to enhance the gray image can improve the recall rate of CNN. However, the remaining image enhancement algorithms cannot improve the performance of CNN in both grayscale image datasets and color image datasets. In addition, in transfer learning, when fine-tuning the pre-trained CNN model, using contrast limited adaptive histogram equalization (CLAHE), successive means quantization transform (SMQT), Wavelet transform, and Laplace operator will reduce the performance of CNN. Conclusions. Experiments show that in deep learning, using image enhancement algorithms may improve CNN performance when training complete CNN models, but not all image enhancement algorithms can improve CNN performance; in transfer learning, when fine-tuning the pre- trained CNN model, image enhancement algorithms may reduce the performance of CNN.
|
29 |
Transfer Learning for Image Classification / Transfert de connaissances pour la classification des images -Lu, Ying 09 November 2017 (has links)
Lors de l’apprentissage d’un modèle de classification pour un nouveau domaine cible avec seulement une petite quantité d’échantillons de formation, l’application des algorithmes d’apprentissage automatiques conduit généralement à des classifieurs surdimensionnés avec de mauvaises compétences de généralisation. D’autre part, recueillir un nombre suffisant d’échantillons de formation étiquetés manuellement peut s’avérer très coûteux. Les méthodes de transfert d’apprentissage visent à résoudre ce type de problèmes en transférant des connaissances provenant d’un domaine source associé qui contient beaucoup plus de données pour faciliter la classification dans le domaine cible. Selon les différentes hypothèses sur le domaine cible et le domaine source, l’apprentissage par transfert peut être classé en trois catégories: apprentissage par transfert inductif, apprentissage par transfert transducteur (adaptation du domaine) et apprentissage par transfert non surveillé. Nous nous concentrons sur le premier qui suppose que la tâche cible et la tâche source sont différentes mais liées. Plus précisément, nous supposons que la tâche cible et la tâche source sont des tâches de classification, tandis que les catégories cible et les catégories source sont différentes mais liées. Nous proposons deux méthodes différentes pour aborder ce problème. Dans le premier travail, nous proposons une nouvelle méthode d’apprentissage par transfert discriminatif, à savoir DTL(Discriminative Transfer Learning), combinant une série d’hypothèses faites à la fois par le modèle appris avec les échantillons de cible et les modèles supplémentaires appris avec des échantillons des catégories sources. Plus précisément, nous utilisons le résidu de reconstruction creuse comme discriminant de base et améliore son pouvoir discriminatif en comparant deux résidus d’un dictionnaire positif et d’un dictionnaire négatif. Sur cette base, nous utilisons des similitudes et des dissemblances en choisissant des catégories sources positivement corrélées et négativement corrélées pour former des dictionnaires supplémentaires. Une nouvelle fonction de coût basée sur la statistique de Wilcoxon-Mann-Whitney est proposée pour choisir les dictionnaires supplémentaires avec des données non équilibrées. En outre, deux processus de Boosting parallèles sont appliqués à la fois aux distributions de données positives et négatives pour améliorer encore les performances du classificateur. Sur deux bases de données de classification d’images différentes, la DTL proposée surpasse de manière constante les autres méthodes de l’état de l’art du transfert de connaissances, tout en maintenant un temps d’exécution très efficace. Dans le deuxième travail, nous combinons le pouvoir du transport optimal (OT) et des réseaux de neurones profond (DNN) pour résoudre le problème ITL. Plus précisément, nous proposons une nouvelle méthode pour affiner conjointement un réseau de neurones avec des données source et des données cibles. En ajoutant une fonction de perte du transfert optimal (OT loss) entre les prédictions du classificateur source et cible comme une contrainte sur le classificateur source, le réseau JTLN (Joint Transfer Learning Network) proposé peut effectivement apprendre des connaissances utiles pour la classification cible à partir des données source. En outre, en utilisant différents métriques comme matrice de coût pour la fonction de perte du transfert optimal, JTLN peut intégrer différentes connaissances antérieures sur la relation entre les catégories cibles et les catégories sources. Nous avons effectué des expérimentations avec JTLN basées sur Alexnet sur les jeux de données de classification d’image et les résultats vérifient l’efficacité du JTLN proposé. A notre connaissances, ce JTLN proposé est le premier travail à aborder ITL avec des réseaux de neurones profond (DNN) tout en intégrant des connaissances antérieures sur la relation entre les catégories cible et source. / When learning a classification model for a new target domain with only a small amount of training samples, brute force application of machine learning algorithms generally leads to over-fitted classifiers with poor generalization skills. On the other hand, collecting a sufficient number of manually labeled training samples may prove very expensive. Transfer Learning methods aim to solve this kind of problems by transferring knowledge from related source domain which has much more data to help classification in the target domain. Depending on different assumptions about target domain and source domain, transfer learning can be further categorized into three categories: Inductive Transfer Learning, Transductive Transfer Learning (Domain Adaptation) and Unsupervised Transfer Learning. We focus on the first one which assumes that the target task and source task are different but related. More specifically, we assume that both target task and source task are classification tasks, while the target categories and source categories are different but related. We propose two different methods to approach this ITL problem. In the first work we propose a new discriminative transfer learning method, namely DTL, combining a series of hypotheses made by both the model learned with target training samples, and the additional models learned with source category samples. Specifically, we use the sparse reconstruction residual as a basic discriminant, and enhance its discriminative power by comparing two residuals from a positive and a negative dictionary. On this basis, we make use of similarities and dissimilarities by choosing both positively correlated and negatively correlated source categories to form additional dictionaries. A new Wilcoxon-Mann-Whitney statistic based cost function is proposed to choose the additional dictionaries with unbalanced training data. Also, two parallel boosting processes are applied to both the positive and negative data distributions to further improve classifier performance. On two different image classification databases, the proposed DTL consistently out performs other state-of-the-art transfer learning methods, while at the same time maintaining very efficient runtime. In the second work we combine the power of Optimal Transport and Deep Neural Networks to tackle the ITL problem. Specifically, we propose a novel method to jointly fine-tune a Deep Neural Network with source data and target data. By adding an Optimal Transport loss (OT loss) between source and target classifier predictions as a constraint on the source classifier, the proposed Joint Transfer Learning Network (JTLN) can effectively learn useful knowledge for target classification from source data. Furthermore, by using different kind of metric as cost matrix for the OT loss, JTLN can incorporate different prior knowledge about the relatedness between target categories and source categories. We carried out experiments with JTLN based on Alexnet on image classification datasets and the results verify the effectiveness of the proposed JTLN in comparison with standard consecutive fine-tuning. To the best of our knowledge, the proposed JTLN is the first work to tackle ITL with Deep Neural Networks while incorporating prior knowledge on relatedness between target and source categories. This Joint Transfer Learning with OT loss is general and can also be applied to other kind of Neural Networks.
|
30 |
A deep learning model for scene recognitionMeng, Zhaoxin January 2019 (has links)
Scene recognition is a hot research topic in the field of image recognition. It is necessary that we focus on the research on scene recognition, because it is helpful to the scene understanding topic, and can provide important contextual information for object recognition. The traditional approaches for scene recognition still have a lot of shortcomings. In these years, the deep learning method, which uses convolutional neural network, has got state-of-the-art results in this area. This thesis constructs a model based on multi-layer feature extraction of CNN and transfer learning for scene recognition tasks. Because scene images often contain multiple objects, there may be more useful local semantic information in the convolutional layers of the network, which may be lost in the full connected layers. Therefore, this paper improved the traditional architecture of CNN, adopted the existing improvement which enhanced the convolution layer information, and extracted it using Fisher Vector. Then this thesis introduced the idea of transfer learning, and tried to introduce the knowledge of two different fields, which are scene and object. We combined the output of these two networks to achieve better results. Finally, this thesis implemented the method using Python and PyTorch. This thesis applied the method to two famous scene datasets. the UIUC-Sports and Scene-15 datasets. Compared with traditional CNN AlexNet architecture, we improve the result from 81% to 93% in UIUC-Sports, and from 79% to 91% in Scene- 15. It shows that our method has good performance on scene recognition tasks.
|
Page generated in 0.0972 seconds