Spelling suggestions: "subject:"transferlearning"" "subject:"transferleading""
161 |
Reinforcement Learning for Control of a Multi-Input, Multi-Output Model of the Human ArmCrowder, Douglas Cale 01 September 2021 (has links)
No description available.
|
162 |
Feature Fusion Deep Learning Method for Video and Audio Based Emotion RecognitionYanan Song (11825003) 20 December 2021 (has links)
In this thesis, we proposed a deep learning based emotion recognition system in order
to improve the successive classification rate. We first use transfer learning to extract visual
features and use Mel frequency Cepstral Coefficients(MFCC) to extract audio features, and
then apply the recurrent neural networks(RNN) with attention mechanism to process the
sequential inputs. After that, the outputs of both channels are fused into a concatenate layer,
which is processed using batch normalization, to reduce internal covariate shift. Finally, the
classification result is obtained by the softmax layer. From our experiments, the video and
audio subsystem achieve 78% and 77% respectively, and the feature fusion system with
video and audio achieves 92% accuracy based on the RAVDESS dataset for eight emotion
classes. Our proposed feature fusion system outperforms conventional methods in terms of
classification prediction.
|
163 |
A Transfer Learning Approach to Object Detection Acceleration for Embedded ApplicationsLauren M Vance (10986807) 05 August 2021 (has links)
<p>Deep learning solutions to computer vision tasks
have revolutionized many industries in recent years, but embedded systems have
too many restrictions to take advantage of current state-of-the-art configurations.
Typical embedded processor hardware configurations must meet very low power and
memory constraints to maintain small and lightweight packaging, and the
architectures of the current best deep learning models are too computationally
intensive for these hardware configurations. Current research shows that
convolutional neural networks (CNNs) can be deployed with a few architectural
modifications on Field-Programmable Gate Arrays (FPGAs) resulting in minimal
loss of accuracy, similar or decreased processing speeds, and lower power
consumption when compared to general-purpose Central Processing Units (CPUs)
and Graphics Processing Units (GPUs). This research contributes further to
these findings with the FPGA implementation of a YOLOv4 object detection model
that was developed with the use of transfer learning. The transfer-learned
model uses the weights of a model pre-trained on the MS-COCO dataset as a
starting point then fine-tunes only the output layers for detection on more
specific objects of five classes. The model architecture was then modified slightly
for compatibility with the FPGA hardware using techniques such as weight
quantization and replacing unsupported activation layer types. The model was deployed
on three different hardware setups (CPU, GPU, FPGA) for inference on a test set
of images. It was found that the FPGA was able to achieve real-time inference speeds
of 33.77 frames-per-second, a speedup of 7.74 frames-per-second when compared
to GPU deployment. The model also consumed 96% less power than a GPU
configuration with only approximately 4% average loss in accuracy across all 5
classes. The results are even more striking when compared to CPU deployment,
with 131.7-times speedup in inference throughput. CPUs have long since been
outperformed by GPUs for deep learning applications but are used in most
embedded systems. These results further illustrate the advantages of FPGAs for
deep learning inference on embedded systems even when transfer learning is used
for an efficient end-to-end deployment process. This work advances current
state-of-the-art with the implementation of a YOLOv4 object detection model developed
with transfer learning for FPGA deployment.</p>
|
164 |
Inspecting product quality with computer vision techniques : Comparing traditional image processingmethodswith deep learning methodson small datasets in finding surface defectsHult, Jim, Pihl, Pontus January 2021 (has links)
Quality control is an important part of any production line. It can be done manually but is most efficient if automated. Inspecting qualitycan include many different processes but this thesisisfocusedon the visual inspection for cracks and scratches. The best way of doingthis at the time of writing is with the help of Artificial Intelligence (AI), more specifically Deep Learning (DL).However, these need a training datasetbeforehand to train on and for some smaller companies, this mightnotbean option. This study triesto find an alternative visual inspection method,that does notrelyon atrained deep learning modelfor when trainingdata is severely limited. Our method is to use edge detection algorithmsin combination with a template to find any edge that doesn’t belong. These include scratches, cracks, or misaligned stickers. These anomalies arethen highlighted in the original picture to show where the defect is. Since deep learningis stateof the art ofvisual inspection, it is expected to outperform template matching when sufficiently trained.To find where this occurs,the accuracy of template matching iscompared to the accuracy of adeep learning modelat different training levels. The deep learning modelisto be trained onimage augmenteddatasets of size: 6, 12, 24, 48, 84, 126, 180, 210, 315, and 423. Both template matching and the deep learning modelwas tested on the samebalanceddataset of size 216. Half of the dataset was images of scratched units,and the other half was of unscratched units. This gave a baseline of 50% where anything under would be worse thanjust guessing. Template matching achieved an accuracy of 88%, and the deep learning modelaccuracyrose from 51% to 100%as the training setincreased. This makes template matching have better accuracy then AI trained on dataset of 84imagesor smaller. But a deep learning modeltrained on 126 images doesstart to outperform template matching. Template matching did perform well where no data was available and training adeep learning modelis no option. But unlike a deep learning model, template matching would not need retraining to find other kinds of surface defects. Template matching could also be used to find for example, misplaced stickers. Due to the use of a template, any edge that doesnot match isdetected. The ways to train deep learning modelis highly customizable to the users need. Due to resourceand knowledge restrictions, a deep dive into this subject was not conducted.For template matching, only Canny edge detection was used whenmeasuringaccuracy. Other edge detection methodssuch as, Sobel, and Prewitt was ruledoutearlier in this study.
|
165 |
Regularization schemes for transfer learning with convolutional networks / Stratégies de régularisation pour l'apprentissage par transfert des réseaux de neurones à convolutionLi, Xuhong 10 September 2019 (has links)
L’apprentissage par transfert de réseaux profonds réduit considérablement les coûts en temps de calcul et en données du processus d’entraînement des réseaux et améliore largement les performances de la tâche cible par rapport à l’apprentissage à partir de zéro. Cependant, l’apprentissage par transfert d’un réseau profond peut provoquer un oubli des connaissances acquises lors de l’apprentissage de la tâche source. Puisque l’efficacité de l’apprentissage par transfert vient des connaissances acquises sur la tâche source, ces connaissances doivent être préservées pendant le transfert. Cette thèse résout ce problème d’oubli en proposant deux schémas de régularisation préservant les connaissances pendant l’apprentissage par transfert. Nous examinons d’abord plusieurs formes de régularisation des paramètres qui favorisent toutes explicitement la similarité de la solution finale avec le modèle initial, par exemple, L1, L2, et Group-Lasso. Nous proposons également les variantes qui utilisent l’information de Fisher comme métrique pour mesurer l’importance des paramètres. Nous validons ces approches de régularisation des paramètres sur différentes tâches de segmentation sémantique d’image ou de calcul de flot optique. Le second schéma de régularisation est basé sur la théorie du transport optimal qui permet d’estimer la dissimilarité entre deux distributions. Nous nous appuyons sur la théorie du transport optimal pour pénaliser les déviations des représentations de haut niveau entre la tâche source et la tâche cible, avec le même objectif de préserver les connaissances pendant l’apprentissage par transfert. Au prix d’une légère augmentation du temps de calcul pendant l’apprentissage, cette nouvelle approche de régularisation améliore les performances des tâches cibles et offre une plus grande précision dans les tâches de classification d’images par rapport aux approches de régularisation des paramètres. / Transfer learning with deep convolutional neural networks significantly reduces the computation and data overhead of the training process and boosts the performance on the target task, compared to training from scratch. However, transfer learning with a deep network may cause the model to forget the knowledge acquired when learning the source task, leading to the so-called catastrophic forgetting. Since the efficiency of transfer learning derives from the knowledge acquired on the source task, this knowledge should be preserved during transfer. This thesis solves this problem of forgetting by proposing two regularization schemes that preserve the knowledge during transfer. First we investigate several forms of parameter regularization, all of which explicitly promote the similarity of the final solution with the initial model, based on the L1, L2, and Group-Lasso penalties. We also propose the variants that use Fisher information as a metric for measuring the importance of parameters. We validate these parameter regularization approaches on various tasks. The second regularization scheme is based on the theory of optimal transport, which enables to estimate the dissimilarity between two distributions. We benefit from optimal transport to penalize the deviations of high-level representations between the source and target task, with the same objective of preserving knowledge during transfer learning. With a mild increase in computation time during training, this novel regularization approach improves the performance of the target tasks, and yields higher accuracy on image classification tasks compared to parameter regularization approaches.
|
166 |
On Transfer Learning Techniques for Machine LearningDebasmit Das (8314707) 30 April 2020 (has links)
<pre><pre><p>
</p><p>Recent progress in machine learning has been mainly due to
the availability of large amounts of annotated data used for training complex
models with deep architectures. Annotating this training data becomes
burdensome and creates a major bottleneck in maintaining machine-learning
databases. Moreover, these trained models fail to generalize to new categories
or new varieties of the same categories. This is because new categories or new
varieties have data distribution different from the training data distribution.
To tackle these problems, this thesis proposes to develop a family of
transfer-learning techniques that can deal with different training (source) and
testing (target) distributions with the assumption that the availability of
annotated data is limited in the testing domain. This is done by using the
auxiliary data-abundant source domain from which useful knowledge is
transferred that can be applied to data-scarce target domain. This transferable
knowledge serves as a prior that biases target-domain predictions and prevents
the target-domain model from overfitting. Specifically, we explore structural
priors that encode relational knowledge between different data entities, which
provides more informative bias than traditional priors. The choice of the
structural prior depends on the information availability and the similarity
between the two domains. Depending on the domain similarity and the information
availability, we divide the transfer learning problem into four major
categories and propose different structural priors to solve each of these
sub-problems.</p><p>
</p><p>This thesis first focuses on the
unsupervised-domain-adaptation problem, where we propose to minimize domain
discrepancy by transforming labeled source-domain data to be close to unlabeled
target-domain data. For this problem,
the categories remain the same across the two domains and hence we assume that
the structural relationship between the source-domain samples is carried over
to the target domain. Thus, graph or hyper-graph is constructed as the
structural prior from both domains and a graph/hyper-graph matching formulation
is used to transform samples in the source domain to be closer to samples in
the target domain. An efficient optimization scheme is then proposed to tackle
the time and memory inefficiencies associated with the matching problem. The
few-shot learning problem is studied next, where we propose to transfer
knowledge from source-domain categories containing abundantly labeled data to
novel categories in the target domain that contains only few labeled data. The
knowledge transfer biases the novel category predictions and prevents the model
from overfitting. The knowledge is encoded using a neural-network-based prior
that transforms a data sample to its corresponding class prototype. This neural
network is trained from the source-domain data and applied to the target-domain
data, where it transforms the few-shot samples to the novel-class prototypes
for better recognition performance. The few-shot learning problem is then
extended to the situation, where we do not have access to the source-domain
data but only have access to the source-domain class prototypes. In this limited
information setting, parametric neural-network-based priors would overfit to
the source-class prototypes and hence we seek a non-parametric-based prior
using manifolds. A piecewise linear manifold is used as a structural prior to
fit the source-domain-class prototypes. This structure is extended to the
target domain, where the novel-class prototypes are found by projecting the
few-shot samples onto the manifold. Finally, the zero-shot learning problem is
addressed, which is an extreme case of the few-shot learning problem where we
do not have any labeled data in the target domain. However, we have high-level
information for both the source and target domain categories in the form of
semantic descriptors. We learn the relation between the sample space and the
semantic space, using a regularized neural network so that classification of
the novel categories can be carried out in a common representation space. This
same neural network is then used in the target domain to relate the two spaces.
In case we want to generate data for the novel categories in the target domain,
we can use a constrained generative adversarial network instead of a
traditional neural network. Thus, we use structural priors like graphs, neural
networks and manifolds to relate various data entities like samples, prototypes
and semantics for these different transfer learning sub-problems. We explore
additional post-processing steps like pseudo-labeling, domain adaptation and
calibration and enforce algorithmic and architectural constraints to further
improve recognition performance. Experimental results on standard transfer
learning image recognition datasets produced competitive results with respect
to previous work. Further experimentation and analyses of these methods
provided better understanding of machine learning as well.</p><p>
</p></pre></pre>
|
167 |
Knowledge transfer for image understanding / Transfert de connaissance pour la compréhension des imagesKulkarni, Praveen 23 January 2017 (has links)
Le Transfert de Connaissance (Knowledge Transfer or Transfer Learning) est une solution prometteuse au difficile problème de l’apprentissage des réseaux profonds au moyen de bases d’apprentissage de petite taille, en présence d’une grande variabilité visuelle intra-classe. Dans ce travail, nous reprenons ce paradigme, dans le but d’étendre les capacités des CNN les plus récents au problème de la classification. Dans un premier temps, nous proposons plusieurs techniques permettant, lors de l’apprentissage et de la prédiction, une réduction des ressources nécessaires – une limitation connue des CNN. (i) En utilisant une méthode hybride combinant des techniques classiques comme des Bag-Of-Words (BoW) avec des CNN. (iv) En introduisant une nouvelle méthode d’agrégation intégrée à une structure de type CNN ainsi qu’un modèle non-linéaire s’appuyant sur des parties de l’image. La contribution clé est, finalement, une technique capable d’isoler les régions des images utiles pour une représentation locale. De plus, nous proposons une méthode nouvelle pour apprendre une représentation structurée des coefficients des réseaux de neurones. Nous présentons des résultats sur des jeux de données difficiles, ainsi que des comparaisons avec des méthodes concurrentes récentes. Nous prouvons que les méthodes proposées s’étendent à d’autres tâches de reconnaissance visuelles comme la classification d’objets, de scènes ou d’actions. / Knowledge transfer is a promising solution for the difficult problem of training deep convolutional neural nets (CNNs) using only small size training datasets with a high intra-class visual variability. In this thesis work, we explore this paradigm to extend the ability of state-of-the-art CNNs for image classification.First, we propose several effective techniques to reduce the training and test-time computational burden associated to CNNs:(i) Using a hybrid method to combine conventional, unsupervised aggregators such as Bag-of-Words (BoW) with CNNs;(ii) Introducing a novel pooling methods within a CNN framework along with non-linear part-based models. The key contribution lies in a technique able to discover useful regions per image involved in the pooling of local representations;In addition, we also propose a novel method to learn the structure of weights in deep neural networks. Experiments are run on challenging datasets with comparisons against state-of-the-art methods. The methods proposed are shown to generalize to different visual recognition tasks, such as object, scene or action classification.
|
168 |
Implications of Conversational AI on Humanoid RobotsSoudamalla, Sharath Kumar 09 October 2020 (has links)
Humanizing Technologies GmbH develops Intelligent software for the humanoid robots from Softbank Robotics. The main objective of this thesis is to develop and deploy Conversational Artificial Intelligence software into the humanoid robots using deep learning techniques. Development of conversational agents using Machine Learning or Artificial Intelligence is an intriguing issue with regards to Natural Language Processing. Great research and experimentation is being conducted in this area. Currently most of the chatbots are developed with rule based programming that cannot hold conversation which replicates real human interaction. This issue is addressed in this thesis with the development of Deep learning conversational AI based on Sequence to sequence, Attention mechanism, Transfer learning, Active learning and Beam search decoding which emulates human like conversation. The
complete end to end conversational AI software is designed, implemented and deployed in this thesis work according to the conceptual specifications. The research objectives are successfully accomplished and results of the proposed concept are dis-
cussed in detail.
|
169 |
Improving Biometric Log Detection with Partitioning and Filtering of the Search SpaceRajabli, Nijat January 2021 (has links)
Tracking of tree logs from a harvesting site to its processing site is a legal requirement for timber-based industries for social and economic reasons. Biometric tree log detection systems use images of the tree logs to track the logs by checking whether a given log image matches any of the logs registered in the system. However, as the number of registered tree logs in the database increases, the number of pairwise comparisons, and consequently the search time increase proportionally. Growing search space degrades the accuracy and the response time of matching queries and slows down the tracking process, costing time and resources. This work introduces database filtering and partitioning approaches based on discriminative log-end features to reduce the search space of the biometric log identification algorithms. In this study, 252 unique log images are used to train and test models for extracting features from the log images and to filter and cluster a database of logs. Experiments are carried out to show the end-to-end accuracy and speed-up impact of the individual approaches as well as the combinations thereof. The findings of this study indicate that the proposed approaches are suited for speeding-up tree log identification systems and highlight further opportunities in this field
|
170 |
Sentiment Analysis of Financial News with Supervised LearningSyeda, Farha Shazmeen January 2020 (has links)
Financial data in banks are unstructured and complicated. It is challenging to analyze these texts manually due to the small amount of labeled training data in financial text. Moreover, the financial text consists of language in the economic domain where a general-purpose model is not efficient. In this thesis, data had collected from MFN (Modular Finance) financial news, this data is scraped and persisted in the database and price indices are collected from Bloomberg terminal. Comprehensive study and tests are conducted to find the state-of-art results for classifying the sentiments using traditional classifiers like Naive Bayes and transfer learning models like BERT and FinBERT. FinBERT outperform the Naive Bayes and BERT classifier. The time-series indices for sentiments are built, and their correlations with price indices calculated using Pearson correlation. Augmented Dickey-Fuller (ADF) is used to check if both the time series data are stationary. Finally, the statistical hypothesis Granger causality test determines if the sentiment time series helps predict price. This result shows that there is a significant correlation and causal relation between sentiments and price.
|
Page generated in 0.0633 seconds