• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 22
  • Tagged with
  • 27
  • 27
  • 26
  • 14
  • 11
  • 10
  • 9
  • 8
  • 8
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Towards Data-efficient Graph Learning

Zhang, Qiannan 05 1900 (has links)
Graphs are commonly employed to model complex data and discover latent patterns and relationships between entities in the real world. Canonical graph learning models have achieved remarkable progress in modeling and inference on graph-structured data that consists of nodes connected by edges. Generally, they leverage abundant labeled data for model training and thus inevitably suffer from the label scarcity issue due to the expense and hardship of data annotation in practice. Data-efficient graph learning attempts to address the prevailing data scarcity issue in graph mining problems, of which the key idea is to transfer knowledge from the related resources to obtain the models with good generalizability to the target graph-related tasks with mere annotations. However, the generalization of the models to data-scarce scenarios is faced with challenges including 1) dealing with graph structure and structural heterogeneity to extract transferable knowledge; 2) selecting beneficial and fine-grained knowledge for effective transfer; 3) addressing the divergence across different resources to promote knowledge transfer. Motivated by the aforementioned challenges, the dissertation mainly focuses on three perspectives, i.e., knowledge extraction with graph heterogeneity, knowledge selection, and knowledge transfer. The purposed models are applied to various node classification and graph classification tasks in the low-data regimes, evaluated on a variety of datasets, and have shown their effectiveness compared with the state-of-the-art baselines.
2

Learning with Limited Labeled Data: Techniques and Applications

Lei, Shuo 11 October 2023 (has links)
Recent advances in large neural network-style models have demonstrated great performance in various applications, such as image generation, question answering, and audio classification. However, these deep and high-capacity models require a large amount of labeled data to function properly, rendering them inapplicable in many real-world scenarios. This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to learn novel classes with limited labeled data, (2) How to adapt a large pre-trained model to the target domain if only unlabeled data is available, (3) How to boost the performance of the few-shot learning model with unlabeled data, and (4) How to utilize limited labeled data to learn new classes without the training data in the same domain. First, we study few-shot learning in text classification tasks. Meta-learning is becoming a popular approach for addressing few-shot text classification and has achieved state-of-the-art performance. However, the performance of existing approaches heavily depends on the interclass variance of the support set. To address this problem, we propose a TART network for few-shot text classification. The model enhances the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. In addition, we design a novel discriminative reference regularization to maximize divergence between transformed prototypes in task-adaptive metric spaces to improve performance further. In the second problem we focus on self-learning in cross-lingual transfer task. Our goal here is to develop a framework that can make the pretrained cross-lingual model continue learning the knowledge with large amount of unlabeled data. Existing self-learning methods in crosslingual transfer tasks suffer from the large number of incorrectly pseudo-labeled samples used in the training phase. We first design an uncertainty-aware cross-lingual transfer framework with pseudo-partial-labels. We also propose a novel pseudo-partial-label estimation method that considers prediction confidences and the limitation to the number of candidate classes. Next, to boost the performance of the few-shot learning model with unlabeled data, we propose a semi-supervised approach for few-shot semantic segmentation task. Existing solutions for few-shot semantic segmentation cannot easily be applied to utilize image-level weak annotations. We propose a class-prototype augmentation method to enrich the prototype representation by utilizing a few image-level annotations, achieving superior performance in one-/multi-way and weak annotation settings. We also design a robust strategy with softmasked average pooling to handle the noise in image-level annotations, which considers the prediction uncertainty and employs the task-specific threshold to mask the distraction. Finally, we study the cross-domain few-shot learning in the semantic segmentation task. Most existing few-shot segmentation methods consider a setting where base classes are drawn from the same domain as the new classes. Nevertheless, gathering enough training data for meta-learning is either unattainable or impractical in many applications. We extend few-shot semantic segmentation to a new task, called Cross-Domain Few-Shot Semantic Segmentation (CD-FSS), which aims to generalize the meta-knowledge from domains with sufficient training labels to low-resource domains. Then, we establish a new benchmark for the CD-FSS task and evaluate both representative few-shot segmentation methods and transfer learning based methods on the proposed benchmark. We then propose a novel Pyramid-AnchorTransformation based few-shot segmentation network (PATNet), in which domain-specific features are transformed into domain-agnostic ones for downstream segmentation modules to fast adapt to unseen domains. / Doctor of Philosophy / Nowadays, deep learning techniques play a crucial role in our everyday existence. In addition, they are crucial to the success of many e-commerce and local businesses for enhancing data analytics and decision-making. Notable applications include intelligent transportation, intelligent healthcare, the generation of natural language, and intrusion detection, among others. To achieve reasonable performance on a new task, these deep and high-capacity models require thousands of labeled examples, which increases the data collection effort and computation costs associated with training a model. Moreover, in many disciplines, it might be difficult or even impossible to obtain data due to concerns such as privacy and safety. This dissertation focuses on learning with limited labeled data in natural language processing and computer vision tasks. To recognize novel classes with a few examples in text classification tasks, we develop a deep learning-based model that can capture both cross- task transferable knowledge and task-specific features. We also build an uncertainty-aware self-learning framework and a semi-supervised few-shot learning method, which allow us to boost the pre-trained model with easily accessible unlabeled data. In addition, we propose a cross-domain few-shot semantic segmentation method to generalize the model to different domains with a few examples. By handling these unique challenges in learning with limited labeled data and developing suitable approaches, we hope to improve the efficiency and generalization of deep learning methods in the real world.
3

Evaluating Transcription of Ciphers with Few-Shot Learning

Milioni, Nikolina January 2022 (has links)
Ciphers are encrypted documents created to hide their content from those who were not the receivers of the message. Different types of symbols, such as zodiac signs, alchemical symbols, alphabet letters or digits are exploited to compose the encrypted text which needs to be decrypted to gain access to the content of the documents. The first step before decryption is the transcription of the cipher. The purpose of this thesis is to evaluate an automatic transcription tool from image to a text format to provide a transcription of the cipher images. We implement a supervised few-shot deep-learning model which is tested on different types of encrypted documents and use various evaluation metrics to assess the results. We show that the few-shot model presents promising results on seen data with Symbol Error Rates (SER) ranging from 8.21% to 47.55% and accuracy scores from 80.13% to 90.27%, whereas SER in out-of-domain datasets reaches 79.91%. While a wide range of symbols are correctly transcribed, the erroneous symbols mainly contain diacritics or are punctuation marks.
4

The "What"-"Where" Network: A Tool for One-Shot Image Recognition and Localization

Hurlburt, Daniel 06 January 2021 (has links)
One common shortcoming of modern computer vision is the inability of most models to generalize to new classes—one/few shot image recognition. We propose a new problem formulation for this task and present a network architecture and training methodology to solve this task. Further, we provide insights into how careful focus on how not just the data, but the way data presented to the model can have significant impact on performance. Using these method, we achieve high accuracy in few-shot image recognition tasks.
5

Evaluating and Fine-Tuning a Few-Shot Model for Transcription of Historical Ciphers

Eliasson, Ingrid January 2023 (has links)
Thousands of historical ciphers, encrypted manuscripts, are stored in archives across Europe. Historical cryptology is the research field concerned with studying these manuscripts - combining the interest of humanistic fields with methods of cryptography and computational linguistics. Before a cipher can be decrypted by automatic means, it must first be transcribed into machine-readable digital text. Image processing techniques and Deep Learning have enabled transcription of handwritten text to be performed automatically, but the task faces challenges when ciphers constitute the target data. The main reason is a lack of labeled data, caused by the heterogeneity of handwriting and the tendency of ciphers to employ unique symbol sets. Few-Shot Learning is a machine learning framework which reduces the need for labeled data, using pretrained models in combination with support sets containing a few labeled examples from the target data set. This project is concerned with evaluating a Few-Shot model on the task of transcription of historical ciphers. The model is tested on pages from three in-domain ciphers which vary in handwriting style and symbol sets. The project also investigates the use of further fine-tuning the model by training it on a limited amount of labeled symbol examples from the respective target ciphers. We find that the performance of the model is dependant on the handwriting style of the target document, and that certain model parameters should be explored individually for each data set. We further show that fine-tuning the model is indeed efficient, lowering the Symbol Error Rate (SER) at best 27.6 percentage points.
6

Few-Shot Malware Detection Using A Novel Adversarial Reprogramming Model

Kumar, Ekula Praveen January 2022 (has links)
No description available.
7

Few-Shot Learning for Quality Inspection

Palmér, Jesper, Alsalehy, Ahmad January 2023 (has links)
The goal of this project is to find a suitable Few-Shot Learning (FSL) model that can be used in a fault detection system for use in an industrial setting. A dataset of Printed Circuit Board (PCB) images has been created to train different FSL models. This dataset is meant for evaluating FSL models in the specialized setting of fault detection in PCB manufacturing. FSL is a part of deep learning that has seen a large amount of development recently. Few-shot learning allows neural networks to learn on small datasets. In this thesis, various state-of-the-art FSL algorithms are implemented and tested on the custom PCB dataset. Different backbones are used to establish a benchmark for the tested FSL algorithms on three different datasets. Those datasets are ImageNet, PCB Defects, and the created PCB dataset. Our results show that ProtoNets combined with ResNet12 backbone achieved the highest accuracy in two test scenarios. In those tests, the model combination achieved 87.20%and 92.27% in 1-shot and 5-shot test scenarios, respectively. This thesis presents a Few-Shot Anomaly Detection (FSAD) model based on Vision Transformers (ViT). The model is compared to the state-of-the-art FSAD model DevNet on the MVTec-AD dataset. DevNet and ViT are chosen for comparison because they both approach the problem by dividing images into patches. How the models handle the image patches is however very different. The results indicate that ViT Deviation does not obtain as high AUC-ROC and AUC-PR scores as DevNet. This is because of the use of the very deep ViT architecture in the ViT Deviation model. A shallower transformer-based model is believed to be better suited for FSAD. Improvements for ViT Deviation are suggested for future work. The most notable suggested improvement is the use of the FS-CT architecture as a FSAD model because of the high accuracy it achieves in classification. / Målet med detta projekt är att hitta en lämplig Few-Shot Learning(FSL) modell som kan användas i ett feldetekteringssystem för användning i en industriell miljö. Ett dataset av Printed Circuit Board(PCB) bilder har skapats för att träna olika FSL-modeller. Detta datasetär avsedd för att utvärdera FSL-modeller i det specialiserade områdetfeldetektering vid PCB-tillverkning. FSL är en del av djupinlärningsom har utvecklats mycket den senaste tiden. FSL tillåter neuralanätverk att lära sig på små datamängder.I detta examensarbete implementeras och testas olika state-of-theart FSL algoritmer på det anpassade PCB-datasetet. Olika ryggradsmodeller används för att upprätta ett riktmärke för de testade FSL-algoritmernapå tre olika dataset. Dessa dataset är ImageNet[6], PCB Defects[14]och det skapade PCB-datasetet. Våra resultat visar att ProtoNets ikombination med ResNet12-ryggraden uppnådde den högsta noggrannheten i två testscenarier. I dessa tester uppnådde modellkombinationen 87,20% och 92,27% i testscenarier med 1-shot respektive5-shot.Detta examensarbete presenterar en Few-Shot Anomaly Detectionmodell (FSAD) baserad på Vision Transformers (ViT). Modellen jämförs med FSAD-modellen DevNet på MVTec-AD-datasetet. DevNetoch ViT väljs för jämförelse eftersom de båda angriper problemetgenom att dela upp bilder i mindre lappar. Hur modellerna hanterarlapparna är dock väldigt olika. Resultaten indikerar att ViT-Deviationinte får lika hög AUC-ROC och AUC-PR som DevNet. Detta beror påanvändningen av den mycket djupa ViT-arkitekturen i ViT Deviationmodellen. En grundare ViT-baserad modell tros vara bättre lämpadför FSAD. Förbättringar för ViT-Deviation föreslås för framtida arbete.Den mest anmärkningsvärda föreslagna förbättringen är användningen av FS-CT-arkitekturen som en FSAD-modell på grund av de lovande resultaten den uppnår i klassificering.
8

A comparative evaluation of machine learning models for engagement classification during presentations : A comparison of distance- and non-distance-based machine learning models for presentation classification and class likelihood estimation / En jämförande utvärdering av maskininlärningsmodeller för engagemangsklassificering under presentationer : En jämförelse av distans- och icke-distansbaserade maskininlärningsmodeller för presentationsklassificering och klasssannolikhetsuppskattning

Ali Omer Bajallan, Rebwar January 2022 (has links)
In recent years, there has been a significant increase in the usage of audience engagement platforms, which have allowed for engaging interactions between presenters and their audiences. The increased popularity of the platforms comes from the fact that engaging and interactive presentations have been shown to improve learning outcomes and create positive presentation experiences. However, using the platforms does not guarantee that your audience is engaged and participating. Given that the added value of engaging presentations only applies if the audience is actually engaged, it increases the need to know if and how engaged your audience is. The usage of audience engagement platforms has allowed for new ways of engagement to be studied. By utilizing the data gathered from the interactive presentation sessions, engagement can be studied and quantified through the modeling of the data. As the usage of audience engagement platforms and the study of presentation engagement is relatively new, there exists a limited amount of labeled data quantifying the level of engagement during presentations. To model the data, machine learning models should therefore be trained to generalize by being exposed to a limited number of presentation samples. This technique of training machine learning models is also referred to as few-shot learning. Distance-based machine learning models are defined in this study as models that make classifications and inferences by calculating distances between observations or observation class representations. Distance-based models have previously shown relatively good performance in few-shot learning applications, and interest therefore lies in expanding their application areas. This study presents a comparative evaluation of distance- and non-distance-based machine learning models given the problem of classifying presentations as being engaged or non-engaged, and estimating presentation class likelihoods in a few-shot learning context. A presentation-level dataset was gathered from the interactive presentation sessions, and each presentation observation was labeled as being engaged or non-engaged. The machine learning models were then trained to model the data and evaluated in terms of how well they were able to generalize to unseen testing samples by being exposed to a limited number of training observations. In particular, their classification and class likelihood estimation performances were evaluated. The results conclude that the distance-based models outperformed the non-distance-based models artificial neural network and relevance-vector machine given the presentation class likelihood estimation problem. The metric learning nearest neighbor classifier was the only distance-based model that outperformed all the non-distance-based models given both the presentation classification and class likelihood estimation problems. / Under de senaste åren har det skett en betydande ökning av användningen av plattformar för publikengagemang, vilket har möjliggjort engagerande interaktioner mellan presentatörer och deras publik. Plattformarnas ökade popularitet kommer från det faktum att engagerande och interaktiva presentationer har visat sig förbättra läranderesultat och skapa positiva presentationsupplevelser. Att använda plattformarna garanterar dock inte att din publik är engagerad och deltagande. Med tanke på att mervärdet av engagerande presentationer bara gäller om publiken faktiskt är engagerad, ökar det behovet av att veta om och hur engagerad din publik är. Användningen av plattformar för publikengagemang har gjort det möjligt att på nya sätt studera engagemang. Genom att använda data som samlats in från de interaktiva presentationssessionerna kan engagemang studeras och kvantifieras genom modellering av data. Eftersom användandet av plattformar för publikengagemang och studien av presentationsengagemang är relativt nytt, finns det en begränsad mängd märkt data som kvantifierar nivån av engagemang under presentationerna. För att modellera datan så bör maskininlärningsmodeller tränas att generalisera genom att utsättas för ett begränsad antal presentations observationer. Denna teknik för att träna inlärningsmodeller kallas också few-shot lärande. Distans-baserade maskininlärningsmodeller definieras i denna studie som modeller som gör klassificeringar genom att beräkna avstånd mellan observationer eller observationsklass representationer. Distans-baserade modeller har tidigare visat relativt goda resultat i few-shot inlärning problem, och intresset ligger därför i att utöka deras tillämpningsområden. Denna studie presenterar en jämförande utvärdering av distans- och icke-distans baserade maskininlärningsmodeller givet problemet med att klassificera presentationer som engagerade eller icke-engagerade, och uppskattning av presentation klasssannolikheter i ett few-shot inlärnings sammanhang. Ett dataset på presentationsnivå samlades in från de interaktiva presentationssessionerna, och varje presentation märktes som engagerad eller icke-engagerad. Maskininlärningsmodellerna tränades sedan för att modellera data och utvärderades i termer av hur väl de kunde generalisera till osedda testobservationer givet att de exponeras mot ett begränsat antal träningsobservationer. I synnerhet utvärderades deras klassificering och uppskattning av klasssannolikheter. Resultaten visade att alla distans-baserade modeller var bättre än de icke-distansbaserade modellerna artificial neural network och relevence-vector machine givet problemet med uppskattning av klasssannolikheter. Den distans-baserade metric learning nearest neighbor klassificeraren var den enda avståndsbaserade modellen som överträffade alla icke-distansbaserade modeller givet problemen med presentations klassificering och klasssannolikhets uppskattning.
9

Metric Learning via Linear Embeddings for Human Motion Recognition

Kong, ByoungDoo 18 December 2020 (has links)
We consider the application of Few-Shot Learning (FSL) and dimensionality reduction to the problem of human motion recognition (HMR). The structure of human motion has unique characteristics such as its dynamic and high-dimensional nature. Recent research on human motion recognition uses deep neural networks with multiple layers. Most importantly, large datasets will need to be collected to use such networks to analyze human motion. This process is both time-consuming and expensive since a large motion capture database must be collected and labeled. Despite significant progress having been made in human motion recognition, state-of-the-art algorithms still misclassify actions because of characteristics such as the difficulty in obtaining large-scale leveled human motion datasets. To address these limitations, we use metric-based FSL methods that use small-size data in conjunction with dimensionality reduction. We also propose a modified dimensionality reduction scheme based on the preservation of secants tailored to arbitrary useful distances, such as the geodesic distance learned by ISOMAP. We provide multiple experimental results that demonstrate improvements in human motion classification.
10

On Transfer Learning Techniques for Machine Learning

Debasmit Das (8314707) 30 April 2020 (has links)
<pre><pre><p> </p><p>Recent progress in machine learning has been mainly due to the availability of large amounts of annotated data used for training complex models with deep architectures. Annotating this training data becomes burdensome and creates a major bottleneck in maintaining machine-learning databases. Moreover, these trained models fail to generalize to new categories or new varieties of the same categories. This is because new categories or new varieties have data distribution different from the training data distribution. To tackle these problems, this thesis proposes to develop a family of transfer-learning techniques that can deal with different training (source) and testing (target) distributions with the assumption that the availability of annotated data is limited in the testing domain. This is done by using the auxiliary data-abundant source domain from which useful knowledge is transferred that can be applied to data-scarce target domain. This transferable knowledge serves as a prior that biases target-domain predictions and prevents the target-domain model from overfitting. Specifically, we explore structural priors that encode relational knowledge between different data entities, which provides more informative bias than traditional priors. The choice of the structural prior depends on the information availability and the similarity between the two domains. Depending on the domain similarity and the information availability, we divide the transfer learning problem into four major categories and propose different structural priors to solve each of these sub-problems.</p><p> </p><p>This thesis first focuses on the unsupervised-domain-adaptation problem, where we propose to minimize domain discrepancy by transforming labeled source-domain data to be close to unlabeled target-domain data. For this problem, the categories remain the same across the two domains and hence we assume that the structural relationship between the source-domain samples is carried over to the target domain. Thus, graph or hyper-graph is constructed as the structural prior from both domains and a graph/hyper-graph matching formulation is used to transform samples in the source domain to be closer to samples in the target domain. An efficient optimization scheme is then proposed to tackle the time and memory inefficiencies associated with the matching problem. The few-shot learning problem is studied next, where we propose to transfer knowledge from source-domain categories containing abundantly labeled data to novel categories in the target domain that contains only few labeled data. The knowledge transfer biases the novel category predictions and prevents the model from overfitting. The knowledge is encoded using a neural-network-based prior that transforms a data sample to its corresponding class prototype. This neural network is trained from the source-domain data and applied to the target-domain data, where it transforms the few-shot samples to the novel-class prototypes for better recognition performance. The few-shot learning problem is then extended to the situation, where we do not have access to the source-domain data but only have access to the source-domain class prototypes. In this limited information setting, parametric neural-network-based priors would overfit to the source-class prototypes and hence we seek a non-parametric-based prior using manifolds. A piecewise linear manifold is used as a structural prior to fit the source-domain-class prototypes. This structure is extended to the target domain, where the novel-class prototypes are found by projecting the few-shot samples onto the manifold. Finally, the zero-shot learning problem is addressed, which is an extreme case of the few-shot learning problem where we do not have any labeled data in the target domain. However, we have high-level information for both the source and target domain categories in the form of semantic descriptors. We learn the relation between the sample space and the semantic space, using a regularized neural network so that classification of the novel categories can be carried out in a common representation space. This same neural network is then used in the target domain to relate the two spaces. In case we want to generate data for the novel categories in the target domain, we can use a constrained generative adversarial network instead of a traditional neural network. Thus, we use structural priors like graphs, neural networks and manifolds to relate various data entities like samples, prototypes and semantics for these different transfer learning sub-problems. We explore additional post-processing steps like pseudo-labeling, domain adaptation and calibration and enforce algorithmic and architectural constraints to further improve recognition performance. Experimental results on standard transfer learning image recognition datasets produced competitive results with respect to previous work. Further experimentation and analyses of these methods provided better understanding of machine learning as well.</p><p> </p></pre></pre>

Page generated in 0.029 seconds