Spelling suggestions: "subject:"fewshot 1earning"" "subject:"fewshot c1earning""
1 |
Towards Data-efficient Graph LearningZhang, Qiannan 05 1900 (has links)
Graphs are commonly employed to model complex data and discover latent patterns and relationships between entities in the real world. Canonical graph learning models have achieved remarkable progress in modeling and inference on graph-structured data that consists of nodes connected by edges. Generally, they leverage abundant labeled data for model training and thus inevitably suffer from the label scarcity issue due to the expense and hardship of data annotation in practice. Data-efficient graph learning attempts to address the prevailing data scarcity issue in graph mining problems, of which the key idea is to transfer knowledge from the related resources to obtain the models with good generalizability to the target graph-related tasks with mere annotations. However, the generalization of the models to data-scarce scenarios is faced with challenges including 1) dealing with graph structure and structural heterogeneity to extract transferable knowledge; 2) selecting beneficial and fine-grained knowledge for effective transfer; 3) addressing the divergence across different resources to promote knowledge transfer. Motivated by the aforementioned challenges, the dissertation mainly focuses on three perspectives, i.e., knowledge extraction with graph heterogeneity, knowledge selection, and knowledge transfer. The purposed models are applied to various node classification and graph classification tasks in the low-data regimes, evaluated on a variety of datasets, and have shown their effectiveness compared with the state-of-the-art baselines.
|
2 |
Few-Shot and Zero-Shot Learning for Information ExtractionGong, Jiaying 31 May 2024 (has links)
Information extraction aims to automatically extract structured information from unstructured texts.
Supervised information extraction requires large quantities of labeled training data, which is time-consuming and labor-intensive. This dissertation focuses on information extraction, especially relation extraction and attribute-value extraction in e-commerce, with few labeled (few-shot learning) or even no labeled (zero-shot learning) training data. We explore multi-source auxiliary information and novel learning techniques to integrate semantic auxiliary information with the input text to improve few-shot learning and zero-shot learning.
For zero-shot and few-shot relation extraction, the first method explores the existing data statistics and leverages auxiliary information including labels, synonyms of labels, keywords, and hypernyms of name entities to enable zero-shot learning for the unlabeled data. We build an automatic hypernym extraction framework to help acquire hypernyms of different entities directly from the web. The second method explores the relations between seen classes and new classes. We propose a prompt-based model with semantic knowledge augmentation to recognize new relation triplets under the zero-shot setting. In this method, we transform the problem of zero-shot learning into supervised learning with the generated augmented data for new relations. We design the prompts for training using the auxiliary information based on an external knowledge graph to integrate semantic knowledge learned from seen relations. The third work utilizes auxiliary information from images to enhance few-shot learning. We propose a multi-modal few-shot relation extraction model that leverages both textual and visual semantic information to learn a multi-modal representation jointly. To supplement the missing contexts in text, this work integrates both local features (object-level) and global features (pixel-level) from different modalities through image-guided attention, object-guided attention, and hybrid feature attention to solve the problem of sparsity and noise.
We then explore the few-shot and zero-shot aspect (attribute-value) extraction in the e-commerce application field. The first work studies the multi-label few-shot learning by leveraging the auxiliary information of anchor (label) and category description based on the prototypical networks, where the hybrid attention helps alleviate ambiguity and capture more informative semantics by calculating both the label-relevant and query-related weights. A dynamic threshold is learned by integrating the semantic information from support and query sets to achieve multi-label inference. The second work explores multi-label zero-shot learning via semi-inductive link prediction of the heterogeneous hypergraph. The heterogeneous hypergraph is built with higher-order relations (generated by the auxiliary information of user behavior data and product inventory data) to capture the complex and interconnected relations between users and the products. / Doctor of Philosophy / Information extraction is the process of automatically extracting structured information from unstructured sources, such as plain text documents, web pages, images, and so on. In this dissertation, we will first focus on general relation extraction, which aims at identifying and classifying semantic relations between entities. For example, given the sentence `Peter was born in Manchester.' in the newspaper, structured information (Peter, place of birth, Manchester) can be extracted. Then, we focus on attribute-value (aspect) extraction in the application field, which aims at extracting attribute-value pairs from product descriptions or images on e-commerce websites. For example, given a product description or image of a handbag, the brand (i.e. brand: Chanel), color (i.e. color: black), and other structured information can be extracted from the product, which provides a better search and recommendation experience for customers.
With the advancement of deep learning techniques, machines (models) trained with large quantities of example input data and the corresponding desired output data, can perform automatic information extraction tasks with high accuracy. Such example input data and the corresponding desired output data are also named annotated data. However, across technological innovation and social change, new data (i.e. articles, products, etc.) is being generated continuously. It is difficult, time-consuming, and costly to annotate large quantities of new data for training. In this dissertation, we explore several different methods to help the model achieve good performance with only a few (few-shot learning) or even no labeled data (zero-shot learning) for training.
Humans are born with no prior knowledge, but they can still recognize new information based on their existing knowledge by continuously learning. Inspired by how human beings learn new knowledge, we explore different auxiliary information that can benefit few-shot and zero-shot information extraction. We studied the auxiliary information from existing data statistics, knowledge graphs, corresponding images, labels, user behavior data, product inventory data, optical characters, etc. We enable few-shot and zero-shot learning by adding auxiliary information to the training data. For example, we study the data statistics of both labeled and unlabeled data. We use data augmentation and prompts to generate training samples for no labeled data. We utilize graphs to learn general patterns and representations that can potentially transfer to unseen nodes and relations. This dissertation provides the exploration of how utilizing the above different auxiliary information to help improve the performance of information extraction with few annotated or even no annotated training data.
|
3 |
Learning with Limited Labeled Data: Techniques and ApplicationsLei, Shuo 11 October 2023 (has links)
Recent advances in large neural network-style models have demonstrated great performance in various applications, such as image generation, question answering, and audio classification. However, these deep and high-capacity models require a large amount of labeled data to function properly, rendering them inapplicable in many real-world scenarios. This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to learn novel classes with limited labeled data, (2) How to adapt a large pre-trained model to the target domain if only unlabeled data is available, (3) How to boost the performance of the few-shot learning model with unlabeled data, and (4) How to utilize limited labeled data to learn new classes without the training data in the same domain.
First, we study few-shot learning in text classification tasks. Meta-learning is becoming a popular approach for addressing few-shot text classification and has achieved state-of-the-art performance. However, the performance of existing approaches heavily depends on the interclass variance of the support set. To address this problem, we propose a TART network for few-shot text classification. The model enhances the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. In addition, we design a novel discriminative reference regularization to maximize divergence between transformed prototypes in task-adaptive metric spaces to improve performance further.
In the second problem we focus on self-learning in cross-lingual transfer task. Our goal here is to develop a framework that can make the pretrained cross-lingual model continue learning the knowledge with large amount of unlabeled data. Existing self-learning methods in crosslingual transfer tasks suffer from the large number of incorrectly pseudo-labeled samples used in the training phase. We first design an uncertainty-aware cross-lingual transfer framework with pseudo-partial-labels. We also propose a novel pseudo-partial-label estimation method that considers prediction confidences and the limitation to the number of candidate classes.
Next, to boost the performance of the few-shot learning model with unlabeled data, we propose a semi-supervised approach for few-shot semantic segmentation task. Existing solutions for few-shot semantic segmentation cannot easily be applied to utilize image-level weak annotations. We propose a class-prototype augmentation method to enrich the prototype representation by utilizing a few image-level annotations, achieving superior performance in one-/multi-way and weak annotation settings. We also design a robust strategy with softmasked average pooling to handle the noise in image-level annotations, which considers the prediction uncertainty and employs the task-specific threshold to mask the distraction.
Finally, we study the cross-domain few-shot learning in the semantic segmentation task. Most existing few-shot segmentation methods consider a setting where base classes are drawn from the same domain as the new classes. Nevertheless, gathering enough training data for meta-learning is either unattainable or impractical in many applications. We extend few-shot semantic segmentation to a new task, called Cross-Domain Few-Shot Semantic Segmentation (CD-FSS), which aims to generalize the meta-knowledge from domains with sufficient training labels to low-resource domains. Then, we establish a new benchmark for the CD-FSS task and evaluate both representative few-shot segmentation methods and transfer learning based methods on the proposed benchmark. We then propose a novel Pyramid-AnchorTransformation based few-shot segmentation network (PATNet), in which domain-specific features are transformed into domain-agnostic ones for downstream segmentation modules to fast adapt to unseen domains. / Doctor of Philosophy / Nowadays, deep learning techniques play a crucial role in our everyday existence. In addition, they are crucial to the success of many e-commerce and local businesses for enhancing data analytics and decision-making. Notable applications include intelligent transportation, intelligent healthcare, the generation of natural language, and intrusion detection, among others. To achieve reasonable performance on a new task, these deep and high-capacity models require thousands of labeled examples, which increases the data collection effort and computation costs associated with training a model. Moreover, in many disciplines, it might be difficult or even impossible to obtain data due to concerns such as privacy and safety.
This dissertation focuses on learning with limited labeled data in natural language processing and computer vision tasks. To recognize novel classes with a few examples in text classification tasks, we develop a deep learning-based model that can capture both cross- task transferable knowledge and task-specific features. We also build an uncertainty-aware self-learning framework and a semi-supervised few-shot learning method, which allow us to boost the pre-trained model with easily accessible unlabeled data. In addition, we propose a cross-domain few-shot semantic segmentation method to generalize the model to different domains with a few examples. By handling these unique challenges in learning with limited labeled data and developing suitable approaches, we hope to improve the efficiency and generalization of deep learning methods in the real world.
|
4 |
Evaluating Transcription of Ciphers with Few-Shot LearningMilioni, Nikolina January 2022 (has links)
Ciphers are encrypted documents created to hide their content from those who were not the receivers of the message. Different types of symbols, such as zodiac signs, alchemical symbols, alphabet letters or digits are exploited to compose the encrypted text which needs to be decrypted to gain access to the content of the documents. The first step before decryption is the transcription of the cipher. The purpose of this thesis is to evaluate an automatic transcription tool from image to a text format to provide a transcription of the cipher images. We implement a supervised few-shot deep-learning model which is tested on different types of encrypted documents and use various evaluation metrics to assess the results. We show that the few-shot model presents promising results on seen data with Symbol Error Rates (SER) ranging from 8.21% to 47.55% and accuracy scores from 80.13% to 90.27%, whereas SER in out-of-domain datasets reaches 79.91%. While a wide range of symbols are correctly transcribed, the erroneous symbols mainly contain diacritics or are punctuation marks.
|
5 |
The "What"-"Where" Network: A Tool for One-Shot Image Recognition and LocalizationHurlburt, Daniel 06 January 2021 (has links)
One common shortcoming of modern computer vision is the inability of most models to generalize to new classes—one/few shot image recognition. We propose a new problem formulation for this task and present a network architecture and training methodology to solve this task. Further, we provide insights into how careful focus on how not just the data, but the way data presented to the model can have significant impact on performance. Using these method, we achieve high accuracy in few-shot image recognition tasks.
|
6 |
Evaluating and Fine-Tuning a Few-Shot Model for Transcription of Historical CiphersEliasson, Ingrid January 2023 (has links)
Thousands of historical ciphers, encrypted manuscripts, are stored in archives across Europe. Historical cryptology is the research field concerned with studying these manuscripts - combining the interest of humanistic fields with methods of cryptography and computational linguistics. Before a cipher can be decrypted by automatic means, it must first be transcribed into machine-readable digital text. Image processing techniques and Deep Learning have enabled transcription of handwritten text to be performed automatically, but the task faces challenges when ciphers constitute the target data. The main reason is a lack of labeled data, caused by the heterogeneity of handwriting and the tendency of ciphers to employ unique symbol sets. Few-Shot Learning is a machine learning framework which reduces the need for labeled data, using pretrained models in combination with support sets containing a few labeled examples from the target data set. This project is concerned with evaluating a Few-Shot model on the task of transcription of historical ciphers. The model is tested on pages from three in-domain ciphers which vary in handwriting style and symbol sets. The project also investigates the use of further fine-tuning the model by training it on a limited amount of labeled symbol examples from the respective target ciphers. We find that the performance of the model is dependant on the handwriting style of the target document, and that certain model parameters should be explored individually for each data set. We further show that fine-tuning the model is indeed efficient, lowering the Symbol Error Rate (SER) at best 27.6 percentage points.
|
7 |
Few-Shot Learning for Quality InspectionPalmér, Jesper, Alsalehy, Ahmad January 2023 (has links)
The goal of this project is to find a suitable Few-Shot Learning (FSL) model that can be used in a fault detection system for use in an industrial setting. A dataset of Printed Circuit Board (PCB) images has been created to train different FSL models. This dataset is meant for evaluating FSL models in the specialized setting of fault detection in PCB manufacturing. FSL is a part of deep learning that has seen a large amount of development recently. Few-shot learning allows neural networks to learn on small datasets. In this thesis, various state-of-the-art FSL algorithms are implemented and tested on the custom PCB dataset. Different backbones are used to establish a benchmark for the tested FSL algorithms on three different datasets. Those datasets are ImageNet, PCB Defects, and the created PCB dataset. Our results show that ProtoNets combined with ResNet12 backbone achieved the highest accuracy in two test scenarios. In those tests, the model combination achieved 87.20%and 92.27% in 1-shot and 5-shot test scenarios, respectively. This thesis presents a Few-Shot Anomaly Detection (FSAD) model based on Vision Transformers (ViT). The model is compared to the state-of-the-art FSAD model DevNet on the MVTec-AD dataset. DevNet and ViT are chosen for comparison because they both approach the problem by dividing images into patches. How the models handle the image patches is however very different. The results indicate that ViT Deviation does not obtain as high AUC-ROC and AUC-PR scores as DevNet. This is because of the use of the very deep ViT architecture in the ViT Deviation model. A shallower transformer-based model is believed to be better suited for FSAD. Improvements for ViT Deviation are suggested for future work. The most notable suggested improvement is the use of the FS-CT architecture as a FSAD model because of the high accuracy it achieves in classification. / Målet med detta projekt är att hitta en lämplig Few-Shot Learning(FSL) modell som kan användas i ett feldetekteringssystem för användning i en industriell miljö. Ett dataset av Printed Circuit Board(PCB) bilder har skapats för att träna olika FSL-modeller. Detta datasetär avsedd för att utvärdera FSL-modeller i det specialiserade områdetfeldetektering vid PCB-tillverkning. FSL är en del av djupinlärningsom har utvecklats mycket den senaste tiden. FSL tillåter neuralanätverk att lära sig på små datamängder.I detta examensarbete implementeras och testas olika state-of-theart FSL algoritmer på det anpassade PCB-datasetet. Olika ryggradsmodeller används för att upprätta ett riktmärke för de testade FSL-algoritmernapå tre olika dataset. Dessa dataset är ImageNet[6], PCB Defects[14]och det skapade PCB-datasetet. Våra resultat visar att ProtoNets ikombination med ResNet12-ryggraden uppnådde den högsta noggrannheten i två testscenarier. I dessa tester uppnådde modellkombinationen 87,20% och 92,27% i testscenarier med 1-shot respektive5-shot.Detta examensarbete presenterar en Few-Shot Anomaly Detectionmodell (FSAD) baserad på Vision Transformers (ViT). Modellen jämförs med FSAD-modellen DevNet på MVTec-AD-datasetet. DevNetoch ViT väljs för jämförelse eftersom de båda angriper problemetgenom att dela upp bilder i mindre lappar. Hur modellerna hanterarlapparna är dock väldigt olika. Resultaten indikerar att ViT-Deviationinte får lika hög AUC-ROC och AUC-PR som DevNet. Detta beror påanvändningen av den mycket djupa ViT-arkitekturen i ViT Deviationmodellen. En grundare ViT-baserad modell tros vara bättre lämpadför FSAD. Förbättringar för ViT-Deviation föreslås för framtida arbete.Den mest anmärkningsvärda föreslagna förbättringen är användningen av FS-CT-arkitekturen som en FSAD-modell på grund av de lovande resultaten den uppnår i klassificering.
|
8 |
Bridging Machine Learning and Experimental Design for Enhanced Data Analysis and OptimizationGuo, Qing 19 July 2024 (has links)
Experimental design is a powerful tool for gathering highly informative observations using a small number of experiments. The demand for smart data collection strategies is increasing due to the need to save time and budget, especially in online experiments and machine learning. However, the traditional experimental design method falls short in systematically assessing changing variables' effects. Specifically within Artificial Intelligence (AI), the challenge lies in assessing the impacts of model structures and training strategies on task performances with a limited number of trials. This shortfall underscores the necessity for the development of novel approaches. On the other side, the optimal design criterion has typically been model-based in classic design literature, which leads to restricting the flexibility of experimental design strategies. However, machine learning's inherent flexibility can empower the estimation of metrics efficiently using nonparametric and optimization techniques, thereby broadening the horizons of experimental design possibilities.
In this dissertation, the aim is to develop a set of novel methods to bridge the merits between these two domains: 1) applying ideas from statistical experimental design to enhance data efficiency in machine learning, and 2) leveraging powerful deep neural networks to optimize experimental design strategies.
This dissertation consists of 5 chapters. Chapter 1 provides a general introduction to mutual information, fractional factorial design, hyper-parameter tuning, multi-modality, etc. In Chapter 2, I propose a new mutual information estimator FLO by integrating techniques from variational inference (VAE), contrastive learning, and convex optimization. I apply FLO to broad data science applications, such as efficient data collection, transfer learning, fair learning, etc. Chapter 3 introduces a new design strategy called multi-layer sliced design (MLSD) with the application of AI assurance. It focuses on exploring the effects of hyper-parameters under different models and optimization strategies. Chapter 4 investigates classic vision challenges via multimodal large language models by implicitly optimizing mutual information and thoroughly exploring training strategies. Chapter 5 concludes this proposal and discusses several future research topics. / Doctor of Philosophy / In the digital age, artificial intelligence (AI) is reshaping our interactions with technology through advanced machine learning models. These models are complex, often opaque mechanisms that present challenges in understanding their inner workings. This complexity necessitates numerous experiments with different settings to optimize performance, which can be costly. Consequently, it is crucial to strategically evaluate the effects of various strategies on task performance using a limited number of trials. The Design of Experiments (DoE) offers invaluable techniques for investigating and understanding these complex systems efficiently. Moreover, integrating machine learning models can further enhance the DoE. Traditionally, experimental designs pre-specify a model and focus on finding the best strategies for experimentation. This assumption can restrict the adaptability and applicability of experimental designs. However, the inherent flexibility of machine learning models can enhance the capabilities of DoE, unlocking new possibilities for efficiently optimizing experimental strategies through an information-centric approach. Moreover, the information-based method can also be beneficial in other AI applications, including self-supervised learning, fair learning, transfer learning, etc. The research presented in this dissertation aims to bridge machine learning and experimental design, offering new insights and methodologies that benefit both AI techniques and DoE.
|
9 |
Metric Learning via Linear Embeddings for Human Motion RecognitionKong, ByoungDoo 18 December 2020 (has links)
We consider the application of Few-Shot Learning (FSL) and dimensionality reduction to the problem of human motion recognition (HMR). The structure of human motion has unique characteristics such as its dynamic and high-dimensional nature. Recent research on human motion recognition uses deep neural networks with multiple layers. Most importantly, large datasets will need to be collected to use such networks to analyze human motion. This process is both time-consuming and expensive since a large motion capture database must be collected and labeled. Despite significant progress having been made in human motion recognition, state-of-the-art algorithms still misclassify actions because of characteristics such as the difficulty in obtaining large-scale leveled human motion datasets. To address these limitations, we use metric-based FSL methods that use small-size data in conjunction with dimensionality reduction. We also propose a modified dimensionality reduction scheme based on the preservation of secants tailored to arbitrary useful distances, such as the geodesic distance learned by ISOMAP. We provide multiple experimental results that demonstrate improvements in human motion classification.
|
10 |
On Transfer Learning Techniques for Machine LearningDebasmit Das (8314707) 30 April 2020 (has links)
<pre><pre><p>
</p><p>Recent progress in machine learning has been mainly due to
the availability of large amounts of annotated data used for training complex
models with deep architectures. Annotating this training data becomes
burdensome and creates a major bottleneck in maintaining machine-learning
databases. Moreover, these trained models fail to generalize to new categories
or new varieties of the same categories. This is because new categories or new
varieties have data distribution different from the training data distribution.
To tackle these problems, this thesis proposes to develop a family of
transfer-learning techniques that can deal with different training (source) and
testing (target) distributions with the assumption that the availability of
annotated data is limited in the testing domain. This is done by using the
auxiliary data-abundant source domain from which useful knowledge is
transferred that can be applied to data-scarce target domain. This transferable
knowledge serves as a prior that biases target-domain predictions and prevents
the target-domain model from overfitting. Specifically, we explore structural
priors that encode relational knowledge between different data entities, which
provides more informative bias than traditional priors. The choice of the
structural prior depends on the information availability and the similarity
between the two domains. Depending on the domain similarity and the information
availability, we divide the transfer learning problem into four major
categories and propose different structural priors to solve each of these
sub-problems.</p><p>
</p><p>This thesis first focuses on the
unsupervised-domain-adaptation problem, where we propose to minimize domain
discrepancy by transforming labeled source-domain data to be close to unlabeled
target-domain data. For this problem,
the categories remain the same across the two domains and hence we assume that
the structural relationship between the source-domain samples is carried over
to the target domain. Thus, graph or hyper-graph is constructed as the
structural prior from both domains and a graph/hyper-graph matching formulation
is used to transform samples in the source domain to be closer to samples in
the target domain. An efficient optimization scheme is then proposed to tackle
the time and memory inefficiencies associated with the matching problem. The
few-shot learning problem is studied next, where we propose to transfer
knowledge from source-domain categories containing abundantly labeled data to
novel categories in the target domain that contains only few labeled data. The
knowledge transfer biases the novel category predictions and prevents the model
from overfitting. The knowledge is encoded using a neural-network-based prior
that transforms a data sample to its corresponding class prototype. This neural
network is trained from the source-domain data and applied to the target-domain
data, where it transforms the few-shot samples to the novel-class prototypes
for better recognition performance. The few-shot learning problem is then
extended to the situation, where we do not have access to the source-domain
data but only have access to the source-domain class prototypes. In this limited
information setting, parametric neural-network-based priors would overfit to
the source-class prototypes and hence we seek a non-parametric-based prior
using manifolds. A piecewise linear manifold is used as a structural prior to
fit the source-domain-class prototypes. This structure is extended to the
target domain, where the novel-class prototypes are found by projecting the
few-shot samples onto the manifold. Finally, the zero-shot learning problem is
addressed, which is an extreme case of the few-shot learning problem where we
do not have any labeled data in the target domain. However, we have high-level
information for both the source and target domain categories in the form of
semantic descriptors. We learn the relation between the sample space and the
semantic space, using a regularized neural network so that classification of
the novel categories can be carried out in a common representation space. This
same neural network is then used in the target domain to relate the two spaces.
In case we want to generate data for the novel categories in the target domain,
we can use a constrained generative adversarial network instead of a
traditional neural network. Thus, we use structural priors like graphs, neural
networks and manifolds to relate various data entities like samples, prototypes
and semantics for these different transfer learning sub-problems. We explore
additional post-processing steps like pseudo-labeling, domain adaptation and
calibration and enforce algorithmic and architectural constraints to further
improve recognition performance. Experimental results on standard transfer
learning image recognition datasets produced competitive results with respect
to previous work. Further experimentation and analyses of these methods
provided better understanding of machine learning as well.</p><p>
</p></pre></pre>
|
Page generated in 0.0803 seconds