Global ETD Search

1	Towards Data-efficient Graph Learning Zhang, Qiannan 05 1900 (has links) Graphs are commonly employed to model complex data and discover latent patterns and relationships between entities in the real world. Canonical graph learning models have achieved remarkable progress in modeling and inference on graph-structured data that consists of nodes connected by edges. Generally, they leverage abundant labeled data for model training and thus inevitably suffer from the label scarcity issue due to the expense and hardship of data annotation in practice. Data-efficient graph learning attempts to address the prevailing data scarcity issue in graph mining problems, of which the key idea is to transfer knowledge from the related resources to obtain the models with good generalizability to the target graph-related tasks with mere annotations. However, the generalization of the models to data-scarce scenarios is faced with challenges including 1) dealing with graph structure and structural heterogeneity to extract transferable knowledge; 2) selecting beneficial and fine-grained knowledge for effective transfer; 3) addressing the divergence across different resources to promote knowledge transfer. Motivated by the aforementioned challenges, the dissertation mainly focuses on three perspectives, i.e., knowledge extraction with graph heterogeneity, knowledge selection, and knowledge transfer. The purposed models are applied to various node classification and graph classification tasks in the low-data regimes, evaluated on a variety of datasets, and have shown their effectiveness compared with the state-of-the-art baselines. Graph learning few-shot learning heterogeneous graph learning
2	Few-Shot and Zero-Shot Learning for Information Extraction Gong, Jiaying 31 May 2024 (has links) Information extraction aims to automatically extract structured information from unstructured texts. Supervised information extraction requires large quantities of labeled training data, which is time-consuming and labor-intensive. This dissertation focuses on information extraction, especially relation extraction and attribute-value extraction in e-commerce, with few labeled (few-shot learning) or even no labeled (zero-shot learning) training data. We explore multi-source auxiliary information and novel learning techniques to integrate semantic auxiliary information with the input text to improve few-shot learning and zero-shot learning. For zero-shot and few-shot relation extraction, the first method explores the existing data statistics and leverages auxiliary information including labels, synonyms of labels, keywords, and hypernyms of name entities to enable zero-shot learning for the unlabeled data. We build an automatic hypernym extraction framework to help acquire hypernyms of different entities directly from the web. The second method explores the relations between seen classes and new classes. We propose a prompt-based model with semantic knowledge augmentation to recognize new relation triplets under the zero-shot setting. In this method, we transform the problem of zero-shot learning into supervised learning with the generated augmented data for new relations. We design the prompts for training using the auxiliary information based on an external knowledge graph to integrate semantic knowledge learned from seen relations. The third work utilizes auxiliary information from images to enhance few-shot learning. We propose a multi-modal few-shot relation extraction model that leverages both textual and visual semantic information to learn a multi-modal representation jointly. To supplement the missing contexts in text, this work integrates both local features (object-level) and global features (pixel-level) from different modalities through image-guided attention, object-guided attention, and hybrid feature attention to solve the problem of sparsity and noise. We then explore the few-shot and zero-shot aspect (attribute-value) extraction in the e-commerce application field. The first work studies the multi-label few-shot learning by leveraging the auxiliary information of anchor (label) and category description based on the prototypical networks, where the hybrid attention helps alleviate ambiguity and capture more informative semantics by calculating both the label-relevant and query-related weights. A dynamic threshold is learned by integrating the semantic information from support and query sets to achieve multi-label inference. The second work explores multi-label zero-shot learning via semi-inductive link prediction of the heterogeneous hypergraph. The heterogeneous hypergraph is built with higher-order relations (generated by the auxiliary information of user behavior data and product inventory data) to capture the complex and interconnected relations between users and the products. / Doctor of Philosophy / Information extraction is the process of automatically extracting structured information from unstructured sources, such as plain text documents, web pages, images, and so on. In this dissertation, we will first focus on general relation extraction, which aims at identifying and classifying semantic relations between entities. For example, given the sentence `Peter was born in Manchester.' in the newspaper, structured information (Peter, place of birth, Manchester) can be extracted. Then, we focus on attribute-value (aspect) extraction in the application field, which aims at extracting attribute-value pairs from product descriptions or images on e-commerce websites. For example, given a product description or image of a handbag, the brand (i.e. brand: Chanel), color (i.e. color: black), and other structured information can be extracted from the product, which provides a better search and recommendation experience for customers. With the advancement of deep learning techniques, machines (models) trained with large quantities of example input data and the corresponding desired output data, can perform automatic information extraction tasks with high accuracy. Such example input data and the corresponding desired output data are also named annotated data. However, across technological innovation and social change, new data (i.e. articles, products, etc.) is being generated continuously. It is difficult, time-consuming, and costly to annotate large quantities of new data for training. In this dissertation, we explore several different methods to help the model achieve good performance with only a few (few-shot learning) or even no labeled data (zero-shot learning) for training. Humans are born with no prior knowledge, but they can still recognize new information based on their existing knowledge by continuously learning. Inspired by how human beings learn new knowledge, we explore different auxiliary information that can benefit few-shot and zero-shot information extraction. We studied the auxiliary information from existing data statistics, knowledge graphs, corresponding images, labels, user behavior data, product inventory data, optical characters, etc. We enable few-shot and zero-shot learning by adding auxiliary information to the training data. For example, we study the data statistics of both labeled and unlabeled data. We use data augmentation and prompts to generate training samples for no labeled data. We utilize graphs to learn general patterns and representations that can potentially transfer to unseen nodes and relations. This dissertation provides the exploration of how utilizing the above different auxiliary information to help improve the performance of information extraction with few annotated or even no annotated training data. Information Extraction Few-Shot Learning Zero-Shot Learning
3	Learning with Limited Labeled Data: Techniques and Applications Lei, Shuo 11 October 2023 (has links) Recent advances in large neural network-style models have demonstrated great performance in various applications, such as image generation, question answering, and audio classification. However, these deep and high-capacity models require a large amount of labeled data to function properly, rendering them inapplicable in many real-world scenarios. This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to learn novel classes with limited labeled data, (2) How to adapt a large pre-trained model to the target domain if only unlabeled data is available, (3) How to boost the performance of the few-shot learning model with unlabeled data, and (4) How to utilize limited labeled data to learn new classes without the training data in the same domain. First, we study few-shot learning in text classification tasks. Meta-learning is becoming a popular approach for addressing few-shot text classification and has achieved state-of-the-art performance. However, the performance of existing approaches heavily depends on the interclass variance of the support set. To address this problem, we propose a TART network for few-shot text classification. The model enhances the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. In addition, we design a novel discriminative reference regularization to maximize divergence between transformed prototypes in task-adaptive metric spaces to improve performance further. In the second problem we focus on self-learning in cross-lingual transfer task. Our goal here is to develop a framework that can make the pretrained cross-lingual model continue learning the knowledge with large amount of unlabeled data. Existing self-learning methods in crosslingual transfer tasks suffer from the large number of incorrectly pseudo-labeled samples used in the training phase. We first design an uncertainty-aware cross-lingual transfer framework with pseudo-partial-labels. We also propose a novel pseudo-partial-label estimation method that considers prediction confidences and the limitation to the number of candidate classes. Next, to boost the performance of the few-shot learning model with unlabeled data, we propose a semi-supervised approach for few-shot semantic segmentation task. Existing solutions for few-shot semantic segmentation cannot easily be applied to utilize image-level weak annotations. We propose a class-prototype augmentation method to enrich the prototype representation by utilizing a few image-level annotations, achieving superior performance in one-/multi-way and weak annotation settings. We also design a robust strategy with softmasked average pooling to handle the noise in image-level annotations, which considers the prediction uncertainty and employs the task-specific threshold to mask the distraction. Finally, we study the cross-domain few-shot learning in the semantic segmentation task. Most existing few-shot segmentation methods consider a setting where base classes are drawn from the same domain as the new classes. Nevertheless, gathering enough training data for meta-learning is either unattainable or impractical in many applications. We extend few-shot semantic segmentation to a new task, called Cross-Domain Few-Shot Semantic Segmentation (CD-FSS), which aims to generalize the meta-knowledge from domains with sufficient training labels to low-resource domains. Then, we establish a new benchmark for the CD-FSS task and evaluate both representative few-shot segmentation methods and transfer learning based methods on the proposed benchmark. We then propose a novel Pyramid-AnchorTransformation based few-shot segmentation network (PATNet), in which domain-specific features are transformed into domain-agnostic ones for downstream segmentation modules to fast adapt to unseen domains. / Doctor of Philosophy / Nowadays, deep learning techniques play a crucial role in our everyday existence. In addition, they are crucial to the success of many e-commerce and local businesses for enhancing data analytics and decision-making. Notable applications include intelligent transportation, intelligent healthcare, the generation of natural language, and intrusion detection, among others. To achieve reasonable performance on a new task, these deep and high-capacity models require thousands of labeled examples, which increases the data collection effort and computation costs associated with training a model. Moreover, in many disciplines, it might be difficult or even impossible to obtain data due to concerns such as privacy and safety. This dissertation focuses on learning with limited labeled data in natural language processing and computer vision tasks. To recognize novel classes with a few examples in text classification tasks, we develop a deep learning-based model that can capture both cross- task transferable knowledge and task-specific features. We also build an uncertainty-aware self-learning framework and a semi-supervised few-shot learning method, which allow us to boost the pre-trained model with easily accessible unlabeled data. In addition, we propose a cross-domain few-shot semantic segmentation method to generalize the model to different domains with a few examples. By handling these unique challenges in learning with limited labeled data and developing suitable approaches, we hope to improve the eﬀiciency and generalization of deep learning methods in the real world. few-shot learning self-learning semantic segmentation natural language processing
4	Evaluating Transcription of Ciphers with Few-Shot Learning Milioni, Nikolina January 2022 (has links) Ciphers are encrypted documents created to hide their content from those who were not the receivers of the message. Different types of symbols, such as zodiac signs, alchemical symbols, alphabet letters or digits are exploited to compose the encrypted text which needs to be decrypted to gain access to the content of the documents. The first step before decryption is the transcription of the cipher. The purpose of this thesis is to evaluate an automatic transcription tool from image to a text format to provide a transcription of the cipher images. We implement a supervised few-shot deep-learning model which is tested on different types of encrypted documents and use various evaluation metrics to assess the results. We show that the few-shot model presents promising results on seen data with Symbol Error Rates (SER) ranging from 8.21% to 47.55% and accuracy scores from 80.13% to 90.27%, whereas SER in out-of-domain datasets reaches 79.91%. While a wide range of symbols are correctly transcribed, the erroneous symbols mainly contain diacritics or are punctuation marks. Ciphers Automatic Transcription Decrypt project Few-shot learning
5	The "What"-"Where" Network: A Tool for One-Shot Image Recognition and Localization Hurlburt, Daniel 06 January 2021 (has links) One common shortcoming of modern computer vision is the inability of most models to generalize to new classes—one/few shot image recognition. We propose a new problem formulation for this task and present a network architecture and training methodology to solve this task. Further, we provide insights into how careful focus on how not just the data, but the way data presented to the model can have significant impact on performance. Using these method, we achieve high accuracy in few-shot image recognition tasks. computer vision semantic segmentation few-shot learning one-shot learning embedding Physical Sciences and Mathematics
6	Evaluating and Fine-Tuning a Few-Shot Model for Transcription of Historical Ciphers Eliasson, Ingrid January 2023 (has links) Thousands of historical ciphers, encrypted manuscripts, are stored in archives across Europe. Historical cryptology is the research field concerned with studying these manuscripts - combining the interest of humanistic fields with methods of cryptography and computational linguistics. Before a cipher can be decrypted by automatic means, it must first be transcribed into machine-readable digital text. Image processing techniques and Deep Learning have enabled transcription of handwritten text to be performed automatically, but the task faces challenges when ciphers constitute the target data. The main reason is a lack of labeled data, caused by the heterogeneity of handwriting and the tendency of ciphers to employ unique symbol sets. Few-Shot Learning is a machine learning framework which reduces the need for labeled data, using pretrained models in combination with support sets containing a few labeled examples from the target data set. This project is concerned with evaluating a Few-Shot model on the task of transcription of historical ciphers. The model is tested on pages from three in-domain ciphers which vary in handwriting style and symbol sets. The project also investigates the use of further fine-tuning the model by training it on a limited amount of labeled symbol examples from the respective target ciphers. We find that the performance of the model is dependant on the handwriting style of the target document, and that certain model parameters should be explored individually for each data set. We further show that fine-tuning the model is indeed efficient, lowering the Symbol Error Rate (SER) at best 27.6 percentage points. historical ciphers few-shot learning automatic transcription
7	Few-Shot Malware Detection Using A Novel Adversarial Reprogramming Model Kumar, Ekula Praveen January 2022 (has links) No description available. Computer Science Computer Engineering Information Technology malware malware detection few-shot cybersecurity machine learning adversarial reprogramming
8	Few-Shot Learning for Quality Inspection Palmér, Jesper, Alsalehy, Ahmad January 2023 (has links) The goal of this project is to find a suitable Few-Shot Learning (FSL) model that can be used in a fault detection system for use in an industrial setting. A dataset of Printed Circuit Board (PCB) images has been created to train different FSL models. This dataset is meant for evaluating FSL models in the specialized setting of fault detection in PCB manufacturing. FSL is a part of deep learning that has seen a large amount of development recently. Few-shot learning allows neural networks to learn on small datasets. In this thesis, various state-of-the-art FSL algorithms are implemented and tested on the custom PCB dataset. Different backbones are used to establish a benchmark for the tested FSL algorithms on three different datasets. Those datasets are ImageNet, PCB Defects, and the created PCB dataset. Our results show that ProtoNets combined with ResNet12 backbone achieved the highest accuracy in two test scenarios. In those tests, the model combination achieved 87.20%and 92.27% in 1-shot and 5-shot test scenarios, respectively. This thesis presents a Few-Shot Anomaly Detection (FSAD) model based on Vision Transformers (ViT). The model is compared to the state-of-the-art FSAD model DevNet on the MVTec-AD dataset. DevNet and ViT are chosen for comparison because they both approach the problem by dividing images into patches. How the models handle the image patches is however very different. The results indicate that ViT Deviation does not obtain as high AUC-ROC and AUC-PR scores as DevNet. This is because of the use of the very deep ViT architecture in the ViT Deviation model. A shallower transformer-based model is believed to be better suited for FSAD. Improvements for ViT Deviation are suggested for future work. The most notable suggested improvement is the use of the FS-CT architecture as a FSAD model because of the high accuracy it achieves in classification. / Målet med detta projekt är att hitta en lämplig Few-Shot Learning(FSL) modell som kan användas i ett feldetekteringssystem för användning i en industriell miljö. Ett dataset av Printed Circuit Board(PCB) bilder har skapats för att träna olika FSL-modeller. Detta datasetär avsedd för att utvärdera FSL-modeller i det specialiserade områdetfeldetektering vid PCB-tillverkning. FSL är en del av djupinlärningsom har utvecklats mycket den senaste tiden. FSL tillåter neuralanätverk att lära sig på små datamängder.I detta examensarbete implementeras och testas olika state-of-theart FSL algoritmer på det anpassade PCB-datasetet. Olika ryggradsmodeller används för att upprätta ett riktmärke för de testade FSL-algoritmernapå tre olika dataset. Dessa dataset är ImageNet[6], PCB Defects[14]och det skapade PCB-datasetet. Våra resultat visar att ProtoNets ikombination med ResNet12-ryggraden uppnådde den högsta noggrannheten i två testscenarier. I dessa tester uppnådde modellkombinationen 87,20% och 92,27% i testscenarier med 1-shot respektive5-shot.Detta examensarbete presenterar en Few-Shot Anomaly Detectionmodell (FSAD) baserad på Vision Transformers (ViT). Modellen jämförs med FSAD-modellen DevNet på MVTec-AD-datasetet. DevNetoch ViT väljs för jämförelse eftersom de båda angriper problemetgenom att dela upp bilder i mindre lappar. Hur modellerna hanterarlapparna är dock väldigt olika. Resultaten indikerar att ViT-Deviationinte får lika hög AUC-ROC och AUC-PR som DevNet. Detta beror påanvändningen av den mycket djupa ViT-arkitekturen i ViT Deviationmodellen. En grundare ViT-baserad modell tros vara bättre lämpadför FSAD. Förbättringar för ViT-Deviation föreslås för framtida arbete.Den mest anmärkningsvärda föreslagna förbättringen är användningen av FS-CT-arkitekturen som en FSAD-modell på grund av de lovande resultaten den uppnår i klassificering. Few-Shot Learning AI Transformers ViT Deviation Vision Transformers Computer Sciences Datavetenskap (datalogi)
9	Bridging Machine Learning and Experimental Design for Enhanced Data Analysis and Optimization Guo, Qing 19 July 2024 (has links) Experimental design is a powerful tool for gathering highly informative observations using a small number of experiments. The demand for smart data collection strategies is increasing due to the need to save time and budget, especially in online experiments and machine learning. However, the traditional experimental design method falls short in systematically assessing changing variables' effects. Specifically within Artificial Intelligence (AI), the challenge lies in assessing the impacts of model structures and training strategies on task performances with a limited number of trials. This shortfall underscores the necessity for the development of novel approaches. On the other side, the optimal design criterion has typically been model-based in classic design literature, which leads to restricting the flexibility of experimental design strategies. However, machine learning's inherent flexibility can empower the estimation of metrics efficiently using nonparametric and optimization techniques, thereby broadening the horizons of experimental design possibilities. In this dissertation, the aim is to develop a set of novel methods to bridge the merits between these two domains: 1) applying ideas from statistical experimental design to enhance data efficiency in machine learning, and 2) leveraging powerful deep neural networks to optimize experimental design strategies. This dissertation consists of 5 chapters. Chapter 1 provides a general introduction to mutual information, fractional factorial design, hyper-parameter tuning, multi-modality, etc. In Chapter 2, I propose a new mutual information estimator FLO by integrating techniques from variational inference (VAE), contrastive learning, and convex optimization. I apply FLO to broad data science applications, such as efficient data collection, transfer learning, fair learning, etc. Chapter 3 introduces a new design strategy called multi-layer sliced design (MLSD) with the application of AI assurance. It focuses on exploring the effects of hyper-parameters under different models and optimization strategies. Chapter 4 investigates classic vision challenges via multimodal large language models by implicitly optimizing mutual information and thoroughly exploring training strategies. Chapter 5 concludes this proposal and discusses several future research topics. / Doctor of Philosophy / In the digital age, artificial intelligence (AI) is reshaping our interactions with technology through advanced machine learning models. These models are complex, often opaque mechanisms that present challenges in understanding their inner workings. This complexity necessitates numerous experiments with different settings to optimize performance, which can be costly. Consequently, it is crucial to strategically evaluate the effects of various strategies on task performance using a limited number of trials. The Design of Experiments (DoE) offers invaluable techniques for investigating and understanding these complex systems efficiently. Moreover, integrating machine learning models can further enhance the DoE. Traditionally, experimental designs pre-specify a model and focus on finding the best strategies for experimentation. This assumption can restrict the adaptability and applicability of experimental designs. However, the inherent flexibility of machine learning models can enhance the capabilities of DoE, unlocking new possibilities for efficiently optimizing experimental strategies through an information-centric approach. Moreover, the information-based method can also be beneficial in other AI applications, including self-supervised learning, fair learning, transfer learning, etc. The research presented in this dissertation aims to bridge machine learning and experimental design, offering new insights and methodologies that benefit both AI techniques and DoE. Mutual Information Sliced Design Bayesian Optimal Design Induced Lasso Few-shot Learning Variational Inference Contrastive Learning
10	A comparative evaluation of machine learning models for engagement classification during presentations : A comparison of distance- and non-distance-based machine learning models for presentation classification and class likelihood estimation / En jämförande utvärdering av maskininlärningsmodeller för engagemangsklassificering under presentationer : En jämförelse av distans- och icke-distansbaserade maskininlärningsmodeller för presentationsklassificering och klasssannolikhetsuppskattning Ali Omer Bajallan, Rebwar January 2022 (has links) In recent years, there has been a significant increase in the usage of audience engagement platforms, which have allowed for engaging interactions between presenters and their audiences. The increased popularity of the platforms comes from the fact that engaging and interactive presentations have been shown to improve learning outcomes and create positive presentation experiences. However, using the platforms does not guarantee that your audience is engaged and participating. Given that the added value of engaging presentations only applies if the audience is actually engaged, it increases the need to know if and how engaged your audience is. The usage of audience engagement platforms has allowed for new ways of engagement to be studied. By utilizing the data gathered from the interactive presentation sessions, engagement can be studied and quantified through the modeling of the data. As the usage of audience engagement platforms and the study of presentation engagement is relatively new, there exists a limited amount of labeled data quantifying the level of engagement during presentations. To model the data, machine learning models should therefore be trained to generalize by being exposed to a limited number of presentation samples. This technique of training machine learning models is also referred to as few-shot learning. Distance-based machine learning models are defined in this study as models that make classifications and inferences by calculating distances between observations or observation class representations. Distance-based models have previously shown relatively good performance in few-shot learning applications, and interest therefore lies in expanding their application areas. This study presents a comparative evaluation of distance- and non-distance-based machine learning models given the problem of classifying presentations as being engaged or non-engaged, and estimating presentation class likelihoods in a few-shot learning context. A presentation-level dataset was gathered from the interactive presentation sessions, and each presentation observation was labeled as being engaged or non-engaged. The machine learning models were then trained to model the data and evaluated in terms of how well they were able to generalize to unseen testing samples by being exposed to a limited number of training observations. In particular, their classification and class likelihood estimation performances were evaluated. The results conclude that the distance-based models outperformed the non-distance-based models artificial neural network and relevance-vector machine given the presentation class likelihood estimation problem. The metric learning nearest neighbor classifier was the only distance-based model that outperformed all the non-distance-based models given both the presentation classification and class likelihood estimation problems. / Under de senaste åren har det skett en betydande ökning av användningen av plattformar för publikengagemang, vilket har möjliggjort engagerande interaktioner mellan presentatörer och deras publik. Plattformarnas ökade popularitet kommer från det faktum att engagerande och interaktiva presentationer har visat sig förbättra läranderesultat och skapa positiva presentationsupplevelser. Att använda plattformarna garanterar dock inte att din publik är engagerad och deltagande. Med tanke på att mervärdet av engagerande presentationer bara gäller om publiken faktiskt är engagerad, ökar det behovet av att veta om och hur engagerad din publik är. Användningen av plattformar för publikengagemang har gjort det möjligt att på nya sätt studera engagemang. Genom att använda data som samlats in från de interaktiva presentationssessionerna kan engagemang studeras och kvantifieras genom modellering av data. Eftersom användandet av plattformar för publikengagemang och studien av presentationsengagemang är relativt nytt, finns det en begränsad mängd märkt data som kvantifierar nivån av engagemang under presentationerna. För att modellera datan så bör maskininlärningsmodeller tränas att generalisera genom att utsättas för ett begränsad antal presentations observationer. Denna teknik för att träna inlärningsmodeller kallas också few-shot lärande. Distans-baserade maskininlärningsmodeller definieras i denna studie som modeller som gör klassificeringar genom att beräkna avstånd mellan observationer eller observationsklass representationer. Distans-baserade modeller har tidigare visat relativt goda resultat i few-shot inlärning problem, och intresset ligger därför i att utöka deras tillämpningsområden. Denna studie presenterar en jämförande utvärdering av distans- och icke-distans baserade maskininlärningsmodeller givet problemet med att klassificera presentationer som engagerade eller icke-engagerade, och uppskattning av presentation klasssannolikheter i ett few-shot inlärnings sammanhang. Ett dataset på presentationsnivå samlades in från de interaktiva presentationssessionerna, och varje presentation märktes som engagerad eller icke-engagerad. Maskininlärningsmodellerna tränades sedan för att modellera data och utvärderades i termer av hur väl de kunde generalisera till osedda testobservationer givet att de exponeras mot ett begränsat antal träningsobservationer. I synnerhet utvärderades deras klassificering och uppskattning av klasssannolikheter. Resultaten visade att alla distans-baserade modeller var bättre än de icke-distansbaserade modellerna artificial neural network och relevence-vector machine givet problemet med uppskattning av klasssannolikheter. Den distans-baserade metric learning nearest neighbor klassificeraren var den enda avståndsbaserade modellen som överträffade alla icke-distansbaserade modeller givet problemen med presentations klassificering och klasssannolikhets uppskattning. Audience engagement Engagement quantification Machine learning Distance-based models Few-shot learning Statistical analysis Publiksengagemang Engagemangskvantifiering Maskininlärning Distansbaserade modeller Few-shot inlärning Statistisk analys Computer and Information Sciences Data- och informationsvetenskap

Search results