Global ETD Search

231	Mutual Learning Algorithms in Machine Learning Chowdhury, Sabrina Tarin 05 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Mutual learning algorithm is a machine learning algorithm where multiple machine learning algorithms learns from different sources and then share their knowledge among themselves so that all the agents can improve their classification and prediction accuracies simultaneously. Mutual learning algorithm can be an efficient mechanism for improving the machine learning and neural network efficiency in a multi-agent system. Usually, in knowledge distillation algorithms, a big network plays the role of a static teacher and passes the data to smaller networks, known as student networks, to improve the efficiency of the latter. In this thesis, it is showed that two small networks can dynamically and interchangeably play the changing roles of teacher and student to share their knowledge and hence, the efficiency of both the networks improve simultaneously. This type of dynamic learning mechanism can be very useful in mobile environment where there is resource constraint for training with big dataset. Data exchange in multi agent, teacher-student network system can lead to efficient learning. The concept and the proposed mutual learning algorithm are demonstrated using convolutional neural networks (CNNs) and Support Vector Machine (SVM) to recognize the pattern recognition problem using MNIST hand-writing dataset. The concept of machine learning is applied in the field of natural language processing (NLP) too. Machines with basic understanding of human language are getting increasingly popular in day-to-day life. Therefore, NLP-enabled machines with memory efficient training can potentially become an indispensable part of our life in near future. A classic problem in the field of NLP is news classification problem where news articles from newspapers are classified by news categories by machine learning algorithms. In this thesis, we show news classification implemented using Naïve Bayes and support vector machine (SVM) algorithm. Then we show two small networks can dynamically play the changing roles of teacher and student to share their knowledge on news classification and hence, the efficiency of both the networks improves simultaneously. The mutual learning algorithm is applied between homogenous algorithms first, i.e., between two Naive Bayes algorithms and two SVM algorithms. Then the mutual learning is demonstrated between heterogenous agents, i.e., between one Naïve Bayes and one SVM agent and the relative efficiency increase between the agents is discussed before and after mutual learning. / 2025-04-04 Mutual learning algorithm Machine Learning Image Classification Natural Language Processing
232	Evaluation of Automatic Text Summarization Using Synthetic Facts Ahn, Jaewook 01 June 2022 (has links) (PDF) Automatic text summarization has achieved remarkable success with the development of deep neural networks and the availability of standardized benchmark datasets. It can generate fluent, human-like summaries. However, the unreliability of the existing evaluation metrics hinders its practical usage and slows down its progress. To address this issue, we propose an automatic reference-less text summarization evaluation system with dynamically generated synthetic facts. We hypothesize that if a system guarantees a summary that has all the facts that are 100% known in the synthetic document, it can provide natural interpretability and high feasibility in measuring factual consistency and comprehensiveness. To our knowledge, our system is the first system that measures the overarching quality of the text summarization models with factual consistency, comprehensiveness, and compression rate. We validate our system by comparing its correlation with human judgment with existing N-gram overlap-based metrics such as ROUGE and BLEU and a BERT-based evaluation metric, BERTScore. Our system's experimental evaluation of PEGASUS, BART, and T5 outperforms the current evaluation metrics in measuring factual consistency with a noticeable margin and demonstrates its statistical significance in measuring comprehensiveness and overall summary quality. Natural Language Processing Text summarization Summarization Evaluation Synthetic Facts
233	Improving Vulnerability Description Using Natural Language Generation Althebeiti, Hattan 01 January 2023 (has links) (PDF) Software plays an integral role in powering numerous everyday computing gadgets. As our reliance on software continues to grow, so does the prevalence of software vulnerabilities, with significant implications for organizations and users. As such, documenting vulnerabilities and tracking their development becomes crucial. Vulnerability databases addressed this issue by storing a record with various attributes for each discovered vulnerability. However, their contents suffer several drawbacks, which we address in our work. In this dissertation, we investigate the weaknesses associated with vulnerability descriptions in public repositories and alleviate such weaknesses through Natural Language Processing (NLP) approaches. The first contribution examines vulnerability descriptions in those databases and approaches to improve them. We propose a new automated method leveraging external sources to enrich the scope and context of a vulnerability description. Moreover, we exploit fine-tuned pretrained language models for normalizing the resulting description. The second contribution investigates the need for uniform and normalized structure in vulnerability descriptions. We address this need by breaking the description of a vulnerability into multiple constituents and developing a multi-task model to create a new uniform and normalized summary that maintains the necessary attributes of the vulnerability using the extracted features while ensuring a consistent vulnerability description. Our method proved effective in generating new summaries with the same structure across a collection of various vulnerability descriptions and types. Our final contribution investigates the feasibility of assigning the Common Weakness Enumeration (CWE) attribute to a vulnerability based on its description. CWE offers a comprehensive framework that categorizes similar exposures into classes, representing the types of exploitation associated with such vulnerabilities. Our approach utilizing pre-trained language models is shown to outperform Large Language Model (LLM) for this task. Overall, this dissertation provides various technical approaches exploiting advances in NLP to improve publicly available vulnerability databases. Vulnerability NVD CVE Natural Language Processing Transformer LLM Computer Engineering
234	Disambiguating natural language via aligning meaningful descriptions Xin, Yida 07 February 2024 (has links) Artificial Intelligence (AI) technologies are increasingly pervading aspects of our lives. Because people use natural language to communicate with each other, computers should also use natural language to communicate with us. One of the principal obstacles to achieving this is the ambiguity of natural language, evidenced in problems such as prepositional phrase attachment and pronoun coreference. Current methods rely on the statistical frequency of word patterns, but this is often brittle and opaque to people. In this thesis, I explore the idea of using commonsense knowledge to resolve linguistic ambiguities. I introduce PatchComm, which invokes explicit commonsense assertions to solve context-independent ambiguities. When commonsense assertions are missing, I invoke RetroGAN-DRD, which leverages state-of-the-art inference techniques such as retrofitting and generative adversarial networks (GAN) to infer commonsense assertions. I build upon that with ProGeneXP, which brings state-of-the-art language models to the task of describing its inputs and implicit knowledge in natural language while providing meaningful descriptions for PatchComm to align to further resolve linguistic ambiguities. Finally, I introduce DialComm to lay the groundwork for moving from single-sentence disambiguation to discourse. Specifically, DialComm builds upon PatchComm to obtain information from single sentences and integrates such information with additional commonsense assertions to build integral frame representations for discourses. I illustrate DialComm’s ability with an application to end-user programming in natural language. The contributions of this dissertation lie in showing how commonsense inference can be integrated with parsing to resolve ambiguities in natural language, in a transparent manner. I have implemented three candidate systems, with increasingly sophisticated approaches. I verified that they perform well on some standard tests, and they operate in such a way that is understandable to people. This obviates the mythical inevitability of an interpretability-performance tradeoff. I have shown how my techniques can be used in a candidate application, programming in natural language. My work leaves us in a good position to exploit further advances in natural language understanding and commonsense inference. I am optimistic that natural, transparent communication with computers will help make the world a better place. Computer science Artificial intelligence Commonsense reasoning Natural language processing
235	Incorporating semantic and syntactic information into document representation for document clustering Wang, Yong 06 August 2005 (has links) Document clustering is a widely used strategy for information retrieval and text data mining. In traditional document clustering systems, documents are represented as a bag of independent words. In this project, we propose to enrich the representation of a document by incorporating semantic information and syntactic information. Semantic analysis and syntactic analysis are performed on the raw text to identify this information. A detailed survey of current research in natural language processing, syntactic analysis, and semantic analysis is provided. Our experimental results demonstrate that incorporating semantic information and syntactic information can improve the performance of our document clustering system for most of our data sets. A statistically significant improvement can be achieved when we combine both syntactic and semantic information. Our experimental results using compound words show that using only compound words does not improve the clustering performance for our data sets. When the compound words are combined with original single words, the combined feature set gets slightly better performance for most data sets. But this improvement is not statistically significant. In order to select the best clustering algorithm for our document clustering system, a comparison of several widely used clustering algorithms is performed. Although the bisecting K-means method has advantages when working with large datasets, a traditional hierarchical clustering algorithm still achieves the best performance for our small datasets. Natural Language Processing Information Retrieval Text Data Mining Document Clustering
236	Introducing Semantic Role Labels and Enhancing Dependency Parsing to Compute Politeness in Natural Language Dua, Smrite 13 August 2015 (has links) No description available. Computer Science
237	Improving NLP Systems Using Unconventional, Freely-Available Data Huang, Fei January 2013 (has links) Sentence labeling is a type of pattern recognition task that involves the assignment of a categorical label to each member of a sentence of observed words. Standard supervised sentence-labeling systems often have poor generalization: it is difficult to estimate parameters for words which appear in the test set, but seldom (or never) appear in the training set, because they only use words as features in their prediction tasks. Representation learning is a promising technique for discovering features that allow a supervised classifier to generalize from a source domain dataset to arbitrary new domains. We demonstrate that features which are learned from distributional representations of unlabeled data can be used to improve performance on out-of-vocabulary words and help the model to generalize. We also argue that it is important for a representation learner to be able to incorporate expert knowledge during its search for helpful features. We investigate techniques for building open-domain sentence labeling systems that approach the ideal of a system whose accuracy is high and consistent across domains. In particular, we investigate unsupervised techniques for language model representation learning that provide new features which are stable across domains, in that they are predictive in both the training and out-of-domain test data. In experiments, our best system with the proposed techniques reduce error by as much as 11.4% relative to the previous system using traditional representations on the Part-of-Speech tagging task. Moreover, we leverage the Posterior Regularization framework, and develop an architecture for incorporating biases from prior knowledge into representation learning. We investigate three types of biases: entropy bias, distance bias and predictive bias. Experiments on two domain adaptation tasks show that our biased learners identify significantly better sets of features than unbiased learners. This results in a relative reduction in error of more than 16% for both tasks with respect to existing state-of-the-art representation learning techniques. We also extend the idea of using additional unlabeled data to improve the system's performance on a different NLP task, word alignment. Traditional word alignment only takes a sentence-level aligned parallel corpus as input and generates the word-level alignments. However, as the integration of different cultures, more and more people are competent in multiple languages, and they often use elements of multiple languages in conversations. Linguist Code Switching (LCS) is such a situation where two or more languages show up in the context of a single conversation. Traditional machine translation (MT) systems treat LCS data as noise, or just as regular sentences. However, if LCS data is processed intelligently, it can provide a useful signal for training word alignment and MT models. In this work, we first extract constraints from this code switching data and then incorporate them into a word alignment model training procedure. We also show that by using the code switching data, we can jointly train a word alignment model and a language model using co-training. Our techniques for incorporating LCS data improve by 2.64 in BLEU score over a baseline MT system trained using only standard sentence-aligned corpora. / Computer and Information Science Computer Science Alignment Domain Adaptation Natural Language Processing Representation
238	Knowledge intensive natural language generation with revision Cline, Ben E. 09 September 2008 (has links) Traditional natural language generation systems use a pipelined architecture. Two problems with this architecture are poor task decomposition and the lack of interaction between conceptual and stylistic decisions making. A revision architecture operating in a knowledge intensive environment is proposed as a means to deal with these two problems. In a revision system. text is produced and refined iteratively. A text production cycle consists of two steps. First, the text generators produce initial text. Second, this text is examined for defects by revisors. When defects are found the revisors make suggestions for the regeneration of the text. The text generator/revision cycle continues to polish the text iteratively until no more defects can be found. Although previous research has focused on stylistic revisions only. this paper describes techniques for both stylistic and conceptual revisions. Using revision to produce extended natural language text through a series of drafts provides three significant advantages over a traditional natural language generation system. First, it reduces complexity through task decomposition. Second, it promotes text polishing techniques that benefit from the ability to examine generated text in the context of the underlying knowledge from which it was generated. Third, it provides a mechanism for the integrated handling of conceptual and stylistic decisions. For revision to operate intelligently and efficiently, the revision component must have access to both the surface text and the underlying knowledge from which it was generated. A knowledge intensive architecture with a uniform knowledge base allows the revision software to quickly locate referents, choices made in producing the defective text, alternatives to the decisions made at both the conceptual and stylistic levels, and the intent of the text. The revisors use this knowledge, along with facts about the topic at hand and knowledge about how text is produced. to select alternatives for improving the text. The Kalos system was implemented to illustrate revision processing in a natural language generation system. It produces advanced draft quality text for a microprocessor users' guide from a knowledge base describing the microprocessor. It uses revision techniques in a knowledge intensive environment to iteratively polish its initial generation. The system performs both conceptual and stylistic revisions. Example output from the system, showing both types of revision, is presented and discussed. Techniques for dealing with the computational problems caused by the system's uniform knowledge base are described. / Ph. D. LD5655.V856 1994.C586
239	Summarizing Legal Depositions Chakravarty, Saurabh 18 January 2021 (has links) Documents like legal depositions are used by lawyers and paralegals to ascertain the facts pertaining to a case. These documents capture the conversation between a lawyer and a deponent, which is in the form of questions and answers. Applying current automatic summarization methods to these documents results in low-quality summaries. Though extensive research has been performed in the area of summarization, not all methods succeed in all domains. Accordingly, this research focuses on developing methods to generate high-quality summaries of depositions. As part of our work related to legal deposition summarization, we propose a solution in the form of a pipeline of components, each addressing a sub-problem; we argue that a pipeline based framework can be tuned to summarize documents from any domain. First, we developed methods to parse the depositions, accounting for different document formats. We were able to successfully parse both a proprietary and a public dataset with our methods. We next developed methods to anonymize the personal information present in the deposition documents; we achieve 95% accuracy on the anonymization using a random sampling based evaluation. Third, we developed an ontology to define dialog acts for the questions and answers present in legal depositions. Fourth, we developed classifiers based on this ontology and achieved F1-scores of 0.84 and 0.87 on the public and proprietary datasets, respectively. Fifth, we developed methods to transform a question-answer pair to a canonical/simple form. In particular, based on the dialog acts for the question and answer combination, we developed transformation methods using each of traditional NLP, and deep learning, techniques. We were able to achieve good scores on the ROUGE and semantic similarity metrics for most of the dialog act combinations. Sixth, we developed methods based on deep learning, heuristics, and machine translation to correct the transformed declarative sentences. The sentence correction improved the readability of the transformed sentences. Seventh, we developed a methodology to break a deposition into its topical aspects. An ontology for aspects was defined for legal depositions, and classifiers were developed that achieved an F1-score of 0.89. Eighth, we developed methods to segment the deposition into parts that have the same thematic context. The segments helped in augmenting candidate summary sentences with surrounding context, that leads to a more readable summary. Ninth, we developed a pipeline to integrate all of the methods, to generate summaries from the depositions. We were able to outperform the baseline and state of the art summarization methods in a majority of the cases based on the F1, Recall, and ROUGE-2 scores. The performance gains were statistically significant for all of the scores. The summaries generated by our system can be arranged based on the same thematic context or aspect and hence should be much easier to read and follow, compared to the baseline methods. As part of our future work, we will improve upon these methods. We will refine our methods to identify the important parts using additional documents related to a deposition. In addition, we will work to improve the compression ratio of the generated summaries by reducing the number of unimportant sentences. We will expand the training dataset to learn and tune the coverage of the aspects for various deponent types using empirical methods. Our system has demonstrated effectiveness in transforming a QA pair into a declarative sentence. Having such a capability could enable us to generate a narrative summary from the depositions, a first for legal depositions. We will also expand our dataset for evaluation to ensure that our methods are indeed generalizable, and that they work well when experts subjectively evaluate the quality of the deposition summaries. / Doctor of Philosophy / Documents in the legal domain are of various types. One set of documents includes trial and deposition transcripts. These documents capture the proceedings of a trial or a deposition by note-taking, often over many hours. They contain conversation sentences that are spoken during the trial or deposition and involve multiple actors. One of the greatest challenges with these documents is that generally, they are long. This is a source of pain for attorneys and paralegals who work with the information contained in the documents. Text summarization techniques have been successfully used to compress a document and capture the salient parts from it. They have also been able to reduce redundancy in summary sentences while focusing on coherence and proper sentence formation. Summarizing trial and deposition transcripts would be immensely useful for law professionals, reducing the time to identify and disseminate salient information in case related documents, as well as reducing costs and trial preparation time. Processing the deposition documents using traditional text processing techniques is a challenge because of their form. Having the deposition conversations transformed into a suitable declarative form where they can be easily comprehended can pave the way for the usage of extractive and abstractive summarization methods. As part of our work, we identified the different discourse structures present in the deposition in the form of dialog acts. We developed methods based on those dialog acts to transform the deposition into a declarative form. We were able to achieve an accuracy of 87% on the dialog act classification. We also were able to transform the conversational question-answer (QA) pairs into declarative forms for 10 of the top-11 dialog act combinations. Our transformation methods performed better in 8 out of the 10 QA pair types, when compared to the baselines. We also developed methods to classify the deposition QA pairs according to their topical aspects. We generated summaries using aspects by defining the relative coverage for each aspect that should be present in a summary. Another set of methods developed can segment the depositions into parts that have the same thematic context. These segments aid augmenting the candidate summary sentences, to create a summary where information is surrounded by associated context. This makes the summary more readable and informative; we were able to significantly outperform the state of the art methods, based on our evaluations. Natural Language Processing Deep Learning Legal Deposition Summarization
240	Learning with Limited Labeled Data: Techniques and Applications Lei, Shuo 11 October 2023 (has links) Recent advances in large neural network-style models have demonstrated great performance in various applications, such as image generation, question answering, and audio classification. However, these deep and high-capacity models require a large amount of labeled data to function properly, rendering them inapplicable in many real-world scenarios. This dissertation focuses on the development and evaluation of advanced machine learning algorithms to solve the following research questions: (1) How to learn novel classes with limited labeled data, (2) How to adapt a large pre-trained model to the target domain if only unlabeled data is available, (3) How to boost the performance of the few-shot learning model with unlabeled data, and (4) How to utilize limited labeled data to learn new classes without the training data in the same domain. First, we study few-shot learning in text classification tasks. Meta-learning is becoming a popular approach for addressing few-shot text classification and has achieved state-of-the-art performance. However, the performance of existing approaches heavily depends on the interclass variance of the support set. To address this problem, we propose a TART network for few-shot text classification. The model enhances the generalization by transforming the class prototypes to per-class fixed reference points in task-adaptive metric spaces. In addition, we design a novel discriminative reference regularization to maximize divergence between transformed prototypes in task-adaptive metric spaces to improve performance further. In the second problem we focus on self-learning in cross-lingual transfer task. Our goal here is to develop a framework that can make the pretrained cross-lingual model continue learning the knowledge with large amount of unlabeled data. Existing self-learning methods in crosslingual transfer tasks suffer from the large number of incorrectly pseudo-labeled samples used in the training phase. We first design an uncertainty-aware cross-lingual transfer framework with pseudo-partial-labels. We also propose a novel pseudo-partial-label estimation method that considers prediction confidences and the limitation to the number of candidate classes. Next, to boost the performance of the few-shot learning model with unlabeled data, we propose a semi-supervised approach for few-shot semantic segmentation task. Existing solutions for few-shot semantic segmentation cannot easily be applied to utilize image-level weak annotations. We propose a class-prototype augmentation method to enrich the prototype representation by utilizing a few image-level annotations, achieving superior performance in one-/multi-way and weak annotation settings. We also design a robust strategy with softmasked average pooling to handle the noise in image-level annotations, which considers the prediction uncertainty and employs the task-specific threshold to mask the distraction. Finally, we study the cross-domain few-shot learning in the semantic segmentation task. Most existing few-shot segmentation methods consider a setting where base classes are drawn from the same domain as the new classes. Nevertheless, gathering enough training data for meta-learning is either unattainable or impractical in many applications. We extend few-shot semantic segmentation to a new task, called Cross-Domain Few-Shot Semantic Segmentation (CD-FSS), which aims to generalize the meta-knowledge from domains with sufficient training labels to low-resource domains. Then, we establish a new benchmark for the CD-FSS task and evaluate both representative few-shot segmentation methods and transfer learning based methods on the proposed benchmark. We then propose a novel Pyramid-AnchorTransformation based few-shot segmentation network (PATNet), in which domain-specific features are transformed into domain-agnostic ones for downstream segmentation modules to fast adapt to unseen domains. / Doctor of Philosophy / Nowadays, deep learning techniques play a crucial role in our everyday existence. In addition, they are crucial to the success of many e-commerce and local businesses for enhancing data analytics and decision-making. Notable applications include intelligent transportation, intelligent healthcare, the generation of natural language, and intrusion detection, among others. To achieve reasonable performance on a new task, these deep and high-capacity models require thousands of labeled examples, which increases the data collection effort and computation costs associated with training a model. Moreover, in many disciplines, it might be difficult or even impossible to obtain data due to concerns such as privacy and safety. This dissertation focuses on learning with limited labeled data in natural language processing and computer vision tasks. To recognize novel classes with a few examples in text classification tasks, we develop a deep learning-based model that can capture both cross- task transferable knowledge and task-specific features. We also build an uncertainty-aware self-learning framework and a semi-supervised few-shot learning method, which allow us to boost the pre-trained model with easily accessible unlabeled data. In addition, we propose a cross-domain few-shot semantic segmentation method to generalize the model to different domains with a few examples. By handling these unique challenges in learning with limited labeled data and developing suitable approaches, we hope to improve the eﬀiciency and generalization of deep learning methods in the real world. few-shot learning self-learning semantic segmentation natural language processing

Search results