Global ETD Search

251	Semi-automatic grammar induction for bidirectional machine translation. January 2002 (has links) Wong, Chin Chung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 137-143). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Objectives --- p.3 / Chapter 1.2 --- Thesis Outline --- p.5 / Chapter 2 --- Background in Natural Language Understanding --- p.6 / Chapter 2.1 --- Rule-based Approaches --- p.7 / Chapter 2.2 --- Corpus-based Approaches --- p.8 / Chapter 2.2.1 --- Stochastic Approaches --- p.8 / Chapter 2.2.2 --- Phrase-spotting Approaches --- p.9 / Chapter 2.3 --- The ATIS Domain --- p.10 / Chapter 2.3.1 --- Chinese Corpus Preparation --- p.11 / Chapter 3 --- Semi-automatic Grammar Induction - Baseline Approach --- p.13 / Chapter 3.1 --- Background in Grammar Induction --- p.13 / Chapter 3.1.1 --- Simulated Annealing --- p.14 / Chapter 3.1.2 --- Bayesian Grammar Induction --- p.14 / Chapter 3.1.3 --- Probabilistic Grammar Acquisition --- p.15 / Chapter 3.2 --- Semi-automatic Grammar Induction 一 Baseline Approach --- p.16 / Chapter 3.2.1 --- Spatial Clustering --- p.16 / Chapter 3.2.2 --- Temporal Clustering --- p.18 / Chapter 3.2.3 --- Post-processing --- p.19 / Chapter 3.2.4 --- Four Aspects for Enhancements --- p.20 / Chapter 3.3 --- Chapter Summary --- p.22 / Chapter 4 --- Semi-automatic Grammar Induction - Enhanced Approach --- p.23 / Chapter 4.1 --- Evaluating Induced Grammars --- p.24 / Chapter 4.2 --- Stopping Criterion --- p.26 / Chapter 4.2.1 --- Cross-checking with Recall Values --- p.29 / Chapter 4.3 --- Improvements on Temporal Clustering --- p.32 / Chapter 4.3.1 --- Evaluation --- p.39 / Chapter 4.4 --- Improvements on Spatial Clustering --- p.46 / Chapter 4.4.1 --- Distance Measures --- p.48 / Chapter 4.4.2 --- Evaluation --- p.57 / Chapter 4.5 --- Enhancements based on Intelligent Selection --- p.62 / Chapter 4.5.1 --- Informed Selection between Spatial Clustering and Tem- poral Clustering --- p.62 / Chapter 4.5.2 --- Selecting the Number of Clusters Per Iteration --- p.64 / Chapter 4.5.3 --- An Example for Intelligent Selection --- p.64 / Chapter 4.5.4 --- Evaluation --- p.68 / Chapter 4.6 --- Chapter Summary --- p.71 / Chapter 5 --- Bidirectional Machine Translation using Induced Grammars ´ؤBaseline Approach --- p.73 / Chapter 5.1 --- Background in Machine Translation --- p.75 / Chapter 5.1.1 --- Rule-based Machine Translation --- p.75 / Chapter 5.1.2 --- Statistical Machine Translation --- p.76 / Chapter 5.1.3 --- Knowledge-based Machine Translation --- p.77 / Chapter 5.1.4 --- Example-based Machine Translation --- p.78 / Chapter 5.1.5 --- Evaluation --- p.79 / Chapter 5.2 --- Baseline Configuration on Bidirectional Machine Translation System --- p.84 / Chapter 5.2.1 --- Bilingual Dictionary --- p.84 / Chapter 5.2.2 --- Concept Alignments --- p.85 / Chapter 5.2.3 --- Translation Process --- p.89 / Chapter 5.2.4 --- Two Aspects for Enhancements --- p.90 / Chapter 5.3 --- Chapter Summary --- p.91 / Chapter 6 --- Bidirectional Machine Translation ´ؤ Enhanced Approach --- p.92 / Chapter 6.1 --- Concept Alignments --- p.93 / Chapter 6.1.1 --- Enhanced Alignment Scheme --- p.95 / Chapter 6.1.2 --- Experiment --- p.97 / Chapter 6.2 --- Grammar Checker --- p.100 / Chapter 6.2.1 --- Components for Grammar Checking --- p.101 / Chapter 6.3 --- Evaluation --- p.117 / Chapter 6.3.1 --- Bleu Score Performance --- p.118 / Chapter 6.3.2 --- Modified Bleu Score --- p.122 / Chapter 6.4 --- Chapter Summary --- p.130 / Chapter 7 --- Conclusions --- p.131 / Chapter 7.1 --- Summary --- p.131 / Chapter 7.2 --- Contributions --- p.134 / Chapter 7.3 --- Future work --- p.136 / Bibliography --- p.137 / Chapter A --- Original SQL Queries --- p.144 / Chapter B --- Seeded Categories --- p.146 / Chapter C --- 3 Alignment Categories --- p.147 / Chapter D --- Labels of Syntactic Structures in Grammar Checker --- p.148 Machine translating Computer algorithms Grammar, Comparative and general--Syntax
252	Automatic text categorization for information filtering. January 1998 (has links) Ho Chao Yang. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 157-163). / Abstract also in Chinese. / Abstract --- p.i / Acknowledgment --- p.iii / List of Figures --- p.viii / List of Tables --- p.xiv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Automatic Document Categorization --- p.1 / Chapter 1.2 --- Information Filtering --- p.3 / Chapter 1.3 --- Contributions --- p.6 / Chapter 1.4 --- Organization of the Thesis --- p.7 / Chapter 2 --- Related Work --- p.9 / Chapter 2.1 --- Existing Automatic Document Categorization Approaches --- p.9 / Chapter 2.1.1 --- Rule-Based Approach --- p.10 / Chapter 2.1.2 --- Similarity-Based Approach --- p.13 / Chapter 2.2 --- Existing Information Filtering Approaches --- p.19 / Chapter 2.2.1 --- Information Filtering Systems --- p.19 / Chapter 2.2.2 --- Filtering in TREC --- p.21 / Chapter 3 --- Document Pre-Processing --- p.23 / Chapter 3.1 --- Document Representation --- p.23 / Chapter 3.2 --- Classification Scheme Learning Strategy --- p.26 / Chapter 4 --- A New Approach - IBRI --- p.31 / Chapter 4.1 --- Overview of Our New IBRI Approach --- p.31 / Chapter 4.2 --- The IBRI Representation and Definitions --- p.34 / Chapter 4.3 --- The IBRI Learning Algorithm --- p.37 / Chapter 5 --- IBRI Experiments --- p.43 / Chapter 5.1 --- Experimental Setup --- p.43 / Chapter 5.2 --- Evaluation Metric --- p.45 / Chapter 5.3 --- Results --- p.46 / Chapter 6 --- A New Approach - GIS --- p.50 / Chapter 6.1 --- Motivation of GIS --- p.50 / Chapter 6.2 --- Similarity-Based Learning --- p.51 / Chapter 6.3 --- The Generalized Instance Set Algorithm (GIS) --- p.58 / Chapter 6.4 --- Using GIS Classifiers for Classification --- p.63 / Chapter 6.5 --- Time Complexity --- p.64 / Chapter 7 --- GIS Experiments --- p.68 / Chapter 7.1 --- Experimental Setup --- p.68 / Chapter 7.2 --- Results --- p.73 / Chapter 8 --- A New Information Filtering Approach Based on GIS --- p.87 / Chapter 8.1 --- Information Filtering Systems --- p.87 / Chapter 8.2 --- GIS-Based Information Filtering --- p.90 / Chapter 9 --- Experiments on GIS-based Information Filtering --- p.95 / Chapter 9.1 --- Experimental Setup --- p.95 / Chapter 9.2 --- Results --- p.100 / Chapter 10 --- Conclusions and Future Work --- p.108 / Chapter 10.1 --- Conclusions --- p.108 / Chapter 10.2 --- Future Work --- p.110 / Chapter A --- Sample Documents in the corpora --- p.111 / Chapter B --- Details of Experimental Results of GIS --- p.120 / Chapter C --- Computational Time of Reuters-21578 Experiments --- p.141 Text processing (Computer science) Nearest neighbor analysis (Statistics) Information retrieval
253	Realization of automatic concept extraction for Chinese conceptual information retrieval =: 中文槪念訊息檢索中自動槪念抽取的實踐. / 中文槪念訊息檢索中自動槪念抽取的實踐 / Realization of automatic concept extraction for Chinese conceptual information retrieval =: Zhong wen gai nian xun xi jian suo zhong zi dong gai nian chou qu de shi jian. / Zhong wen gai nian xun xi jian suo zhong zi dong gai nian chou qu de shi jian January 1998 (has links) Wai Ip Lam. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1998. / Includes bibliographical references (leaves 84-87). / Text in English; abstract also in Chinese. / Wai Ip Lam. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Background --- p.5 / Chapter 2.1 --- Information Retrieval --- p.5 / Chapter 2.1.1 --- Index Extraction --- p.6 / Chapter 2.1.2 --- Other Approaches to Extracting Indexes --- p.7 / Chapter 2.1.3 --- Conceptual Information Retrieval --- p.8 / Chapter 2.1.4 --- Information Extraction --- p.9 / Chapter 2.2 --- Natural Language Parsing --- p.9 / Chapter 2.2.1 --- Linguistics-based --- p.10 / Chapter 2.2.2 --- Corpus-based --- p.11 / Chapter 3 --- Concept Extraction --- p.13 / Chapter 3.1 --- Concepts in Sentences --- p.13 / Chapter 3.1.1 --- Semantic Structures and Themantic Roles --- p.13 / Chapter 3.1.2 --- Syntactic Functions --- p.14 / Chapter 3.2 --- Representing Concepts --- p.15 / Chapter 3.3 --- Application to Conceptual Information Retrieval --- p.18 / Chapter 3.4 --- Overview of Our Concept Extraction Model --- p.20 / Chapter 3.4.1 --- Corpus Training --- p.21 / Chapter 3.4.2 --- Sentence Analyzing --- p.22 / Chapter 4 --- Noun Phrase Detection --- p.23 / Chapter 4.1 --- Significance of Noun Phrase Detection --- p.23 / Chapter 4.1.1 --- Noun Phrases versus Terminals in Parse Trees --- p.23 / Chapter 4.1.2 --- Quantitative Analysis of Applying Noun Phrase Detection --- p.26 / Chapter 4.2 --- An Algorithm for Chinese Noun Phrase Partial Parsing --- p.28 / Chapter 4.2.1 --- The Hybrid Approach --- p.28 / Chapter 4.2.2 --- CNP3´ؤThe Chinese NP Partial Parser --- p.30 / Chapter 5 --- Rule Extraction and SVO Parsing --- p.35 / Chapter 5.1 --- Annotation of Corpora --- p.36 / Chapter 5.1.1 --- Components of Chinese Sentence Patterns --- p.36 / Chapter 5.1.2 --- Annotating Sentence Structures --- p.37 / Chapter 5.1.3 --- Illustrative Examples --- p.38 / Chapter 5.2 --- Parsing with Rules Obtained Directly from Corpora --- p.43 / Chapter 5.2.1 --- Extracting Rules --- p.43 / Chapter 5.2.2 --- Parsing --- p.44 / Chapter 5.3 --- Using Word Specific Information --- p.45 / Chapter 6 --- Generalization of Rules --- p.48 / Chapter 6.1 --- Essence of Chinese Linguistics on Generalization --- p.49 / Chapter 6.1.1 --- Classification of Chinese Sentence Patterns --- p.50 / Chapter 6.1.2 --- Revision of Chinese Verb Phrase Classification --- p.52 / Chapter 6.2 --- Initial Generalization --- p.53 / Chapter 6.2.1 --- Generalizing Rules --- p.55 / Chapter 6.2.2 --- Dealing with Alternative Results --- p.58 / Chapter 6.2.3 --- Parsing --- p.58 / Chapter 6.2.4 --- An illustrative Example --- p.59 / Chapter 6.3 --- Further Generalization --- p.60 / Chapter 7 --- Experiments on SVO Parsing --- p.62 / Chapter 7.1 --- Experimental Setup --- p.63 / Chapter 7.2 --- Effect of Adopting Noun Phrase Detection --- p.65 / Chapter 7.3 --- Results of Generalization --- p.68 / Chapter 7.4 --- Reliability Evaluation --- p.69 / Chapter 7.4.1 --- Covergence Sequence Tests --- p.69 / Chapter 7.4.2 --- Cross Evaluation Tests --- p.72 / Chapter 7.5 --- Overall Performance --- p.75 / Chapter 8 --- Conclusions --- p.79 / Chapter 8.1 --- Summary --- p.79 / Chapter 8.2 --- Contribution --- p.81 / Chapter 8.3 --- Future Directions --- p.81 / Chapter 8.3.1 --- Improvements in Parsing --- p.81 / Chapter 8.3.2 --- Concept Representations --- p.82 / Chapter 8.3.3 --- Non-IR Applications --- p.83 / Bibliography --- p.84 / Appendix --- p.88 / Chapter A --- The Extended Part of Speech Tag Set --- p.88 Chinese language--Data processing Information retrieval Parsing (Computer grammar)
254	Deep learning for reading and understanding language Kočiský, Tomáš January 2017 (has links) This thesis presents novel tasks and deep learning methods for machine reading comprehension and question answering with the goal of achieving natural language understanding. First, we consider a semantic parsing task where the model understands sentences and translates them into a logical form or instructions. We present a novel semi-supervised sequential autoencoder that considers language as a discrete sequential latent variable and semantic parses as the observations. This model allows us to leverage synthetically generated unpaired logical forms, and thereby alleviate the lack of supervised training data. We show the semi-supervised model outperforms a supervised model when trained with the additional generated data. Second, reading comprehension requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess reading comprehension ability, in both artificial agents and children learning to read. We propose a new, challenging, supervised reading comprehension task. We gather a large-scale dataset of news stories from the CNN and Daily Mail websites with Cloze-style questions created from the highlights. This dataset allows for the first time training deep learning models for reading comprehension. We also introduce novel attention-based models for this task and present qualitative analysis of the attention mechanism. Finally, following the recent advances in reading comprehension in both models and task design, we further propose a new task for understanding complex narratives, NarrativeQA, consisting of full texts of books and movie scripts. We collect human written questions and answers based on high-level plot summaries. This task is designed to encourage development of models for language understanding; it is designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard reading comprehension models struggle on the tasks presented here.
255	Topic Segmentation and Medical Named Entities Recognition for Pictorially Visualizing Health Record Summary System Ruan, Wei 03 April 2019 (has links) Medical Information Visualization makes optimized use of digitized data of medical records, e.g. Electronic Medical Record. This thesis is an extended work of Pictorial Information Visualization System (PIVS) developed by Yongji Jin (Jin, 2016) Jiaren Suo (Suo, 2017) which is a graphical visualization system by picturizing patient’s medical history summary depicting patients’ medical information in order to help patients and doctors to easily capture patients’ past and present conditions. The summary information has been manually entered into the interface where the information can be taken from clinical notes. This study proposes a methodology of automatically extracting medical information from patients’ clinical notes by using the techniques of Natural Language Processing in order to produce medical history summarization from past medical records. We develop a Named Entities Recognition system to extract the information of the medical imaging procedure (performance date, human body location, imaging results and so on) and medications (medication names, frequency and quantities) by applying the model of conditional random fields with three main features and others: word-based, part-of-speech, Metamap semantic features. Adding Metamap semantic features is a novel idea which raised the accuracy compared to previous studies. Our evaluation shows that our model has higher accuracy than others on medication extraction as a case study. For enhancing the accuracy of entities extraction, we also propose a methodology of Topic Segmentation to clinical notes using boundary detection by determining the difference of classification probabilities of subsequence sequences, which is different from the traditional Topic Segmentation approaches such as TextTiling, TopicTiling and Beeferman Statistical Model. With Topic Segmentation combined for Named Entities Extraction, we observed higher accuracy for medication extraction compared to the case without the segmentation. Finally, we also present a prototype of integrating our information extraction system with PIVS by simply building the database of interface coordinates and the terms of human body parts. Topic Segmentation Named Entities Recognition Natural Language Processing Machine Learning Medical Information Extraction
256	Pivot-based Statistical Machine Translation for Morphologically Rich Languages Kholy, Ahmed El January 2016 (has links) This thesis describes the research efforts on pivot-based statistical machine translation (SMT) for morphologically rich languages (MRL). We provide a framework to translate to and from morphologically rich languages especially in the context of having little or no parallel corpora between the source and the target languages. We basically address three main challenges. The first one is the sparsity of data as a result of morphological richness. The second one is maximizing the precision and recall of the pivoting process itself. And the last one is making use of any parallel data between the source and the target languages. To address the challenge of data sparsity, we explored a space of tokenization schemes and normalization options. We also examined a set of six detokenization techniques to evaluate detokenized and orthographically corrected (enriched) output. We provide a recipe of the best settings to translate to one of the most challenging languages, namely Arabic. Our best model improves the translation quality over the baseline by 1.3 BLEU points. We also investigated the idea of separation between translation and morphology generation. We compared three methods of modeling morphological features. Features can be modeled as part of the core translation. Alternatively these features can be generated using target monolingual context. Finally, the features can be predicted using both source and target information. In our experimental results, we outperform the vanilla factored translation model. In order to decide on which features to translate, generate or predict, a detailed error analysis should be provided on the system output. As a result, we present AMEANA, an open-source tool for error analysis of natural language processing tasks, targeting morphologically rich languages. The second challenge we are concerned with is the pivoting process itself. We discuss several techniques to improve the precision and recall of the pivot matching. One technique to improve the recall works on the level of the word alignment as an optimization process for pivoting driven by generating phrase pairs between source and target languages. Despite the fact that improving the recall of the pivot matching improves the overall translation quality, we also need to increase the precision of the pivot quality. To achieve this, we introduce quality constraints scores to determine the quality of the pivot phrase pairs between source and target languages. We show positive results for different language pairs which shows the consistency of our approaches. In one of our best models we reach an improvement of 1.2 BLEU points. The third challenge we are concerned with is how to make use of any parallel data between the source and the target languages. We build on the approach of improving the precision of the pivoting process and the methods of combination between the pivot system and the direct system built from the parallel data. In one of the approaches, we introduce morphology constraint scores which are added to the log linear space of features in order to determine the quality of the pivot phrase pairs. We compare two methods of generating the morphology constraints. One method is based on hand-crafted rules relying on our knowledge of the source and target languages; while in the other method, the morphology constraints are induced from available parallel data between the source and target languages which we also use to build a direct translation model. We then combine both the pivot and direct models to achieve better coverage and overall translation quality. Using induced morphology constraints outperformed the handcrafted rules and improved over our best model from all previous approaches by 0.6 BLEU points (7.2/6.7 BLEU points from the direct and pivot baselines respectively). Finally, we introduce applying smart techniques to combine pivot and direct models. We show that smart selective combination can lead to a large reduction of the pivot model without affecting the performance and in some cases improving it. Arabic language Machine translating Computer science
257	Grammar-Based Semantic Parsing Into Graph Representations Bauer, Daniel January 2017 (has links) Directed graphs are an intuitive and versatile representation of natural language meaning because they can capture relationships between instances of events and entities, including cases where entities play multiple roles. Yet, there are few approaches in natural language processing that use graph manipulation techniques for semantic parsing. This dissertation studies graph-based representations of natural language meaning, discusses a formal-grammar based approach to the semantic construction of graph representations, and develops methods for open-domain semantic parsing into such representations. To perform string-to-graph translation I use synchronous hyperedge replacement grammars (SHRG). The thesis studies this grammar formalism from a formal, linguistic, and algorithmic perspective. It proposes a new lexicalized variant of this formalism (LSHRG), which is inspired by tree insertion grammar and provides a clean syntax/semantics interface. The thesis develops a new method for automatically extracting SHRG and LSHRG grammars from annotated “graph banks”, which uses existing syntactic derivations to structure the extracted grammar. It also discusses a new method for semantic parsing with large, automatically extracted grammars, that translates syntactic derivations into derivations of the synchronous grammar, as well as initial work on parse reranking and selection using a graph model. I evaluate this work on the Abstract Meaning Representation (AMR) dataset. The results show that the grammar-based approach to semantic analysis shows promise as a technique for semantic parsing and that string-to-graph grammars can be induced efficiently. Taken together, the thesis lays the foundation for future work on graph methods in natural language semantics. Computer science Directed graphs Semantics--Data processing
258	Exponential Family Embeddings Rudolph, Maja January 2018 (has links) Word embeddings are a powerful approach for capturing semantic similarity among terms in a vocabulary. Exponential family embeddings extend the idea of word embeddings to other types of high-dimensional data. Exponential family embeddings have three ingredients; embeddings as latent variables, a predefined conditioning set for each observation called the context and a conditional likelihood from the exponential family. The embeddings are inferred with a scalable algorithm. This thesis highlights three advantages of the exponential family embeddings model class: (A) The approximations used for existing methods such as word2vec can be understood as a biased stochastic gradients procedure on a specific type of exponential family embedding model --- the Bernoulli embedding. (B) By choosing different likelihoods from the exponential family we can generalize the task of learning distributed representations to different application domains. For example, we can learn embeddings of grocery items from shopping data, embeddings of movies from click data, or embeddings of neurons from recordings of zebrafish brains. On all three applications, we find exponential family embedding models to be more effective than other types of dimensionality reduction. They better reconstruct held-out data and find interesting qualitative structure. (C) Finally, the probabilistic modeling perspective allows us to incorporate structure and domain knowledge in the embedding space. We develop models for studying how language varies over time, differs between related groups of data, and how word usage differs between languages. Key to the success of these methods is that the embeddings share statistical information through hierarchical priors or neural networks. We demonstrate the benefits of this approach in empirical studies of Senate speeches, scientific abstracts, and shopping baskets. Computer science Exponential families (Statistics) Embeddings (Mathematics)
259	Cross-Lingual Transfer of Natural Language Processing Systems Rasooli, Mohammad Sadegh January 2019 (has links) Accurate natural language processing systems rely heavily on annotated datasets. In the absence of such datasets, transfer methods can help to develop a model by transferring annotations from one or more rich-resource languages to the target language of interest. These methods are generally divided into two approaches: 1) annotation projection from translation data, aka parallel data, using supervised models in rich-resource languages, and 2) direct model transfer from annotated datasets in rich-resource languages. In this thesis, we demonstrate different methods for transfer of dependency parsers and sentiment analysis systems. We propose an annotation projection method that performs well in the scenarios for which a large amount of in-domain parallel data is available. We also propose a method which is a combination of annotation projection and direct transfer that can leverage a minimal amount of information from a small out-of-domain parallel dataset to develop highly accurate transfer models. Furthermore, we propose an unsupervised syntactic reordering model to improve the accuracy of dependency parser transfer for non-European languages. Finally, we conduct a diverse set of experiments for the transfer of sentiment analysis systems in different data settings. A summary of our contributions are as follows: * We develop accurate dependency parsers using parallel text in an annotation projection framework. We make use of the fact that the density of word alignments is a valuable indicator of reliability in annotation projection. * We develop accurate dependency parsers in the absence of a large amount of parallel data. We use the Bible data, which is in orders of magnitude smaller than a conventional parallel dataset, to provide minimal cues for creating cross-lingual word representations. Our model is also capable of boosting the performance of annotation projection with a large amount of parallel data. Our model develops cross-lingual word representations for going beyond the traditional delexicalized direct transfer methods. Moreover, we propose a simple but effective word translation approach that brings in explicit lexical features from the target language in our direct transfer method. * We develop different syntactic reordering models that can change the source treebanks in rich-resource languages, thus preventing learning a wrong model for a non-related language. Our experimental results show substantial improvements over non-European languages. * We develop transfer methods for sentiment analysis in different data availability scenarios. We show that we can leverage cross-lingual word embeddings to create accurate sentiment analysis systems in the absence of annotated data in the target language of interest. We believe that the novelties that we introduce in this thesis indicate the usefulness of transfer methods. This is appealing in practice, especially since we suggest eliminating the requirement for annotating new datasets for low-resource languages which is expensive, if not impossible, to obtain. Computer science Artificial intelligence Parsing (Computer grammar)
260	Forced Attention for Image Captioning Hemanth Devarapalli (5930603) 17 January 2019 (has links) <div> <div> <div> <p>Automatic generation of captions for a given image is an active research area in Artificial Intelligence. The architectures have evolved from using metadata of the images on which classical machine learning was employed to neural networks. Two different styles of architectures evolved in the neural network space for image captioning: Encoder-Attention-Decoder architecture, and the transformer architecture. This study is an attempt to modify the attention to allow any object to be specified. An archetypical Encoder-Attention-Decoder architecture (Show, Attend, and Tell (Xu et al., 2015)) is employed as a baseline for this study, and a modification of the Show, Attend, and Tell architecture is proposed. Both the architectures are evaluated on the MSCOCO (Lin et al., 2014) dataset, and seven metrics: BLEU – 1, 2, 3, 4 (Papineni, Roukos, Ward & Zhu, 2002), METEOR (Banerjee & Lavie, 2005), ROGUE L (Lin, 2004), and CIDer (Vedantam, Lawrence & Parikh, 2015) are calculated. Finally, the statistical significance of the results is evaluated by performing paired t tests. </p> </div> </div> </div> Natural Language Processing Artificial intelligence. Natural language processsing Image Captioning Deep Learning

Search results