• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 897
  • 156
  • 74
  • 55
  • 27
  • 18
  • 16
  • 11
  • 10
  • 8
  • 8
  • 7
  • 5
  • 5
  • 4
  • Tagged with
  • 1557
  • 1557
  • 1557
  • 605
  • 549
  • 451
  • 376
  • 366
  • 256
  • 248
  • 237
  • 223
  • 214
  • 194
  • 190
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Using a rewriting system to model individual writing styles

Lin, Jing January 2012 (has links)
Each individual has a distinguished writing style. But natural language generation systems pro- duce text with much less variety. Is it possible to produce more human-like text from natural language generation systems by mimicking the style of particular authors? We start by analysing the text of real authors. We collect a corpus of texts from a single genre (food recipes) with each text identified with its author, and summarise a variety of writing features in these texts. Each author's writing style is the combination of a set of features. Analysis of the writing features shows that not only does each individual author write differently but the differences are consistent over the whole of their corpus. Hence we conclude that authors do keep consistent style consisting of a variety of different features. When we discuss notions such as the style and meaning of texts, we are referring to the reac- tion that readers have to them. It is important, therefore, in the field of computational linguistics to experiment by showing texts to people and assessing their interpretation of the texts. In our research we move the thesis from simple discussion and statistical analysis of the properties of text and NLG systems, to perform experiments to verify the actual impact that lexical preference has on real readers. Through experiments that require participants to follow a recipe and prepare food, we conclude that it is possible to alter the lexicon of a recipe without altering the actions performed by the cook, hence that word choice is an aspect of style rather than semantics; and also that word choice is one of the writing features employed by readers in identifying the author of a text. Among all writing features, individual lexical preference is very important both for analysing and generating texts. So we choose individual lexical choice as our principal topic of research. Using a modified version of distributional similarity CDS) helps us to choose words used by in- dividual authors without the limitation of many other solutions such as a pre-built thesauri. We present an algorithm for analysis and rewriting, and assess the results. Based on the results we propose some further improvements.
112

Domain independent generation from RDF instance date

Sun, Xiantang January 2008 (has links)
The next generation of the web, the Semantic Web, integrates distributed web resources from various domains by allowing data (instantial and ontological data) to be shared and reused across applications, enterprise and community boundaries based on the Resource Description Framework (RDF). Nevertheless, the RDF was not developed for casual users who are unfamiliar with the RDF but interested in data represented using RDF. NLG may be a possible solution to bridging the gap between the casual users and RDF data, but the cost of separately applying fine grained NLG techniques for every domain in the Semantic Web would be extremely high, and hence not realistic.
113

Automatic Tagging of Communication Data

Hoyt, Matthew Ray 08 1900 (has links)
Globally distributed software teams are widespread throughout industry. But finding reliable methods that can properly assess a team's activities is a real challenge. Methods such as surveys and manual coding of activities are too time consuming and are often unreliable. Recent advances in information retrieval and linguistics, however, suggest that automated and/or semi-automated text classification algorithms could be an effective way of finding differences in the communication patterns among individuals and groups. Communication among group members is frequent and generates a significant amount of data. Thus having a web-based tool that can automatically analyze the communication patterns among global software teams could lead to a better understanding of group performance. The goal of this thesis, therefore, is to compare automatic and semi-automatic measures of communication and evaluate their effectiveness in classifying different types of group activities that occur within a global software development project. In order to achieve this goal, we developed a web-based component that can be used to help clean and classify communication activities. The component was then used to compare different automated text classification techniques on various group activities to determine their effectiveness in correctly classifying data from a global software development team project.
114

Content-Based Geolocation Prediction of Canadian Twitter Users and Their Tweets

Metin, Ali Mert 13 August 2019 (has links)
Last decade witnessed the rise of online social networks, especially Twitter. Today, Twitteris a giant social platform with over 250 million users |who produce massive amounts of data everyday. This creates many research opportunities, speci cally for Natural Language Processing (NLP) in which text is utilized to extract information that could be used in many applications. One problem NLP might help solving is geolocation inference or geolocation detection from online social networks. Detecting the location of Twitter users based on the text of their tweets is useful since not many users publicly declare their locations or geotag their tweets. Location information is crucial for a variety of applications such as event detection, disease and illness tracking and user pro ling. These tasks are not trivial, because online content is often noisy; it includes misspellings, incomplete words or phrases, idiomatic expressions, abbreviations, acronyms, and Twitter-speci c literature. In this work, we attempted to detect the location of Canadian users |and tweets sent from Canada |at metropolitan areas and province level; this was not done before, to the best of our knowledge. In order to do this, we collected two di erent datasets, and applied a variety of machine learning, including deep learning methods. Besides, we also attempted to geolocate users based on their social graph (i.e., user's friends and followers) as a novel approach.
115

Geographic referring expressions : doing geometry with words

Gomes de Oliveira, Rodrigo January 2017 (has links)
No description available.
116

GLR parsing with multiple grammars for natural language queries.

January 2000 (has links)
Luk Po Chui. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2000. / Includes bibliographical references (leaves 97-100). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Efficiency and Memory --- p.2 / Chapter 1.2 --- Ambiguity --- p.3 / Chapter 1.3 --- Robustness --- p.4 / Chapter 1.4 --- Thesis Organization --- p.5 / Chapter 2 --- Background --- p.7 / Chapter 2.1 --- Introduction --- p.7 / Chapter 2.2 --- Context-Free Grammars --- p.8 / Chapter 2.3 --- The LR Parsing Algorithm --- p.9 / Chapter 2.4 --- The Generalized LR Parsing Algorithm --- p.12 / Chapter 2.4.1 --- Graph-Structured Stack --- p.12 / Chapter 2.4.2 --- Packed Shared Parse Forest --- p.14 / Chapter 2.5 --- Time and Space Complexity --- p.16 / Chapter 2.6 --- Related Work on Parsing --- p.17 / Chapter 2.6.1 --- GLR* --- p.17 / Chapter 2.6.2 --- TINA --- p.18 / Chapter 2.6.3 --- PHOENIX --- p.19 / Chapter 2.7 --- Chapter Summary --- p.21 / Chapter 3 --- Grammar Partitioning --- p.22 / Chapter 3.1 --- Introduction --- p.22 / Chapter 3.2 --- Motivation --- p.22 / Chapter 3.3 --- Previous Work on Grammar Partitioning --- p.24 / Chapter 3.4 --- Our Grammar Partitioning Approach --- p.26 / Chapter 3.4.1 --- Definitions and Concepts --- p.26 / Chapter 3.4.2 --- Guidelines for Grammar Partitioning --- p.29 / Chapter 3.5 --- An Example --- p.30 / Chapter 3.6 --- Chapter Summary --- p.34 / Chapter 4 --- Parser Composition --- p.35 / Chapter 4.1 --- Introduction --- p.35 / Chapter 4.2 --- GLR Lattice Parsing --- p.36 / Chapter 4.2.1 --- Lattice with Multiple Granularity --- p.36 / Chapter 4.2.2 --- Modifications to the GLR Parsing Algorithm --- p.37 / Chapter 4.3 --- Parser Composition Algorithms --- p.45 / Chapter 4.3.1 --- Parser Composition by Cascading --- p.46 / Chapter 4 3.2 --- Parser Composition with Predictive Pruning --- p.48 / Chapter 4.3.3 --- Comparison of Parser Composition by Cascading and Parser Composition with Predictive Pruning --- p.54 / Chapter 4.4 --- Chapter Summary --- p.54 / Chapter 5 --- Experimental Results and Analysis --- p.56 / Chapter 5.1 --- Introduction --- p.56 / Chapter 5.2 --- Experimental Corpus --- p.57 / Chapter 5.3 --- ATIS Grammar Development --- p.60 / Chapter 5.4 --- Grammar Partitioning and Parser Composition on ATIS Domain --- p.62 / Chapter 5.4.1 --- ATIS Grammar Partitioning --- p.62 / Chapter 5.4.2 --- Parser Composition on ATIS --- p.63 / Chapter 5.5 --- Ambiguity Handling --- p.66 / Chapter 5.6 --- Semantic Interpretation --- p.69 / Chapter 5.6.1 --- Best Path Selection --- p.69 / Chapter 5.6.2 --- Semantic Frame Generation --- p.71 / Chapter 5.6.3 --- Post-Processing --- p.72 / Chapter 5.7 --- Experiments --- p.73 / Chapter 5.7.1 --- Grammar Coverage --- p.73 / Chapter 5.7.2 --- Size of Parsing Table --- p.74 / Chapter 5.7.3 --- Computational Costs --- p.76 / Chapter 5.7.4 --- Accuracy Measures in Natural Language Understanding --- p.81 / Chapter 5.7.5 --- Summary of Results --- p.90 / Chapter 5.8 --- Chapter Summary --- p.91 / Chapter 6 --- Conclusions --- p.92 / Chapter 6.1 --- Thesis Summary --- p.92 / Chapter 6.2 --- Thesis Contributions --- p.93 / Chapter 6.3 --- Future Work --- p.94 / Chapter 6.3.1 --- Statistical Approach on Grammar Partitioning --- p.94 / Chapter 6.3.2 --- Probabilistic modeling for Best Parse Selection --- p.95 / Chapter 6.3.3 --- Robust Parsing Strategies --- p.96 / Bibliography --- p.97 / Chapter A --- ATIS-3 Grammar --- p.101 / Chapter A.l --- English ATIS-3 Grammar Rules --- p.101 / Chapter A.2 --- Chinese ATIS-3 Grammar Rules --- p.104
117

Natural language understanding across application domains and languages.

January 2002 (has links)
Tsui Wai-Ching. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 115-122). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Natural Language Understanding Using Belief Networks --- p.5 / Chapter 1.3 --- Integrating Speech Recognition with Natural Language Un- derstanding --- p.7 / Chapter 1.4 --- Thesis Goals --- p.9 / Chapter 1.5 --- Thesis Organization --- p.10 / Chapter 2 --- Background --- p.12 / Chapter 2.1 --- Natural Language Understanding Approaches --- p.13 / Chapter 2.1.1 --- Rule-based Approaches --- p.15 / Chapter 2.1.2 --- Stochastic Approaches --- p.16 / Chapter 2.1.3 --- Mixed Approaches --- p.18 / Chapter 2.2 --- Portability of Natural Language Understanding Frameworks --- p.19 / Chapter 2.2.1 --- Portability across Domains --- p.19 / Chapter 2.2.2 --- Portability across Languages --- p.20 / Chapter 2.2.3 --- Portability across both Domains and Languages --- p.21 / Chapter 2.3 --- Spoken Language Understanding --- p.21 / Chapter 2.3.1 --- Integration of Speech Recognition Confidence into Nat- ural Language Understanding --- p.22 / Chapter 2.3.2 --- Integration of Other Potential Confidence Features into Natural Language Understanding --- p.24 / Chapter 2.4 --- Belief Networks --- p.24 / Chapter 2.4.1 --- Overview --- p.24 / Chapter 2.4.2 --- Bayesian Inference --- p.26 / Chapter 2.5 --- Transformation-based Parsing Technique --- p.27 / Chapter 2.6 --- Chapter Summary --- p.28 / Chapter 3 --- Portability of the Natural Language Understanding Frame- work across Application Domains and Languages --- p.31 / Chapter 3.1 --- Natural Language Understanding Framework --- p.32 / Chapter 3.1.1 --- Semantic Tagging --- p.33 / Chapter 3.1.2 --- Informational Goal Inference with Belief Networks --- p.34 / Chapter 3.2 --- The ISIS Stocks Domain --- p.36 / Chapter 3.3 --- A Unified Framework for English and Chinese --- p.38 / Chapter 3.3.1 --- Semantic Tagging for the ISIS domain --- p.39 / Chapter 3.3.2 --- Transformation-based Parsing --- p.40 / Chapter 3.3.3 --- Informational Goal Inference with Belief Networks for the ISIS domain --- p.43 / Chapter 3.4 --- Experiments --- p.45 / Chapter 3.4.1 --- Goal Identification Experiments --- p.45 / Chapter 3.4.2 --- A Cross-language Experiment --- p.49 / Chapter 3.5 --- Chapter Summary --- p.55 / Chapter 4 --- Enhancement in the Belief Networks for Informational Goal Inference --- p.57 / Chapter 4.1 --- Semantic Concept Selection in Belief Networks --- p.58 / Chapter 4.1.1 --- Selection of Positive Evidence --- p.58 / Chapter 4.1.2 --- Selection of Negative Evidence --- p.62 / Chapter 4.2 --- Estimation of Statistical Probabilities in the Enhanced Belief Networks --- p.64 / Chapter 4.2.1 --- Estimation of Prior Probabilities --- p.65 / Chapter 4.2.2 --- Estimation of Posterior Probabilities --- p.66 / Chapter 4.3 --- Experiments --- p.73 / Chapter 4.3.1 --- Belief Networks Developed with Positive Evidence --- p.74 / Chapter 4.3.2 --- Belief Networks with the Injection of Negative Evidence --- p.76 / Chapter 4.4 --- Chapter Summary --- p.82 / Chapter 5 --- Integration between Speech Recognition and Natural Lan- guage Understanding --- p.84 / Chapter 5.1 --- The Speech Corpus for the Chinese ISIS Stocks Domain --- p.86 / Chapter 5.2 --- Our Extended Natural Language Understanding Framework for Spoken Language Understanding --- p.90 / Chapter 5.2.1 --- Integrated Scoring for Chinese Speech Recognition and Natural Language Understanding --- p.92 / Chapter 5.3 --- Experiments --- p.92 / Chapter 5.3.1 --- Training and Testing on the Perfect Reference Data Sets --- p.93 / Chapter 5.3.2 --- Mismatched Training and Testing Conditions ´ؤ Perfect Reference versus Imperfect Hypotheses --- p.93 / Chapter 5.3.3 --- Comparing Goal Identification between the Use of Single- best versus N-best Recognition Hypotheses --- p.95 / Chapter 5.3.4 --- Integration of Speech Recognition Confidence Scores into Natural Language Understanding --- p.97 / Chapter 5.3.5 --- Feasibility of Our Approach for Spoken Language Un- derstanding --- p.99 / Chapter 5.3.6 --- Justification of Using Max-of-max Classifier in Our Single Goal Identification Scheme --- p.107 / Chapter 5.4 --- Chapter Summary --- p.109 / Chapter 6 --- Conclusions and Future Work --- p.110 / Chapter 6.1 --- Conclusions --- p.110 / Chapter 6.2 --- Contributions --- p.112 / Chapter 6.3 --- Future Work --- p.113 / Bibliography --- p.115 / Chapter A --- Semantic Frames for Chinese --- p.123 / Chapter B --- Semantic Frames for English --- p.127 / Chapter C --- The Concept Set of Positive Evidence for the Nine Goalsin English --- p.131 / Chapter D --- The Concept Set of Positive Evidence for the Ten Goalsin Chinese --- p.133 / Chapter E --- The Complete Concept Set including Both the Positive and Negative Evidence for the Ten Goals in English --- p.135 / Chapter F --- The Complete Concept Set including Both the Positive and Negative Evidence for the Ten Goals in Chinese --- p.138 / Chapter G --- The Assignment of Statistical Probabilities for Each Selected Concept under the Corresponding Goals in Chinese --- p.141 / Chapter H --- The Assignment of Statistical Probabilities for Each Selected Concept under the Corresponding Goals in English --- p.146
118

Why did they cite that?

Lovering, Charles 26 April 2018 (has links)
We explore a machine learning task, evidence recommendation (ER), the extraction of evidence from a source document to support an external claim. This task is an instance of the question answering machine learning task. We apply ER to academic publications because they cite other papers for the claims they make. Reading cited papers to corroborate claims is time-consuming and an automated ER tool could expedite it. Thus, we propose a methodology for collecting a dataset of academic papers and their references. We explore deep learning models for ER and achieve 77% accuracy with pairwise models and 75% pairwise accuracy with document-wise models.
119

Towards the Automatic Classification of Student Answers to Open-ended Questions

Alvarado Mantecon, Jesus Gerardo 24 April 2019 (has links)
One of the main research challenges nowadays in the context of Massive Open Online Courses (MOOCs) is the automation of the evaluation process of text-based assessments effectively. Text-based assessments, such as essay writing, have been proved to be better indicators of higher level of understanding than machine-scored assessments (E.g. Multiple Choice Questions). Nonetheless, due to the rapid growth of MOOCs, text-based evaluation has become a difficult task for human markers, creating the need of automated systems for grading. In this thesis, we focus on the automated short answer grading task (ASAG), which automatically assesses natural language answers to open-ended questions into correct and incorrect classes. We propose an ensemble supervised machine learning approach that relies on two types of classifiers: a response-based classifier, which centers around feature extraction from available responses, and a reference-based classifier which considers the relationships between responses, model answers and questions. For each classifier, we explored a set of features based on words and entities. For the response-based classifier, we tested and compared 5 features: traditional n-gram models, entity URIs (Uniform Resource Identifier) and entity mentions both extracted using a semantic annotation API, entity mention embeddings based on GloVe and entity URI embeddings extracted from Wikipedia. For the reference-based classifier, we explored fourteen features: cosine similarity between sentence embeddings from student answers and model answers, number of overlapping elements (words, entity URI, entity mention) between student answers and model answers or question text, Jaccard similarity coefficient between student answers and model answers or question text (based on words, entity URI or entity mentions) and a sentence embedding representation. We evaluated our classifiers on three datasets, two of which belong to the SemEval ASAG competition (Dzikovska et al., 2013). Our results show that, in general, reference-based features perform much better than response-based features in terms of accuracy and macro average f1-score. Within the reference-based approach, we observe that the use of S6 embedding representation, which considers question text, student and model answer, generated the best performing models. Nonetheless, their combination with other similarity features helped build more accurate classifiers. As for response-based classifiers, models based on traditional n-gram features remained the best models. Finally, we combined our best reference-based and response-based classifiers using an ensemble learning model. Our ensemble classifiers combining both approaches achieved the best results for one of the evaluation datasets, but underperformed on the remaining two. We also compared the best two classifiers with some of the main state-of-the-art results on the SemEval competition. Our final embedded meta-classifier outperformed the top-ranking result on the SemEval Beetle dataset and our top classifier on SemEval SciEntBank, trained on reference-based features, obtained the 2nd position. In conclusion, the reference-based approach, powered mainly by sentence level embeddings and other similarity features, proved to generate the most efficient models in two out of three datasets and the ensemble model was the best on the SemEval Beetle dataset.
120

Robust parsing with confluent preorder parser. / CUHK electronic theses & dissertations collection

January 1996 (has links)
by Ho, Kei Shiu Edward. / "June 1996." / Thesis (Ph.D.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (p. 186-193). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web.

Page generated in 0.1229 seconds