Global ETD Search

61	Robust parsing with confluent preorder parser. / CUHK electronic theses & dissertations collection January 1996 (has links) by Ho, Kei Shiu Edward. / "June 1996." / Thesis (Ph.D.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (p. 186-193). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Mode of access: World Wide Web. Parsing (Computer grammar) Robust statistics
62	Conditional random fields with dynamic potentials for Chinese named entity recognition. January 2008 (has links) Wu, Yiu Kei. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 69-75). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Chinese NER Problem --- p.1 / Chapter 1.2 --- Contribution of Our Proposed Framework --- p.3 / Chapter 2 --- Related Work --- p.6 / Chapter 2.1 --- Hidden Markov Models --- p.7 / Chapter 2.2 --- Maximum Entropy Models --- p.8 / Chapter 2.3 --- Conditional Random Fields --- p.10 / Chapter 3 --- Our Proposed Model --- p.14 / Chapter 3.1 --- Background --- p.14 / Chapter 3.1.1 --- Problem Formulation --- p.14 / Chapter 3.1.2 --- Conditional Random Fields --- p.16 / Chapter 3.1.3 --- Semi-Markov Conditional Random Fields --- p.26 / Chapter 3.2 --- The Formulation of Our Proposed Model --- p.28 / Chapter 3.2.1 --- The Main Principle --- p.28 / Chapter 3.2.2 --- The Detailed Formulation --- p.36 / Chapter 3.2.3 --- Adapting Features from Original CRF to CRFDP --- p.51 / Chapter 4 --- Experiments --- p.54 / Chapter 4.1 --- Datasets --- p.55 / Chapter 4.2 --- Features --- p.57 / Chapter 4.3 --- Evaluation Metrics --- p.61 / Chapter 4.4 --- Results and Discussion --- p.63 / Chapter 5 --- Conclusions and Future Work --- p.67 / Bibliography --- p.69 / A --- p.76 / B --- p.78 / C --- p.88 Computational linguistics Random fields Parsing (Computer grammar) Names, Chinese
63	A Detailed Analysis of Semantic Dependency Parsing with Deep Neural Networks / En detaljerad analys av semantisk dependensparsning meddjupa neuronnät Roxbo, Daniel January 2019 (has links) The use of Long Short Term Memory (LSTM) networks continues to yield better results in natural language processing tasks. One area which recently has seen significant improvements is semantic dependency parsing, where the current state-of-the-art model uses a multilayer LSTM combined with an attention-based scoring function to predict the dependencies. In this thesis the state of the art model is first replicated and then extended to include features based on syntactical trees, which was found to be useful in a similar model. In addition, the effect of part-of-speech tags is studied. The replicated model achieves a labeled F1 score of 93.6 on the in-domain data and 89.2 on the out-of-domain data on the DM dataset, which shows that the model is indeed replicable. Using multiple features extracted from syntactic gold standard trees of the DELPH-IN Derivation Tree (DT) type increased the labeled scores to 97.1 and 94.1 respectively, while the use of predicted trees of the Stanford Basic (SB) type did not improve the results at all. The usefulness of part-of-speech tags was found to be diminished in the presence of other features. Semantic Dependency Parsing LSTM Computer Sciences Datavetenskap (datalogi)
64	Missing the point : the effect of punctuation on reading performance Grindlay, Benjamin James William. January 2002 (has links) (PDF) Bibliography: p. 249-261. Reading comprehension Punctuation Grammar, comparative and general Parsing Psycholinguistics
65	Missing the point : the effect of punctuation on reading performance / Benjamin J. W. Grindlay. / Effect of punctuation on reading performance Grindlay, Benjamin James William January 2002 (has links) Bibliography: p. 249-261. / viii, 271 p. : tables ; 30 cm. / Title page, contents and abstract only. The complete thesis in print form is available from the University Library. / Thesis (Ph.D.)--University of Adelaide, Dept. of Psychology, 2003? Reading comprehension. Punctuation. Psycholinguistics.
66	Parsing and Generating English Using Commutative Transformations Katz, Boris, Winston, Patrick H. 01 May 1982 (has links) This paper is about an implemented natural language interface that translates from English into semantic net relations and from semantic net relations back into English. The parser and companion generator were implemented for two reasons: (a) to enable experimental work in support of a theory of learning by analogy; (b) to demonstrate the viability of a theory of parsing and generation built on commutative transformations. The learning theory was shaped to a great degree by experiments that would have been extraordinarily tedious to perform without the English interface with which the experimental data base was prepared, revise, and revised again. Inasmuch as current work on the learning theory is moving toward a tenfold increase in data-base size, the English interface is moving from a facilitating role to an enabling one. The parsing and generation theory has two particularly important features: (a) the same grammar is used for both parsing and generation; (b) the transformations of the grammar are commutative. The language generation procedure converts a semantic network fragment into kernel frames, chooses the set of transformations that should be performed upon each frame, executes the specified transformations, combines the altered kernels into a sentence, performs a pronominalization process, and finally produces the appropriate English word string. Parsing is essentially the reverse of generation. The first step in the parsing process is splitting a given sentence into a set of kernel clauses along with a description of how those clauses hierarchically related to each other. The clauses are hierarchically related to each other. The clauses are used to produce a matrix embedded kernel frames, which in turn supply arguments to relation-creating functions. The evaluation of the relation-creating functions results in the construction of the semantic net fragments. parsing generation natural language semantic networks scommutative transformations language understanding
67	Learning by Failing to Explain Hall, Robert Joseph 01 May 1986 (has links) Explanation-based Generalization requires that the learner obtain an explanation of why a precedent exemplifies a concept. It is, therefore, useless if the system fails to find this explanation. However, it is not necessary to give up and resort to purely empirical generalization methods. In fact, the system may already know almost everything it needs to explain the precedent. Learning by Failing to Explain is a method which is able to exploit current knowledge to prune complex precedents, isolating the mysterious parts of the precedent. The idea has two parts: the notion of partially analyzing a precedent to get rid of the parts which are already explainable, and the notion of re-analyzing old rules in terms of new ones, so that more general rules are obtained. learning explanation heuristic parsing design sgraph grammars subgraph isomorphism
68	The Multilingual Forest : Investigating High-quality Parallel Corpus Development Adesam, Yvonne January 2012 (has links) This thesis explores the development of parallel treebanks, collections of language data consisting of texts and their translations, with syntactic annotation and alignment, linking words, phrases, and sentences to show translation equivalence. We describe the semi-manual annotation of the SMULTRON parallel treebank, consisting of 1,000 sentences in English, German and Swedish. This description is the starting point for answering the first of two questions in this thesis. What issues need to be considered to achieve a high-quality, consistent,parallel treebank? The units of annotation and the choice of annotation schemes are crucial for quality, and some automated processing is necessary to increase the size. Automatic quality checks and evaluation are essential, but manual quality control is still needed to achieve high quality. Additionally, we explore improving the automatically created annotation for one language, using information available from the annotation of the other languages. This leads us to the second of the two questions in this thesis. Can we improve automatic annotation by projecting information available in the other languages? Experiments with automatic alignment, which is projected from two language pairs, L1–L2 and L1–L3, onto the third pair, L2–L3, show an improvement in precision, in particular if the projected alignment is intersected with the system alignment. We also construct a test collection for experiments on annotation projection to resolve prepositional phrase attachment ambiguities. While majority vote projection improves the annotation, compared to the basic automatic annotation, using linguistic clues to correct the annotation before majority vote projection is even better, although more laborious. However, some structural errors cannot be corrected by projection at all, as different languages have different wording, and thus different structures. / I denna doktorsavhandling utforskas skapandet av parallella trädbanker. Dessa är språkliga data som består av texter och deras översättningar, som har märkts upp med syntaktisk information samt länkar mellan ord, fraser och meningar som motsvarar varandra i översättningarna. Vi beskriver den delvis manuella uppmärkningen av den parallella trädbanken SMULTRON, med 1.000 engelska, tyska och svenska meningar. Denna beskrivning är utgångspunkt för att besvara den första av två frågor i avhandlingen. Vilka frågor måste beaktas för att skapa en högkvalitativ parallell trädbank? De enheter som märks upp samt valet av uppmärkningssystemet är viktiga för kvaliteten, och en viss andel automatisk bearbetning är nödvändig för att utöka storleken. Automatiska kvalitetskontroller och automatisk utvärdering är av vikt, men viss manuell granskning är nödvändig för att uppnå hög kvalitet. Vidare utforskar vi att använda information som finns i uppmärkningen, för att förbättra den automatiskt skapade uppmärkningen för ett annat språk. Detta leder oss till den andra av de två frågorna i avhandlingen. Kan vi förbättra automatisk uppmärkning genom att överföra information som finns i de andra språken? Experimenten visar att automatisk länkning som överförs från två språkpar, L1–L2 och L1–L3, till det tredje språkparet, L2–L3, får förbättrad precision, framför allt för skärningspunkten mellan den överförda länkningen och den automatiska länkningen. Vi skapar även en testsamling för experiment med överföring av uppmärkning för att lösa upp strukturella flertydigheter hos prepositionsfraser. Överföring enligt majoritetsprincipen förbättrar uppmärkningen, jämfört med den grundläggande automatiska uppmärkningen, men att använda språkliga ledtrådar för att korrigera uppmärkningen innan majoritetsöverföring är ännu bättre, om än mer arbetskrävande. Vissa felaktiga strukturer kan dock inte korrigeras med hjälp av överföring, eftersom de olika språken använder olika formuleringar, och därmed har olika strukturer. treebank syntax alignment corpus annotation projection multilingual tagging parsing
69	NOVEL APPROACH TO STORAGE AND STORTING OF NEXT GENERATION SEQUENCING DATA FOR THE PURPOSE OF FUNCTIONAL ANNOTATION TRANSFER Candelli, Tito January 2012 (has links) The problem of functional annotation of novel sequences has been a sigfinicant issue for many laboratories that decided to apply next generation sequencing techniques to less studied species. In particular experiments such as transcriptome analysis heavily suer from this problem due to the impossibility of ascribing their results in a relevant biological context. Several tools have been proposed to solve this problem through homology annotation transfer. The principle behind this strategy is that homologous genes share common functions in dierent organisms, and therefore annotations are transferable between these genes. Commonly, BLAST reports are used to identify a suitable homologousgene in a well annotated species and the annotation is then transferred fromthe homologue to the novel sequence. Not all homologues, however, possess valid functional annotations. The aim of this project was to devise an algorithm to process BLAST reports and provide a criterion to discriminate between homologues with a biologically informative and uninformative annotation, respectively. In addition, all data obtained from the BLAST report isto be stored in a relational database for ease of consultation and visualization. In order to test the solidity of the system, we utilized 750 novel sequences obtained through application of next generation sequencing techniques to Avena sativa samples. This species particularly suits our needs as it represents the typical target for homology annotation transfer: lack of a reference genome and diculty in attributing functional annotation. The system was able to perform all the required tasks. Comparisons between best hits asdetermined by BLAST and best hits as determined by the algorithm showed a significant increase in the biological significance of the results when thealgorithm sorting system was applied. homology annotation transfer blast parsing relational database functional information
70	MaltParser -- An Architecture for Inductive Labeled Dependency Parsing Hall, Johan January 2006 (has links) <p>This licentiate thesis presents a software architecture for inductive labeled dependency parsing of unrestricted natural language text, which achieves a strict modularization of parsing algorithm, feature model and learning method such that these parameters can be varied independently. The architecture is based on the theoretical framework of inductive dependency parsing by Nivre \citeyear{nivre06c} and has been realized in MaltParser, a system that supports several parsing algorithms and learning methods, for which complex feature models can be defined in a special description language. Special attention is given in this thesis to learning methods based on support vector machines (SVM).</p><p>The implementation is validated in three sets of experiments using data from three languages (Chinese, English and Swedish). First, we check if the implementation realizes the underlying architecture. The experiments show that the MaltParser system outperforms the baseline and satisfies the basic constraints of well-formedness. Furthermore, the experiments show that it is possible to vary parsing algorithm, feature model and learning method independently. Secondly, we focus on the special properties of the SVM interface. It is possible to reduce the learning and parsing time without sacrificing accuracy by dividing the training data into smaller sets, according to the part-of-speech of the next token in the current parser configuration. Thirdly, the last set of experiments present a broad empirical study that compares SVM to memory-based learning (MBL) with five different feature models, where all combinations have gone through parameter optimization for both learning methods. The study shows that SVM outperforms MBL for more complex and lexicalized feature models with respect to parsing accuracy. There are also indications that SVM, with a splitting strategy, can achieve faster parsing than MBL. The parsing accuracy achieved is the highest reported for the Swedish data set and very close to the state of the art for Chinese and English.</p> / <p>Denna licentiatavhandling presenterar en mjukvaruarkitektur för</p><p>datadriven dependensparsning, dvs. för att automatiskt skapa en</p><p>syntaktisk analys i form av dependensgrafer för meningar i texter</p><p>på naturligt språk. Arkitekturen bygger på idén att man ska kunna variera parsningsalgoritm, särdragsmodell och inlärningsmetod oberoende av varandra. Till grund för denna arkitektur har vi använt det teoretiska ramverket för induktiv dependensparsning presenterat av Nivre \citeyear{nivre06c}. Arkitekturen har realiserats i programvaran MaltParser, där det är möjligt att definiera komplexa särdragsmodeller i ett speciellt beskrivningsspråk. I denna avhandling kommer vi att lägga extra tyngd vid att beskriva hur vi har integrerat inlärningsmetoden supportvektor-maskiner (SVM).</p><p>MaltParser valideras med tre experimentserier, där data från tre språk används (kinesiska, engelska och svenska). I den första experimentserien kontrolleras om implementationen realiserar den underliggande arkitekturen. Experimenten visar att MaltParser utklassar en trivial metod för dependensparsning (\emph{eng}. baseline) och de grundläggande kraven på välformade dependensgrafer uppfylls. Dessutom visar experimenten att det är möjligt att variera parsningsalgoritm, särdragsmodell och inlärningsmetod oberoende av varandra. Den andra experimentserien fokuserar på de speciella egenskaperna för SVM-gränssnittet. Experimenten visar att det är möjligt att reducera inlärnings- och parsningstiden utan att förlora i parsningskorrekthet genom att dela upp träningsdata enligt ordklasstaggen för nästa ord i nuvarande parsningskonfiguration. Den tredje och sista experimentserien presenterar en empirisk undersökning som jämför SVM med minnesbaserad inlärning (MBL). Studien använder sig av fem särdragsmodeller, där alla kombinationer av språk, inlärningsmetod och särdragsmodell</p><p>har genomgått omfattande parameteroptimering. Experimenten visar att SVM överträffar MBL för mer komplexa och lexikaliserade särdragsmodeller med avseende på parsningskorrekthet. Det finns även vissa indikationer på att SVM, med en uppdelningsstrategi, kan parsa en text snabbare än MBL. För svenska kan vi rapportera den högsta parsningskorrektheten hittills och för kinesiska och engelska är resultaten nära de bästa som har rapporterats.</p> Dependency Parsing Support Vector Machines Machine Learning Language technology Språkteknologi

Search results