Global ETD Search

1	Conditional random fields with dynamic potentials for Chinese named entity recognition. January 2008 (has links) Wu, Yiu Kei. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2008. / Includes bibliographical references (p. 69-75). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Chinese NER Problem --- p.1 / Chapter 1.2 --- Contribution of Our Proposed Framework --- p.3 / Chapter 2 --- Related Work --- p.6 / Chapter 2.1 --- Hidden Markov Models --- p.7 / Chapter 2.2 --- Maximum Entropy Models --- p.8 / Chapter 2.3 --- Conditional Random Fields --- p.10 / Chapter 3 --- Our Proposed Model --- p.14 / Chapter 3.1 --- Background --- p.14 / Chapter 3.1.1 --- Problem Formulation --- p.14 / Chapter 3.1.2 --- Conditional Random Fields --- p.16 / Chapter 3.1.3 --- Semi-Markov Conditional Random Fields --- p.26 / Chapter 3.2 --- The Formulation of Our Proposed Model --- p.28 / Chapter 3.2.1 --- The Main Principle --- p.28 / Chapter 3.2.2 --- The Detailed Formulation --- p.36 / Chapter 3.2.3 --- Adapting Features from Original CRF to CRFDP --- p.51 / Chapter 4 --- Experiments --- p.54 / Chapter 4.1 --- Datasets --- p.55 / Chapter 4.2 --- Features --- p.57 / Chapter 4.3 --- Evaluation Metrics --- p.61 / Chapter 4.4 --- Results and Discussion --- p.63 / Chapter 5 --- Conclusions and Future Work --- p.67 / Bibliography --- p.69 / A --- p.76 / B --- p.78 / C --- p.88 Computational linguistics Random fields Parsing (Computer grammar) Names, Chinese
2	The effects of echoic training on the emergence of naming in a second language by monolingual English-speaking preschool children Cao, Yu January 2016 (has links) I conducted two experiments to investigate the emergence of Naming in a second language by monolingual English-speaking preschool children who demonstrated Naming in English with non-contrived visual stimuli. In Experiment I, I tested for the presence of full echoic responses in Chinese with 32 monolingual English-speaking children. The participants were randomly assigned into two groups. Group I received echoic probes in Chinese phonemes with English approximations, while Group II received echoic probes in distinctive Chinese phonemes. Participants in both groups were probed for their echoic responses in English. Results showed that Group I outperformed Group II in the numbers of correct echoic responses in Chinese phonemes, suggesting that the numbers of correct echoic responses in Chinese were affected by the distinctiveness of the phonemes as well as participants’ echoic responses in English. In Experiment II, I tested the effects of echoic training on the acquisition of Naming in Chinese with contrived and non-contrived visual stimuli by eight monolingual English-speaking preschool children. A multiple probe design was implemented for experimental control. I conducted naming probes in English with contrived and non-contrived visual stimuli, as well as naming probes in Chinese with contrived and non-contrived visual stimuli with all participants. Six out of eight participants received echoic training, while the other two participants went through repeated probes due to a lack of stable responding. The intervention consisted of the experimenter teaching the participants to echo Chinese consonant-vowel combinations that shared the same phonemes with the probes sets but with different consonant-vowel combinations. The participants were taught to say the target consonant-vowel combinations independently without an echoic model with 100% accuracy across three sessions during delayed probes in order to master a training set. Prior to the intervention, all participants demonstrated naming in English with non-contrived stimuli, but none of the participants demonstrated naming in English with contrived stimuli, or in Chinese with contrived or non-contrived stimuli. The results from post-intervention probes showed that echoic training was functionally related to the emergence of naming in Chinese with non-contrived stimuli for six participants, as well as naming in Chinese with contrived stimuli for five out of six participants. Names, Chinese Education, Preschool Learning, Psychology of Preschool children Psychology Education, Bilingual Special education
3	Probabilistic models for information extraction: from cascaded approach to joint approach. / CUHK electronic theses & dissertations collection January 2010 (has links) Based on these observations and analysis, we propose a joint discriminative probabilistic framework to optimize all relevant subtasks simultaneously. This framework defines a joint probability distribution for both segmentations in sequence data and relations of segments in the form of an exponential family. This model allows tight interactions between segmentations and relations of segments and it offers a natural way for IE tasks. Since exact parameter estimation and inference are prohibitively intractable, a structured variational inference algorithm is developed to perform parameter estimation approximately. For inference, we propose a strong bi-directional MH approach to find the MAP assignments for joint segmentations and relations to explore mutual benefits on both directions, such that segmentations can aid relations, and vice-versa. / Information Extraction (IE) aims at identifying specific pieces of information (data) in a unstructured or semi-structured textual document and transforming unstructured information in a corpus of documents or Web pages into a structured database. There are several representative tasks in IE: named entity recognition (NER), which aims at identifying phrases that denote types of named entities, entity relation extraction, which aims at discovering the events or relations related to the entities, and the task of coreference resolution, aims at determining whether two extracted mentions of entities refer to the same object. IE is useful for a wide variety of applications. / The end-to-end performance of high-level IE systems for compound tasks is often hampered by the use of cascaded frameworks. The integrated model we proposed can alleviate some of these problems, but it is only loosely coupled. Parameter estimation is performed independently and it only allows information to flow in one direction. In this top-down integration model, the decision of the bottom sub-model could guide the decision of the upper sub-model, but not vice-versa. Thus, deep interactions and dependencies between different tasks can hardly be well captured. / We have investigated and developed a cascaded framework in an attempt to consider entity extraction and qualitative domain knowledge based on undirected, discriminatively-trained probabilistic graphical models. This framework consists of two stages and it is the combination of statistical learning and first-order logic. As a pipeline model, the first stage is a base model and the second stage is used to validate and correct the errors made in the base model. We incorporated domain knowledge that can be well formulated into first-order logic to extract entity candidates from the base model. We have applied this framework and achieved encouraging results in Chinese NER on the People's Daily corpus. / We perform extensive experiments on three important IE tasks using real-world datasets, namely Chinese NER, entity identification and relationship extraction from Wikipedia's encyclopedic articles, and citation matching, to test our proposed models, including the bidirectional model, the integrated model, and the joint model. Experimental results show that our models significantly outperform current state-of-the-art probabilistic models, such as decoupled and joint models, illustrating the feasibility and promise of our proposed approaches. (Abstract shortened by UMI.) / We present a general, strongly-coupled, and bidirectional architecture based on discriminatively trained factor graphs for information extraction, which consists of two components---segmentation and relation. First we introduce joint factors connecting variables of relevant subtasks to capture dependencies and interactions between them. We then propose a strong bidirectional Markov chain Monte Carlo (MCMC) sampling inference algorithm which allows information to flow in both directions to find the approximate maximum a posteriori (MAP) solution for all subtasks. Notably, our framework is considerably simpler to implement, and outperforms previous ones. / Yu, Xiaofeng. / Adviser: Zam Wai. / Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2010. / Includes bibliographical references (leaves 109-123). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. Graphical modeling (Statistics) Names, Chinese Random fields Text processing (Computer science)

1

Page generated in 0.0422 seconds