Spelling suggestions: "subject:"lexicography -- data processing"" "subject:"lexicography -- mata processing""
1 |
Multilingual information retrieval on the world wide web: the development of a Cantonese-Dagaare-English trilingual electroniclexiconMok, Yuen-kwan, Sally., 莫婉君. January 2006 (has links)
published_or_final_version / abstract / Linguistics / Master / Master of Philosophy
|
2 |
An investigation into lemmatization in Southern SothoMakgabutlane, Kelebohile Hilda 01 1900 (has links)
Lemmatization refers to the process whereby a lexicographer
assigns a specific place in a dictionary to a word which he
regards as the most basic form amongst other related forms. The
fact that in Bantu languages formative elements can be added to
one another in an often seemingly interminable series till quite
long words are produced, evokes curiosity as far as lemmatization
is concerned. Being aware of the productive nature of Southern
Sotho it is interesting to observe how lexicographers go about
handling the question of morphological complexities they are
normally faced with in the process of arranging lexical items.
This study has shown that some difficulties are encountered as
far as adhering to the traditional method of alphabetization is
concerned. It does not aim at proposing solutions but does point
out some considerations which should be borne in mind in the
process of lemmatization. / African Languages / M.A. (African Languages)
|
3 |
An investigation into lemmatization in Southern SothoMakgabutlane, Kelebohile Hilda 01 1900 (has links)
Lemmatization refers to the process whereby a lexicographer
assigns a specific place in a dictionary to a word which he
regards as the most basic form amongst other related forms. The
fact that in Bantu languages formative elements can be added to
one another in an often seemingly interminable series till quite
long words are produced, evokes curiosity as far as lemmatization
is concerned. Being aware of the productive nature of Southern
Sotho it is interesting to observe how lexicographers go about
handling the question of morphological complexities they are
normally faced with in the process of arranging lexical items.
This study has shown that some difficulties are encountered as
far as adhering to the traditional method of alphabetization is
concerned. It does not aim at proposing solutions but does point
out some considerations which should be borne in mind in the
process of lemmatization. / African Languages / M.A. (African Languages)
|
4 |
Lexicographic path searches for FPGA routingSo, Keith Kam-Ho, Computer Science & Engineering, Faculty of Engineering, UNSW January 2008 (has links)
This dissertation reports on studies of the application of lexicographic graph searches to solve problems in FPGA detailed routing. Our contributions include the derivation of iteration limits for scalar implementations of negotiation congestion for standard floating point types and the identification of pathological cases for path choice. In the study of the routability-driven detailed FPGA routing problem, we show universal detailed routability is NP-complete based on a related proof by Lee and Wong. We describe the design of a lexicographic composition operator of totally-ordered monoids as path cost metrics and show its optimality under an adapted A* search. Our new router, CornNC, based on lexicographic composition of congestion and wirelength, established a new minimum track count for the FPGA Place and Route Challenge. For the problem of long-path timing-driven FPGA detailed routing, we show that long-path budgeted detailed routability is NP-complete by reduction to universal detailed routability. We generalise the lexicographic composition to any finite length and verify its optimality under A* search. The application of the timing budget solution of Ghiasi et al. is used to solve the long-path timing budget problem for FPGA connections. Our delay-clamped spiral lexicographic composition design, SpiralRoute, ensures connection based budgets are always met, thus achieves timing closure when it successfully routes. For 113 test routing instances derived from standard benchmarks, SpiralRoute found 13 routable instances with timing closure that were unroutable by a scalar negotiated congestion router and achieved timing closure in another 27 cases when the scalar router did not, at the expense of increased runtime. We also study techniques to improve SpiralRoute runtimes, including a data structure of a trie augmented by data stacks for minimum element retrieval, and the technique of step tomonoid elimination in reducing the retrieval depth in a trie of stacks structure.
|
5 |
Lexicographic path searches for FPGA routingSo, Keith Kam-Ho, Computer Science & Engineering, Faculty of Engineering, UNSW January 2008 (has links)
This dissertation reports on studies of the application of lexicographic graph searches to solve problems in FPGA detailed routing. Our contributions include the derivation of iteration limits for scalar implementations of negotiation congestion for standard floating point types and the identification of pathological cases for path choice. In the study of the routability-driven detailed FPGA routing problem, we show universal detailed routability is NP-complete based on a related proof by Lee and Wong. We describe the design of a lexicographic composition operator of totally-ordered monoids as path cost metrics and show its optimality under an adapted A* search. Our new router, CornNC, based on lexicographic composition of congestion and wirelength, established a new minimum track count for the FPGA Place and Route Challenge. For the problem of long-path timing-driven FPGA detailed routing, we show that long-path budgeted detailed routability is NP-complete by reduction to universal detailed routability. We generalise the lexicographic composition to any finite length and verify its optimality under A* search. The application of the timing budget solution of Ghiasi et al. is used to solve the long-path timing budget problem for FPGA connections. Our delay-clamped spiral lexicographic composition design, SpiralRoute, ensures connection based budgets are always met, thus achieves timing closure when it successfully routes. For 113 test routing instances derived from standard benchmarks, SpiralRoute found 13 routable instances with timing closure that were unroutable by a scalar negotiated congestion router and achieved timing closure in another 27 cases when the scalar router did not, at the expense of increased runtime. We also study techniques to improve SpiralRoute runtimes, including a data structure of a trie augmented by data stacks for minimum element retrieval, and the technique of step tomonoid elimination in reducing the retrieval depth in a trie of stacks structure.
|
6 |
Improving Topic Tracking with Domain ChainingYang, Li 08 1900 (has links)
Topic Detection and Tracking (TDT) research has produced some successful statistical tracking systems. While lexical chaining, a non-statistical approach, has also been applied to the task of tracking by Carthy and Stokes for the 2001 TDT evaluation, an efficient tracking system based on this technology has yet to be developed. In thesis we investigate two new techniques which can improve Carthy's original design. First, at the core of our system is a semantic domain chainer. This chainer relies not only on the WordNet database for semantic relationships but also on Magnini's semantic domain database, which is an extension of WordNet. The domain-chaining algorithm is a linear algorithm. Second, to handle proper nouns, we gather all of the ones that occur in a news story together in a chain reserved for proper nouns. In this thesis we also discuss the linguistic limitations of lexical chainers to represent textual meaning.
|
7 |
Statistical modeling for lexical chains for automatic Chinese news story segmentation.January 2010 (has links)
Chan, Shing Kai. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2010. / Includes bibliographical references (leaves 106-114). / Abstracts in English and Chinese. / Abstract --- p.i / Acknowledgements --- p.v / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Problem Statement --- p.2 / Chapter 1.2 --- Motivation for Story Segmentation --- p.4 / Chapter 1.3 --- Terminologies --- p.5 / Chapter 1.4 --- Thesis Goals --- p.6 / Chapter 1.5 --- Thesis Organization --- p.8 / Chapter 2 --- Background Study --- p.9 / Chapter 2.1 --- Coherence-based Approaches --- p.10 / Chapter 2.1.1 --- Defining Coherence --- p.10 / Chapter 2.1.2 --- Lexical Chaining --- p.12 / Chapter 2.1.3 --- Cosine Similarity --- p.15 / Chapter 2.1.4 --- Language Modeling --- p.19 / Chapter 2.2 --- Feature-based Approaches --- p.21 / Chapter 2.2.1 --- Lexical Cues --- p.22 / Chapter 2.2.2 --- Audio Cues --- p.23 / Chapter 2.2.3 --- Video Cues --- p.24 / Chapter 2.3 --- Pros and Cons and Hybrid Approaches --- p.25 / Chapter 2.4 --- Chapter Summary --- p.27 / Chapter 3 --- Experimental Corpora --- p.29 / Chapter 3.1 --- The TDT2 and TDT3 Multi-language Text Corpus --- p.29 / Chapter 3.1.1 --- Introduction --- p.29 / Chapter 3.1.2 --- Program Particulars and Structures --- p.31 / Chapter 3.2 --- Data Preprocessing --- p.33 / Chapter 3.2.1 --- Challenges of Lexical Chain Formation on Chi- nese Text --- p.33 / Chapter 3.2.2 --- Word Segmentation for Word Units Extraction --- p.35 / Chapter 3.2.3 --- Part-of-speech Tagging for Candidate Words Ex- traction --- p.36 / Chapter 3.3 --- Chapter Summary --- p.37 / Chapter 4 --- Indication of Lexical Cohesiveness by Lexical Chains --- p.39 / Chapter 4.1 --- Lexical Chain as a Representation of Cohesiveness --- p.40 / Chapter 4.1.1 --- Choice of Word Relations for Lexical Chaining --- p.41 / Chapter 4.1.2 --- Lexical Chaining by Connecting Repeated Lexi- cal Elements --- p.43 / Chapter 4.2 --- Lexical Chain as an Indicator of Story Segments --- p.48 / Chapter 4.2.1 --- Indicators of Absence of Cohesiveness --- p.49 / Chapter 4.2.2 --- Indicator of Continuation of Cohesiveness --- p.58 / Chapter 4.3 --- Chapter Summary --- p.62 / Chapter 5 --- Indication of Story Boundaries by Lexical Chains --- p.63 / Chapter 5.1 --- Formal Definition of the Classification Procedures --- p.64 / Chapter 5.2 --- Theoretical Framework for Segmentation Based on Lex- ical Chaining --- p.65 / Chapter 5.2.1 --- Evaluation of Story Segmentation Accuracy --- p.65 / Chapter 5.2.2 --- Previous Approach of Story Segmentation Based on Lexical Chaining --- p.66 / Chapter 5.2.3 --- Statistical Framework for Story Segmentation based on Lexical Chaining --- p.69 / Chapter 5.2.4 --- Post Processing of Ratio for Boundary Identifi- cation --- p.73 / Chapter 5.3 --- Comparing Segmentation Models --- p.75 / Chapter 5.4 --- Chapter Summary --- p.79 / Chapter 6 --- Analysis of Lexical Chains Features as Boundary Indi- cators --- p.80 / Chapter 6.1 --- Error Analysis --- p.81 / Chapter 6.2 --- Window Length in the LRT Model --- p.82 / Chapter 6.3 --- The Relative Importance of Each Set of Features --- p.84 / Chapter 6.4 --- The Effect of Removing Timing Information --- p.92 / Chapter 6.5 --- Chapter Summary --- p.96 / Chapter 7 --- Conclusions and Future Work --- p.98 / Chapter 7.1 --- Contributions --- p.98 / Chapter 7.2 --- Future Works --- p.100 / Chapter 7.2.1 --- Further Extension of the Framework --- p.100 / Chapter 7.2.2 --- Wider Applications of the Framework --- p.105 / Bibliography --- p.106
|
Page generated in 0.0976 seconds