11 |
On-line Chinese character recognition using tree classifier approach. / Online Chinese character recognition using tree classification approachJanuary 1993 (has links)
by Wong Tsz Kin. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1993. / Includes bibliographical references (leaves 45-47). / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Characteristics of Chinese Character --- p.2 / Chapter 1.1.1 --- The Nature of Chinese Language --- p.2 / Chapter 1.1.2 --- The Structure of Chinese Characters --- p.3 / Chapter 1.1.3 --- Basic Writing Strokes --- p.3 / Chapter 1.1.4 --- Writing Stroke Sequencing --- p.3 / Chapter 1.1.5 --- Geographic Structure of Components --- p.4 / Chapter 1.2 --- Stroke Distribution of Chinese Characters --- p.5 / Chapter 1.3 --- Radical --- p.5 / Chapter 1.4 --- Overview --- p.6 / Chapter 1.5 --- Objective --- p.10 / Chapter 2 --- Preprocessing --- p.12 / Chapter 2.1 --- Smoothing and Sampling --- p.12 / Chapter 2.2 --- Interpolation --- p.13 / Chapter 2.3 --- Dehooking --- p.13 / Chapter 2.4 --- Normalization --- p.14 / Chapter 2.5 --- Stroke Segmentation --- p.15 / Chapter 3 --- Preclassification --- p.18 / Chapter 3.1 --- Feature Analysis --- p.18 / Chapter 3.2 --- Radical Detection --- p.20 / Chapter 3.3 --- Description of The Preclassification Component --- p.22 / Chapter 3.4 --- Results and Conclusions --- p.23 / Chapter 4 --- The Recognition Stage --- p.25 / Chapter 4.1 --- Introduction --- p.25 / Chapter 4.2 --- Stroke Match Algorithm --- p.26 / Chapter 4.3 --- Relation Match Stage --- p.30 / Chapter 4.3.1 --- Introduction --- p.30 / Chapter 4.4 --- Final Classification --- p.35 / Chapter 5 --- Results and Conclusions --- p.39 / Chapter 5.1 --- Experiment Results --- p.39 / Chapter 5.2 --- Analysis --- p.39 / Chapter 5.3 --- Conclusions
|
12 |
Design and implementation of multistage tree classifier for Chinese character recognition.January 1992 (has links)
Yeung Lap Kei. / Thesis (M.Sc.)--Chinese University of Hong Kong, 1992. / Includes bibliographical references (leaves [14-15]). / PREFACE / ABSTRACT / CONTENT / Chapter §1. --- INTRODUCTION / Chapter §1.1 --- The Chinese language --- p.1 / Chapter §1.2 --- Chinese information processing system --- p.2 / Chapter §1.3 --- Chinese character recognition --- p.4 / Chapter §1.4 --- Multi-stage tree classifier Vs Single-stage tree classifier in Chinese character recognition --- p.6 / Chapter §1.5 --- Decision Tree / Chapter §1.5.1 --- Basic Terminology of a decision tree --- p.7 / Chapter §1.5.2 --- Structure design of a decision tree --- p.10 / Chapter §1.6 --- Motivation of the project --- p.12 / Chapter §1.7 --- Objects of the project --- p.14 / Chapter §1.8 --- Development environment --- p.14 / Chapter §2. --- APPROACH 1 - UNSUPERVISED LEARNING --- p.15 / Chapter §3. --- APPROACH 2 - SUPERVISED LEARNING / Chapter §3.1 --- Idea --- p.17 / Chapter §3.2 --- The 3 Corner Code --- p.20 / Chapter §3.3 --- Feature Extraction & Selection --- p.22 / Chapter §3.4 --- Decision at Each Node / Chapter §3.4.1 --- Statistical Linear Discriminant Analysis --- p.22 / Chapter §3.4.2 --- Optimization of the Number of Misclassification --- p.24 / Chapter §3.5 --- Implementation / Chapter §3.5.1 --- Training Data --- p.36 / Chapter §3.5.2 --- Clustering with the Use of SAS --- p.38 / Chapter §3.5.3 --- Building the Decision Trees --- p.42 / Chapter §3.5.4 --- Description of the Classifier --- p.45 / Chapter §3.6 --- Experiments and Testing Result / Chapter §3.6.1 --- Performance Parameters being Measured --- p.47 / Chapter §3.6.2 --- Testing by Resubstitution Method --- p.50 / Chapter §3.6.3 --- Noise Model --- p.52 / Chapter §4. --- POSSIBLE IMPROVEMENT --- p.55 / Chapter §5. --- EXPERIMENTAL RESULTS & THE IMPROVED MULTISTAGE CLASSIFIER / Chapter §5.1 --- Experimental Results --- p.59 / Chapter §5.2 --- Conclusion --- p.70 / Chapter §6. --- IMPROVED MULTISTAGE TREE CLASSIFIER / Chapter §6.1 --- The Optimal Multistage Tree Classifier --- p.72 / Chapter §6.2 --- Performance Analysis --- p.73 / Chapter §7. --- FURTHER DISCRIMINATION BY CONTEXT CONSIDERATION / Chapter §7.1 --- Idea --- p.76 / Chapter §7.2 --- Description of Algorithm --- p.78 / Chapter §7.3 --- Performance Analysis --- p.81 / Chapter §8. --- CONCLUSION / Chapter §8.1 --- Advantage of the Classifier --- p.84 / Chapter §8.2 --- Limitation of the Classifier --- p.85 / Chapter §9. --- AREA OF FUTURE RESEARCH AND IMPROVEMENT / Chapter §9.1 --- Detailed Analysis at Each Terminal Node --- p.86 / Chapter §9.2 --- Improving the Noise Filtering Technique --- p.87 / Chapter §9.3 --- The Use of 4 Corner Code --- p.88 / Chapter §9.4 --- Increase in the Dimension of the Feature Space --- p.90 / Chapter §9.5 --- 1-Tree Protocol with Entropy Reduction --- p.91 / Chapter §9.6 --- The Use of Human Intelligence --- p.92 / APPENDICES / Chapter A.1 --- K-MEANS / Chapter A.2 --- Unsupervised Learning Approach / Chapter A.3 --- Other Algorithms (Maximum Distance & ISODATA) / Chapter A.4 --- Possible Improvement / Chapter A.5 --- Theories on Statistical Discriminant Analysis / Chapter A.6 --- Passage used in Testing the Performance of the Classifier with Context Consideration / Chapter A.7 --- A Partial List of Semantically Related Chinese Characters / Chapter A.8 --- An Example of Misclassification Table / Chapter A.9 --- "Listing of the Program ""CHDIS.C""" / REFERENCE
|
13 |
A new approach to the generation of Gray scale Chinese fonts.January 1993 (has links)
by Poon Chi-cheung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1993. / Includes bibliographical references (leaves 82-84). / Abstract / Acknowledgments / Preface / Chapter Chapter 1: --- Font Systems --- p.1 / Representations of Character Images --- p.1 / Characteristics of Chinese Font System --- p.3 / Large Character Set --- p.3 / Condensed Strokes --- p.4 / Low Repetition Rate --- p.5 / WYSIWYG (What You See Is What You Get) --- p.6 / Chapter Chapter 2: --- Human Visual System and Gray Scale Font --- p.9 / Human Visual System --- p.9 / Physiology --- p.9 / Spatial Frequencies --- p.10 / How much resolution is enough --- p.11 / Screen and Printer --- p.12 / Raster Display Devices --- p.13 / Printer --- p.14 / Resolution --- p.15 / Gray Scale Font --- p.15 / Generation of Gray Scale Font --- p.18 / Chapter Chapter 3: --- Digital Filtering Method for Gray Scale Font --- p.19 / Filtering Process --- p.19 / Weighted Functions --- p.21 / Generation of Gray Scale Character --- p.23 / Results --- p.24 / More Experiments --- p.24 / Problems --- p.26 / Speed and Storage --- p.26 / Impression of Strokes --- p.27 / Thin strokes in the small-size character --- p.30 / New Approach to Generate Gray Scale Font --- p.30 / Chapter Chapter 4: --- Rasterization Algorithms --- p.32 / Outline Font --- p.32 / TrueType Font --- p.33 / Scan Conversion --- p.35 / Basic Outline-to-Bitmap Conversion --- p.35 / Scan-converting Polygon --- p.36 / Rasterization of a character --- p.36 / Intersecting Points and Ranges --- p.37 / Straight Lines --- p.37 / Quadratic Bezier Curves --- p.38 / Implementation Techniques --- p.39 / Approximation of quadratic Bezier curve by straight lines --- p.39 / Simplification of the Filling Process --- p.41 / The Rasterization Algorithm --- p.45 / Chapter Chapter 5: --- Direct Rasterization with Gray Scale --- p.46 / Rasterization with Gray Scale --- p.46 / Determination of Gray Value of Boundary-pixel --- p.50 / Preliminary Results --- p.54 / Hinting --- p.56 / Rasterization with Hinting --- p.56 / Strokes Migration --- p.57 / Hints Finding --- p.59 / Chapter Chapter 6: --- Results and Conclusion --- p.62 / Quality --- p.66 / Comparison with Black-and-White Character --- p.66 / Hinted Against Unhinted --- p.71 / Generation Speeds --- p.75 / Discussion and Comments --- p.78 / Practical Font System --- p.79 / Conclusion --- p.80 / Bibliography --- p.82
|
14 |
Codes of Modernity: Infrastructures of Language and Chinese Scripts in an Age of Global Information RevolutionKuzuoglu, Ulug January 2018 (has links)
This dissertation explores the global history of Chinese script reforms—the effort to phoneticize Chinese language and/or simplify the writing system—from its inception in the 1890s to its demise in the 1980s. These reforms took place at the intersection of industrialization, colonialism, and new information technologies, such as alphabet-based telegraphy and breakthroughs in printing technologies. As these social and technological transformations put unprecedented pressure on knowledge management and the use of mental and clerical labor, many Chinese intellectuals claimed that learning Chinese characters consumed too much time and mental energy. Chinese script reforms, this dissertation argues, were an effort to increase speed in producing, transmitting, and accessing information, and thus meet the demands of the industrializing knowledge economy.
The industrializing knowledge economy that this dissertation explores was built on and sustained by a psychological understanding of the human subject as a knowledge machine, and it was part of a global moment in which the optimization of labor in knowledge production was a key concern for all modernizing economies. While Chinese intellectuals were inventing new signs of inscription, American behavioral psychologists, Soviet psycho-economists, and Central Asian and Ottoman technicians were all experimenting with new scripts in order to increase mental efficiency and productivity. This dissertation reveals the intimate connections between the Chinese and non-Chinese script engineering projects that were taking place synchronically across the world. The chapters of this work demonstrate for the first time, for instance, that the simplification of Chinese characters in the 1920s and 1930s was intimately connected to the discipline of behavioral psychology in the US. The first generation of Chinese psychologists employed the American psychologists’ methods to track eye movements, count word-frequencies, and statistically analyze the speed of reading, writing, and memorizing in order to simplify and “rationalize” the Chinese writing system in an effort to discipline and optimize mental labor. Other chapters explore the issue of mental and clerical optimization by finding the origins of the Chinese Latin Alphabet (CLA), the mother of pinyin, in hitherto unknown Eurasian connections. The CLA, the pages of this work shows, was the product of a transnational exchange that involved Ottoman and Transcaucasian typographers as well as Russian engineers and Chinese communists who sought efficiency in knowledge production through inventing new scripts. Situating the Chinese script reforms at this global intersection of psychology, economy, and linguistics, this dissertation examines the global connections and forces that turned the human subject into a knowledge worker who was cognitively managed through education, literacy, propaganda, and other measures of organizing information, all of which had the script at the center.
The search for efficiency and productivity—the core values of industrialism—lay at the heart of script reforms in China, but this search was inseparable from linguistic orders and political ambitions. Even if writing, transmitting, and learning a phonetic script could theoretically be easier and more efficient than the Chinese characters, the alphabet opened a veritable Pandora’s Box around the issue of selection: given the complex linguistic landscape in China, which speech was a phonetic script supposed to represent? There were myriad languages spoken throughout the empire and the subsequent nation-state, most of which were mutually incomprehensible. Mandarin as spoken in Beijing was different from that spoken in the south, and “topolects” or regional languages such as Min or Cantonese were to Mandarin what Romanian is to English. As a linguistic life-or-death issue, phonetic scripts stood for the infrastructural possibilities and limitations in the representation of speeches. Some scripts, such as Lao Naixuan’s phonetic script composed of more than a hundred signs, were capable of representing multiple Mandarin and non-Mandarin speeches; whereas others, such as Phonetic Symbols that only has thirty-seven syllabic signs, represented only one speech, i.e., Mandarin. Using Mandarin-oriented scripts to transcribe non-Mandarin speeches was like writing English with fifteen letters, hence the acrimonious disputes that fill the pages of this dissertation. Succinctly put, it was at the level of script invention that Chinese and non-Chinese actors engineered different infrastructures not only for laboring minds but also for the social world of Chinese languages. The history of information technologies and knowledge economy in China was thus inseparable from the world of speech and language, as each script offered a new potential to reassemble the written matter and the speaking mind in a different way.
“Codes of Modernity” thus conceptualizes the script itself as an infrastructural medium. A script was not merely a passive carrier of information, but an existential artifact. Building on an expanding literature on infrastructures, it endorses the observation that infrastructures, technologies, and the social world around them work in a recursive loop. An infrastructure is not just the physical object that permits the flow of information, goods, ideas, and people, but a sociotechnical product that enables the experience of culture, while imposing constrains on it at the same time. Like electricity grids, transportation systems, and sewage canals, the experience of scripts as infrastructures is the experience of thought worlds. After a long tradition of structuralism and poststructuralism that sought to understand the world through the semiotic prism of language, “Codes of Modernity” argues that it is time for an infrastructuralism that excavates the indispensable media that enable the production of language and thought.
|
15 |
華語教學中漢字書寫與字感建立之研究 / On writing Chinese characters and building Chinese character perception (zìgăn) in teaching Chinese as a second language楊惠雯, Yang, Huei Wen Unknown Date (has links)
本研究旨在透過漢字字感教學法,試圖解決前人研究中各式漢字教學法之侷限性,進而發展出能在有限華語教學時數內完成、具有教學成效、符合各類漢字字源演變與特質,且能引起學習者動機,建立字感的漢字教學設計。本研究對字感的定義也同樣是教學目的:學習者經過漢字教學,掌握漢字的大概念後,將所學的知識應用到未學過的漢字上。學習者因而能夠有系統的分析、推測新字的形音義,或者有能力檢視漢字的形音義是否正確合理,如此有助於增進漢字學習效率。
本研究採用教學實驗法,以自編之漢字字感教材,連續十週開設免費班課程,每次上課50分鐘,對初、中級華語學習者進行教學實驗。教材字例以教育部華測會主辦之華語文能力測驗《基礎八百詞》中出現的漢字為主,總共分為五個主題:象形字例教學、指事字例教學、會意字例教學、形聲字例教學、假借字例教學。各教學主題內容主要分為:(1)教學前教師漢字知能建立與教案設計、(2)教學中活動操作步驟、(3)教學後學習評量施測與檢討、(4)學習者課程回饋單。本研究實驗課程合計教授142個漢字。
本研究主要結果如下:
一、字感教學確有教學成效。字感教學可建立教師正確的文字學知識與漢字釋義能力,並協助教師在有限教學時間內,運用本身知能有效率的進行漢字教學,減少學習者學習負擔。
二、字感教學符合教學需求與學習需要。字感教學透過為教育部華語文能力測驗(TOCFL)測驗公布之《基礎八百詞》中常用漢字量身打造教學活動,可以符合華語文教師實際教學需求、學習者學習需要,且讓學習者願意接受、提高學習興趣。
三、字感教學可引起學習動機,有助後續漢字學習。字感教學中的漢字書寫教學讓華語文學習者建立推測漢字「字音、字形、字義」的判斷、自我糾正、自主學習能力,破除漢字難學之迷思。經過有步驟、有系統、有意義、有樂趣、有文化的字感教學後,從客觀的學習評量分析可發現學習者確實能將課堂所學應用至推測與分析未學過的漢字,且可提升華語文學習者漢字書寫能力,從根本改善「動口不動手」的學習結果。
最後,本研究對往後教學實驗可修正與改進的部分提出建議,並期許藉由字感教學,讓全球華語熱因漢字的特色與文化更熱,讓世界各國感受到中華文化的美、智慧與溫度。 / The main purpose of developing the Chinese Character Perception Teaching Approach is to solve the restrictions existing in current Chinese character teaching approaches. Due to the limited time and proportion of Chinese character teaching in teaching Chinese as a second language (CSL), a set of lesson plans are made to possess certain efficiency, meet the different origins and property of each character from the six categories of Chinese characters (liùshū), invoke learners’ motive and build their solid perception toward Chinese characters. The goal of the teaching approach, as well as a more detailed definition of Chinese character perception would be: after going through Chinese character teaching approaches and master the big idea of each category of Chinese characters, learners would be able to transfer their knowledge to comprehend the characters they have not learnt yet. Thus, learners could analyze characters systematically, connect the sound, meaning and structure of characters, or observe whether a character is correctly written or pronounced according to its property. Chinese character learning efficiency would then be improved.
In order to prove the positive effects of Chinese Character Perception Teaching Approach, the following items are practiced: the self-designed lesson plans and teaching materials to beginners and intermediate learners in CSL classroom. The experiment lasts for ten weeks, 50 minutes each time. The characters chosen in experiments are from “Standard 800 Phrases,” which is one of the bases of Test of Chinese as a Foreign Language (TOCFL). The teaching experiments are divided into 5 themes: pictographs, self-explanatory characters, associative compounds, pictophonetic characters, and phonetic loan characters. Each theme’s lesson plan contains: 1. before teaching—building teacher’s competence of Chinese characters; 2. during teaching—listing out activities and steps of teaching; 3. after teaching—assessment and review; 4. feedback sheet from students and teaching efficiency analysis. Throughout the experiment, 142 Chinese characters are taught and comprehended in total.
The results of this study are listed as following:
1. Chinese Character Perception Teaching Approach is proved to be effective and operative. It provides teacher with appropriate knowledge of etymology and competency to explain the big ideas of different Chinese character categories in a way that is comprehendible to beginners and intermediate learners.
2. Chinese Character Perception Teaching Approach meets the need of both teaching and learning. This teaching approach is tailored to suit requirements in practical teaching and helps learners to prepare themselves for taking TOCFL. Besides, according to feedback sheets, students’ interest towards Chinese characters and related cultural issues are invoked.
3. Chinese Character Perception Teaching Approach can help learners to analyze characters systematically and connect the sound, meaning and structure of characters even before they are taught. Also, through adequate writing practice, students are familiar with the strokes of Chinese characters. Therefore, they are able to write not only correct, but also make words better-looking.
Finally, the reaserch brings up some suggestions to modify and improve the Chinese Character Perception Teaching Approach. The author expect that through this effective and interesting way of teaching characters, the myth of “Chinese characters are hard to learn” would be broken, and let students from all over the world truly feel the warmth, the beauty of Chinese language and culture.
|
16 |
Breaking the learning barrier of Chinese Changjei input methodWong Kun-wing, Peter., 黃冠榮. January 1998 (has links)
published_or_final_version / Education / Master / Master of Education
|
17 |
Chinese character synthesis : towards universal Chinese information exchangeYiu, Lai Kuen Candy 01 January 2003 (has links)
No description available.
|
18 |
Applications of neural networks for industrial and office automation葉慶輝, Yip, Hing-fai, Devil. January 2001 (has links)
published_or_final_version / Industrial and Manufacturing Systems Engineering / Doctoral / Doctor of Philosophy
|
19 |
Learning Chinese keyboarding skill: Cangjie input methodChan, Kam-kong, Angus, 陳錦江 January 2006 (has links)
published_or_final_version / Education / Master / Master of Science in Information Technology in Education
|
20 |
Hemispheric processing in reading Chinese characters : statistical, experimental, and cognitive modelingHsiao, Janet Hui-wen January 2006 (has links)
In Chinese orthography, phonetic compounds comprise about 80% of the most frequent characters. They contain separate phonological and semantic elements, referred to as phonetic and semantic radicals respectively. A dominant type exists in which the se-mantic radical appears on the left and the phonetic radical on the right (SP characters); an opposite, minority structure also exists in which the semantic radical appears on the right and the phonetic radical on the left (PS characters). Through statistical analyses, connectionist modelling, behavioural experiments, and neuroimaging studies, this dis-sertation demonstrates that the distinct structures of these two types of characters allow us crucial insights into the relationship between brain structure and reading processes. The statistical analyses of a Chinese lexical database show that, because of the different information profiles of SP and PS characters and the imbalanced distribution between them in the lexicon, the overall information is skewed to the right. This information skew provides important opportunities to examine the interaction between foveal split-ting and the information structure of the characters. The foveal splitting hypothesis as-sumes a vertical meridian split in the foveal representation and the consequent contra-lateral projection to the two cerebral hemispheres; it has been shown to have important implications for visual word recognition. The square shape and the condensed structure of Chinese characters make them a severe test case for the split fovea claim. Through a lateralized cueing examination and a TMS study of the semantic radical combinability effect with foveally presented characters in character semantic judgements, a flexible division of labour between the hemispheres in character recognition is demonstrated, with each hemisphere responding optimally to the information in the contralateral visual hemifield. The interaction between stimulation site and radical combinability in the TMS study also provides further support for the split fovea claim, suggesting functional foveal splitting as a universal processing constraint in reading. Even if foveal splitting is true, it is still unclear about how far the effects of foveal split-ting can extend from the retina into the process of character recognition. We show that, in naming isolated, foveally presented SP and PS characters, adult male and female readers process them differently, with opposite patterns of ease and difficulty: males responded significantly faster to SP than PS characters; females showed a non-significant tendency in the opposite direction. This result is also supported by a corre-sponding ERP study showing larger N350 amplitude elicited by PS character than SP characters in the male brain, and an opposite pattern in the female brain. The split fovea claim suggests that the two halves of a centrally fixated character are initially processed in different hemispheres. The male brain typically relies more on the left hemisphere for phonological processing compared with the female brain, causing this gender difference to emerge. This interaction is also predicted by an implemented computational model, contrasting a split cognitive architecture, in which the mapping between orthography to phonology is mediated by two partially encapsulated, interconnected processing do-mains, and a non-split cognitive architecture, in which the mapping is mediated by a single, undifferentiated processing domain. Thus, the effects of foveal splitting in read-ing extend far enough to interact with the gender of the reader in a naturalistic reading task. In short, this dissertation demonstrates that foveal splitting is a universal language proc-essing phenomenon, precise enough to project the two radicals of a centrally-fixated Chinese character to different hemispheres to allow a flexible division of labour be-tween the two hemispheres to emerge, and its effects in reading extend far enough into word recognition to interact with the gender of the reader in a naturalistic reading task. The results can also be extrapolated to Chinese word and sentence processing as well as to other languages. This dissertation thus has contributed to a better understanding of the relationship between brain structure and language processes.
|
Page generated in 0.0752 seconds