• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 5
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 9
  • 9
  • 5
  • 4
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Computational approaches to figurative language

Shutova, Ekaterina January 2011 (has links)
No description available.
2

Tree encoding of speech signals at low bit rates

Chu, Chung Cheung. January 1986 (has links)
No description available.
3

Tree encoding of speech signals at low bit rates

Chu, Chung Cheung. January 1986 (has links)
No description available.
4

The word segmentation & part-of-speech tagging system for the modern Chinese. / Word segmentation and part-of-speech tagging system for the modern Chinese

January 1994 (has links)
Liu Hon-lung. / Title also in Chinese characters. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1994. / Includes bibliographical references (leaves [58-59]). / Chapter 1. --- Introduction --- p.1 / Chapter 2. --- "Word Segmentation and Part-of-Speech Tagging: Techniques, Current Researches and The Embraced Problems" --- p.6 / Chapter 2.1. --- Various Methods on Word Segmentation and Part-of-Speech Tagging --- p.6 / Chapter 2.2. --- Current Researches on Word Segmentation and Part-of-Speech Tagging --- p.9 / Chapter 2.3. --- Embraced Problems in Word Segmentation and Part-of-Speech Tagging --- p.9 / Chapter 3. --- Branch-and-Bound Algorithm for Combinational Optimization of the Probabilistic Scoring Function --- p.15 / Chapter 3.1. --- Definition of Word Segmentation and Part-of-Speech Tagging --- p.15 / Chapter 3.2. --- Framework --- p.17 / Chapter 3.3. --- "Weight Assignment, Intermediate Score Computation & Optimization" --- p.20 / Chapter 4. --- Implementation Issues of the Proposed Word Segmentation and Part-of-Speech Tagging System --- p.26 / Chapter 4.1. --- Design of System Dictionary and Data Structure --- p.30 / Chapter 4.2. --- Training Process --- p.33 / Chapter 4.3. --- Tagging Process --- p.35 / Chapter 4.4. --- Tagging Samples of the Word Segmentation & Part-of-Speech Tagging System --- p.39 / Chapter 5. --- Experiments on the Proposed Word Segmentation and Part-Of-Speech Tagging System --- p.41 / Chapter 5.1. --- Closed Test --- p.41 / Chapter 5.2. --- Open Test --- p.42 / Chapter 6. --- Testing and Statistics --- p.43 / Chapter 7. --- Conclusions and Discussions --- p.47 / References / Appendices / Appendix A: sysdict.tag Sample / Appendix B: econ.tag Sample / Appendix C: open. tag Sample / Appendix D:漢語分詞及詞性標注系統for Windows / Appendix E: Neural Network
5

Machine Learning Methods for Articulatory Data

Berry, Jeffrey James January 2012 (has links)
Humans make use of more than just the audio signal to perceive speech. Behavioral and neurological research has shown that a person's knowledge of how speech is produced influences what is perceived. With methods for collecting articulatory data becoming more ubiquitous, methods for extracting useful information are needed to make this data useful to speech scientists, and for speech technology applications. This dissertation presents feature extraction methods for ultrasound images of the tongue and for data collected with an Electro-Magnetic Articulograph (EMA). The usefulness of these features is tested in several phoneme classification tasks. Feature extraction methods for ultrasound tongue images presented here consist of automatically tracing the tongue surface contour using a modified Deep Belief Network (DBN) (Hinton et al. 2006), and methods inspired by research in face recognition which use the entire image. The tongue tracing method consists of training a DBN as an autoencoder on concatenated images and traces, and then retraining the first two layers to accept only the image at runtime. This 'translational' DBN (tDBN) method is shown to produce traces comparable to those made by human experts. An iterative bootstrapping procedure is presented for using the tDBN to assist a human expert in labeling a new data set. Tongue contour traces are compared with the Eigentongues method of (Hueber et al. 2007), and a Gabor Jet representation in a 6-class phoneme classification task using Support Vector Classifiers (SVC), with Gabor Jets performing the best. These SVC methods are compared to a tDBN classifier, which extracts features from raw images and classifies them with accuracy only slightly lower than the Gabor Jet SVC method.For EMA data, supervised binary SVC feature detectors are trained for each feature in three versions of Distinctive Feature Theory (DFT): Preliminaries (Jakobson et al. 1954), The Sound Pattern of English (Chomsky and Halle 1968), and Unified Feature Theory (Clements and Hume 1995). Each of these feature sets, together with a fourth unsupervised feature set learned using Independent Components Analysis (ICA), are compared on their usefulness in a 46-class phoneme recognition task. Phoneme recognition is performed using a linear-chain Conditional Random Field (CRF) (Lafferty et al. 2001), which takes advantage of the temporal nature of speech, by looking at observations adjacent in time. Results of the phoneme recognition task show that Unified Feature Theory performs slightly better than the other versions of DFT. Surprisingly, ICA actually performs worse than running the CRF on raw EMA data.
6

Automatic phonological transcription using forced alignment : FAVE toolkit performance on four non-standard varieties of English

Sella, Valeria January 2018 (has links)
Forced alignment, a speech recognition software performing semi-automatic phonological transcription, constitutes a methodological revolution in the recent history of linguistic research. Its use is progressively becoming the norm in research fields such as sociophonetics, but its general performance and range of applications have been relatively understudied. This thesis investigates the performance and portability of the Forced Alignment and Vowel Extraction program suite (FAVE), an aligner that was trained on, and designed to study, American English. It was decided to test FAVE on four non-American varieties of English (Scottish, Irish, Australian and Indian English) and a control variety (General American). First, the performance of FAVE was compared with human annotators, and then it was tested on three potentially problematic variables: /p, t, k/ realization, rhotic consonants and /l/. Although FAVE was found to perform significantly differently from human annotators on identical datasets, further analysis revealed that the aligner performed quite similarly on the non-standard varieties and the control variety, suggesting that the difference in accuracy does not constitute a major drawback to its extended usage. The study discusses the implications of the findings in relation to doubts expressed about the usage of such technology and argues for a wider implementation of forced alignment tools such as FAVE in sociophonetic research.
7

Effective automatic speech recognition data collection for under–resourced languages / de Vries N.J.

De Vries, Nicolaas Johannes January 2011 (has links)
As building transcribed speech corpora for under–resourced languages plays a pivotal role in developing automatic speech recognition (ASR) technologies for such languages, a key step in developing these technologies is the effective collection of ASR data, consisting of transcribed audio and associated meta data. The problem is that no suitable tool currently exists for effectively collecting ASR data for such languages. The specific context and requirements for effectively collecting ASR data for underresourced languages, render all currently known solutions unsuitable for such a task. Such requirements include portability, Internet independence and an open–source code–base. This work documents the development of such a tool, called Woefzela, from the determination of the requirements necessary for effective data collection in this context, to the verification and validation of its functionality. The study demonstrates the effectiveness of using smartphones without any Internet connectivity for ASR data collection for under–resourced languages. It introduces a semireal– time quality control philosophy which increases the amount of usable ASR data collected from speakers. Woefzela was developed for the Android Operating System, and is freely available for use on Android smartphones, with its source code also being made available. A total of more than 790 hours of ASR data for the eleven official languages of South Africa have been successfully collected with Woefzela. As part of this study a benchmark for the performance of a new National Centre for Human Language Technology (NCHLT) English corpus was established. / Thesis (M.Ing. (Electrical Engineering))--North-West University, Potchefstroom Campus, 2012.
8

Effective automatic speech recognition data collection for under–resourced languages / de Vries N.J.

De Vries, Nicolaas Johannes January 2011 (has links)
As building transcribed speech corpora for under–resourced languages plays a pivotal role in developing automatic speech recognition (ASR) technologies for such languages, a key step in developing these technologies is the effective collection of ASR data, consisting of transcribed audio and associated meta data. The problem is that no suitable tool currently exists for effectively collecting ASR data for such languages. The specific context and requirements for effectively collecting ASR data for underresourced languages, render all currently known solutions unsuitable for such a task. Such requirements include portability, Internet independence and an open–source code–base. This work documents the development of such a tool, called Woefzela, from the determination of the requirements necessary for effective data collection in this context, to the verification and validation of its functionality. The study demonstrates the effectiveness of using smartphones without any Internet connectivity for ASR data collection for under–resourced languages. It introduces a semireal– time quality control philosophy which increases the amount of usable ASR data collected from speakers. Woefzela was developed for the Android Operating System, and is freely available for use on Android smartphones, with its source code also being made available. A total of more than 790 hours of ASR data for the eleven official languages of South Africa have been successfully collected with Woefzela. As part of this study a benchmark for the performance of a new National Centre for Human Language Technology (NCHLT) English corpus was established. / Thesis (M.Ing. (Electrical Engineering))--North-West University, Potchefstroom Campus, 2012.
9

Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data / Automatic Speech Recognition System Continually Improving Based on Subtitled Speech Data

Kocour, Martin January 2019 (has links)
V dnešnej dobe systémy rozpoznávania reči s veľkým slovníkom dosahujú pomerne vysoké presnosti. Za ich výsledkami však často stoja desiatky ba až stovky hodín manuálne oanotovaných trénovacích dát. Takéto dáta sú často bežne nedostupné alebo pre požadovaný jazyk vôbec neexistujú. Možným riešením je použitie bežne dostupných no menej kvalitných audiovizuálnych dát. Táto práca sa zaoberá technikou zpracovania práve takýchto dát a ich použitím pre trénovanie akustických modelov. Ďalej táto práca pojednáva o možnom využití týchto dát pre kontinuálne vylepšovanie modelov, kedže tieto dáta sú prakticky nevyčerpateľné. Pre tieto účely bol v rámci práce navrhnutý nový prístup pre výber dát.

Page generated in 0.1255 seconds