Spelling suggestions: "subject:"epeech 3research."" "subject:"epeech 1research.""
1 |
A study of Kansas speech with information and exercises for its improvementWatt, Elaine Harris. January 1951 (has links)
Call number: LD2668 .T4 1951 W38 / Master of Science
|
2 |
An empirical investigation of the effect of the number of speakers and speaker position on the results of individual events contests held at the 1967 Western Speech Association TournamentRybacki, Donald J. (Donald Jay), 1945- January 1968 (has links)
No description available.
|
3 |
Some considerations of deaf speechNolan, M. Helena January 2010 (has links)
Digitized by Kansas Correctional Industries
|
4 |
Linguistic analysis of children's speech : effects of stimulus media on elicited samplesAhmed, S. Esther January 2010 (has links)
Digitized by Kansas Correctional Industries
|
5 |
Acoustic characteristics of stop consonants: a controlled study.Zue, V. W. (Victor Waito) January 1976 (has links)
Thesis (Sc. D.)—Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1976. / Includes bibliographical references (p. 146-149). / This electronic version was scanned from a copy of the thesis on file at the Speech Communication Group. The certified thesis is available in the Institute Archives and Special Collections
|
6 |
Production de la parole en français: investigation des unités impliquées dans l'encodage phonologique des motsEvinck, Sylvie January 1997 (has links)
Doctorat en sciences psychologiques / info:eu-repo/semantics/nonPublished
|
7 |
Multi-dialect Arabic broadcast speech recognitionAli, Ahmed Mohamed Abdel Maksoud January 2018 (has links)
Dialectal Arabic speech research suffers from the lack of labelled resources and standardised orthography. There are three main challenges in dialectal Arabic speech recognition: (i) finding labelled dialectal Arabic speech data, (ii) training robust dialectal speech recognition models from limited labelled data and (iii) evaluating speech recognition for dialects with no orthographic rules. This thesis is concerned with the following three contributions: Arabic Dialect Identification: We are mainly dealing with Arabic speech without prior knowledge of the spoken dialect. Arabic dialects could be sufficiently diverse to the extent that one can argue that they are different languages rather than dialects of the same language. We have two contributions: First, we use crowdsourcing to annotate a multi-dialectal speech corpus collected from Al Jazeera TV channel. We obtained utterance level dialect labels for 57 hours of high-quality consisting of four major varieties of dialectal Arabic (DA), comprised of Egyptian, Levantine, Gulf or Arabic peninsula, North African or Moroccan from almost 1,000 hours. Second, we build an Arabic dialect identification (ADI) system. We explored two main groups of features, namely acoustic features and linguistic features. For the linguistic features, we look at a wide range of features, addressing words, characters and phonemes. With respect to acoustic features, we look at raw features such as mel-frequency cepstral coefficients combined with shifted delta cepstra (MFCC-SDC), bottleneck features and the i-vector as a latent variable. We studied both generative and discriminative classifiers, in addition to deep learning approaches, namely deep neural network (DNN) and convolutional neural network (CNN). In our work, we propose Arabic as a five class dialect challenge comprising of the previously mentioned four dialects as well as modern standard Arabic. Arabic Speech Recognition: We introduce our effort in building Arabic automatic speech recognition (ASR) and we create an open research community to advance it. This section has two main goals: First, creating a framework for Arabic ASR that is publicly available for research. We address our effort in building two multi-genre broadcast (MGB) challenges. MGB-2 focuses on broadcast news using more than 1,200 hours of speech and 130M words of text collected from the broadcast domain. MGB-3, however, focuses on dialectal multi-genre data with limited non-orthographic speech collected from YouTube, with special attention paid to transfer learning. Second, building a robust Arabic ASR system and reporting a competitive word error rate (WER) to use it as a potential benchmark to advance the state of the art in Arabic ASR. Our overall system is a combination of five acoustic models (AM): unidirectional long short term memory (LSTM), bidirectional LSTM (BLSTM), time delay neural network (TDNN), TDNN layers along with LSTM layers (TDNN-LSTM) and finally TDNN layers followed by BLSTM layers (TDNN-BLSTM). The AM is trained using purely sequence trained neural networks lattice-free maximum mutual information (LFMMI). The generated lattices are rescored using a four-gram language model (LM) and a recurrent neural network with maximum entropy (RNNME) LM. Our official WER is 13%, which has the lowest WER reported on this task. Evaluation: The third part of the thesis addresses our effort in evaluating dialectal speech with no orthographic rules. Our methods learn from multiple transcribers and align the speech hypothesis to overcome the non-orthographic aspects. Our multi-reference WER (MR-WER) approach is similar to the BLEU score used in machine translation (MT). We have also automated this process by learning different spelling variants from Twitter data. We mine automatically from a huge collection of tweets in an unsupervised fashion to build more than 11M n-to-m lexical pairs, and we propose a new evaluation metric: dialectal WER (WERd). Finally, we tried to estimate the word error rate (e-WER) with no reference transcription using decoding and language features. We show that our word error rate estimation is robust for many scenarios with and without the decoding features.
|
8 |
Experimental phonetics in Britain, 1890-1940Ashby, Michael January 2016 (has links)
This study provides the first critical history of British developments in phonetic science from 1890 to the beginning of the Second World War. It draws on both published and unpublished documentary evidence, and on original digital analyses of contemporary images, experimental data, and sound recordings. Experimental phonetics had diverse origins embracing medicine, physics and philology. A survey of the nineteenth century background shows that by 1890 significant British contributions in all three fields could have furnished the makings of a native approach to phonetics as an experimental science, but they failed to come together for a variety of bureaucratic, professional and personal reasons. Experimental phonetics-an academic fashion as much as a scientific specialism-was instead imported from Germany and France, and it had little continuity with British antecedents. The study details the earliest British phonetics laboratories, their personnel, equipment, and research programmes, providing the first extensive account of the UCL laboratory, and bringing to light a forgotten 1930s laboratory in Newcastle. The major methods of empirical investigation of the period are scrutinised, rehabilitating long-neglected British origins. The early work of Daniel Jones is extensively re-evaluated, establishing his scientific credentials, and the career of Stephen Jones, the first academic in Britain to earn a salary as an experimental phonetician, receives detailed treatment. New light is thrown on many neglected figures, including W. A. Aikin, E. R. Edwards, John G. McKendrick, and Wilfred Perrett, while a detailed investigation of the work of Sir Richard Paget reveals the astonishing accuracy of his auditory analyses. The study concludes with an account of the career of Robert Curry, the first recognisably modern and professional speech scientist to emerge in Britain.
|
9 |
Effects of voice coding and speech rate on a synthetic speech display in a telephone information systemHerlong, David W. January 1988 (has links)
Despite the lack of formal guidelines, synthetic speech displays are used in a growing variety of applications. Telephone information systems permitting human-computer interaction from remote locations are an especially popular implementation of computer-generated speech. Currently, human factors research is needed to specify design characteristics providing usable telephone information systems as defined by task performance and user ratings. Previous research used nonintegrated tasks such as transcription of phonetic syllables, words, or sentences to assess task performance or user preference differences. This study used a computer-driven telephone information system as a real-time, human-computer interface to simulate applications where synthetic speech is used to access data. Subjects used a telephone keypad to navigate through an automated, department store database to locate and transcribe specific information messages. Because speech provides a sequential and transient information display, users may have difficulty navigating through auditory databases. One issue investigated in this study was whether use of alternating male and female voices to code different levels in the database hierarchy would improve user search performance. Other issues investigated were basic intelligibility of these male and female voices as influenced by different levels of speech rate. All factors were assessed as functions of search or transcription task performance and user preference. Analysis of transcription accuracy, search efficiency and time, and subjective ratings revealed an overall significant effect of speech rate on all groups of measures but no significant effects for voice type or coding scheme. Results were used to recommend design guidelines for developing speech displays for telephone information systems. / Master of Science
|
10 |
The effects of speech rate, message repetition, and information placement on synthesized speech intelligibilityMerva, Monica Ann 12 March 2013 (has links)
Recent improvements in speech technology have made synthetic speech a viable I/O alternative. However, little research has focused on optimizing the various speech parameters which influence system performance. This study examined the effects of speech rate, message repetition, and the placement of information in a message. Briefly, subjects heard messages generated by a speech synthesizer and were asked to transcribe what they had heard. After entering each transcription, subjects rated the perceived difficulty of the preceding message, and how confident they were of their response. The accuracy of their response, system response time, and response latency were recorded.
Transcription accuracy was best for messages spoken at 150 or 180 wpm and for messages repeated either twice or three times. Words at the end of messages were transcribed more accurately than words at the beginning of messages. Response latencies were fastest at 180 wpm with 3 repetitions and rose as the number of repetitions decreased. System response times were shortest when a message was repeated only once. The subjective certainty and difficulty ratings indicated that subjects were aware of errors when incorrectly transcribing a message. These results suggest that a) message rates should lie below 210 wpm, b) a repeat feature should be included in speech interface designs, and c) important information should be contained at the end of messages. / Master of Science
|
Page generated in 0.0359 seconds