Global ETD Search

201	Speech enhancement using microphone array Cho, Jaeyoun, January 2005 (has links) Thesis (Ph. D.)--Ohio State University, 2005. / Title from first page of PDF file. Includes bibliographical references (p. 114-117).
202	Acoustic characteristics of stop consonants: a controlled study. Zue, V. W. (Victor Waito) January 1976 (has links) Thesis (Sc. D.)—Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1976. / Includes bibliographical references (p. 146-149). / This electronic version was scanned from a copy of the thesis on file at the Speech Communication Group. The certified thesis is available in the Institute Archives and Special Collections Speech processing systems. English language -- Consonants. Speech -- Research.
203	Automatic syllabification of untranscribed speech Nel, Pieter Willem 03 1900 (has links) Thesis (MScEng)--Stellenbosch University, 2005. / ENGLISH ABSTRACT: The syllable has been proposed as a unit of automatic speech recognition due to its strong links with human speech production and perception. Recently, it has been proved that incorporating information from syllable-length time-scales into automatic speech recognition improves results in large vocabulary recognition tasks. It was also shown to aid in various language recognition tasks and in foreign accent identification. Therefore, the ability to automatically segment speech into syllables is an important research tool. Where most previous studies employed knowledge-based methods, this study presents a purely statistical method for the automatic syllabification of speech. We introduce the concept of hierarchical hidden Markov model structures and show how these can be used to implement a purely acoustical syllable segmenter based, on general sonority theory, combined with some of the phonotactic constraints found in the English language. The accurate reporting of syllabification results is a problem in the existing literature. We present a well-defined dynamic time warping (DTW) distance measure used for reporting syllabification results. We achieve a token error rate of 20.3% with a 42ms average boundary error on a relatively large set of data. This compares well with previous knowledge-based and statistically- based methods. / AFRIKAANSE OPSOMMING: Die syllabe is voorheen voorgestel as 'n basiese eenheid vir automatiese spraakherkenning weens die sterk verwantwskap wat dit het met spraak produksie en persepsie. Onlangs is dit bewys dat die gebruik van informasie van syllabe-lengte tydskale die resultate verbeter in groot woordeskat herkennings take. Dit is ook bewys dat die gebruik van syllabes automatiese taalherkenning en vreemdetaal aksent herkenning vergemaklik. Dit is daarom belangrik om vir navorsingsdoeleindes syllabes automaties te kan segmenteer. Vorige studies het kennisgebaseerde metodes gebruik om hierdie segmentasie te bewerkstellig. Hierdie studie gebruik 'n suiwer statistiese metode vir die automatiese syllabifikasie van spraak. Ons gebruik die konsep van hierargiese verskuilde Markov model strukture en wys hoe dit gebruik kan word om 'n suiwer akoestiese syllabe segmenteerder te implementeer. Die model word gebou deur dit te baseer op die teorie van sonoriteit asook die fonotaktiese beperkinge teenwoordig in die Engelse taal. Die akkurate voorstelling van syllabifikasie resultate is problematies in die bestaande literatuur. Ons definieer volledig 'n DTW (Dynamic Time Warping) afstands funksie waarmee ons ons syllabifikasie resultate weergee. Ons behaal 'n TER (Token Error Rate) van 20.3% met 'n 42ms gemiddelde grens fout op 'n relatiewe groot stel data. Dit vergelyk goed met vorige kennis-gebaseerde en statisties-gebaseerde metodes. Automatic speech recognition Speech processing systems Dissertations -- Electronic engineering
204	Enkelsybanddemodulasie met behulp van syferseinverwerking Kruger, Johannes Petrus 12 June 2014 (has links) M.Ing. (Electrical and Electronic Engineering) / The feasibility of modulation and demodulation of speech signals within a microprocessor is invertigated in the following study. Existing modulation and demodulation techniques are investigated and new techniques. suitable for microprocessor implementation, described. Finally a single sideband demodulator was built using the TMS32010 microprocessor with results being better or comparable than existing analog techniques. Modulation (Electronics) Demodulation (Electronics) Speech synthesis Speech processing systems
205	A rule-based system to automatically segment and label continuous speech of known text / Boissonneault, Paul G. January 1984 (has links) No description available. Speech synthesis. Automatic speech recognition. Speech processing systems.
206	Vector quantization in residual-encoded linear prediction of speech Abramson, Mark January 1983 (has links) No description available. Speech processing systems. Vocoder. Data compression (Telecommunication) Coding theory.
207	The effects of recognition accuracy and vocabulary size of a speech recognition system on task performance and user acceptance Casali, Sherry P. 22 June 2010 (has links) Automatic speech recognition systems have at last advanced to the state that they are now a feasible alternative for human-machine communication in selected applications. As such, research efforts are now beginning to focus on characteristics of the human, the recognition device, and the interface which optimize the system performance, rather than the previous trend of determining factors affecting recognizer performance alone. This study investigated two characteristics of the recognition device, the accuracy level at which it recognizes speech, and the vocabulary size of the recognizer as a percent of task vocabulary size to determine their effects on system performance. In addition, the study considered one characteristic of the user, age. Briefly, subjects performed a data entry task under each of the treatment conditions. Task completion time and the number of errors remaining at the end of each session were recorded. After each session, subjects rated the recognition device used as to its acceptability for the task. The accuracy level at which the recognizer was performing significantly influenced the task completion time as well as the user's acceptability ratings, but had only a small effect on the number of errors left uncorrected. The available vocabulary size also significantly affected the task completion time; however, its effect on the final error rate and on the acceptability ratings was negligible. The age of the subject was also found to influence both objective and subjective measures. Older subjects in general required longer times to complete the tasks; however, they consistently rated the speech input systems more favorably than the younger subjects. / Master of Science LD5655.V855 1988.C382 Automatic speech recognition Speech processing systems
208	An Analog Architecture for Auditory Feature Extraction and Recognition Smith, Paul Devon 22 November 2004 (has links) Speech recognition systems have been implemented using a wide range of signal processing techniques including neuromorphic/biological inspired and Digital Signal Processing techniques. Neuromorphic/biologically inspired techniques, such as silicon cochlea models, are based on fairly simple yet highly parallel computation and/or computational units. While the area of digital signal processing (DSP) is based on block transforms and statistical or error minimization methods. Essential to each of these techniques is the first stage of extracting meaningful information from the speech signal, which is known as feature extraction. This can be done using biologically inspired techniques such as silicon cochlea models, or techniques beginning with a model of speech production and then trying to separate the the vocal tract response from an excitation signal. Even within each of these approaches, there are multiple techniques including cepstrum filtering, which sits under the class of Homomorphic signal processing, or techniques using FFT based predictive approaches. The underlying reality is there are multiple techniques that have attacked the problem in speech recognition but the problem is still far from being solved. The techniques that have shown to have the best recognition rates involve Cepstrum Coefficients for the feature extraction and Hidden-Markov Models to perform the pattern recognition. The presented research develops an analog system based on programmable analog array technology that can perform the initial stages of auditory feature extraction and recognition before passing information to a digital signal processor. The goal being a low power system that can be fully contained on one or more integrated circuit chips. Results show that it is possible to realize advanced filtering techniques such as Cepstrum Filtering and Vector Quantization in analog circuitry. Prior to this work, previous applications of analog signal processing have focused on vision, cochlea models, anti-aliasing filters and other single component uses. Furthermore, classic designs have looked heavily at utilizing op-amps as a basic core building block for these designs. This research also shows a novel design for a Hidden Markov Model (HMM) decoder utilizing circuits that take advantage of the inherent properties of subthreshold transistors and floating-gate technology to create low-power computational blocks. Neuromorphic cochlea Hidden Markov Model (HMM) Cepstrum Speech processing systems Cochlea Models Speech processing systems Markov processes Signal processing Digital techniques Neural networks (Computer science)
209	Evaluation of two tactile speech displays Clements, Mark Andrew. January 1978 (has links) Thesis: Elec. E., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 1978 / Bibliography: leaves 57-59. / by Mark Andrew Clements. / Elec. E. / Elec. E. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science Deaf Means of communication. Speech processing systems. Vibrators. Touch. Touch. fast Vibrators. fast Speech processing systems. fast Deaf fast
210	Independent formant and pitch control applied to singing voice Calitz, Wietsche Roets 12 1900 (has links) Thesis (MScIng)--University of Stellenbosch, 2004. / ENGLISH ABSTRACT: A singing voice can be manipulated artificially by means of a digital computer for the purposes of creating new melodies or to correct existing ones. When the fundamental frequency of an audio signal that represents a human voice is changed by simple algorithms, the formants of the voice tend to move to new frequency locations, making it sound unnatural. The main purpose is to design a technique by which the pitch and formants of a singing voice can be controlled independently. / AFRIKAANSE OPSOMMING: Onafhanklike formant- en toonhoogte beheer toegepas op ’n sangstem: ’n Sangstem kan deur ’n digitale rekenaar gemanipuleer word om nuwe melodie¨e te skep, of om bestaandes te verbeter. Wanneer die fundamentele frekwensie van ’n klanksein (wat ’n menslike stem voorstel) deur ’n eenvoudige algoritme verander word, skuif die oorspronklike formante na nuwe frekwensie gebiede. Dit veroorsaak dat die resultaat onnatuurlik klink. Die hoof oogmerk is om ’n tegniek te ontwerp wat die toonhoogte en die formante van ’n sangstem apart kan beheer. Speech processing systems Vocoder Signal processing Singing -- Data processing Theses -- Electrical engineering Dissertations -- Electrical engineering

Search results