Spelling suggestions: "subject:"epeech aprocessing lemsystems"" "subject:"epeech aprocessing atemsystems""
201 |
Speech enhancement using microphone arrayCho, Jaeyoun, January 2005 (has links)
Thesis (Ph. D.)--Ohio State University, 2005. / Title from first page of PDF file. Includes bibliographical references (p. 114-117).
|
202 |
Acoustic characteristics of stop consonants: a controlled study.Zue, V. W. (Victor Waito) January 1976 (has links)
Thesis (Sc. D.)—Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1976. / Includes bibliographical references (p. 146-149). / This electronic version was scanned from a copy of the thesis on file at the Speech Communication Group. The certified thesis is available in the Institute Archives and Special Collections
|
203 |
Automatic syllabification of untranscribed speechNel, Pieter Willem 03 1900 (has links)
Thesis (MScEng)--Stellenbosch University, 2005. / ENGLISH ABSTRACT: The syllable has been proposed as a unit of automatic speech recognition due to its
strong links with human speech production and perception. Recently, it has been proved
that incorporating information from syllable-length time-scales into automatic speech
recognition improves results in large vocabulary recognition tasks. It was also shown to
aid in various language recognition tasks and in foreign accent identification. Therefore,
the ability to automatically segment speech into syllables is an important research tool.
Where most previous studies employed knowledge-based methods, this study presents a
purely statistical method for the automatic syllabification of speech.
We introduce the concept of hierarchical hidden Markov model structures and show
how these can be used to implement a purely acoustical syllable segmenter based, on
general sonority theory, combined with some of the phonotactic constraints found in the
English language.
The accurate reporting of syllabification results is a problem in the existing literature.
We present a well-defined dynamic time warping (DTW) distance measure used for
reporting syllabification results.
We achieve a token error rate of 20.3% with a 42ms average boundary error on a
relatively large set of data. This compares well with previous knowledge-based and
statistically- based methods. / AFRIKAANSE OPSOMMING: Die syllabe is voorheen voorgestel as 'n basiese eenheid vir automatiese spraakherkenning
weens die sterk verwantwskap wat dit het met spraak produksie en persepsie. Onlangs
is dit bewys dat die gebruik van informasie van syllabe-lengte tydskale die resultate
verbeter in groot woordeskat herkennings take. Dit is ook bewys dat die gebruik van
syllabes automatiese taalherkenning en vreemdetaal aksent herkenning vergemaklik. Dit
is daarom belangrik om vir navorsingsdoeleindes syllabes automaties te kan segmenteer.
Vorige studies het kennisgebaseerde metodes gebruik om hierdie segmentasie te bewerkstellig.
Hierdie studie gebruik 'n suiwer statistiese metode vir die automatiese syllabifikasie
van spraak.
Ons gebruik die konsep van hierargiese verskuilde Markov model strukture en wys hoe
dit gebruik kan word om 'n suiwer akoestiese syllabe segmenteerder te implementeer. Die
model word gebou deur dit te baseer op die teorie van sonoriteit asook die fonotaktiese
beperkinge teenwoordig in die Engelse taal.
Die akkurate voorstelling van syllabifikasie resultate is problematies in die bestaande
literatuur. Ons definieer volledig 'n DTW (Dynamic Time Warping) afstands funksie
waarmee ons ons syllabifikasie resultate weergee.
Ons behaal 'n TER (Token Error Rate) van 20.3% met 'n 42ms gemiddelde grens
fout op 'n relatiewe groot stel data. Dit vergelyk goed met vorige kennis-gebaseerde en
statisties-gebaseerde metodes.
|
204 |
Enkelsybanddemodulasie met behulp van syferseinverwerkingKruger, Johannes Petrus 12 June 2014 (has links)
M.Ing. (Electrical and Electronic Engineering) / The feasibility of modulation and demodulation of speech signals within a microprocessor is invertigated in the following study. Existing modulation and demodulation techniques are investigated and new techniques. suitable for microprocessor implementation, described. Finally a single sideband demodulator was built using the TMS32010 microprocessor with results being better or comparable than existing analog techniques.
|
205 |
A rule-based system to automatically segment and label continuous speech of known text /Boissonneault, Paul G. January 1984 (has links)
No description available.
|
206 |
Vector quantization in residual-encoded linear prediction of speechAbramson, Mark January 1983 (has links)
No description available.
|
207 |
The effects of recognition accuracy and vocabulary size of a speech recognition system on task performance and user acceptanceCasali, Sherry P. 22 June 2010 (has links)
Automatic speech recognition systems have at last advanced to the state that they are now a feasible alternative for human-machine communication in selected applications. As such, research efforts are now beginning to focus on characteristics of the human, the recognition device, and the interface which optimize the system performance, rather than the previous trend of determining factors affecting recognizer performance alone. This study investigated two characteristics of the recognition device, the accuracy level at which it recognizes speech, and the vocabulary size of the recognizer as a percent of task vocabulary size to determine their effects on system performance. In addition, the study considered one characteristic of the user, age. Briefly, subjects performed a data entry task under each of the treatment conditions. Task completion time and the number of errors remaining at the end of each session were recorded. After each session, subjects rated the recognition device used as to its acceptability for the task.
The accuracy level at which the recognizer was performing significantly influenced the task completion time as well as the user's acceptability ratings, but had only a small effect on the number of errors left uncorrected. The available vocabulary size also significantly affected the task completion time; however, its effect on the final error rate and on the acceptability ratings was negligible. The age of the subject was also found to influence both objective and subjective measures. Older subjects in general required longer times to complete the tasks; however, they consistently rated the speech input systems more favorably than the younger subjects. / Master of Science
|
208 |
An Analog Architecture for Auditory Feature Extraction and RecognitionSmith, Paul Devon 22 November 2004 (has links)
Speech recognition systems have been implemented using a wide range of signal processing techniques including
neuromorphic/biological inspired and Digital Signal Processing
techniques. Neuromorphic/biologically inspired techniques, such as silicon cochlea models, are based on fairly simple yet highly parallel computation and/or computational units. While the area of digital signal processing (DSP) is based on block transforms and statistical or error minimization methods.
Essential to each of these techniques is the first stage of
extracting meaningful information from the speech signal, which is known as feature extraction. This can be done using biologically inspired techniques such as silicon cochlea models, or techniques beginning with a model of speech production and then trying to separate the the vocal tract response from an excitation signal. Even within each of these approaches, there are multiple techniques including cepstrum filtering, which sits
under the class of Homomorphic signal processing, or techniques using FFT based predictive approaches. The underlying reality is there are multiple techniques that have attacked the problem in speech recognition but the problem is still far from being solved. The techniques that have shown to have the best recognition rates involve Cepstrum Coefficients for the feature extraction and Hidden-Markov Models to perform the pattern recognition.
The presented research develops an analog system based on
programmable analog array technology that can perform the initial stages of auditory feature extraction and recognition before passing information to a digital signal processor. The goal being a low power system that can be fully contained on one or more integrated circuit chips. Results show that it is
possible to realize advanced filtering techniques such as
Cepstrum Filtering and Vector Quantization in analog circuitry. Prior to this work, previous applications of analog signal processing have focused on vision, cochlea models, anti-aliasing filters and other single component uses. Furthermore, classic designs have looked heavily at utilizing op-amps as a basic core building block for these designs. This research also shows a novel design for a Hidden Markov Model (HMM) decoder utilizing circuits that take advantage of the inherent properties of subthreshold transistors and floating-gate technology to create low-power computational blocks.
|
209 |
Evaluation of two tactile speech displaysClements, Mark Andrew. January 1978 (has links)
Thesis: Elec. E., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 1978 / Bibliography: leaves 57-59. / by Mark Andrew Clements. / Elec. E. / Elec. E. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
|
210 |
Independent formant and pitch control applied to singing voiceCalitz, Wietsche Roets 12 1900 (has links)
Thesis (MScIng)--University of Stellenbosch, 2004. / ENGLISH ABSTRACT: A singing voice can be manipulated artificially by means of a digital computer for the
purposes of creating new melodies or to correct existing ones. When the fundamental frequency
of an audio signal that represents a human voice is changed by simple algorithms,
the formants of the voice tend to move to new frequency locations, making it sound unnatural.
The main purpose is to design a technique by which the pitch and formants of a
singing voice can be controlled independently. / AFRIKAANSE OPSOMMING: Onafhanklike formant- en toonhoogte beheer toegepas op ’n sangstem: ’n Sangstem kan
deur ’n digitale rekenaar gemanipuleer word om nuwe melodie¨e te skep, of om bestaandes
te verbeter. Wanneer die fundamentele frekwensie van ’n klanksein (wat ’n menslike stem
voorstel) deur ’n eenvoudige algoritme verander word, skuif die oorspronklike formante
na nuwe frekwensie gebiede. Dit veroorsaak dat die resultaat onnatuurlik klink. Die hoof
oogmerk is om ’n tegniek te ontwerp wat die toonhoogte en die formante van ’n sangstem
apart kan beheer.
|
Page generated in 0.0758 seconds