• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 290
  • 34
  • 33
  • 27
  • 27
  • 16
  • 12
  • 8
  • 8
  • 8
  • 8
  • 8
  • 8
  • 8
  • 5
  • Tagged with
  • 577
  • 143
  • 135
  • 75
  • 46
  • 45
  • 42
  • 35
  • 32
  • 30
  • 29
  • 29
  • 27
  • 26
  • 25
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

A Design of Speech Recognition System for Two-Word Mandarin Phrases

Jheng, He-de 06 September 2007 (has links)
The objective of this thesis is to increase the correct recognition rate of the two-word Mandarin phrases. The reason for inaccuracy is due to the ambiguities of the syllables and the intonations. For the syllable ambiguity, a balanced speech training dataset is designed and the weights of the state observation probabilities on vowels and consonants are adjusted. For the tone ambiguity, both the pitch contour and the spectrum evolution property derived from the Karhunen-Loéve transform are applied. The experimental results indicate that an 85% correct rate can be achieved, that is a 6% increase in the performance for the system without the above improvements.
162

Music Processing in Deaf Adults with Cochlear Implants

Saindon, Mathieu R. 11 January 2011 (has links)
Cochlear implants (CIs) provide coarse representations of pitch, which are adequate for speech but not for music. Despite increasing interest in music processing by CI users, the available information is fragmentary. The present experiment attempted to fill this void by conducting a comprehensive assessment of music processing in adult CI users. CI users (n =6) and normally hearing (NH) controls (n = 12) were tested on several tasks involving melody and rhythm perception, recognition of familiar music, and emotion of recognition in speech and music. CI performance was substantially poorer than NH performance and at chance levels on pitch processing tasks. Performance was highly variable, however, with one individual achieving NH performance levels on some tasks, probably because of low-frequency residual hearing in his unimplanted ear. Future research with a larger sample of CI users can shed light on factors associated with good and poor music processing in this population.
163

Music Processing in Deaf Adults with Cochlear Implants

Saindon, Mathieu R. 11 January 2011 (has links)
Cochlear implants (CIs) provide coarse representations of pitch, which are adequate for speech but not for music. Despite increasing interest in music processing by CI users, the available information is fragmentary. The present experiment attempted to fill this void by conducting a comprehensive assessment of music processing in adult CI users. CI users (n =6) and normally hearing (NH) controls (n = 12) were tested on several tasks involving melody and rhythm perception, recognition of familiar music, and emotion of recognition in speech and music. CI performance was substantially poorer than NH performance and at chance levels on pitch processing tasks. Performance was highly variable, however, with one individual achieving NH performance levels on some tasks, probably because of low-frequency residual hearing in his unimplanted ear. Future research with a larger sample of CI users can shed light on factors associated with good and poor music processing in this population.
164

Real-time monitoring of voice characteristics usingaccelerometer and microphone measurements

Virebrand, Marcus January 2011 (has links)
VoxLog is a portable voice accumulator, that uses both an accelerometer that measures skin vibrations and a regular microphone to collect data. The goal of the thesis was to implement and evaluate methods that based on this data estimate the three different voice parameters fundamental frequency, phonation and soundpressure level. For pitch, three different methods were evaluated. The different methods all require relatively low computational power since the goal was to implement at least one of them on the digital signal processor in the VoxLog. The results from these evaluations show that the best estimations of pitch were made with a FFT-based approach that uses phase information to get an estimation with high frequencyresolution. Phonation is estimated with an energy based voice activity detection method.This estimation is then used to choose when sound pressure level should be estimated. Here one of the main problems was to make a distinction between when sound pressure level should be estimated for the wearer of the VoxLog or when an estimation should be made for the background noise. This was solved by implementing a time window before and after phonation were neither is estimated. For both pitch and sound pressure level a feedback functionality was implemented. The feedback is given to the user via vibrations in the VoxLog, the feedback is given when estimated parameters break set limits on pitch or sound pressure level.
165

Investigating the Perceptual Effects of Multi-rate Stimulation in Cochlear Implants and the Development of a Tuned Multi-rate Sound Processing Strategy

Stohl, Joshua Simeon January 2009 (has links)
<p>It is well established that cochlear implants (CIs) are able to provide many users with excellent speech recognition ability in quiet conditions; however, the ability to correctly identify speech in noisy conditions or appreciate music is generally poor for implant users with respect to normal-hearing listeners. This discrepancy has been hypothesized to be in part a function of the relative decrease in spectral information available to implant users (Rubinstein and Turner, 2003; Wilson et al., 2004). One method that has been proposed for increasing the amount of spectral information available to CI users is to include time-varying stimulation rate in addition to changes in the place of stimulation. However, previous implementations of multi-rate strategies have failed to result in an improvement in speech recognition over the clinically available, fixed-rate strategies (Fearn, 2001; Nobbe, 2004). It has been hypothesized that this lack of success was due to a failure to consider the underlying perceptual responses to multi-rate stimulation. </p><p>In this work, psychophysical experiments were implemented with the goal of achieving a better understanding of the interaction of place and rate of stimulation and the effects of duration and context on CI listeners' ability to detect changes in stimulation rate. Results from those experiments were utilized in the implementation of a tuned multi-rate sound processing strategy for implant users in order to potentially ``tune" multi-rate strategies and improve speech recognition performance. </p><p>In an acute study with quiet conditions, speech recognition performance with a tuned multi-rate implementation was better than performance with a clinically available, fixed-rate strategy, although the difference was not statistically significant. These results suggest that utilizing time-varying pulse rates in a subject-specific implementation of a multi-rate algorithm may offer improvements in speech recognition over clinically available strategies. A longitudinal study was also performed to investigate the potential benefit from training to speech recognition. General improvements in speech recognition ability were observed as a function of time; however, final scores with the tuned multi-rate algorithm never surpassed performance with the fixed-rate algorithm for noisy conditions. </p><p>The ability to improve upon speech recognition scores for quiet conditions with respect to the fixed-rate algorithm suggests that using time-varying stimulation rates potentially provides additional, usable information to listeners. However, performance with the fixed-rate algorithm proved to be more robust to noise, even after three weeks of training. This lack of robustness to noise may be in part a result of the frequency estimation technique used in the multi-rate strategy, and thus more sophisticated techniques for real-time frequency estimation should be explored in the future.</p> / Dissertation
166

Instrument Timbres and Pitch Estimation in Polyphonic Music

Loeffler, Dominik B. 14 April 2006 (has links)
In the past decade, the availability of digitally encoded, downloadable music has increased dramatically, pushed mainly by the release of the now famous MP3 compression format (Fraunhofer-Gesellschaft, 1994). Online sales of music in the US doubled in 2005, according to a recent news article (*), while the number of files exchanged on P2P platforms is much higher, but hard to estimate. The existing and coming informational flood in digital music prompts the need for sophisticated content-based information retrieval. Query-by-Humming is a prototypical technique aimed at locating pieces of music by melody; automatic annotation algorithms seek to enable finer search criteria, such as instruments, genre, or meter. Score transcription systems strive for an abstract, compressed form of a piece of music understandable by composers and musicians. Much research still has to be performed to achieve these goals. This thesis connects essential knowledge about music and human auditory perception with signal processing algorithms to solve the specific problem of pitch estimation. The designed algorithm obtains an estimate of the magnitude spectrum via STFT and models the harmonic structure of each pitch contained in the magnitude spectrum with Gaussian density mixtures, whose parameters are subsequently estimated via an Expectation-Maximization (EM) algorithm. Heuristics for EM initialization are formulated mathematically. The system is implemented in MATLAB, featuring a GUI that provides for visual (spectrogram) and numerical (console) verification of results. The algorithm is tested using an array of data ranging from single to triple superposed instrument recordings. Its advantages and limitations are discussed, and a brief outlook over potential future research is given. (*) "Online and Wireless Music Sales Tripled in 2005"; Associated Press; January 19, 2006
167

A Design of Speech Recognition System for the Mandarin Toponyms

Wei, Hong-jhang 31 August 2006 (has links)
In this thesis, a Mandarin toponym speech recognition system is developed using MFCC, LPC and HMM under Red Hat Linux 9.0. The system is based on monosyllable HMM's to select the initial toponym candidates, and its final classification result can be obtained by further pitch identification mechanisms. For speaker-dependent case, a 90% correct rate can be achieved approximately and the recognition process can be accomplished within 1.5 seconds on the average.
168

A Design of Speech Recognition System for Three-word and Four-word Mandarin Phrases

Sue, Ji-sin 10 September 2006 (has links)
In this thesis, a three-word and four-word Mandarin phrases speech recognition system is developed. This system contains two recordings of twenty-four thousand three-word phrases and twenty-two thousand four-word phrases in the database. And it applies MFCC, mono-syllable HMM¡¦s and speech-text alignment scheme to select the initial phrase candidates. A wavelet transform based vowel segmentation technique and a Mandarin pitch identification method is then followed to increase the phrase correct identification rate and obtain the final answer. Experimental results indicate that 92% and 96% correct rates can be achieved for three-word and four-word phrases recognition problems respectively, under the conditions that the first recording of this database is used for training and the second one is for testing. For the speaker-dependent case, the correct phrase can be found within 1 second, using a PC with Intel Celeron 2.4 GHz CPU and RedHat Linux 9.0 Operation System.
169

Effect of visually induced self-motion perception (vection) on upright standing posture

渡邉, 悟, 市川, 真澄, WATANABE, Satoru, ICHIKAWA, Masumi 12 1900 (has links)
名古屋大学博士学位論文 学位の種類 : 博士(医学)(論文) 学位授与年月日:平成4年12月22日 市川真澄氏の博士論文として提出された
170

Comparison of aural and visual instructional methodologies designed to improve the intonation accuracy of seventh grade violin and viola instrumentalists

Núñez, Mario Leoncio. January 2002 (has links)
Thesis (Ph. D.)--University of North Texas, 2002. / Includes bibliographical references (p. 302-314).

Page generated in 0.0484 seconds