• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 407
  • 111
  • 22
  • 20
  • 13
  • 9
  • 9
  • 9
  • 9
  • 9
  • 9
  • 8
  • 4
  • 2
  • 1
  • Tagged with
  • 719
  • 719
  • 162
  • 153
  • 128
  • 113
  • 110
  • 87
  • 81
  • 72
  • 63
  • 60
  • 59
  • 55
  • 52
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

The perception of foreign language speech at the pre-meaning level /

André, Elise January 1975 (has links)
No description available.
82

Speech perception of Canadian English sibilants: Processing of acoustic information or underlying (articulatory) vocal tract configurations?

Luk, San-Hei Kenny January 2018 (has links)
Acoustic and articulatory speech perception theories are proposed to explain how listeners map acoustic signals to phonetic categories. Different from acoustic theories, articulatory theories hypothesize that listeners would recover articulatory information during the mapping. To test this hypothesis, we altered the acoustic information of the Canadian English sibilants /s/ and /ʃ/ while keeping their articulatory information to signal places of articulation of the original sibilants. The manipulated sibilants were articulatorily cued as the original sibilants, but acoustically cued as the alternative sibilants (/s/ as /ʃ/ and /ʃ/ as /s/). We conducted an identification task to examine whether altering acoustic information would switch our Canadian English listeners’ identification. The listeners identified acoustically /s/-like /ʃ/ completely as the alternative sibilant /s/, but the acoustically /ʃ/-like /s/ as 60% the alternative sibilant /ʃ/ and 40% the original sibilant /s/. There was a categorical switch in the /ʃ/ stimuli but not the /s/ stimuli. This asymmetry of identification between two sibilants can be explained by two accounts: an acoustic plus articulatory account would be that the listeners relied more articulatory information only when identifying /s/ but not /ʃ/; and a purely acoustic account would be that the asymmetry was only a result of still existing small acoustic differences. While the acoustic plus articulatory account cannot explain why articulatory information only influenced the /s/ identification of the /s/ stimuli even after adding a set of assumptions, the purely acoustic account allows us to explain our results consistently without additional assumptions. Although our results cannot be used as evidence to reject the possibility that listeners will recover articulatory information, the results do suggest that even if we assume that articulatory information is recovered, acoustic information is more dominant than articulatory information in the identification process, at least for Canadian English /s/ and /ʃ/. / Thesis / Master of Science (MSc)
83

Spoken word recognition : a combined computational and experimental approach

Gaskell, Mark Gareth January 1994 (has links)
The research reported in this thesis examines issues of word recognition in human speech perception. The main aim of the research is to assess the effect of regular variation in speech on lexical access. In particular, the effect of a type of neutralising phonological variation, assimilation of place of articulation, is examined. This variation occurs regressively across word boundaries in connected speech, altering the sUlface phonetic form of the underlying words. Two methods of investigation are used to explore this issue. Firstly, experiments using cross-modal priming and phoneme monitOling techniques are used to examine the effect of variation on the matching process between speech input and lexical form. Secondly, simulated experiments are performed using two computational models of speech recognition: TRACE (McClelland & Elman, 1986) and a simple recun-ent network. The priming experiments show that the mismatching effects of a phonological change on the word-recognition process depend on their viability, as defmed by phonological constraints. This implies that speech perception involves a process of contextdependent inference, that recovers the abstract underlying representation of speech. Simulations of these and other experiments are then reported using a simple recurrent network model of speech perception. The model accommodates the results of the priming studies and predicts that similar phonological context effects will occur in nonwords. Two phoneme monitOling studies support this prediction, but also show interaction between lexical status and viability, implying that phonological inference relies on both lexical and phonological constraints. A revision of the network model is proposed which leams the mapping from the surface form of speech to semantic and phonological representations.
84

Effects of speech and noise on Cantonese speech intelligibility

Mak, Cheuk-yan, Charin., 麥芍欣. January 2006 (has links)
published_or_final_version / abstract / Speech and Hearing Sciences / Master / Master of Science in Audiology
85

Developmental and cultural factors of audiovisual speech perception in noise

Reetzke, Rachel Denise 16 September 2014 (has links)
The aim of this project is two-fold: 1) to investigate developmental differences in intelligibility gains from visual cues in speech perception-in-noise, and 2) to examine how different types of maskers modulate visual enhancement across age groups. A secondary aim of this project is to investigate whether or not bilingualism differentially modulates audiovisual integration during speech in noise tasks. To that end, both child and adult, monolingual and bilingual participants completed speech perception in noise tasks through three within-subject variables: (1) masker type: pink noise or two-talker babble, (2) modality: audio-only (AO) and audiovisual (AV), and (3) Signal-to-noise ratio (SNR): 0 dB, -4 dB, -8 dB, -12 dB, and -16 dB. The findings revealed that, although both children and adults benefited from visual cues in speech-in-noise tasks, adults showed greater benefit at lower SNRs. Moreover, although child monolingual and bilingual participants performed comparably across all conditions, monolingual adults outperformed simultaneous bilingual adult participants. These results may indicate that the divergent use of visual cues in speech perception between bilingual and monolingual speakers occurs later in development. / text
86

Natural speech segmentation and literacy

Wood, Clare Patricia January 1996 (has links)
No description available.
87

Telephone use and performance in cochlear implant candidates

Allen, Karen January 2007 (has links)
Telephones are an integral part of everyday life in today's society. It is well known that hearing impaired people have difficulty understanding speech on the telephone. The ability to use the telephone is commonly reported as one of the many benefits of cochlear implantation. Assessment for a cochlear implant (CI) includes a variety of aspects related to communication and hearing ability. Included in the case history, mention is made whether the person can use the telephone. The purpose of the present study was firstly to identify if the inability to use the telephone could be used a predictor for suitability for a cochlear implant. It was also purposed to determine if telephone ability could be assessed by self-reported measures. The participants were 13 severe to profoundly hearing impaired people who had previously undergone candidacy assessment for a cochlear implant. Each participant was evaluated on their use and understanding of speech on the telephone. Participants were separated into two groups: those who were candidates for a cochlear implant and those who were not. Speech perception testing was evaluated using a recording of CUNY sentences on the telephone. Results indicated that cochlear implant candidates correctly perceived a significantly lower number of words on the telephone than non-candidates. Use of the telephone was evaluated using a 51-item questionnaire. Results indicated that there was no significant difference in self-reported use of the telephone between cochlear implant candidates and non-candidates. The differences in speech perception understanding on the telephone were most likely due to the overall better hearing levels of the non-candidates. The clinical implications of the present study are considered.
88

Testing template and testing concept of operations for speaker authentication technology

Sipko, Marek M. 09 1900 (has links)
This thesis documents the findings of developing a generic testing template and supporting concept of operations for speaker verification technology as part of the Iraqi Enrollment via Voice Authentication Project (IEVAP). The IEVAP is an Office of the Secretary of Defense sponsored research project commissioned to study the feasibility of speaker verification technology in support of the Global War on Terrorism security requirements. The intent of this project is to contribute toward the future employment of speech technologies in a variety of coalition military operations by developing a pilot proof-of-concept system that integrates speaker verification and automated speech recognition technology into a mobile platform to enhance warfighting capabilities. In this phase of the IEVAP, NPS developed a generic testing template and supporting concept of operations for speaker authentication technology. The intent of this project was to contribute toward the future employment of speech technologies in a variety of coalition military operations by developing a testing template along with a concept of operations to conduct such testing.
89

Recognition of in-ear microphone speech data using multi-layer neural networks

Bulbuller, Gokhan. 03 1900 (has links)
Speech collected through a microphone placed in front of the mouth has been the primary source of data collection for speech recognition. There are only a few speech recognition studies using speech collected from the human ear canal. In this study, a speech recognition system is presented, specifically an isolated word recognizer which uses speech collected from the external auditory canals of the subjects via an in-ear microphone. Currently, the vocabulary is limited to seven words that can be used as control commands for a wide variety of applications. The speech segmentation task is achieved by using the short-time signal energy parameter and the short-time energy-entropy feature (EEF), and by incorporating some heuristic assumptions. Multi-layer feedforward neural networks with two-layer and three-layer network configurations are selected for the word recognition task and use real cepstrum (RC) and mel-frequency cepstral coefficients (MFCCs) extracted from each segmented utterance as characteristic features for the word recognizer. Results show that the neural network configurations investigated are viable choices for this specific recognition task as the average recognition rates obtained with the MFCCs as input features for the two-layer and three-layer networks are 94.731% and 94.61% respectively on the data investigated. Average recognition rates obtained using the RCs as features on the same network configurations are 86.252% and 86.7% respectively.
90

Speech periodicity enhancement based on transform-domain signal decomposition and robust pitch estimation. / CUHK electronic theses & dissertations collection

January 2012 (has links)
周期性是語音信號一個重要的特徵。周期性對於聲調語言更是不可或缺。在聲調語言中,音調的輪廓形狀決定了發音的語義。語音的周期性增強旨在修復受噪聲影響的語音信號的波形周期性,從而增強音調和聲調的聽覺感知。 / 本論文提出了一種新的語音周期性增強方法。在該方法中,語音的周期性增強通過對線性預測殘差信號在變換域的周期/非周期分解來達到。線性預測殘差信號一般被認為是語音周期性的主要載體,該信號以基音同步的方式通過重疊頻率變換在變換域被分解。周期性增強通過對變換參數的加權來達到:代表周期性成分的變換參數被增強,而代表非周期成分的變換參數被削弱。本文提出和評估了三種設置變換參數權重的方法。這三種權重分別是固定權重、自適應權重和維納(Wiener)濾波器參數。 / 為了保証有效的周期/非周期分解,本研究提出了一種新的基音周期估計方法。該方法使用瞬時累積峰值譜作為語音的諧波特徵表示;噪聲對峰值譜影響的概率分布用高斯混合密度模型來表示。基音周期的估計問題則表示為l₁規則化的最大釋然估計。對於該非凸優化問題,本文提出了兩種凸優化方法來近似求解。本文提出的基音周期估計方法優于傳統方法,它在低信噪比的條件下能夠取得較高的估計准確率。 / 本文對提出的語音周期性增強方法進行了全面的實驗和評估。實驗結果表明,該方法能夠有效地修復受損語音的諧波結搆和波形周期性。對比其他測試的語音和語音周期性增強方法,本文的新方法能夠更顯著地提高語音的質量。其輸出語音音質的客觀測量參數,例如SNR和PESQ,優于其他方法。 / Periodicity is an important attribute of speech signals. It is an essential element of tonal languages, where the meaning of a word is determined by the pitch contour. Speech periodicity enhancement is the process of restoring waveform periodicity of noise-corrupted speech, in order to improve human perception of pitch and tone in noisy environments. / This thesis presents a novel approach to speech periodicity enhancement. The enhancement is achieved through periodic-aperiodic decomposition of the linear prediction residual signal in a transform domain. Transform coefficients that represent the periodic component are amplified to enhance the periodicity, and those coefficients representing the aperiodic components are attenuated to suppress the noise. We propose and evaluate dfferent methods of assigning coefficient weights for periodicity enhancement. These methods include simple fixed weights, adaptive weights, and transform-domain Wiener filtering. / As a key component for periodic-aperiodic decomposition, a novel method of robust pitch estimation is developed. The temporally accumulated peak spectrum is proposed as a robust representation of speech harmonics. Gaussian mixture model is employed to model the effect of noise on the peak spectrum. Pitch estimation is formulated as a problem of l₁-regularized maximum likelihood estimation, in which prior information is exploited. Two convex optimization approaches are developed to solve the associated non-convex optimization problem. The proposed pitch estimation method significantly outperforms the conventional methods. It attains high estimation accuracy for various types of noise at very low signal-to-noise ratio (e.g., -5 dB). / Experimental results confirm that with the proposed approach of periodicity enhancement, speech harmonic structure and waveform periodicity can be effectively restored. Compared with other speech and periodicity enhancement methods evaluated in this study, the proposed method can produce speech outputs with noticeably higher quality in terms of different objective measurements, such as SNR and PESQ. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Huang, Feng. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 130-143). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Speech enhancement --- p.1 / Chapter 1.2 --- Speech periodicity and enhancement --- p.4 / Chapter 1.3 --- Research motivations and objectives --- p.6 / Chapter 1.4 --- Thesis outline --- p.7 / Chapter I --- Transform-domain signal decomposition --- p.9 / Chapter 2 --- Speech representation model --- p.10 / Chapter 2.1 --- Overview of speech modeling --- p.10 / Chapter 2.2 --- Speech analysis --- p.12 / Chapter 2.2.1 --- Linear prediction analysis --- p.13 / Chapter 2.2.2 --- Constant-pitch warping --- p.15 / Chapter 2.2.3 --- Two-stage lapped frequency transforms --- p.18 / Chapter 2.2.4 --- Signal segmentation --- p.20 / Chapter 2.3 --- Speech synthesis --- p.22 / Chapter 2.4 --- Applications --- p.24 / Chapter 2.4.1 --- Speech coding --- p.24 / Chapter 2.4.2 --- Speech modification --- p.25 / Chapter 3 --- Signal decomposition for periodicity enhancement --- p.29 / Chapter 3.1 --- Periodic-aperiodic decomposition --- p.29 / Chapter 3.2 --- Periodicity enhancement of noisy speech --- p.33 / Chapter 3.2.1 --- Noise effect on transform coefficients --- p.33 / Chapter 3.2.2 --- Principle of periodicity enhancement --- p.34 / Chapter 3.2.3 --- Research focuses --- p.36 / Chapter 3.3 --- Related issues for noisy speech --- p.38 / Chapter 3.3.1 --- LP coefficient estimation --- p.38 / Chapter 3.3.2 --- Model adjustments --- p.40 / Chapter II --- Robust pitch estimation --- p.43 / Chapter 4 --- Pitch estimation with temporally accumulated peak spectrum --- p.44 / Chapter 4.1 --- Review of pitch estimation methods --- p.44 / Chapter 4.2 --- Peak spectrum and inter-frame spectrum similarity --- p.45 / Chapter 4.3 --- Temporally accumulated peak spectrum --- p.49 / Chapter 4.4 --- Pitch estimation using autocorrelation of TAPS --- p.53 / Chapter 4.5 --- Experimental evaluation --- p.55 / Chapter 4.5.1 --- Test data --- p.55 / Chapter 4.5.2 --- Performance metrics --- p.56 / Chapter 4.5.3 --- Accumulated frame number --- p.56 / Chapter 4.5.4 --- Experimental results --- p.57 / Chapter 5 --- Pitch estimation using sparse estimation techniques --- p.59 / Chapter 5.1 --- Sparse representation of TAPS --- p.59 / Chapter 5.2 --- Sparse weight estimation using l₁-regularized minimization --- p.60 / Chapter 5.2.1 --- Least absolute shrinkage and selection operator --- p.61 / Chapter 5.2.2 --- Gaussian mixture distribution for noise effect --- p.62 / Chapter 5.2.3 --- Convex approximation --- p.62 / Chapter 5.2.4 --- Difference-of-convex programming --- p.63 / Chapter 5.3 --- Pitch estimation from sparse weight vector --- p.65 / Chapter 5.4 --- Experimental evaluation --- p.65 / Chapter 5.4.1 --- Experiment settings --- p.65 / Chapter 5.4.2 --- Peak spectrum exemplar set --- p.67 / Chapter 5.4.3 --- Gaussian mixture density --- p.67 / Chapter 5.4.4 --- Experimental results --- p.68 / Chapter 5.5 --- Summary of robust pitch estimation --- p.76 / Chapter III --- Speech periodicity enhancement --- p.78 / Chapter 6 --- Transform-domain coefficient weighting --- p.79 / Chapter 6.1 --- Overview of the proposed framework --- p.79 / Chapter 6.2 --- Transform coefficient weighting --- p.81 / Chapter 6.3 --- Experimental evaluation --- p.82 / Chapter 6.3.1 --- Experiment settings --- p.83 / Chapter 6.3.2 --- Experimental results --- p.85 / Chapter 7 --- Adaptive coefficient weighting --- p.89 / Chapter 7.1 --- Motivation of adaptive weights --- p.89 / Chapter 7.2 --- Energy concentration of voiced and unvoiced speech --- p.91 / Chapter 7.2.1 --- Energy concentration measures --- p.91 / Chapter 7.2.2 --- Voiced/Unvoiced discrimination --- p.92 / Chapter 7.3 --- Pitch estimation confidence --- p.94 / Chapter 7.3.1 --- Basis of confidence measure --- p.94 / Chapter 7.3.2 --- Robustness of the confidence measure --- p.97 / Chapter 7.4 --- Adaptive coefficient weighting --- p.99 / Chapter 7.5 --- Experimental evaluation --- p.103 / Chapter 8 --- Transform-domain Wiener filtering --- p.107 / Chapter 8.1 --- Transform-domain Wiener filtering for periodicity enhancement --- p.108 / Chapter 8.1.1 --- The MMSE optimal Wiener filter --- p.108 / Chapter 8.1.2 --- Wiener filter for periodic component --- p.109 / Chapter 8.1.3 --- Wiener filter for aperiodic components --- p.109 / Chapter 8.2 --- Filter parameter estimation --- p.110 / Chapter 8.2.1 --- Filter parameters for aperiodic components --- p.110 / Chapter 8.2.2 --- Filter parameters for periodic component --- p.111 / Chapter 8.3 --- Experimental evaluation --- p.112 / Chapter 8.4 --- Summary of speech periodicity enhancement --- p.117 / Chapter 9 --- Conclusions and future directions --- p.119 / Chapter 9.1 --- Conclusions --- p.119 / Chapter 9.2 --- Contributions --- p.121 / Chapter 9.3 --- Future directions --- p.122 / Chapter A --- Algorithms for LP coefficient estimation --- p.123 / Chapter A.1 --- Kalman filtering of noisy speech --- p.123 / Chapter A.2 --- Codebook driven approach --- p.126 / Bibliography --- p.130

Page generated in 0.0824 seconds