Spelling suggestions: "subject:"epeech processing systems"" "subject:"cpeech processing systems""
1 |
Digital speech processing techniquesSolanki, Niranjan U January 2010 (has links)
Digitized by Kansas Correctional Industries
|
2 |
Model-based single-microphone speech separation using conditional random fields.January 2014 (has links)
單麥克風語音分離的目標是從一個語音混合 (speech mixture) 中重建兩個或更多的語音源 (source)。這技術可作為語音應用的前置處理,例如從多媒體音軌中抽取資訊。雖然作為欠定 (under-determined) 語音分離的極端例子,基本上沒可能確切地還原語音源,但透過語音源的統計模型,仍可重構出最有可能的語音源。 / 語音分離的性能藉著圖模式 (graphical modeling) 的應用而得以提升。本論文比較了因子隱馬爾可夫模型(factorial Hidden Markov Model (HMM) )的精確算法和近似算法的複雜度和對語音分離性能的影響,並且調查語音源統計模型中的狀態轉移機率 (state transition probabilities) 對語音分離性能的影響。 / 統計模型錯配在語音分離中時有發生。有限的訓練資料和使用有限的狀態空間 (acoustic states) 對語音源建模都會導致錯配。本論文研究了使用條件隨機域 (conditional random field (CRF) ) 來對語音源狀態空間的後驗概率直接建模。計算語音源的最小均方差估計 (minimum mean-square error)時,這後驗概率是必須的。條件隨機域是一種判別模型 (discriminative model),比生成模型 (generative model) 例如隱馬爾可夫模型對模型錯配有更高的耐受性。使用大間隔 (large-margin) 參數估計更進一步提升語音分離的效能。 / 實驗結果證明當不同語音源的功率比 (signal-to-signal ratio) 相近時,使用條件隨機域作語音分離可以獲得更好的語音音質客觀測量參數(objective quality measures) 和語音識別結果。即使使用簡化了的條件隨機域,結果仍和使用因子隱馬爾可夫模型相當。 / Single-microphone speech separation requires to reconstruct two or more sources from only one speech mixture. It can serve as the front-end for speech applications that demand for robustness against interfering signals, such as information extraction from sound streams of multimedia. As an extreme case of under-determined source separation problem, a unique solution for source reconstruction is unlikely to be achieved, but the most probable source observations can be obtained through statistical inference given their prior information in a statistical model-based setting. / The performance of statistical model-based methods has been progressively improved by the use of graphical models to organize the prior information. In this thesis, the performance of the exact and the approximated statistical inference algorithms on single-microphone speech separation with factorial Hidden Markov models (HMM) are evaluated in terms of speech quality and computational complexity. The important role of state transitions in the source models is also investigated. / Model mis-specification is a major problem in model-based speech separation. These mis-specifications are caused by various factors, including limited amount of training data and finite number of acoustic states. Compared with generative approach such as factorial HMM, direct models like conditional random fields (CRF) are considered to be more robust to model mis-specification due to the inherent discrimination ability. In this thesis, the application of conditional random field (CRF) for single-microphone speech separation is investigated. The posterior probabilities of acoustic states given the mixture, which are essential to minimum mean-square error estimation of the sources, are modeled in a maximum entropy probability distribution. The performance of CRF formulations is further improved with a largemargin approach of parameter estimation. / Experimental results confirm that CRF formulations achieve the improved objective quality measures and automatic speech recognition accuracy of the reconstructed sources, especially when the sources are competing with similar signal-to-signal ratio. Even with a simplified CRF formulation, the performance is still comparable to factorial HMM. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Yeung, Yu Ting. / Thesis (Ph.D.) Chinese University of Hong Kong, 2014. / Includes bibliographical references (leaves 102-118). / Abstracts also in Chinese.
|
3 |
A production system version of the Hearsay-II speech understanding systemMcCracken, Donald L. January 1900 (has links)
Revision of Thesis (Ph. D.)--Carnegie-Mellon University, 1978. / Includes bibliographical references (p. [133]-135) and index.
|
4 |
Pitch tracking and speech enhancement in noisy and reverberant environmentsWu, Mingyang, January 2003 (has links)
Thesis (Ph. D.)--Ohio State University, 2003. / Title from first page of PDF file. Document formatted into pages; contains xvi, 149 p.; also includes graphics. Includes abstract and vita. Advisor: DeLiang Wang, Dept. of Computer and Information Science. Includes bibliographical references (p. 136-149).
|
5 |
Sub-band coding of speech with dynamic bit allocationRabipour, Rafi. January 1982 (has links)
No description available.
|
6 |
Two dimensional prediction for data rate compression of LPC parametersMarr, James Douglas 08 1900 (has links)
No description available.
|
7 |
A low delay 16 kbit/sec coder for speech signals /Iyengar, Vasu January 1987 (has links)
No description available.
|
8 |
Analysis of predictor mistracking in ADPCM speech codersYatrou, Paul M. January 1987 (has links)
No description available.
|
9 |
Residual-excited linear predictive coding of speechKubina, James, 1956- January 1981 (has links)
No description available.
|
10 |
Objective measures of speech qualityQuackenbush, Schuyler Reynier 05 1900 (has links)
No description available.
|
Page generated in 0.1147 seconds