Global ETD Search

21	Vector quantization applied to speech coding in the wireless environment Morgenstern, Robert M. 29 July 2009 (has links) This thesis describes the development of the Voice Coding Development and Research (VoCoDeR) System, a software tool for testing and development of new speech coding methods. This tool enables a researcher to build a voice encoder and speech filters, determine an optimal bit allocation scheme, and/or create an error correction scheme, as desired. Using a channel simulation tool such as BERSIM, the user can create bit error patterns to corrupt the data and then decode the speech for playback and analysis. The system is based upon the North American Digital Cellular (NADC) 8kbps Vector-Sum Excited Linear Prediction (VSELP) speech coder and is currently capable of simulating the complete IS-54 source and channel coding scheme. The system is tested using Multi-Stage Vector Quantization and Finite-State Vector Quantization (FSVQ) applied to the linear prediction coefficients. FSVQ provides significant bit rate savings over previous methods of quantization. A variety of coefficient representations are compared including log-area ratios, arcsine reflection coefficients, line spectrum pairs and immittance spectrum pairs. This has allowed the recently introduced immittance spectrum pairs to be tested using vector quantization. Multiple distortion measures are also examined. The VoCoDeR System provides a tool that will allow an engineer to work on new speech coding algorithms or to determine an optimal source and channel coding scheme. / Master of Science LD5655.V855 1994.M674 Coding theory Vocoder
22	Audio compression and speech enhancement using temporal masking models Gunawan, Teddy Surya, Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW January 2007 (has links) Of the few existing models of temporal masking applicable to problems such as compression and enhancement, none are based on empirical data from the psychoacoustic literature, presumably because the multidimensional nature of the data makes the derivation of tractable functional models difficult. This thesis presents two new functional models of the temporal masking effect of the human auditory system, and their exploitation in audio compression and speech enhancement applications. Traditional audio compression algorithms do not completely utilise the temporal masking properties of the human auditory system, relying solely on simultaneous masking models. A perceptual wavelet packet-based audio coder has been devised that incorporates the first developed temporal masking model and combined with simultaneous masking models in a novel manner. An evaluation of the coder using both objective (PEAQ, ITU-R BS.1387) and extensive subjective tests (ITU-R BS.1116) revealed a bitrate reduction of more than 17% compared with existing simultaneous masking-based audio coders, while preserving transparent quality. In addition, the oversampled wavelet packet transform (ODWT) has been newly applied to obtain alias-free coefficients for more accurate masking threshold calculation. Finally, a low-complexity scalable audio coding algorithm using the ODWT-based thresholds and temporal masking has been investigated. Currently, there is a strong need for innovative speech enhancement algorithms exploiting the auditory masking effects of human auditory system that perform well at very low signal-to-noise ratio. Existing competitive noise suppression algorithms and those that incorporate simultaneous masking were examined and evaluated for their suitability as baseline algorithms. Objective measures using PESQ (ITU-T P.862) and subjective measures (ITU-T P.835) demonstrate that the proposed enhancement scheme, based on a second new masking model, outperformed the seven baseline speech enhancement methods by at least 6- 20% depending on the SNR. Hence, the proposed speech enhancement scheme exploiting temporal masking effects has good potential across many types and intensities of environmental noise. Keywords: human auditory system; temporal masking; simultaneous masking; audio compression; speech enhancement; subjective test; objective test. Coding theor. Signal processing. Speech processing systems. Vocoder. Wavelets (Mathematics)
23	A comparative study of time-stretching algorithms for audio signals / Markle, Blake L. January 2001 (has links) Algorithms exist which will perform independent transformations on frequency or duration of a digital audio signal. These processes have different results different types of audio signals. A comparative study of granular and phase vocoder algorithms, implementation, and their respective effects on audio signals was made to determine which algorithm is best suited to a particular type of audio signal. Computer sound processing. Vocoder.
24	Vector quantization in residual-encoded linear prediction of speech Abramson, Mark. January 1983 (has links) No description available. Speech processing systems. Coding theory. Data compression (Telecommunication) Vocoder.
25	Audio compression and speech enhancement using temporal masking models Gunawan, Teddy Surya, Electrical Engineering & Telecommunications, Faculty of Engineering, UNSW January 2007 (has links) Of the few existing models of temporal masking applicable to problems such as compression and enhancement, none are based on empirical data from the psychoacoustic literature, presumably because the multidimensional nature of the data makes the derivation of tractable functional models difficult. This thesis presents two new functional models of the temporal masking effect of the human auditory system, and their exploitation in audio compression and speech enhancement applications. Traditional audio compression algorithms do not completely utilise the temporal masking properties of the human auditory system, relying solely on simultaneous masking models. A perceptual wavelet packet-based audio coder has been devised that incorporates the first developed temporal masking model and combined with simultaneous masking models in a novel manner. An evaluation of the coder using both objective (PEAQ, ITU-R BS.1387) and extensive subjective tests (ITU-R BS.1116) revealed a bitrate reduction of more than 17% compared with existing simultaneous masking-based audio coders, while preserving transparent quality. In addition, the oversampled wavelet packet transform (ODWT) has been newly applied to obtain alias-free coefficients for more accurate masking threshold calculation. Finally, a low-complexity scalable audio coding algorithm using the ODWT-based thresholds and temporal masking has been investigated. Currently, there is a strong need for innovative speech enhancement algorithms exploiting the auditory masking effects of human auditory system that perform well at very low signal-to-noise ratio. Existing competitive noise suppression algorithms and those that incorporate simultaneous masking were examined and evaluated for their suitability as baseline algorithms. Objective measures using PESQ (ITU-T P.862) and subjective measures (ITU-T P.835) demonstrate that the proposed enhancement scheme, based on a second new masking model, outperformed the seven baseline speech enhancement methods by at least 6- 20% depending on the SNR. Hence, the proposed speech enhancement scheme exploiting temporal masking effects has good potential across many types and intensities of environmental noise. Keywords: human auditory system; temporal masking; simultaneous masking; audio compression; speech enhancement; subjective test; objective test. Coding theor. Signal processing. Speech processing systems. Vocoder. Wavelets (Mathematics)
26	Feature preservation and negated music in a phase vocoder sound representation Apel, Theodore R. January 2008 (has links) Thesis (Ph. D.)--University of California, San Diego, 2008. / Title from first page of PDF file (viewed Jun. 17, 2008). Available via ProQuest Digital Dissertations. Vita. Includes bibliographical references: P. 92-98.
27	Novas abordagens para codificação de voz e reconhecimento automático de locutor projetadas via mascaramento pleno em frequência por oitava SOTERO FILHO, Roberto Fernando Batista 30 October 2009 (has links) Submitted by Pedro Barros (pedro.silvabarros@ufpe.br) on 2018-08-27T22:00:17Z No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) DISSERTAÇÃO Roberto Fernando Batista Sotero Filho.pdf: 4760318 bytes, checksum: c985fe678efa727fd6aeae0a5fb97627 (MD5) / Approved for entry into archive by Alice Araujo (alice.caraujo@ufpe.br) on 2018-09-05T19:02:50Z (GMT) No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) DISSERTAÇÃO Roberto Fernando Batista Sotero Filho.pdf: 4760318 bytes, checksum: c985fe678efa727fd6aeae0a5fb97627 (MD5) / Made available in DSpace on 2018-09-05T19:02:50Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) DISSERTAÇÃO Roberto Fernando Batista Sotero Filho.pdf: 4760318 bytes, checksum: c985fe678efa727fd6aeae0a5fb97627 (MD5) Previous issue date: 2009-10-30 / CAPES / A área de processamento digital de sinais de voz (PDSV) é uma das mais importantes do processamento digital de sinais. Como sub-áreas relevantes do PDSV estão a Codificação da Voz e o Reconhecimento Automático de Locutor (RAL). Esta dissertação propõe uma nova abordagem para um vocoder baseado no Mascaramento Pleno em Frequência por Oitavas (MPFO) em adição a uma técnica de preenchimento espectral via distribuição beta de probabilidade. O método do MPFO consiste em simplificar a magnitude do espectro em frequência do sinal, considerando apenas uma amostra por oitava. Tal abordagem, que oferece um compromisso entre taxa de bits (e.g. 2,7 kbits/s), complexidade, inteligibilidade e qualidade dos sinais de voz, permitiu a criação de um novo formato binário de representação digital da voz: o formato voz. Apresenta-se, também, um novo método de baixa complexidade computacional para RAL, baseando-se em uma das propriedades-chave da percepção auditiva humana: o mascaramento acústico em frequência. O vetor característico dos quadros do sinal de voz é representado pela fração média das amplitudes dos tons de mascaramento em cada oitava. Ambos os tipos de reconhecimento de locutor (de texto dependente e de texto independente) são estudados. Os resultados confirmam que o algoritmo proposto oferece um compromisso entre a complexidade e a taxa de identificações corretas (típico 85%), sendo atrativo para aplicações em sistemas embarcados. / Digital processing of speech signals (DPSS) is one of the most important areas of digital signal processing. Voice coding and automatic speaker recognition (ASR) are relevant DPSS sub-fields. This dissertation introduces a new vocoder scheme, which is based on full frequency masking per octave (FFMO), jointly with a new spectral stuffing technique through the beta probability distribution. The FFMO method consists of simplifying the magnitude of the voice spectrum. It retains just one spectral sample per octave. This approach offers a tradeoff between the bit rate (e.g., 2.7 kbits/s), complexity, intelligibility and voice quality. A new file format, termed voz, was proposed. A novel and low-complexity ASR technique, based one of the key-properties of the human hearing perception - the auditory frequency masking - is also presented. The feature vectors of voice frames are represented by the average amplitude of the largest spectral samples within each octave. Both text-dependent and text-independent speaker recognition is investigated. Results support a tradeoff between recognition efficiency (typically 85%) and complexity of this kind of vocoder-based systems, being thereby attractive for embedded systems. Engenharia Elétrica Vocoder Reconhecimento automático de locutor Mascaramento em frequência
28	Hlasové kodéry pro nízké přenosové rychlosti / Low bit rate voice encoders Leitner, Jakub January 2009 (has links) The final thesis deals with coders and voice coders used in speech signal processing. The aim is to create an integral overview of coders and voice coders including a description of their properties, in the second part of the thesis a simulation of algorithms and methods of speech processing is performed in Matlab Simulink program.The basic methods of speech processing and a parametric LPC voice coder were simulated in time domain. In the LPC voice coder model there are implemented the algorithms for obtaining speech segment parameters. These are the algorithm for classification of voiced and unvoiced speech segment, LPC analysis and pitch detection. The output is a parametric signal that enables a receiver to synthesize a speech signal. The appendix 1 contains a list of names of coders or standard numbers of coders and their properties, the appendix 2 includes an overview of speech processing methods.
29	Vector quantization in residual-encoded linear prediction of speech Abramson, Mark January 1983 (has links) No description available. Speech processing systems. Vocoder. Data compression (Telecommunication) Coding theory.
30	A comparative study of time-stretching algorithms for audio signals / Markle, Blake L. January 2001 (has links) No description available. Computer sound processing. Vocoder.

Search results