• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 7
  • 4
  • 2
  • 1
  • Tagged with
  • 18
  • 18
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Digital signal processing techniques for speech enhancement in hearing aids

Canagarajah, Cedric Nishanthan January 1993 (has links)
No description available.
2

The Cocktail Party Problem: Solutions and Applications.

Wiklund, Karl 02 1900 (has links)
<p>The human auditory system is remarkable in its ability to function in busy acoustic environments. It is able to selectively focus attention on and extract a single source of interest in the midst of competing acoustic sources, reverberation and motion. Yet this problem, which is so elementary for most human listeners has proven to be a very difficult one to solve computationally. Even more difficult has been the search for practical solutions to problems to which digital signal processing can be applied. Many applications that would benefit from a solution such as hearing aid systems, industrial noise control, or audio surveillance require that any such solution be able to operate in real time and consume only a minimal amount of computational resources.</p> <p>In this thesis, a novel solution to the cocktail party problem is proposed. This solution is rooted in the field of Computational Auditory Scene Analysis, and makes use of insights regarding the processing carried out by the early human auditory system in order to effectively suppress interference. These neurobiological insights have been thus adapted in such a way as to produce a solution to the cocktail party problem that is practical from an engineering point of view. The proposed solution has been found to be robust under a wide range of realistic environmental conditions, including spatially distributed interference, as well as reverberation.</p> / Thesis / Doctor of Philosophy (PhD)
3

Ritual Patterns in "The Cocktail Party"

Miller, David L. January 1963 (has links)
No description available.
4

Metaphysical Parallels Between The Cocktail Party and The Book of Job

Pak, Tae-yong January 1966 (has links)
No description available.
5

MELHORAMENTO DO SINAL DE VOZ POR INIBIÇÃO LATERAL E MASCARAMENTO BINAURAL / IMPROVEMENT OF THE SIGNAL VOICE BY LATERAL INHIBITION AND BINAURAL MASKING

Nascimento, Edil James de Jesus 02 April 2004 (has links)
Made available in DSpace on 2016-08-17T14:52:52Z (GMT). No. of bitstreams: 1 Edil James de Jesus Nascimento.PDF: 2709948 bytes, checksum: c8bf5634508e47328bd033c4d323f9c0 (MD5) Previous issue date: 2004-04-02 / The human hearing system is capable to accomplish different tasks that would be useful in engineering applications. One of them is the ability to separate sound sources, allowing the listener to "focus" a single sound source in a noisy environment. Great investments have been made in the development of technologies applied to the voice recognition by machines in real environment. For that, different techniques of processing computational have been proposed, for reduction of the ambient noise and improvement of the signal desired in complex acoustic environment (cocktail party). The model of the human hearing system motivates those techniques in their different phases. In this work, we developed an algorithm to improve the processing speech signal based on the binaural hearing model. After receiving the mixed signals, for two microphones, the algorithm increases the intelligibility of the signal of larger energy of one of the receivers. Using two speakers and considering that each one is closer of one of the microphones, we made use of the concepts of lateral inhibition and binaural masking, to recover the signal of speech of larger energy of one of the receivers. The algorithm was developed in platform matlab and it was compared with another without use the lateral inhibition in the recovery of the desired signal. The results, appraised through the calculation of the relative error and of the scale MOS, showed that the use of the lateral inhibition in the recovery of the signal, improves the relative error between the desired signal and the recovered signal and consequently the quality of the recovered signal. / O sistema auditivo humano é capaz de realizar diferentes tarefas que seriam úteis em aplicações de engenharia. Uma delas é a habilidade de separar fontes sonoras, permitindo a um ouvinte focar uma única fonte sonora em um ambiente ruidoso. Grandes investimentos têm sido feitos no desenvolvimento de tecnologias aplicadas ao reconhecimento de voz, por meio de máquinas, em ambientes reais. Para isso, diferentes técnicas de processamento computacional têm sido propostas para a redução do ruído ambiente e melhoramento do sinal desejado em ambiente acústico complexo (cocktail party). Essas técnicas são motivadas pelo modelo do sistema auditivo humano em suas diferentes fases. Neste trabalho, desenvolvemos um algoritmo para melhorar o processamento de um sinal de fala baseado no modelo auditivo binaural. Após receber os sinais misturados, por dois microfones, o algoritmo aumenta a inteligibilidade do sinal de maior energia de um dos receptores. Utilizando dois oradores e considerando que cada um está mais próximo de um dos receptores, fizemos uso dos conceitos de inibição lateral e mascaramento binaural, para recuperar o sinal de fala de maior energia de um dos receptores. O algoritmo foi desenvolvido sob a plataforma matlab e comparado com um outro sem a utilização da inibição lateral na recuperação do sinal desejado. Os resultados, avaliados através do cálculo do erro relativo e da escala MOS, mostraram que a utilização da inibição lateral na recuperação do sinal, melhora o erro relativo entre o sinal desejado e o sinal recuperado e conseqüentemente a qualidade do sinal recuperado.
6

Vícekanálové metody zvýrazňování řeči / Multi-channel Methods of Speech Enhancement

Zitka, Adam January 2008 (has links)
This thesis deals with multi-channel methods of speech enhancement. Multichannel methods of speech enhancement use a few microphones for recording signals. From mixtures of signals, for example, individual speakers can be separated, noise should be reduced etc. with using neural networks. The task of separating speakers is known as a cocktail-party effect. The main method of solving this problem is called independent component analysis. At first there are described its theoretical foundation and presented conditions and requirements for its application. Methods of ICA try to separate the mixtures with help of searching the minimal gaussian properties of signals. For the analysis of independent components are used different mathematical properties of signals such as kurtosis and entropy. Signals, which were mixed artificially on a computer, can be relatively well separated using, for example, FastICA algorithm or ICA gradient ascent. However, difficult is situation, if we want to separate the signals created in the real recording enviroment, because the separation of speech people speaking at the same time in the real environment affects other various factors such as acoustic properties of the room, noise, delays, reflections from the walls, the position or the type of microphones, etc. Work presents aproach of independent component analysis in the frequency domain, which can successfully separate also recordings made in the real environment.
7

Time-Frequency Masking Performance for Improved Intelligibility with Microphone Arrays

Morgan, Joshua P. 01 January 2017 (has links)
Time-Frequency (TF) masking is an audio processing technique useful for isolating an audio source from interfering sources. TF masking has been applied and studied in monaural and binaural applications, but has only recently been applied to distributed microphone arrays. This work focuses on evaluating the TF masking technique's ability to isolate human speech and improve speech intelligibility in an immersive "cocktail party" environment. In particular, an upper-bound on TF masking performance is established and compared to the traditional delay-sum and general sidelobe canceler (GSC) beamformers. Additionally, the novel technique of combining the GSC with TF masking is investigated and its performance evaluated. This work presents a resource-efficient method for studying the performance of these isolation techniques and evaluates their performance using both virtually simulated data and data recorded in a real-life acoustical environment. Further, methods are presented to analyze speech intelligibility post-processing, and automated objective intelligibility measurements are applied alongside informal subjective assessments to evaluate the performance of these processing techniques. Finally, the causes for subjective/objective intelligibility measurement disagreements are discussed, and it was shown that TF masking did enhance intelligibility beyond delay-sum beamforming and that the utilization of adaptive beamforming can be beneficial.
8

Perceptual Ruler for Quantifying Speech Intelligibility in Cocktail Party Scenarios

Brangers, Kirstin M 01 January 2013 (has links)
Systems designed to enhance intelligibility of speech in noise are difficult to evaluate quantitatively because intelligibility is subjective and often requires feedback from large populations for consistent evaluations. Attempts to quantify the evaluation have included related measures such as the Speech Intelligibility Index. These require separating speech and noise signals, which precludes its use on experimental recordings. This thesis develops a procedure using an Intelligibility Ruler (IR) for efficiently quantifying intelligibility. A calibrated Mean Opinion Score (MOS) method is also implemented in order to compare repeatability over a population of 24 subjective listeners. Results showed that subjects using the IR consistently estimated SII values of the test samples with an average standard deviation of 0.0867 between subjects on a scale from zero to one and R2=0.9421. After a calibration procedure from a subset of subjects, the MOS method yielded similar results with an average standard deviation of 0.07620 and R2=0.9181.While results suggest good repeatability of the IR method over a broad range of subjects, the calibrated MOS method is capable of producing results more closely related to actual SII values and is a simpler procedure for human subjects.
9

A Deep Learning Approach to Brain Tracking of Sound

Hermansson, Oscar January 2022 (has links)
Objectives: Development of accurate auditory attention decoding (AAD) algorithms, capable of identifying the attended sound source from the speech evoked electroencephalography (EEG) responses, could lead to new solutions for hearing impaired listeners: neuro-steered hearing aids. Many of the existing AAD algorithms are either inaccurate or very slow. Therefore, there is a need to develop new EEG-based AAD methods. The first objective of this project was to investigate deep neural network (DNN) models for AAD and compare them to the state-of-the-art linear models. The second objective was to investigate whether generative adversarial networks (GANs) could be used for speech-evoked EEGdata augmentation to improve the AAD performance. Design: The proposed methods were tested in a dataset of 34 participants who performed an auditory attention task. They were instructed to attend to one of the two talkers in the front and ignore the talker on the other side and back-ground noise behind them, while high density EEG was recorded. Main Results: The linear models had an average attended vs ignored speech classification accuracy of 95.87% and 50% for ∼30 second and 8 seconds long time windows, respectively. A DNN model designed for AAD resulted in an average classification accuracy of 82.32% and 58.03% for ∼30 second and 8 seconds long time windows, respectively, when trained only on the real EEG data. The results show that GANs generated relatively realistic speech-evoked EEG signals. A DNN trained with GAN-generated data resulted in an average accuracy 90.25% for 8 seconds long time windows. On shorter trials the GAN-generated EEG data have shown to significantly improve classification performances, when compared to models only trained on real EEG data. Conclusion: The results suggest that DNN models can outperform linear models in AAD tasks, and that GAN-based EEG data augmentation can be used to further improve DNN performance. These results extend prior work and brings us closer to the use of EEG for decoding auditory attention in next-generation neuro-steered hearing aids.
10

Análise de componentes independentes aplicada à separação de sinais de áudio. / Independent component analysis applied to separation of audio signals.

Moreto, Fernando Alves de Lima 19 March 2008 (has links)
Este trabalho estuda o modelo de análise em componentes independentes (ICA) para misturas instantâneas, aplicado na separação de sinais de áudio. Três algoritmos de separação de misturas instantâneas são avaliados: FastICA, PP (Projection Pursuit) e PearsonICA; possuindo dois princípios básicos em comum: as fontes devem ser independentes estatisticamente e não-Gaussianas. Para analisar a capacidade de separação dos algoritmos foram realizados dois grupos de experimentos. No primeiro grupo foram geradas misturas instantâneas, sinteticamente, a partir de sinais de áudio pré-definidos. Além disso, foram geradas misturas instantâneas a partir de sinais com características específicas, também geradas sinteticamente, para avaliar o comportamento dos algoritmos em situações específicas. Para o segundo grupo foram geradas misturas convolutivas no laboratório de acústica do LPS. Foi proposto o algoritmo PP, baseado no método de Busca de Projeções comumente usado em sistemas de exploração e classificação, para separação de múltiplas fontes como alternativa ao modelo ICA. Embora o método PP proposto possa ser utilizado para separação de fontes, ele não pode ser considerado um método ICA e não é garantida a extração das fontes. Finalmente, os experimentos validam os algoritmos estudados. / This work studies Independent Component Analysis (ICA) for instantaneous mixtures, applied to audio signal (source) separation. Three instantaneous mixture separation algorithms are considered: FastICA, PP (Projection Pursuit) and PearsonICA, presenting two common basic principles: sources must be statistically independent and non-Gaussian. In order to analyze each algorithm separation capability, two groups of experiments were carried out. In the first group, instantaneous mixtures were generated synthetically from predefined audio signals. Moreover, instantaneous mixtures were generated from specific signal generated with special features, synthetically, enabling the behavior analysis of the algorithms. In the second group, convolutive mixtures were probed in the acoustics laboratory of LPS at EPUSP. The PP algorithm is proposed, based on the Projection Pursuit technique usually applied in exploratory and clustering environments, for separation of multiple sources as an alternative to conventional ICA. Although the PP algorithm proposed could be applied to separate sources, it couldnt be considered an ICA method, and source extraction is not guaranteed. Finally, experiments validate the studied algorithms.

Page generated in 0.0472 seconds