Global ETD Search

1	Evaluation of Methods for Sound Source Separation in Audio Recordings Using Machine Learning Gidlöf, Amanda January 2023 (has links) Sound source separation is a popular and active research area, especially with modern machine learning techniques. In this thesis, the focus is on single-channel separation of two speakers into individual streams, and specifically considering the case where two speakers are also accompanied by background noise. There are different methods to separate speakers and in this thesis three different methods are evaluated: the Conv-TasNet, the DPTNet, and the FaSNetTAC. The methods were used to train models to perform the sound source separation. These models were evaluated and validated through three experiments. Firstly, previous results for the chosen separation methods were reproduced. Secondly, appropriate models applicable for NFC's datasets and applications were created, to fulfill the aim of this thesis. Lastly, all models were evaluated on an independent dataset, similar to datasets from NFC. The results were evaluated using the metrics SI-SNRi and SDRi. This thesis provides recommended models and methods suitable for NFC applications, especially concluding that the Conv-TasNet and the DPTNet are reasonable choices. Sound source separation signal processing audio processing speech enhancement speaker identification speech separation speech processing speaker separation single channel multi channel cocktail party problem machine learning neural network deep learning deep neural network convolutional network dual-path network filter-and-sum network recurrent network beamforming Communication Systems Kommunikationssystem

Search results

Evaluation of Methods for Sound Source Separation in Audio Recordings Using Machine Learning