Global ETD Search

Return to search

Evaluation of Methods for Sound Source Separation in Audio Recordings Using Machine Learning

Sound source separation is a popular and active research area, especially with modern machine learning techniques. In this thesis, the focus is on single-channel separation of two speakers into individual streams, and specifically considering the case where two speakers are also accompanied by background noise. There are different methods to separate speakers and in this thesis three different methods are evaluated: the Conv-TasNet, the DPTNet, and the FaSNetTAC. The methods were used to train models to perform the sound source separation. These models were evaluated and validated through three experiments. Firstly, previous results for the chosen separation methods were reproduced. Secondly, appropriate models applicable for NFC's datasets and applications were created, to fulfill the aim of this thesis. Lastly, all models were evaluated on an independent dataset, similar to datasets from NFC. The results were evaluated using the metrics SI-SNRi and SDRi. This thesis provides recommended models and methods suitable for NFC applications, especially concluding that the Conv-TasNet and the DPTNet are reasonable choices.

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-192849

Sound source separation

signal processing

audio processing

speech enhancement

speaker identification

cocktail party problem

convolutional network

dual-path network

filter-and-sum network

recurrent network

beamforming

Communication Systems

Kommunikationssystem

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-192849
Date	January 2023
Creators	Gidlöf, Amanda
Publisher	Linköpings universitet, Kommunikationssystem
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.004 seconds

Evaluation of Methods for Sound Source Separation in Audio Recordings Using Machine Learning

Description

Links & Downloads

Tags

Additional Fields