• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Audio fingerprinting for speech reconstruction and recognition in noisy environments

Liu, Feng 13 April 2017 (has links)
Audio fingerprinting is a highly specific content-based audio retrieval technique. Given a short audio fragment as query, an audio fingerprinting system can identify the particular file that contains the fragment in a large library potentially consisting of millions of audio files. In this thesis, we investigate the possibility and feasibility of applying audio fingerprinting to do speech recognition in noisy environments based on speech reconstruction. To reconstruct noisy speech, the speech is divided into small segments of equal length at first. Then, audio fingerprinting is used to find the most similar segment in a large dataset consisting of clean speech files. If the similarity is above a threshold, the noisy segment is replaced with the clean segment. At last, all the segments, after conditional replacement, are concatenated to form the reconstructed speech, which is sent to a traditional speech recognition system. In the above procedure, a critical step is using audio fingerprinting to find the clean speech segment in a dataset. To test its performance, we build a landmark-based audio fingerprinting system. Experimental results show that this baseline system performs well in traditional applications, but its accuracy in this new application is not as good as we expected. Next, we propose three strategies to improve the system, resulting in better accuracy than the baseline system. Finally, we integrate the improved audio fingerprinting system into a traditional speech recognition system and evaluate the performance of the whole system. / Graduate

Page generated in 0.1414 seconds