Spelling suggestions: "subject:"speaker verification"" "subject:"peaker verification""
1 |
Controlo remoto de presenças recorrendo à tecnologia de speaker verificationMoura, Paulo André Alves January 2010 (has links)
Estágio realizado na PT Inovação e orientado pelo Eng.º Sérgio Ramalho / Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores (Major Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 2010
|
2 |
Towards the use of sub-band processing in automatic speaker recognitionFinan, Robert Andrew January 1998 (has links)
No description available.
|
3 |
Speaker Verification Systems Under Various Noise and SNR ConditionsWan, Qianhui January 2017 (has links)
In speaker verification, the mismatches between the training speech and the testing speech can greatly affect the robustness of classification algorithms, and the mismatches are mainly caused by the changes in the noise types and the signal to noise ratios. This thesis aims at finding the most robust classification methods under multi-noise and multiple signal to noise ratio conditions. Comparison of several well-known state of the art classification algorithms and features in speaker verification are made through examining the performance of small-set speaker verification system (e.g. voice lock for a family). The effect of the testing speech length is also examined. The i-vector/Probabilistic Linear Discriminant Analysis method with compensation strategies is shown to provide a stable performance for both previously seen and previously unseen noise scenarios, and a C++ implementation with online processing and multi-threading is developed for this approach.
|
4 |
A Nonlinear Mixture Autoregressive Model For Speaker VerificationSrinivasan, Sundararajan 30 April 2011 (has links)
In this work, we apply a nonlinear mixture autoregressive (MixAR) model to supplant the Gaussian mixture model for speaker verification. MixAR is a statistical model that is a probabilistically weighted combination of components, each of which is an autoregressive filter in addition to a mean. The probabilistic mixing and the datadependent weights are responsible for the nonlinear nature of the model. Our experiments with synthetic as well as real speech data from standard speech corpora show that MixAR model outperforms GMM, especially under unseen noisy conditions. Moreover, MixAR did not require delta features and used 2.5x fewer parameters to achieve comparable or better performance as that of GMM using static as well as delta features. Also, MixAR suffered less from overitting issues than GMM when training data was sparse. However, MixAR performance deteriorated more quickly than that of GMM when evaluation data duration was reduced. This could pose limitations on the required minimum amount of evaluation data when using MixAR model for speaker verification.
|
5 |
Text-Dependent Speaker Verification Implemented in Matlab Using MFCC and DTWTolunay, Atahan January 2010 (has links)
Even though speaker verification is a broad subject, the commercial and personal use implementations are rare. There are several problems that need to be solved before speaker verification can become more useful. The amount of pattern matching and feature extraction techniques is large and the decision on which ones to use is debatable. One of the main problems of speaker verification in general is the impact of noise. The very popular feature extraction technique MFCC is inherently sensitive to mismatch between training and verification conditions. MFCC is used in many speech recognition applications and is not only useful in text-dependent speaker verification. However the most reliable verification techniques are text-dependent. One of the most popular pattern matching techniques in text-dependent speaker verification is DTW. Although having limitations outside the text-dependent applications it is a reliable way of matching templates even with limited amount of training material. The signal processing techniques, MFCC and DTW are explained and discussed in detail along with a Matlab program where these techniques have been implemented. The choices made in signal processing, feature extraction and pattern matching are determined by discussions of available studies on these topics. The results indicate that it is possible to program text-dependent speaker verification systems that are functional in clean conditions with tools like Matlab.
|
6 |
Robustní rozpoznávání mluvčího pomocí neuronových sítí / Robust Speaker Verification with Deep Neural NetworksProfant, Ján January 2019 (has links)
The objective of this work is to study state-of-the-art deep neural networks based speaker verification systems called x-vectors on various conditions, such as wideband and narrowband data and to develop the system, which is robust to unseen language, specific noise or speech codec. This system takes variable length audio recording and maps it into fixed length embedding which is afterward used to represent the speaker. We compared our systems to BUT's submission to Speakers in the Wild Speaker Recognition Challenge (SITW) from 2016, which used previously popular statistical models - i-vectors. We observed, that when comparing single best systems, with recently published x-vectors we were able to obtain more than 4.38 times lower Equal Error Rate on SITW core-core condition compared to SITW submission from BUT. Moreover, we find that diarization substantially reduces error rate when there are multiple speakers for SITW core-multi condition but we could not see the same trend on NIST SRE 2018 VAST data.
|
7 |
Rozpoznávání mluvčího ve Skype hovorech / Speaker Recognition in Skype CallsKaňok, Tomáš January 2011 (has links)
This diploma thesis is concerned with machine identification and verification of speaker, it's theory and applications. It evaluates existing implementation of the subject by the Speech@FIT group. It also considers plugins for the Skype program. Then a functioning plugin is proposed which makes possible identification of the speaker. It is implemented and evaluated here. Towards the end of this thesis suggestions of future development are presented.
|
8 |
Defending against Adversarial Attacks in Speaker Verification SystemsLi-Chi Chang (11178210) 26 July 2021 (has links)
<p>With the advance of the
technologies of Internet of things, smart devices or virtual personal
assistants at home, such as Google Assistant, Apple Siri, and Amazon Alexa,
have been widely used to control and access different objects like door lock,
blobs, air conditioner, and even bank accounts, which makes our life
convenient. Because of its ease for operations, voice control becomes a main
interface between users and these smart devices. To make voice control more
secure, speaker verification systems have been researched to apply human voice
as biometrics to accurately identify a legitimate user and avoid the illegal
access. In recent studies, however, it has been shown that speaker verification
systems are vulnerable to different security attacks such as replay, voice
cloning, and adversarial attacks. Among all attacks, adversarial attacks are
the most dangerous and very challenging to defend. Currently, there is no known
method that can effectively defend against such an attack in speaker verification
systems.</p>
<p>The
goal of this project is to design and implement a defense system that is
simple, light-weight, and effectively against adversarial attacks for speaker
verification. To achieve this goal, we study the audio samples from adversarial
attacks in both the time domain and the Mel spectrogram, and find that the
generated adversarial audio is simply a clean illegal audio with small
perturbations that are similar to white noises, but well-designed to fool
speaker verification. Our intuition is that if these perturbations can be
removed or modified, adversarial attacks can potentially loss the attacking
ability. Therefore, we propose to add a plugin-function module to preprocess
the input audio before it is fed into the verification system. As a first
attempt, we study two opposite plugin functions: denoising that attempts to
remove or reduce perturbations and noise-adding that adds small Gaussian noises
to an input audio. We show through experiments that both methods can
significantly degrade the performance of a state-of-the-art adversarial attack.
Specifically, it is shown that denoising and noise-adding can reduce the
targeted attack success rate of the attack from 100% to only 56% and 5.2%,
respectively. Moreover, noise-adding can slow down the attack 25 times in speed
and has a minor effect on the normal operations of a speaker verification
system. Therefore, we believe that noise-adding can be applied to any speaker
verification system against adversarial attacks. To the best of our knowledge,
this is the first attempt in applying the noise-adding method to defend against
adversarial attacks in speaker verification systems.</p><br>
|
9 |
How do voiceprints age?Nachesa, Maya Konstantinovna January 2023 (has links)
Voiceprints, like fingerprints, are a biometric. Where fingerprints record a person's unique pattern on their finger, voiceprints record what a person's voice "sounds like", abstracting away from what the person said. They have been used in speaker recognition, including verification and identification. In other words, they have been used to ask "is this speaker who they say they are?" or "who is this speaker?", respectively. However, people age, and so do their voices. Do voiceprints age, too? That is, can a person's voice change enough that after a while, the original voiceprint can no longer be used to identify them? In this thesis, I use Swedish audio recordings from Riksdagen's (the Swedish parliament) debate speeches to test this idea. Depending on the answer, this influences how well we can search the database for previously unmarked speeches. I find that speaker verification performance decreases as the age-gap between voiceprints increases, and that it decreases more strongly after roughly five years. Additionally, I grouped the speakers into age groups spanning five years, and found that speaker verification has the highest performance for those for whom the initial voiceprint was recorded from 29-33 years of age. Additionally, longer input speech provides higher quality voiceprints, with performance improvements stagnating when voiceprints become longer than 30 seconds. Finally, voiceprints for men age more strongly than those for women after roughly 5 years. I also investigated how emotions are encoded in voiceprints, since this could potentially impede in speaker recognition. I found that it is possible to train a classifier to recognise emotions from voiceprints, and that this classifier does better when recognising emotions from known speakers. That is, emotions are encoded more characteristically per person as opposed to per emotion itself. As such, they are unlikely to interfere with speaker recognition.
|
10 |
Vision-Based Force Planning and Voice-Based Human-Machine Interface of an Assistive Robotic Exoskeleton Glove for Brachial Plexus InjuriesGuo, Yunfei 18 October 2023 (has links)
This dissertation focuses on improving the capabilities of an assistive robotic exoskeleton glove designed for patients with Brachial Plexus Injuries (BPI). The aim of this research is to develop a force control method, an automatic force planning method, and a Human-Machine Interface (HMI) to refine the grasping functionalities of the exoskeleton glove, thus helping rehabilitation and independent living for individuals with BPI. The exoskeleton glove is a useful tool in post-surgery therapy for patients with BPI, as it helps counteract hand muscle atrophy by allowing controlled and assisted hand movements. This study introduces an assistive exoskeleton glove with rigid side-mounted linkages driven by Series Elastic Actuators (SEAs) to perform five different types of grasps. In the aspect of force control, data-driven SEA fingertip force prediction methods were developed to assist force control with the Linear Series Elastic Actuators (LSEAs). This data-driven force prediction method can provide precise prediction of SEA fingertip force taking into account the deformation and friction force on the exoskeleton glove. In the aspect of force planning, a slip-grasp force planning method with hybrid slip detection is implemented. This method incorporates a vision-based approach to estimate object properties to refine grasp force predictions, thus mimicking human grasping processes and reducing the trial-and-error iterations required for the slip- grasp method, increasing the grasp success rate from 71.9% to 87.5%. In terms of HMI, the Configurable Voice Activation and Speaker Verification (CVASV) system was developed to control the proposed exoskeleton glove, which was then complemented by an innovative one-shot learning-based alternative, which proved to be more effective than CVASV in terms of training time and connectivity requirements. Clinical trials were conducted successfully in patients with BPI, demonstrating the effectiveness of the exoskeleton glove. / Doctor of Philosophy / This dissertation focuses on improving the capabilities of a robotic exoskeleton glove designed to assist individuals with Brachial Plexus Injuries (BPI). The goal is to enhance the glove's ability to grasp and manipulate objects, which can help in the recovery process and enable patients with BPI to live more independently. The exoskeleton glove is a tool for patients with BPI to used after surgery to prevent the muscles of the hand from weakening due to lack of use. This research introduces an exoskeleton glove that utilizes special mechanisms to perform various types of grasp. The study has three main components. First, it focuses on ensuring that the glove can accurately control its grip strength. This is achieved through a special method that takes into account factors such as how the materials in the glove change when it moves and the amount of friction present. Second, the study works on a method for planning how much force the glove should use to hold objects without letting them slip. This method combines a camera-based object and material detection to estimate the weight and size of the target object, making the glove better at holding things without dropping them. The third part involves designing how people can instruct the glove what to do. The command can be sent to the robot by voice. This study proposed a new method that quickly learns how you talk and recognizes your voice. The exoskeleton glove was tested on patients with BPI and the results showed that it is successful in helping them. This study enhances assistive technology, especially in the field of assistive exoskeleton glove, making it more effective and beneficial for individuals with hand disabilities.
|
Page generated in 0.1567 seconds