Spelling suggestions: "subject:"voice recognition"" "subject:"joice recognition""
1 |
Neural networks for speaker identificationFredrickson, Steven Eric January 1995 (has links)
No description available.
|
2 |
Packet voice communication on carrier sense multiple access local area networksRashid, M. A. January 1986 (has links)
No description available.
|
3 |
Prospects for applying speaker verification to unattended secure bankingHannah, Malcolm Ian January 1996 (has links)
No description available.
|
4 |
Implementación de una herramienta de integración de varios tipos de interacción humano-computadora para el desarrollo de nuevos sistemas multimodales / Implementation of an integration tool of several types of human-computer interaction for the development of new multimodal systemsAlzamora M., Alzamora, Manuel I., Huamán, Andrés E., Barrientos, Alfredo, Villalta Riega, Rosario del Pilar January 2018 (has links)
Las personas interactúan con su entorno de forma multimodal. Esto es, con el uso simultaneo de sus sentidos. En los últimos años, se ha buscado una interacción multimodal humano-computador desarrollando nuevos dispositivos y usando diferentes canales de comunicación con el fin de brindar una experiencia de usuario interactiva más natural. Este trabajo presenta una herramienta que permite la integración de diferentes tipos de interacción humano computador y probarlo sobre una solución multimodal. / Revisión por pares
|
5 |
Vocal combo android applicationMathukumalli, Sravya January 1900 (has links)
Master of Science / Department of Computing and Information Sciences / Mitchell L. Neilsen / Now-a-days people from various backgrounds need different information on demand. People are relying on web as a source of information they need. But to get connected to internet all the time with a computer system is not possible. Android, being open source has already made its mark in the mobile application development. The highest smart phone user base and the ease of developing the applications in Android is an added advantage for the users as well as the Android developers.
The vocal combo is an Android application which provides the required functionality on Android supported smart phone or a tablet. This provides the flexibility of accessing information at the users’ fingertips. This application is built using Android SDK, which makes the application easy to deploy on any Android powered device. Vocal Combo is a combination of voice based applications. It includes a Text-To-Voice convertor and Voice-To-Text convertor. This application helps the user to learn the pronunciation of various words. At the same time the user can also check his/her pronunciation skills. This application also provides the functionality of meaning check where the user can check the meaning of the words he types in or speaks out. At any point of time, the user can check the history of the words for which he has checked the meaning or pronunciation for. The application also provides the support to the user on how to use this application.
|
6 |
Multibiometric security in wireless communication systemsSepasian, Mojtaba January 2010 (has links)
This thesis has aimed to explore an application of Multibiometrics to secured wireless communications. The medium of study for this purpose included Wi-Fi, 3G, and WiMAX, over which simulations and experimental studies were carried out to assess the performance. In specific, restriction of access to authorized users only is provided by a technique referred to hereafter as multibiometric cryptosystem. In brief, the system is built upon a complete challenge/response methodology in order to obtain a high level of security on the basis of user identification by fingerprint and further confirmation by verification of the user through text-dependent speaker recognition. First is the enrolment phase by which the database of watermarked fingerprints with memorable texts along with the voice features, based on the same texts, is created by sending them to the server through wireless channel. Later is the verification stage at which claimed users, ones who claim are genuine, are verified against the database, and it consists of five steps. Initially faced by the identification level, one is asked to first present one’s fingerprint and a memorable word, former is watermarked into latter, in order for system to authenticate the fingerprint and verify the validity of it by retrieving the challenge for accepted user. The following three steps then involve speaker recognition including the user responding to the challenge by text-dependent voice, server authenticating the response, and finally server accepting/rejecting the user. In order to implement fingerprint watermarking, i.e. incorporating the memorable word as a watermark message into the fingerprint image, an algorithm of five steps has been developed. The first three novel steps having to do with the fingerprint image enhancement (CLAHE with 'Clip Limit', standard deviation analysis and sliding neighborhood) have been followed with further two steps for embedding, and extracting the watermark into the enhanced fingerprint image utilising Discrete Wavelet Transform (DWT). In the speaker recognition stage, the limitations of this technique in wireless communication have been addressed by sending voice feature (cepstral coefficients) instead of raw sample. This scheme is to reap the advantages of reducing the transmission time and dependency of the data on communication channel, together with no loss of packet. Finally, the obtained results have verified the claims.
|
7 |
Ethnography of San : minority recognition and voice in BotswanaLawy, Jenny January 2016 (has links)
Over the last sixty years anthropological interest in San has focused on their status as hunter-gatherers and, more recently, as an economically and socially marginalised minority group. In this thesis, I examine the different ways in which this indigenous minority population in Botswana manage and negotiate their relations with one another and with the broader society in which they are embedded. The research comprised eighteen months of fieldwork (April 2010 to December 2011) in Gaborone city, and a largely Naro-speaking village in Gantsi District in the west of Botswana. The participants comprised a small but relatively highly-educated cadre of elite San men who self-presented as advocates for San-related issues in the wider community but also San men and women in the towns and villages of the region. Early in the research process I recognised the need to make sense of the ethnography in terms of a variety of markers. Whilst this included what San actually said it also encompassed what they did and how they did it: that is their behaviour, dress and bodily techniques and practices – all of which I describe as voice. The research intersects with issues of gender, language, culture, class, identity and self-representation in the daily lives of San. I emphasise the tensions that San face in their daily struggles for recognition as human beings of equal value in Botswana’s society. As the public face of this struggle, San advocates were in a difficult and ambiguous position in relation to the wider San community. As a consequence of this, I explore egalitarianism as a set of political and social relationships rather than as a ‘sharing practice’. I identify a number of areas for further research, for example, to work collaboratively with San to incorporate aspects of what San called ‘personal empowerment’ and training. I show that the research has wider implications for other minority groups and indigenous people worldwide who have also been subject to highly politicised and overly deterministic definitions of their identity. My work suggests possibilities for working with emerging indigenous ‘elites’, who mediate most visibly the contours of these categories of identity by purposefully combining, conflating and straddling these labels.
|
8 |
The Potential of Visual Features : to Improve Voice Recognition Systems in Vehicles Noisy EnvironmentJafari Moghadamfard, Ramtin, Payvar, Saeid January 2014 (has links)
Multimodal biometric systems have been subject of study in recent decades, theirunique characteristic of Anti spoofing and liveness detection plus ability to deal withaudio noise made them technology candidates for improving current systems such asvoice recognition, verification and identification systems.In this work we studied feasibility of incorporating audio-visual voice recognitionsystem for dealing with audio noise in the truck cab environment. Speech recognitionsystems suffer from excessive noise from the engine and road traffic and cars stereosystem. To deal with this noise different techniques including active and passive noisecancelling have been studied.Our results showed that although audio-only systems are performing better in noisefree environment their performance drops significantly by increase in the level of noisein truck cabins, which by contrast does not affect the performance of visual features.Final fused system comprising both visual and audio cues, proved to be superior toboth audio-only and video-only systems.
|
9 |
Text-Independent Speaker Recognition Using Source Based FeaturesWildermoth, Brett Richard, n/a January 2001 (has links)
Speech signal is basically meant to carry the information about the linguistic message. But, it also contains the speaker-specific information. It is generated by acoustically exciting the cavities of the mouth and nose, and can be used to recognize (identify/verify) a person. This thesis deals with the speaker identification task; i.e., to find the identity of a person using his/her speech from a group of persons already enrolled during the training phase. Listeners use many audible cues in identifying speakers. These cues range from high level cues such as semantics and linguistics of the speech, to low level cues relating to the speaker's vocal tract and voice source characteristics. Generally, the vocal tract characteristics are modeled in modern day speaker identification systems by cepstral coefficients. Although, these coeficients are good at representing vocal tract information, they can be supplemented by using both pitch and voicing information. Pitch provides very important and useful information for identifying speakers. In the current speaker recognition systems, it is very rarely used as it cannot be reliably extracted, and is not always present in the speech signal. In this thesis, an attempt is made to utilize this pitch and voicing information for speaker identification. This thesis illustrates, through the use of a text-independent speaker identification system, the reasonable performance of the cepstral coefficients, achieving an identification error of 6%. Using pitch as a feature in a straight forward manner results in identification errors in the range of 86% to 94%, and this is not very helpful. The two main reasons why the direct use of pitch as a feature does not work for speaker recognition are listed below. First, the speech is not always periodic; only about half of the frames are voiced. Thus, pitch can not be estimated for half of the frames (i.e. for unvoiced frames). The problem is how to account for pitch information for the unvoiced frames during recognition phase. Second, the pitch estimation methods are not very reliable. They classify some of the frames unvoiced when they are really voiced. Also, they make pitch estimation errors (such as doubling or halving of pitch value depending on the method). In order to use pitch information for speaker recognition, we have to overcome these problems. We need a method which does not use the pitch value directly as feature and which should work for voiced as well as unvoiced frames in a reliable manner. We propose here a method which uses the autocorrelation function of the given frame to derive pitch-related features. We call these features the maximum autocorrelation value (MACV) features. These features can be extracted for voiced as well as unvoiced frames and do not suffer from the pitch doubling or halving type of pitch estimation errors. Using these MACV features along with the cepstral features, the speaker identification performance is improved by 45%.
|
10 |
Balso atpažinimo programų lietuvinimo galimybių tyrimas / Speech recognition program`s Lithuanization possibility surveyBivainis, Robertas 30 September 2013 (has links)
Šiame darbe yra analizuojama ir tiriama kaip veikia balso atpažinimo sistema HTK, kokie žingsniai turi būti atlikti norint sėkmingai atpažinti lietuviškai išartus žodžius. Taip pat apžvelgiamos kokių kalbos technologijų samprata reikalinga norint sukurti balso atpažinimo programą. Balso atpažinime labai svarbu yra kalbos signalų atpažinimo modeliai ir paslėptosios Markovo grandinės, todėl analizėje yra apžvelgiama jų veikimo principai ir algoritmai. / This thesis will focus on how the speech recognition program HTK operates and what steps have to be taken in order to recognize spoken Lithuanian words. Also the emphasis of this thesis goes to conceptions of speech recognition technologies which are needed to create a speech recognition program.
|
Page generated in 0.0939 seconds