• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 15
  • 9
  • 5
  • 4
  • 4
  • 4
  • 1
  • 1
  • 1
  • Tagged with
  • 54
  • 54
  • 12
  • 10
  • 9
  • 9
  • 8
  • 8
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Exploring Empirical Guidelines for Selecting Computer Assistive Technology for People with Disabilities

Border, Jennifer January 2011 (has links)
No description available.
22

Dialects, Sex-specificity, and Individual Recognition in the Vocal Repertoire of the Puerto Rican Parrot (Amazona vittata)

Roberts, Briony Z. Jr. 23 December 1997 (has links)
The following study is part of a larger study examining techniques that might be of use in the release program of the Puerto Rican Parrot (Amazona vittata), including marking, capturing, and radio-tracking. The portion of the study reported here documents the vocal behavior of A. vittata during the reproductive season and examines the possibility of using vocalizations to identify individuals, determine the sex of individuals and determine the location of an individual's breeding territory. Objectives of this study included: 1) cataloguing and categorizing the vocal repertoire of A. vittata, 2) determining whether the vocal repertoire was sex-specific and region-specific and 3) determining if an individual's vocal repertoire could be used to identify it. The vocal repertoire was characterized using a hierarchical method and 147 calls were described. The repertoire was found contain a high percentage (76 %) of graded calls. Evolutionary strategies that may explain the complexity of such a repertoire are discussed. The vocal repertoire was found to be both sex- and region-specific. Characteristics analyzed included time and frequency parameters of sonagrams. Three methods were used to determine the feasibility of vocal recognition of individuals. These methods included: bird-call pairing, sonagraphic analysis, and linear predictive coding. Sonagraphic analyses in combination with linear predictive coding techniques show the most promise as tools in voice recognition of the parrot, however, further research will be necessary to determine how reliable voice recognition may be as a method for identifying individuals in the field. / Master of Science
23

Traduction dictée interactive : intégrer la reconnaissance vocale à l’enseignement et à la pratique de la traduction professionnelle

Zapata Rojas, Julian 30 August 2012 (has links)
Translation dictation (TD) is a translation technique that was widely used before professional translators’ workstations witnessed the massive influx of typewriters and personal computers. In the current era of globalization and of information and communication technologies (ICT), and in response to the growing demand for translation, certain translators and translator trainers throughout the world are seeking to (re)integrate dictation into the translation practice. Contrary to a few decades ago, when the transcription of translated texts was typically carried out by professional typists, the translation industry is currently turning to voice recognition (VR) technologies—that is, computer tools that serve to transcribe dictations automatically. Although off-the-shelf VR systems are not specifically conceived for professional translation purposes, they already seem to provide a more ergonomic and efficient approach, for those translators who are already using them, than does the conventional method, i.e., typing on a computer keyboard. This thesis introduces the notion of Interactive Translation Dictation (ITD), a translation technique that involves interaction with a VR system. The literature review conducted for this research indicated that integrating VR technologies into the practice of translation is not new; however, it showed that past efforts have proved unsuccessful. Moreover, an analysis of the needs of translators who use VR systems shed light on why translators have turned to VR software and what their opinions of these tools are. This analysis also allowed us to identify the challenges that VR technology currently presents for professional translation. This thesis is intended as a first step towards developing translation tools that are both ergonomic, i.e., that take into account the human factor, and efficient, allowing translators to meet the needs of the current translation market. The thesis also advocates a renewal of translator training programs. Integrating ITD into translation training and practice means (re)integrating spoken translation techniques that were used in the past and VR technologies that are now emerging. For such integration to be effective, significant technical, cognitive and pedagogical challenges will first need to be overcome. / La traduction dictée (TD) est une technique de traduction amplement utilisée avant l’arrivée massive des machines à écrire et des ordinateurs personnels sur les postes de travail des traducteurs professionnels. À l’heure actuelle, devant la demande croissante de traduction à l’ère de la mondialisation et des technologies de l’information et des communications (TIC), certains traducteurs en exercice et des formateurs en traduction du monde entier considèrent la (ré)intégration de la TD à la pratique traductive. Contrairement à la méthode d’il y a quelques décennies, où la transcription des traductions était normalement produite par un copiste professionnel, on considère l’utilisation des technologies de reconnaissance vocale (RV) : des outils informatiques pouvant prendre en charge la transcription de dictées. Bien que les systèmes de RV sur le marché ne soient pas adaptés à la pratique de la traduction spécifiquement, ils semblent apporter, à ceux qui les utilisent déjà, une approche plus ergonomique et plus efficace que la méthode conventionnelle, c’est-à-dire la saisie au clavier d’ordinateur. La présente thèse introduit la notion de traduction dictée interactive (TDI) comme technique de traduction en interaction avec un système de RV. Lors de la revue de la littérature pour le présent projet, nous avons constaté que l’intérêt à intégrer la RV à la traduction professionnelle n’est pas nouveau, mais que les efforts précédents n’ont pas connu de succès définitif. Également, une analyse des besoins de certains traducteurs utilisant des systèmes de RV nous a éclairé sur la nature des motivations de ces traducteurs à se tourner vers la RV, sur leurs opinions vis-à-vis de cette dernière et sur les difficultés que posent les systèmes de RV pour les tâches d’ordre traductif. Notre thèse se veut un premier pas vers la conception d’outils d’aide à la traduction à la fois ergonomiques, c’est-à-dire prenant en compte le facteur humain, et efficaces, permettant de combler les besoins actuels du marché de la traduction. Elle se veut également une proposition de renouvèlement des programmes de formation à la traduction. Intégrer la TDI à la formation et à la pratique traductives, c’est (ré)intégrer des techniques de traduction orale utilisées par le passé et des technologies émergentes de RV. Et pour que cette intégration soit optimale, des défis importants d’ordre technique, cognitif et pédagogique restent à être surmontés.
24

Developing Java Programs on Android Mobile Phones Using Speech Recognition

Gande, Santhrushna 01 September 2015 (has links)
Nowadays Android operating system based mobile phones and tablets are widely used and had millions of users around the world. The popularity of this operating system is due to its multi-tasking, ease of access and diverse device options. “Java Programming Speech Recognition Application” is an Android application used for handicapped individuals who are not able or have difficultation to type on a keyboard. This application allows the user to write a compute program (in Java Language) by dictating the words and without using a keyboard. The user needs to speak out the commands and symbols required for his/her program. The program has been designed to pick up the Java constant keywords (such as ‘boolean’, ‘break’, ‘if’ and ‘else’), similar to the word received by the speech recognizer system in the application. The “Java Programming Speech Recognition Application” contains external plug-ins such as programming editor and a speech recognizer to record and write the program. These plug-ins come in the form of libraries and pre-coded folders which have to be attached to the main program by the developer.
25

Traduction dictée interactive : intégrer la reconnaissance vocale à l’enseignement et à la pratique de la traduction professionnelle

Zapata Rojas, Julian 30 August 2012 (has links)
Translation dictation (TD) is a translation technique that was widely used before professional translators’ workstations witnessed the massive influx of typewriters and personal computers. In the current era of globalization and of information and communication technologies (ICT), and in response to the growing demand for translation, certain translators and translator trainers throughout the world are seeking to (re)integrate dictation into the translation practice. Contrary to a few decades ago, when the transcription of translated texts was typically carried out by professional typists, the translation industry is currently turning to voice recognition (VR) technologies—that is, computer tools that serve to transcribe dictations automatically. Although off-the-shelf VR systems are not specifically conceived for professional translation purposes, they already seem to provide a more ergonomic and efficient approach, for those translators who are already using them, than does the conventional method, i.e., typing on a computer keyboard. This thesis introduces the notion of Interactive Translation Dictation (ITD), a translation technique that involves interaction with a VR system. The literature review conducted for this research indicated that integrating VR technologies into the practice of translation is not new; however, it showed that past efforts have proved unsuccessful. Moreover, an analysis of the needs of translators who use VR systems shed light on why translators have turned to VR software and what their opinions of these tools are. This analysis also allowed us to identify the challenges that VR technology currently presents for professional translation. This thesis is intended as a first step towards developing translation tools that are both ergonomic, i.e., that take into account the human factor, and efficient, allowing translators to meet the needs of the current translation market. The thesis also advocates a renewal of translator training programs. Integrating ITD into translation training and practice means (re)integrating spoken translation techniques that were used in the past and VR technologies that are now emerging. For such integration to be effective, significant technical, cognitive and pedagogical challenges will first need to be overcome. / La traduction dictée (TD) est une technique de traduction amplement utilisée avant l’arrivée massive des machines à écrire et des ordinateurs personnels sur les postes de travail des traducteurs professionnels. À l’heure actuelle, devant la demande croissante de traduction à l’ère de la mondialisation et des technologies de l’information et des communications (TIC), certains traducteurs en exercice et des formateurs en traduction du monde entier considèrent la (ré)intégration de la TD à la pratique traductive. Contrairement à la méthode d’il y a quelques décennies, où la transcription des traductions était normalement produite par un copiste professionnel, on considère l’utilisation des technologies de reconnaissance vocale (RV) : des outils informatiques pouvant prendre en charge la transcription de dictées. Bien que les systèmes de RV sur le marché ne soient pas adaptés à la pratique de la traduction spécifiquement, ils semblent apporter, à ceux qui les utilisent déjà, une approche plus ergonomique et plus efficace que la méthode conventionnelle, c’est-à-dire la saisie au clavier d’ordinateur. La présente thèse introduit la notion de traduction dictée interactive (TDI) comme technique de traduction en interaction avec un système de RV. Lors de la revue de la littérature pour le présent projet, nous avons constaté que l’intérêt à intégrer la RV à la traduction professionnelle n’est pas nouveau, mais que les efforts précédents n’ont pas connu de succès définitif. Également, une analyse des besoins de certains traducteurs utilisant des systèmes de RV nous a éclairé sur la nature des motivations de ces traducteurs à se tourner vers la RV, sur leurs opinions vis-à-vis de cette dernière et sur les difficultés que posent les systèmes de RV pour les tâches d’ordre traductif. Notre thèse se veut un premier pas vers la conception d’outils d’aide à la traduction à la fois ergonomiques, c’est-à-dire prenant en compte le facteur humain, et efficaces, permettant de combler les besoins actuels du marché de la traduction. Elle se veut également une proposition de renouvèlement des programmes de formation à la traduction. Intégrer la TDI à la formation et à la pratique traductives, c’est (ré)intégrer des techniques de traduction orale utilisées par le passé et des technologies émergentes de RV. Et pour que cette intégration soit optimale, des défis importants d’ordre technique, cognitif et pédagogique restent à être surmontés.
26

Using Speech Recognition Software to Increase Writing Fluency for Individuals with Physical Disabilities

Garrett, Jennifer Tumlin 03 July 2007 (has links)
Writing is an important skill that is necessary throughout school and life. Many students with physical disabilities, however, have difficulty with writing skills due to disability-specific factors, such as motor coordination problems. Due to the difficulties these individuals have with writing, assistive technology is often utilized. One piece of assistive technology, speech recognition software, may help remove the motor demand of writing and help students become more fluent writers. Past research on the use of speech recognition software, however, reveals little information regarding its impact on individuals with physical disabilities. Therefore, this study involved students of high school age with physical disabilities that affected hand use. Using an alternating treatments design to compare the use of word processing with the use of speech recognition software, this study analyzed first-draft writing samples in the areas of fluency, accuracy, type of word errors, recall of intended meaning, and length. Data on fluency, calculated in words correct per minute (wcpm) indicated that all participants wrote much faster with speech recognition compared to word processing. However, accuracy, calculated as percent correct, was much lower when participants used speech recognition compared to word processing. Word errors and recall of intended meaning were coded based on type and varied across participants. In terms of length, all participants wrote longer drafts when using speech recognition software, primarily because their fluency was higher, and they were able, therefore, to write more words. Although the results of this study indicated that participants wrote more fluently with speech recognition, because their accuracy was low, it is difficult to determine whether or not speech recognition is a viable solution for all individuals with physical disabilities. Therefore, additional research is needed that takes into consideration the editing and error correction time when using speech recognition software.
27

The use of voice recognition software as a compensatory strategy for postsecondary education students receiving services under the category of learning disabled

Roberts, Kelly Drew 08 1900 (has links)
This study expands on the current literature base that investigates the use of voice recognition software (VRS) as a compensatory strategy for written language difficulties often experienced by postsecondary education students receiving services under the category of learning disabled. The current literature base is limited to one study (Higgins & Raskind, 1995) which found that subjects' writing samples, completed with VRS, had higher holistic scores than the samples completed with a transcriber, and without assistance. While these findings are positive many questions remain unanswered. The research conducted in this dissertation investigated three such questions. The questions and corresponding findings follow. 1. After being trained on VRS will persons, in postsecondary education, receiving services under the category of learning disabled, continue to use it to complete their academic course work? Will they further use the software for purposes other than academic study? Two individuals continued to use the software. One of these two used the software for multiple purposes. 2. Does the ongoing use of VRS, by postsecondary education students receiving services under the category of learning disabled, improve their written performance when assessed with Fry's Readability Graph? Two subjects each submitted three writing samples: one completed without the use of VRS and two completed using VRS. One subject's grade level equivalency went from 4.5 (sample completed without using VRS) to 6.5 (samples completed using VRS). There was no change in the grade level equivalency of the writing samples for the second subject. 3. What are the contributing variables that influence the continued use, or non-use, of VRS by postsecondary education students receiving services under the category of learning disabled? Numerous variables emerged from the data including: time, access to a personal computer, ease of use, personal issues, use of standard English, the specific limitations associated with a persons disability, whether or not the subjects had other compensatory strategies in place, and the acquisition of the skills necessary to use the software. The findings contribute to the field by providing a framework from which to assess who mayor may not benefit from the use of VRS.
28

Sistema de reconhecimento de locutor utilizando redes neurais artificiais / Artificial neural networks speaker recognition system

Adami, Andre Gustavo January 1997 (has links)
Este trabalho envolve o emprego de recentes tecnologias ligadas a promissora área de Inteligência Computacional e a tradicional área de Processamento de Sinais Digitais. Tem por objetivo o desenvolvimento de uma aplicação especifica na área de Processamento de Voz: o reconhecimento de locutor. Inúmeras aplicações, ligadas principalmente a segurança e controle, são possíveis a partir do domínio da tecnologia de reconhecimento de locutor, tanto no que diz respeito a identificação quanto a verificação de diferentes locutores. O processo de reconhecimento de locutor pode ser dividido em duas grandes fases: extração das características básicas do sinal de voz e classificação. Na fase de extração, procurou-se aplicar os mais recentes avanços na área de Processamento Digital de Sinais ao problema proposto. Neste contexto, foram utilizadas a frequência fundamental e as frequências formantes como parâmetros que identificam o locutor. O primeiro foi obtido através do use da autocorrelação e o segundo foi obtido através da transformada de Fourier. Estes parâmetros foram extraídos na porção da fala onde o trato vocal apresenta uma coarticulação entre dois sons vocálicos. Esta abordagem visa extrair as características desta mudança do aparato vocal. Existem dois tipos de reconhecimento de locutor: identificação (busca-se reconhecer o locutor em uma população) e verificação (busca-se verificar se a identidade alegada é verdadeira). O processo de reconhecimento de locutor é dividido em duas grandes fases: extração das características (envolve aquisição, pré-processamento e extração dos parâmetros característicos do sinal) e classificação (envolve a classificação do sinal amostrado na identificação/verificação do locutor ou não). São apresentadas diversas técnicas para representação do sinal, como analise espectral, medidas de energia, autocorrelação, LPC (Linear Predictive Coding), entre outras. Também são abordadas técnicas para extração de características do sinal, como a frequência fundamental e as frequências formantes. Na fase de classificação, pode-se utilizar diversos métodos convencionais: Cadeias de Markov, Distância Euclidiana, entre outros. Além destes, existem as Redes Neurais Artificiais (RNAs) que são consideradas poderosos classificadores. As RNAs já vêm sendo utilizadas em problemas que envolvem classificações de sinais de voz. Neste trabalho serão estudados os modelos mais utilizados para o problema de reconhecimento de locutor. Assim, o tema principal da Dissertação de Mestrado deste autor é a implementação de um sistema de reconhecimento de locutor utilizando Redes Neurais Artificiais para classificação do locutor. Neste trabalho tamb6m é apresentada uma abordagem para a implementação de um sistema de reconhecimento de locutor utilizando as técnicas convencionais para o processo de classificação do locutor. As técnicas utilizadas são Dynamic Time Warping (DTW) e Vector Quantization (VQ). / This work deals with the application of recent technologies related to the promising research domain of Intelligent Computing (IC) and to the traditional Digital Signal Processing area. This work aims to apply both technologies in a Voice Processing specific application which is the speaker recognition task. Many security control applications can be supported by speaker recognition technology, both in identification and verification of different speakers. The speaker recognition process can be divided into two main phases: basic characteristics extraction from the voice signal and classification. In the extraction phase, one proposed goal was the application of recent advances in DSP theory to the problem approached in this work. In this context, the fundamental frequency and the formant frequencies were employed as parameters to identify the speaker. The first one was obtained through the use of autocorrelation and the second ones were obtained through Fourier transform. These parameters were extracted from the portion of speech where the vocal tract presents a coarticulation between two voiced sounds. This approach is used to extract the characteristics of this apparatus vocal changing. In this work, the Multi-Layer Perceptron (MLP) ANN architecture was investigated in conjunction with the backpropagation learning algorithm. In this sense, some main characteristics extracted from the signal (voice) were used as input parameters to the ANN used. The output of MLP, trained previously with the speakers features, returns the authenticity of that signal. Tests were performed with 10 different male speakers, whose age were in the range from 18 to 24 years. The results are very promising. In this work it is also presented an approach to implement a speaker recognition system by applying conventional methods to the speaker classification process. The methods used are Dynamic Time Warping (DTW) and Vector Quantization (VQ).
29

Sistema de reconhecimento de locutor utilizando redes neurais artificiais / Artificial neural networks speaker recognition system

Adami, Andre Gustavo January 1997 (has links)
Este trabalho envolve o emprego de recentes tecnologias ligadas a promissora área de Inteligência Computacional e a tradicional área de Processamento de Sinais Digitais. Tem por objetivo o desenvolvimento de uma aplicação especifica na área de Processamento de Voz: o reconhecimento de locutor. Inúmeras aplicações, ligadas principalmente a segurança e controle, são possíveis a partir do domínio da tecnologia de reconhecimento de locutor, tanto no que diz respeito a identificação quanto a verificação de diferentes locutores. O processo de reconhecimento de locutor pode ser dividido em duas grandes fases: extração das características básicas do sinal de voz e classificação. Na fase de extração, procurou-se aplicar os mais recentes avanços na área de Processamento Digital de Sinais ao problema proposto. Neste contexto, foram utilizadas a frequência fundamental e as frequências formantes como parâmetros que identificam o locutor. O primeiro foi obtido através do use da autocorrelação e o segundo foi obtido através da transformada de Fourier. Estes parâmetros foram extraídos na porção da fala onde o trato vocal apresenta uma coarticulação entre dois sons vocálicos. Esta abordagem visa extrair as características desta mudança do aparato vocal. Existem dois tipos de reconhecimento de locutor: identificação (busca-se reconhecer o locutor em uma população) e verificação (busca-se verificar se a identidade alegada é verdadeira). O processo de reconhecimento de locutor é dividido em duas grandes fases: extração das características (envolve aquisição, pré-processamento e extração dos parâmetros característicos do sinal) e classificação (envolve a classificação do sinal amostrado na identificação/verificação do locutor ou não). São apresentadas diversas técnicas para representação do sinal, como analise espectral, medidas de energia, autocorrelação, LPC (Linear Predictive Coding), entre outras. Também são abordadas técnicas para extração de características do sinal, como a frequência fundamental e as frequências formantes. Na fase de classificação, pode-se utilizar diversos métodos convencionais: Cadeias de Markov, Distância Euclidiana, entre outros. Além destes, existem as Redes Neurais Artificiais (RNAs) que são consideradas poderosos classificadores. As RNAs já vêm sendo utilizadas em problemas que envolvem classificações de sinais de voz. Neste trabalho serão estudados os modelos mais utilizados para o problema de reconhecimento de locutor. Assim, o tema principal da Dissertação de Mestrado deste autor é a implementação de um sistema de reconhecimento de locutor utilizando Redes Neurais Artificiais para classificação do locutor. Neste trabalho tamb6m é apresentada uma abordagem para a implementação de um sistema de reconhecimento de locutor utilizando as técnicas convencionais para o processo de classificação do locutor. As técnicas utilizadas são Dynamic Time Warping (DTW) e Vector Quantization (VQ). / This work deals with the application of recent technologies related to the promising research domain of Intelligent Computing (IC) and to the traditional Digital Signal Processing area. This work aims to apply both technologies in a Voice Processing specific application which is the speaker recognition task. Many security control applications can be supported by speaker recognition technology, both in identification and verification of different speakers. The speaker recognition process can be divided into two main phases: basic characteristics extraction from the voice signal and classification. In the extraction phase, one proposed goal was the application of recent advances in DSP theory to the problem approached in this work. In this context, the fundamental frequency and the formant frequencies were employed as parameters to identify the speaker. The first one was obtained through the use of autocorrelation and the second ones were obtained through Fourier transform. These parameters were extracted from the portion of speech where the vocal tract presents a coarticulation between two voiced sounds. This approach is used to extract the characteristics of this apparatus vocal changing. In this work, the Multi-Layer Perceptron (MLP) ANN architecture was investigated in conjunction with the backpropagation learning algorithm. In this sense, some main characteristics extracted from the signal (voice) were used as input parameters to the ANN used. The output of MLP, trained previously with the speakers features, returns the authenticity of that signal. Tests were performed with 10 different male speakers, whose age were in the range from 18 to 24 years. The results are very promising. In this work it is also presented an approach to implement a speaker recognition system by applying conventional methods to the speaker classification process. The methods used are Dynamic Time Warping (DTW) and Vector Quantization (VQ).
30

Sistema de reconhecimento de locutor utilizando redes neurais artificiais / Artificial neural networks speaker recognition system

Adami, Andre Gustavo January 1997 (has links)
Este trabalho envolve o emprego de recentes tecnologias ligadas a promissora área de Inteligência Computacional e a tradicional área de Processamento de Sinais Digitais. Tem por objetivo o desenvolvimento de uma aplicação especifica na área de Processamento de Voz: o reconhecimento de locutor. Inúmeras aplicações, ligadas principalmente a segurança e controle, são possíveis a partir do domínio da tecnologia de reconhecimento de locutor, tanto no que diz respeito a identificação quanto a verificação de diferentes locutores. O processo de reconhecimento de locutor pode ser dividido em duas grandes fases: extração das características básicas do sinal de voz e classificação. Na fase de extração, procurou-se aplicar os mais recentes avanços na área de Processamento Digital de Sinais ao problema proposto. Neste contexto, foram utilizadas a frequência fundamental e as frequências formantes como parâmetros que identificam o locutor. O primeiro foi obtido através do use da autocorrelação e o segundo foi obtido através da transformada de Fourier. Estes parâmetros foram extraídos na porção da fala onde o trato vocal apresenta uma coarticulação entre dois sons vocálicos. Esta abordagem visa extrair as características desta mudança do aparato vocal. Existem dois tipos de reconhecimento de locutor: identificação (busca-se reconhecer o locutor em uma população) e verificação (busca-se verificar se a identidade alegada é verdadeira). O processo de reconhecimento de locutor é dividido em duas grandes fases: extração das características (envolve aquisição, pré-processamento e extração dos parâmetros característicos do sinal) e classificação (envolve a classificação do sinal amostrado na identificação/verificação do locutor ou não). São apresentadas diversas técnicas para representação do sinal, como analise espectral, medidas de energia, autocorrelação, LPC (Linear Predictive Coding), entre outras. Também são abordadas técnicas para extração de características do sinal, como a frequência fundamental e as frequências formantes. Na fase de classificação, pode-se utilizar diversos métodos convencionais: Cadeias de Markov, Distância Euclidiana, entre outros. Além destes, existem as Redes Neurais Artificiais (RNAs) que são consideradas poderosos classificadores. As RNAs já vêm sendo utilizadas em problemas que envolvem classificações de sinais de voz. Neste trabalho serão estudados os modelos mais utilizados para o problema de reconhecimento de locutor. Assim, o tema principal da Dissertação de Mestrado deste autor é a implementação de um sistema de reconhecimento de locutor utilizando Redes Neurais Artificiais para classificação do locutor. Neste trabalho tamb6m é apresentada uma abordagem para a implementação de um sistema de reconhecimento de locutor utilizando as técnicas convencionais para o processo de classificação do locutor. As técnicas utilizadas são Dynamic Time Warping (DTW) e Vector Quantization (VQ). / This work deals with the application of recent technologies related to the promising research domain of Intelligent Computing (IC) and to the traditional Digital Signal Processing area. This work aims to apply both technologies in a Voice Processing specific application which is the speaker recognition task. Many security control applications can be supported by speaker recognition technology, both in identification and verification of different speakers. The speaker recognition process can be divided into two main phases: basic characteristics extraction from the voice signal and classification. In the extraction phase, one proposed goal was the application of recent advances in DSP theory to the problem approached in this work. In this context, the fundamental frequency and the formant frequencies were employed as parameters to identify the speaker. The first one was obtained through the use of autocorrelation and the second ones were obtained through Fourier transform. These parameters were extracted from the portion of speech where the vocal tract presents a coarticulation between two voiced sounds. This approach is used to extract the characteristics of this apparatus vocal changing. In this work, the Multi-Layer Perceptron (MLP) ANN architecture was investigated in conjunction with the backpropagation learning algorithm. In this sense, some main characteristics extracted from the signal (voice) were used as input parameters to the ANN used. The output of MLP, trained previously with the speakers features, returns the authenticity of that signal. Tests were performed with 10 different male speakers, whose age were in the range from 18 to 24 years. The results are very promising. In this work it is also presented an approach to implement a speaker recognition system by applying conventional methods to the speaker classification process. The methods used are Dynamic Time Warping (DTW) and Vector Quantization (VQ).

Page generated in 0.1028 seconds