• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 14
  • 9
  • 7
  • 5
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 44
  • 44
  • 27
  • 15
  • 13
  • 9
  • 8
  • 7
  • 7
  • 7
  • 7
  • 7
  • 6
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Speech Encryption Using Wavelet Packets

Bopardikar, Ajit S 02 1900 (has links)
The aim of speech scrambling algorithms is to transform clear speech into an unintelligible signal so that it is difficult to decrypt it in the absence of the key. Most of the existing speech scrambling algorithms tend to retain considerable residual intelligibility in the scrambled speech and are easy to break. Typically, a speech scrambling algorithm involves permutation of speech segments in time, frequency or time-frequency domain or permutation of transform coefficients of each speech block. The time-frequency algorithms have given very low residual intelligibility and have attracted much attention. We first study the uniform filter bank based time-frequency scrambling algorithm with respect to the block length and number of channels. We use objective distance measures to estimate the departure of the scrambled speech from the clear speech. Simulations indicate that the distance measures increase as we increase the block length and the number of chan­nels. This algorithm derives its security only from the time-frequency segment permutation and it has been estimated that the effective number of permutations which give a low residual intelligibility is much less than the total number of possible permutations. In order to increase the effective number of permutations, we propose a time-frequency scrambling algorithm based on wavelet packets. By using different wavelet packet filter banks at the analysis and synthesis end, we add an extra level of security since the eavesdropper has to choose the correct analysis filter bank, correctly rearrange the time-frequency segments, and choose the correct synthesis bank to get back the original speech signal. Simulations performed with this algorithm give distance measures comparable to those obtained for the uniform filter bank based algorithm. Finally, we introduce the 2-channel perfect reconstruction circular convolution filter bank and give a simple method for its design. The filters designed using this method satisfy the paraunitary properties on a discrete equispaced set of points in the frequency domain.
22

Análise cepstral baseada em diferentes famílias transformada wavelet / Cepstral analysis based on different family of wavelet transform

Fabrício Lopes Sanchez 02 December 2008 (has links)
Este trabalho apresenta um estudo comparativo entre diferentes famílias de transformada Wavelet aplicadas à análise cepstral de sinais digitais de fala humana, com o objetivo específico de determinar o período de pitch dos mesmos e, ao final, propõe um algoritmo diferencial para realizar tal operação, levando-se em consideração aspectos importantes do ponto de vista computacional, tais como: desempenho, complexidade do algoritmo, plataforma utilizada, dentre outros. São apresentados também, os resultados obtidos através da implementação da nova técnica (baseada na transformada wavelet) em comparação com a abordagem tradicional (baseada na transformada de Fourier). A implementação da técnica foi testada em linguagem C++ padrão ANSI sob as plataformas Windows XP Professional SP3, Windows Vista Business SP1, Mac OSX Leopard e Linux Mandriva 10. / This work presents a comparative study between different family of wavelets applied on cepstral analysis of the digital speech human signal with specific objective for determining of pitch period of the same and in the end, proposes an differential algorithm to make such a difference operation take into consideration important aspects of computational point of view, such as: performance, algorithm complexity, used platform, among others. They are also present, the results obtained through of the technique implementation compared with the traditional approach. The technique implementation was tested in C++ language standard ANSI under the platform Windows XP Professional SP3 Edition, Windows Vista Business SP1, MacOSX Leopard and Linux Mandriva 10.
23

Extração de características do sinal de voz utilizando análise fatorial verdadeira. / Speech signal feature extraction using true factorial analysis

Matos, Adriano Nogueira 17 December 2008 (has links)
Made available in DSpace on 2015-04-11T14:03:17Z (GMT). No. of bitstreams: 1 DISSERTACAO ADRIANO NOGUEIRA.pdf: 382280 bytes, checksum: fc1f9e0caac3d97ff74a893e97298a71 (MD5) Previous issue date: 2008-12-17 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Digital processing of speech signal is applied in several computer applications, which the major ones are the following: Recognition, synthesis and coding of speech. All these applications require the amount of data in the acoustic signal to be reduced, in order to allow processing by a computer device. The feature extraction of speech signal, that is the goal of this study, performs this action. The features extracted should well depict the speech signal and should have no redundancy, in order to increase the performance of the systems using them. The feature extraction Mel Frequency Cepstral Coefficients (MFCC) method partially fulfills these requirements, but it is seriously damaged when noise signal is acting. The appliance of the statistical method of Factorial Analysis is intended to filter the noise components from the speech. The results of the experiments performed in this work shows that this is a competitive method, especially when used to generate acoustic models in severe noise conditions. / O processamento digital do sinal de voz é empregado em diversas aplicações computacionais, das quais as principais são: Reconhecimento, síntese e codificação da fala. Todas estas aplicações requerem que ocorra redução da quantidade de informações da onda acústica, de maneira a permitir o processamento por um computador. O processo de extração de características do sinal de voz, objeto de estudo deste trabalho, realiza esta tarefa. As características extraídas devem caracterizar o sinal de voz e não conter redundância, de forma a maximizar o desempenho dos sistemas que as utilizem. O método MFCC (Mel Frequency Cepstral Coefficients) de extração de características cumpre parcialmente esses requisitos, mas é seriamente degradado sob a incidência de ruído. A aplicação do método estatístico de Análise Fatorial objetiva filtrar o sinal de ruído das locuções. Os resultados obtidos dos experimentos realizados indicam a competitividade deste método, especialmente quando usado na geração dos modelos acústicos robustos em condições de ruído severo.
24

Identifikace pauz v rušeném řečovém signálu / Pause Identification in Degraded Speech Signal

Podloucká, Lenka January 2008 (has links)
This diploma thesis deals with pause identification with degraded speech signal. The speech characteristics and the conception of speech signal processing are described here. The work aim was to create the reliable recognizing method to establish speech and non-speech segments of speech signal with and without degraded speech signal. The five empty pause detectors were realized in computing environment MATLAB. There was the energetic detector in time domain, two-step detector in spectral domain, one-step integral detector, two-step integral detector and differential detector in cepstrum. The spectral detector makes use of energetic characteristics of speech signal in first step and statistic analysis in second step. Cepstral detectors make use of integral or differential algorithms. The detectors robustness was tested for different types of speech degradation and different values of Signal to Noise Ratio. The test of influence different speech degradation was conducted to compare non-speech detection for detectors by ROC (Receiver Operating Characteristic) Curves.
25

Segmentace řeči / Speech segmentation

Andrla, Petr January 2010 (has links)
The programme for the segmentation of a speech into fonems was created as a part of the master´s thesis. This programme was made in the programme Matlab and consists of several scripts. The programme serves for automatic segmentation. Speech segmentation is the process of identifying the boundaries between phonemes in spoken natural languages. Automatic segmentation is based on vector quantization. In the first step of algorithm, feature extraction is realized. Then speech segments are assigned to calculated centroids. Position where centroid is changed is marked as a boundary of phoneme. The audiorecords were elaborated by the programme and a operation of the automatic segmentation was analysed. A detailed manual was created to the programme too. Individual used methods of the elaboration of a speech were in the master´s thesis briefly descripted, its implementations in the programme and reasons of set of its parameters.
26

Metody potlaÄen­ umu pro rozpoznvaÄe eÄi / Methods of noise suppression for speech recognition systems

Mold­kov, Zuzana January 2014 (has links)
This diploma thesis deals with methods of noise suppression for speech recognition systems. In theoretical part are discussed basic terms of this topic and also methods for noise suppression. These are spectral subtraction, Wiener filtering, RASTA, mapping of spectrogram or algorithms based on noise estimation. In second part types of noise are analyzed, there is proposal and implementation of spectral subtraction method of noise suppression for speech recognition system. Also extensive testing of spectral subtractive algorithms in comparison with Wiener filter is conducted. Assessment of this testing is done with defined metrics, successfulness of recognition, recognition system score and signal to noise ratio.
27

Softwarový analyzátor a dolaďovač záznamenaného vokálu / Software analysator and tuner of vocal records

Smatana, Tomáš January 2015 (has links)
This thesis deals with the analysis methods used to fundamental frequency detection and methods for changing the fundamental frequency of the audio signal containing vocals. It also explores general musical intonation theory. On the basis of this analysis, suitable methods are selected for the follow realization software fine-tuning vocal audio signal.
28

Vytvoření webové aplikace pro objektivní analýzu hypokinetické dysartrie ve frameworku Django / Django framework based web application for objective analysis of hypokinetic dysarthria

Čapek, Karel January 2017 (has links)
This master´s thesis deals with the calculation of parameters that would be able to differentiate healthy speech and speech impaired by hypokinetic dysarthria. There was staged hypokinetic dysarthria, which is a motoric disorder of speech and vocal tract. Were studied speech signal processing methods. Further parameters were studied, which could well differentiate healthy and diseased speech. Subsequently, these parameters were programmed in Python programming language. The next step was to create a web application in Django framework, which is used for the analysis of the dyzartic speech.
29

Vokinesis : instrument de contrôle suprasegmental de la synthèse vocale / Vokinesis : an instrument for suprasegmental control of voice synthesis

Delalez, Samuel 28 November 2017 (has links)
Ce travail s'inscrit dans le domaine du contrôle performatif de la synthèse vocale, et plus particulièrement de la modification temps-réel de signaux de voix pré-enregistrés. Dans un contexte où de tels systèmes n'étaient en mesure de modifier que des paramètres de hauteur, de durée et de qualité vocale, nos travaux étaient centrés sur la question de la modification performative du rythme de la voix. Une grande partie de ce travail de thèse a été consacrée au développement de Vokinesis, un logiciel de modification performative de signaux de voix pré-enregistrés. Il a été développé selon 4 objectifs: permettre le contrôle du rythme de la voix, avoir un système modulaire, utilisable en situation de concert ainsi que pour des applications de recherche. Son développement a nécessité une réflexion sur la nature du rythme vocal et sur la façon dont il doit être contrôlé. Il est alors apparu que l'unité rythmique inter-linguistique de base pour la production du rythme vocale est de l'ordre de la syllabe, mais que les règles de syllabification sont trop variables d'un langage à l'autre pour permettre de définir un motif rythmique inter-linguistique invariant. Nous avons alors pu montrer que le séquencement précis et expressif du rythme vocal nécessite le contrôle de deux phases, qui assemblées forment un groupe rythmique: le noyau et la liaison rythmiques. Nous avons mis en place plusieurs méthodes de contrôle rythmique que nous avons testées avec différentes interfaces de contrôle. Une évaluation objective a permis de valider l'une de nos méthodes du point de vue de la précision du contrôle rythmique. De nouvelles stratégies de contrôle de la hauteur et de paramètres de qualité vocale avec une tablette graphique ont été mises en place. Une réflexion sur la pertinence de cette interface au regard de l'essor des nouvelles interfaces musicales continues nous a permis de conclure que la tablette est la mieux adaptée au contrôle expressif de l'intonation (parole), mais que les PMC (Polyphonic Multidimensional Controllers) sont mieux adaptés au contrôle de la mélodie (chant, ou autres instruments).Le développement de Vokinesis a également nécessité la mise en place de la méthode de traitement de signal VoPTiQ (Voice Pitch, Time and Quality modification), combinant une adaptation de l'algorithme RT-PSOLA et des techniques particulières de filtrage pour les modulations de qualité vocale. L'utilisation musicale de Vokinesis a été évaluée avec succès dans le cadre de représentations publiques du Chorus Digitalis, pour du chant de type variété ou musique contemporaine. L'utilisation dans un cadre de musique électro a également été explorée par l'interfaçage du logiciel de création musicale Ableton Live à Vokinesis. Les perspectives d'application sont multiples: études scientifiques (recherches en prosodie, en parole expressive, en neurosciences...), productions sonores et musicales, pédagogie des langues, thérapies vocales. / This work belongs to the field of performative control of voice synthesis, and more precisely of real-time modification of pre-recorded voice signals. In a context where such systems were only capable of modifying parameters such as pitch, duration and voice quality, our work was carried around the question of performative modification of voice rhythm. One significant part of this thesis has been devoted to the development of Vokinesis, a program for performative modification of pre-recorded voice. It has been developed under 4 goals: to allow for voice rhythm control, to obtain a modular system, usable in public performances situations as well as for research applications. To achieve this development, a reflexion about the nature of voice rhythm and how it should be controlled has been carried out. It appeared that the basic inter-linguistic rhtyhmic unit is syllable-sized, but that syllabification rules are too language-dependant to provide a invariant inter-linguistic rhythmic pattern. We showed that accurate and expressive sequencing of vocal rhythm is performed by controlling the timing of two phases, which together form a rhythmic group: the rhythmic nucleus and the rhythmic link. We developed several rhythm control methods, tested with several control interfaces. An objective evaluation showed that one of our methods allows for very accurate control of rhythm. New strategies for voice pitch and quality control with a graphic tablet have been established. A reflexion about the pertinence of graphic tablets for pitch control, regarding the rise of new continuous musical interfaces, lead us to the conclusion that they best fit intonation control (speech), but that PMC (Polyphonic Multidimensional controllers) are better for melodic control (singing, or other instruments).The development of Vokinesis also required the implementation of the VoPTiQ (Voice Pitch, Time and Quality modification) signal processing method, which combines an adaptation of the RT-PSOLA algorithm and some specific filtering techniques for voice quality modulations. The use of Vokinesis as a musical instrument has been successfully evaluated in public representations of the Chorus Digitalis ensemble, for various singing styles (from pop to contemporary music). Its use for electro music has also been explored by interfacing the Ableton Live composition environnment with Vokinesis. Application perspectives are diverse: scientific studies (research in prosody, expressive speech, neurosciences), sound and music production, language learning and teaching, speech therapies.
30

A parametric monophone speech synthesis system

Klompje, Gideon 12 1900 (has links)
Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2006. / Speech is the primary and most natural means of communication between human beings. With the rapid spread of technology across the globe and the increased number of personal and public applications for digital equipment in recent years, the need for human/machine interaction has increased dramatically. Synthetic speech is audible speech produced by a machine automatically. A text-to-speech (TTS) system is one that converts bodies of text into digital speech signals which can be heard and understood by a person. Current TTS systems generally require large annotated speech corpora in the languages for which they are developed. For many languages these resources are not available. In their absence, a TTS system generates synthetic speech by means of mathematical algorithms constrained by certain rules. This thesis describes the design and implementation of a rule-based speech generation algorithm for use in a TTS system. The system allows the type, emphasis, pitch and other parameters associated with a sound and its particular mode of articulation to be specified. However, no attempt is made to model prosodic and other higher-level information. Instead, this is assumed known. The algorithm uses linear predictive (LP) models of monophone speech units, which greatly reduces the amount of data required for development in a new language. A novel approach to the interpolation of monophone speech units is presented to allow realistic transitions between monophone units. Additionally, novel algorithms for estimation and modelling of the harmonic and stochastic content of an excitation signal are presented. This is used to determine the amount of voiced and unvoiced energy present in individual speech sounds. Promising results were obtained when evaluating the developed system’s South African English speech output using two widely used speech intelligibility tests, namely the modified rhyme test (MRT) and semantically unpredictable sentences (SUS).

Page generated in 0.0568 seconds