• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 46
  • 6
  • 4
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 85
  • 85
  • 85
  • 33
  • 28
  • 19
  • 18
  • 18
  • 18
  • 14
  • 14
  • 12
  • 12
  • 12
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Elever med läs- och skrivsvårigheter och deras olika uppfattningar om användande av talsyntes / Students with reading and writing difficulties and their perceptions of the use of text-to-speech

Stengel, Marie January 2013 (has links)
Syftet med studien är att undersöka elever med läs- och skrivsvårigheters skilda sätt att uppfatta användandet av talsyntes. Kvalitativa intervjuer har genomförts med nio elever i grundskolans årskurs tre till nio. Studien har utgått från en fenomenografisk ansats. I resultatet framkommer sex skilda uppfattningar om användandet av talsyntes. De sex kategorierna är: talsyntesen i användning, viktiga andra, autonomi och självständighet, lärande, delaktighet och förändring samt engagemang och attityd. Majoriteten av eleverna upplever användandet av talsyntes positivt. Studien pekar på att talsyntesen ökar elevernas lärande, motivation och delaktighet hos de allra flesta av eleverna. Elever med läs- och skrivsvårigheter är en heterogen grupp med olika behov beroende av vad som orsakar deras svårigheter och talsyntesens betydelse och användningsområden kan därför variera.  Studien visar att det är viktigt att införandet av talsyntes sker i dialog med eleven och att hon eller han har stora möjligheter att själv bestämma över när, hur och var den ska användas. Resultatet visar också att det är viktigt att det finns en god stöttning i början av användandet. / The aim of the study is to examine students with reading and writing difficulties different ways of perceiving the use of text-to speech. Qualitative interviews were conducted with nine students in the primary grades three to nine. The study was based on a phenomenographic approach. The result shows six different views on the use of text-to-speech. The six description categories are: text-to-speech  in use, significant others, autonomy and independence,  learning, participation and change and commitment and attitude. The majority of students experience the use of text-to-speech positively. The study indicates that text-to-speech increases student learning, motivation and participation of the vast majority of students. Students with reading and writing difficulties are a heterogeneous group with different needs depending on what is causing their difficulties and the importance and use of text-to-speech may therefore vary. It is important that the introduction of text-to-speech through discussion with the student and that she or he has great opportunity to decide when, how and where to use it. The result also shows that it is important with scaffolding at the beginning of use.
62

Grapheme-to-phoneme conversion and its application to transliteration

Jiampojamarn, Sittichai Unknown Date
No description available.
63

Automatic speech segmentation with limited data / by D.R. van Niekerk

Van Niekerk, Daniel Rudolph January 2009 (has links)
The rapid development of corpus-based speech systems such as concatenative synthesis systems for under-resourced languages requires an efficient, consistent and accurate solution with regard to phonetic speech segmentation. Manual development of phonetically annotated corpora is a time consuming and expensive process which suffers from challenges regarding consistency and reproducibility, while automation of this process has only been satisfactorily demonstrated on large corpora of a select few languages by employing techniques requiring extensive and specialised resources. In this work we considered the problem of phonetic segmentation in the context of developing small prototypical speech synthesis corpora for new under-resourced languages. This was done through an empirical evaluation of existing segmentation techniques on typical speech corpora in three South African languages. In this process, the performance of these techniques were characterised under different data conditions and the efficient application of these techniques were investigated in order to improve the accuracy of resulting phonetic alignments. We found that the application of baseline speaker-specific Hidden Markov Models results in relatively robust and accurate alignments even under extremely limited data conditions and demonstrated how such models can be developed and applied efficiently in this context. The result is segmentation of sufficient quality for synthesis applications, with the quality of alignments comparable to manual segmentation efforts in this context. Finally, possibilities for further automated refinement of phonetic alignments were investigated and an efficient corpus development strategy was proposed with suggestions for further work in this direction. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
64

Automatic speech segmentation with limited data / by D.R. van Niekerk

Van Niekerk, Daniel Rudolph January 2009 (has links)
The rapid development of corpus-based speech systems such as concatenative synthesis systems for under-resourced languages requires an efficient, consistent and accurate solution with regard to phonetic speech segmentation. Manual development of phonetically annotated corpora is a time consuming and expensive process which suffers from challenges regarding consistency and reproducibility, while automation of this process has only been satisfactorily demonstrated on large corpora of a select few languages by employing techniques requiring extensive and specialised resources. In this work we considered the problem of phonetic segmentation in the context of developing small prototypical speech synthesis corpora for new under-resourced languages. This was done through an empirical evaluation of existing segmentation techniques on typical speech corpora in three South African languages. In this process, the performance of these techniques were characterised under different data conditions and the efficient application of these techniques were investigated in order to improve the accuracy of resulting phonetic alignments. We found that the application of baseline speaker-specific Hidden Markov Models results in relatively robust and accurate alignments even under extremely limited data conditions and demonstrated how such models can be developed and applied efficiently in this context. The result is segmentation of sufficient quality for synthesis applications, with the quality of alignments comparable to manual segmentation efforts in this context. Finally, possibilities for further automated refinement of phonetic alignments were investigated and an efficient corpus development strategy was proposed with suggestions for further work in this direction. / Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
65

Hlasem ovládaný elektronický zubní kříž / Voice controled electronic health record in dentistry

Hippmann, Radek January 2012 (has links)
Title: Voice controlled electronic health record in dentistry Author: MUDr. Radek Hippmann Department: Department of paediatric stomatology, Faculty hospital Motol Supervisor: Prof. MUDr. Taťjana Dostalová, DrSc., MBA Supervisor's e-mail: Tatjana.Dostalova@fnmotol.cz This PhD thesis is concerning with development of the complex electronic health record (EHR) for the field of dentistry. This system is also enhanced with voice control based on the Automatic speech recognition (ASR) system and module for speech synthesis Text-to- speech (TTS). In the first part of the thesis is described the whole issue and are defined particular areas, whose combination is essential for EHR system creation in this field. It is mainly basic delimiting of terms and areas in the dentistry. In the next step we are engaged in temporomandibular joint (TMJ) problematic, which is often ignored and trends in EHR and voice technologies are also described. In the methodological part are described delineated technologies used during the EHR system creation, voice recognition and TMJ disease classification. Following part incorporates results description, which are corresponding with the knowledge base in dentistry and TMJ. From this knowledge base originates the graphic user interface DentCross, which is serving for dental data...
66

Tradução grafema-fonema para a língua portuguesa baseada em autômatos adaptativos. / Grapheme-phoneme translation for portuguese based on adaptive automata.

Danilo Picagli Shibata 25 March 2008 (has links)
Este trabalho apresenta um estudo sobre a utilização de dispositivos adaptativos para realizar tradução texto-voz. O foco do trabalho é a criação de um método para a tradução grafema-fonema para a língua portuguesa baseado em autômatos adaptativos e seu uso em um software de tradução texto-voz. O método apresentado busca mimetizar o comportamento humano no tratamento de regras de tonicidade, separação de sílabas e as influências que as sílabas exercem sobre suas vizinhas. Essa característica torna o método facilmente utilizável para outras variações da língua portuguesa, considerando que essas características são invariantes em relação à localidade e a época da variedade escolhida. A variação contemporânea da língua falada na cidade de São Paulo foi escolhida como alvo de análise e testes neste trabalho. Para essa variação, o modelo apresenta resultados satisfatórios superando 95% de acerto na tradução grafema-fonema de palavras, chegando a 90% de acerto levando em consideração a resolução de dúvidas geradas por palavras que podem possuir duas representações sonoras e gerando uma saída sonora inteligível aos nativos da língua por meio da síntese por concatenação baseada em sílabas. Como resultado do trabalho, além do modelo para tradução grafema-fonema de palavras baseado em autômatos adaptativos, foi criado um método para escolha da representação fonética correta em caso de ambigüidade e foram criados dois softwares, um para simulação de autômatos adaptativos e outro para a tradução grafema-fonema de palavras utilizando o modelo de tradução criado e o método de escolha da representação correta. Esse último software foi unificado ao sintetizador desenvolvido por Koike et al. (2007) para a criação de um tradutor texto-voz para a língua portuguesa. O trabalho mostra a viabilidade da utilização de autômatos adaptativos como base ou como um elemento auxiliar para o processo de tradução texto-voz na língua portuguesa. / This work presents a study on the use of adaptive devices for text-to-speech translation. The work focuses on the development of a grapheme-phoneme translation method for Portuguese based on Adaptive Automata and the use of this method in a text-to-speech translation software. The presented method resembles human behavior when handling syllable separation rules, syllable stress definition and influences syllables have on each other. This feature makes the method easy to use with different variations of Portuguese, since these characteristics are invariants of the language. Portuguese spoken nowadays in São Paulo, Brazil has been chosen as the target for analysis and tests in this work. The method has good results for such variation of Portuguese, reaching 95% accuracy rate for grapheme-phoneme translation, clearing the 90% mark after resolution of ambiguous cases in which different representations are accepted for a grapheme and generating phonetic output intelligible for native speakers based on concatenation synthesis using syllables as concatenation units. As final results of this work, a model is presented for grapheme-phoneme translation for Portuguese words based on Adaptive Automata, a methodology to choose the correct phonetic representation for the grapheme in ambiguous cases, a software for Adaptive Automata simulation and a software for grapheme-phoneme translation of texts using both the model of translation and methodology for disambiguation. The latter software was unified with the speech synthesizer developed by Koike et al. (2007) to create a text-to-speech translator for Portuguese. This work evidences the feasibility of text-to-speech translation for Portuguese using Adaptive Automata as the main instrument for such task.
67

A Research Bed For Unit Selection Based Text To Speech Synthesis System

Konakanchi, Parthasarathy 02 1900 (has links) (PDF)
After trying Festival Speech Synthesis System, we decided to develop our own TTS framework, conducive to perform the necessary research experiments for developing good quality TTS for Indian languages. In most of the attempts on Indian language TTS, there is no prosody model, provision for handling foreign language words and no phrase break prediction leading to the possibility of introducing appropriate pauses in the synthesized speech. Further, in the Indian context, there is a real felt need for a bilingual TTS, involving English, along with the Indian language. In fact, it may be desirable to also have a trilingual TTS, which can also take care of the language of the neighboring state or Hindi, in addition. Thus, there is a felt need for a full-fledged TTS development framework, which lends itself for experimentation involving all the above issues and more. This thesis work is therefore such a serious attempt to develop a modular, unit selection based TTS framework. The developed system has been tested for its effectiveness to create intelligible speech in Tamil and Kannada. The created system has also been used to carry out two research experiments on TTS. The first part of the work is the design and development of corpus-based concatenative Tamil speech synthesizer in Matlab and C. A synthesis database has been created with 1027 phonetically rich, pre-recorded sentences, segmented at the phone level. From the sentence to be synthesized, specifications of the required target units are predicted. During synthesis, database units are selected that best match the target specification according to a distance metric and a concatenation quality metric. To accelerate matching, the features of the end frames of the database units have been precomputed and stored. The selected units are concatenated to produce synthetic speech. The high values of the obtained mean opinion scores for the TTS output reveal that speech synthesized using our TTS is intelligible and acceptably natural and can possibly be put to commercial use with some additional features. Experiments carried out by others using my TTS framework have shown that, whenever the required phonetic context is not available in the synthesis database., similar phones that are perceptually indistinguishable may be substituted. The second part of the work deals with the design and modification of the developed TTS framework to be embedded in mobile phones. Commercial GSM FR, EFR and AMR speech codecs are used for compressing our synthesis database. Perception experiments reveal that speech synthesized using a highly compressed database is reasonably natural. This holds promise in the future to read SMSs and emails on mobile phones in Indian languages. Finally, we observe that incorporating prosody and pause models for Indian language TTS would further enhance the quality of the synthetic speech. These are some of the potential, unexplored areas ahead, for research in speech synthesis in Indian languages.
68

Evaluating Multi-Uav System with Text to Spech for Sitational Awarness and Workload

Lindgren, Viktor January 2021 (has links)
With improvements to miniaturization technologies, the ratio between operators required per UAV has become increasingly smaller at the cost of increased workload. Workload is an important factor to consider when designing the multi-UAV systems of tomorrow as too much workload may decrease an operator's performance. This study proposes the use of text to speech combined with an emphasis on a single screen design as a way of improving situational awareness and perceived workload. A controlled experiment consisting of 18 participants was conducted inside a simulator. Their situational awareness and perceived workload was measured using SAGAT and NASA-TLX respectively. The results show that the use of text to speech lead to a decrease in situational awareness for all elements inside the graphical user interface that were not directly handled by a text to speech event. All of the NASA-TLX measurements showed an improvement in perceived workload except for physical demand. Overall an improvement of perceived workload was observed when text to speech was in use.
69

Hlasem ovládaný elektronický zubní kříž / Voice controled electronic health record in dentistry

Hippmann, Radek January 2012 (has links)
Title: Voice controlled electronic health record in dentistry Author: MUDr. Radek Hippmann Department: Department of paediatric stomatology, Faculty hospital Motol Supervisor: Prof. MUDr. Taťjana Dostalová, DrSc., MBA Supervisor's e-mail: Tatjana.Dostalova@fnmotol.cz This PhD thesis is concerning with development of the complex electronic health record (EHR) for the field of dentistry. This system is also enhanced with voice control based on the Automatic speech recognition (ASR) system and module for speech synthesis Text-to- speech (TTS). In the first part of the thesis is described the whole issue and are defined particular areas, whose combination is essential for EHR system creation in this field. It is mainly basic delimiting of terms and areas in the dentistry. In the next step we are engaged in temporomandibular joint (TMJ) problematic, which is often ignored and trends in EHR and voice technologies are also described. In the methodological part are described delineated technologies used during the EHR system creation, voice recognition and TMJ disease classification. Following part incorporates results description, which are corresponding with the knowledge base in dentistry and TMJ. From this knowledge base originates the graphic user interface DentCross, which is serving for dental data...
70

Grapheme-to-phoneme transcription of English words in Icelandic text

Ármannsson, Bjarki January 2021 (has links)
Foreign words, such as names, locations or sometimes entire phrases, are a problem for any system that is meant to convert graphemes to phonemes (g2p; i.e.converting written text into phonetic transcription). In this thesis, we investigate both rule-based and neural methods of phonetically transcribing English words found in Icelandic text, taking into account the rules and constraints of how foreign phonemes can be mapped into Icelandic phonology. We implement a rule-based system by compiling grammars into finite-state transducers. In deciding on which rules to include, and evaluating their coverage, we use a list of the most frequently-found English words in a corpus of Icelandic text. The output of the rule-based system is then manually evaluated and corrected (when needed) and subsequently used as data to train a simple bidirectional LSTM g2p model. We train models both with and without length and stress labels included in the gold annotated data. Although the scores for neither model are close to the state-of-the-art for either Icelandic or English, both our rule-based system and LSTM model show promising initial results and improve on the baseline of simply using an Icelandic g2p model, rule-based or neural, on English words. We find that the greater flexibility of the LSTM model seems to give it an advantage over our rule-based system when it comes to modeling certain phenomena. Most notable is the LSTM’s ability to more accurately transcribe relations between graphemes and phonemes for English vowel sounds. Given there does not exist much previous work on g2p transcription specifically handling English words within the Icelandic phonological constraints and it remains an unsolved task, our findings present a foundation for the development of further research, and contribute to improving g2p systems for Icelandic as a whole.

Page generated in 0.0781 seconds