Global ETD Search

1	Effects of Native-English Computer-Assisted Pronunciation Training in an Online Hybrid Learning Environment Singh, Bikram Kumar 07 1900 (has links) The purpose of this dissertation was to understand and compare the effect of training non-native English Speaking (NNES) learners (N = 480) in two distinct learning environments, (i) traditional face-to-face and (ii) online synchronous hybrid learning (SHL). In the traditional training mode, NNES learners (n = 360) were trained by NNES voice and accent (VANC) trainers in a physical, face-to-face setting. In the second, CAPT+SHL training mode, the NNES learners were trained by NNES VANC trainers with the help of a native-English computer-assisted pronunciation training (CAPT) tool in an online SHL environment. Factor analysis, higher-order factor analysis, hierarchical cluster analysis, and multidimensional scaling yielded a reliable scale, Eddie's Voice Test (EVT). Multiple regression yielded a predictive model between NNES pronunciation and their performance. In addition, the CAPT+SHL training mode produced higher scores on pronunciation and performance than the traditional training mode, suggesting a combination of NES and NNES VANC instructors are more effective in training NNES learners than NNES instructors by themselves. The case study (n = 3) on VANC trainers' perception of CAPT and SHL yielded three themes: (1) challenges with synchronous hybrid learning (sub-themes include physical challenges, social challenges, and cognitive challenges); (2) computer-assisted pronunciation training (CAPT) impact on non-native-English-speaking (NNES) learner pronunciation (sub-themes include self-paced pronunciation learning and pronunciation benchmarking; and (3). SHL as an equitable learning environment. Synchronous Hybrid Learning Computer Assisted Pronunciation Training Non-Native English Customer Experience Education, Technology Language, Linguistics Statistics
2	The Virtual Language Teacher : Models and applications for language learning using embodied conversational agents Wik, Preben January 2011 (has links) This thesis presents a framework for computer assisted language learning using a virtual language teacher. It is an attempt at creating, not only a new type of language learning software, but also a server-based application that collects large amounts of speech material for future research purposes.The motivation for the framework is to create a research platform for computer assisted language learning, and computer assisted pronunciation training.Within the thesis, different feedback strategies and pronunciation error detectors are exploredThis is a broad, interdisciplinary approach, combining research from a number of scientific disciplines, such as speech-technology, game studies, cognitive science, phonetics, phonology, and second-language acquisition and teaching methodologies.The thesis discusses the paradigm both from a top-down point of view, where a number of functionally separate but interacting units are presented as part of a proposed architecture, and bottom-up by demonstrating and testing an implementation of the framework. / QC 20110511 Language learning embodied conversational agents speech technology computer assisted language learning computer assisted pronunciation training Information technology Informationsteknik
3	ComPron : Learning Pronunciation through Building Associations between Native Language and Second Language Speech Sounds Lessing, Sara January 2020 (has links) Current computer-assisted pronunciation training (CAPT) tools are too focused on what technologies can do, rather than focusing on learner needs and pedagogy. They also lack an embodied perspective on learning. This thesis presents a Research through Design project exploring what kind of interactive design features can support second language learners’ pronunciation learning of segmental speech sounds with embodiment in mind. ComPron was designed: an open simulated prototype that supports learners in learning perception and production of new segmental speech sounds in a second language, by comparing them to native language speech sounds. ComProm was evaluated through think-aloud user tests and semi-structured interviews (N=4). The findings indicate that ComPron supports awareness of speech sound-movement connections, association building between sounds, and production of sounds. The design features that enabled awareness, association building, and speech sound production support are discussed and what ComPron offers in comparison to other CAPT-tools. Research through Design (RtD) Co-design Embodied Learning Human Computer Interaction
4	A computational model for studying L1’s effect on L2 speech learning January 2018 (has links) abstract: Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. To test the hypotheses, this study comes up with a computational model to analyze the accented speech properties in both segmental (short-term speech measurements on short-segment or phoneme level) and suprasegmental (long-term speech measurements on word, long-segment, or sentence level) feature space. The benefit of using a computational model is that it enables quantitative analysis of L1's effect on accent in terms of different phonological properties. The core parts of this computational model are feature extraction schemes to extract pronunciation and prosody representation of accented speech based on existing techniques in speech processing field. Correlation analysis on both segmental and suprasegmental feature space is conducted to look into the relationship between acoustic measurements related to L1s and perceived accentedness across several L1s. Multiple regression analysis is employed to investigate how the L1's effect impacts the perception of foreign accent, and how accented speech produced by speakers from different L1s behaves distinctly on segmental and suprasegmental feature spaces. Results unveil the potential application of the methodology in this study to provide quantitative analysis of accented speech, and extend current studies in L2 speech learning theory to large scale. Practically, this study further shows that the computational model proposed in this study can benefit automatic accentedness evaluation system by adding features related to speakers' L1s. / Dissertation/Thesis / Doctoral Dissertation Speech and Hearing Science 2018 Linguistics Electrical engineering Computer engineering Automatic speech recognition Computational model Computer-aided language learning Computer-assisted pronunciation training L1 and L2 interaction L2 speech learning
5	Mispronunciation Detection with SpeechBlender Data Augmentation Pipeline / Uttalsfelsdetektering med SpeechBlender data-förstärkning Elkheir, Yassine January 2023 (has links) The rise of multilingualism has fueled the demand for computer-assisted pronunciation training (CAPT) systems for language learning, CAPT systems make use of speech technology advancements and offer features such as learner assessment and curriculum management. Mispronunciation detection (MD) is a crucial aspect of CAPT, aimed at identifying and correcting mispronunciations in second language learners’ speech. One of the significant challenges in developing MD models is the limited availability of labeled second-language speech data. To overcome this, the thesis introduces SpeechBlender - a fine-grained data augmentation pipeline designed to generate mispronunciations. The SpeechBlender targets different regions of a phonetic unit and blends raw speech signals through linear interpolation, resulting in erroneous pronunciation instances. This method provides a more effective sample generation compared to traditional cut/paste methods. The thesis explores also the use of pre-trained automatic speech recognition (ASR) systems for mispronunciation detection (MD), and examines various phone-level features that can be extracted from pre-trained ASR models and utilized for MD tasks. An deep neural model was proposed, that enhance the representations of extracted acoustic features combined with positional phoneme embeddings. The efficacy of the augmentation technique is demonstrated through a phone-level pronunciation quality assessment task using only non-native good pronunciation speech data. Our proposed technique achieves state-of-the-art results, with Speechocean762 Dataset [54], on ASR dependent MD models at phoneme level, with a 2.0% gain in Pearson Correlation Coefficient (PCC) compared to the previous state-of-the-art [17]. Additionally, we demonstrate a 5.0% improvement at the phoneme level compared to our baseline. In this thesis, we developed the first Arabic pronunciation learning corpus for Arabic AraVoiceL2 to demonstrate the generality of our proposed model and augmentation technique. We used the corpus to evaluate the effectiveness of our approach in improving mispronunciation detection for non-native Arabic speakers learning. Our experiments showed promising results, with a 4.6% increase in F1-score for the Arabic AraVoiceL2 testset, demonstrating the effectiveness of our model and augmentation technique in improving pronunciation learning for non-native speakers of Arabic. / Den ökande flerspråkigheten har ökat efterfrågan på datorstödda CAPT-system (Computer-assisted pronunciation training) för språkinlärning. CAPT-systemen utnyttjar taltekniska framsteg och erbjuder funktioner som bedömning av inlärare och läroplanshantering. Upptäckt av felaktigt uttal är en viktig aspekt av CAPT som syftar till att identifiera och korrigera felaktiga uttal i andraspråkselevernas tal. En av de stora utmaningarna när det gäller att utveckla MD-modeller är den begränsade tillgången till märkta taldata för andraspråk. För att övervinna detta introduceras SpeechBlender i avhandlingen - en finkornig dataförstärkningspipeline som är utformad för att generera feluttalningar. SpeechBlender är inriktad på olika regioner i en fonetisk enhet och blandar råa talsignaler genom linjär interpolering, vilket resulterar i felaktiga uttalsinstanser. Denna metod ger en effektivare provgenerering jämfört med traditionella cut/paste-metoder. I avhandlingen undersöks användningen av förtränade system för automatisk taligenkänning (ASR) för upptäckt av felaktigt uttal. I avhandlingen undersöks olika funktioner på fonemnivå som kan extraheras från förtränade ASR-modeller och användas för att upptäcka felaktigt uttal. En LSTM-modell föreslogs som förbättrar representationen av extraherade akustiska egenskaper i kombination med positionella foneminbäddningar. Effektiviteten hos förstärkning stekniken demonstreras genom en uppgift för bedömning av uttalskvaliteten på fonemnivå med hjälp av taldata som endast innehåller taldata som inte är av inhemskt ursprung och som ger ett bra uttal, Vår föreslagna teknik uppnår toppresultat med Speechocean762-dataset [54], på ASR-beroende modeller för upptäckt av felaktigt uttal på fonemnivå, med en ökning av Pearsonkorrelationskoefficienten (PCC) med 2,0% jämfört med den tidigare toppnivån [17]. Dessutom visar vi en förbättring på 5,0% på fonemnivå jämfört med vår baslinje. Vi observerade också en ökning av F1-poängen med 4,6% med arabiska AraVoiceL2-testset. Automatic Speech Recognition (ASR) Datorstödd uttalsträning (CAPT) automatisk taligenkänning (ASR) Elektroteknik och elektronik

1

Page generated in 0.1529 seconds