Global ETD Search

321	Sociolinguistic Knowledge of Albanian Heritage Speakers in the U.S. Dickerson, Carly January 2021 (has links) No description available. Language Linguistics Families and Family Life Foreign Language Minority and Ethnic Groups Sociolinguistics Albanian heritage language heritage speaker social meaning acquisition immigrant language sociolinguistics
322	Automatic Speech Recognition in Somali Gabriel, Naveen January 2020 (has links) The field of speech recognition during the last decade has left the research stage and found its way into the public market, and today, speech recognition software is ubiquitous around us. An automatic speech recognizer understands human speech and represents it as text. Most of the current speech recognition software employs variants of deep neural networks. Before the deep learning era, the hybrid of hidden Markov model and Gaussian mixture model (HMM-GMM) was a popular statistical model to solve speech recognition. In this thesis, automatic speech recognition using HMM-GMM was trained on Somali data which consisted of voice recording and its transcription. HMM-GMM is a hybrid system in which the framework is composed of an acoustic model and a language model. The acoustic model represents the time-variant aspect of the speech signal, and the language model determines how probable is the observed sequence of words. This thesis begins with background about speech recognition. Literature survey covers some of the work that has been done in this field. This thesis evaluates how different language models and discounting methods affect the performance of speech recognition systems. Also, log scores were calculated for the top 5 predicted sentences and confidence measures of pre-dicted sentences. The model was trained on 4.5 hrs of voiced data and its corresponding transcription. It was evaluated on 3 mins of testing data. The performance of the trained model on the test set was good, given that the data was devoid of any background noise and lack of variability. The performance of the model is measured using word error rate(WER) and sentence error rate (SER). The performance of the implemented model is also compared with the results of other research work. This thesis also discusses why log and confidence score of the sentence might not be a good way to measure the performance of the resulting model. It also discusses the shortcoming of the HMM-GMM model, how the existing model can be improved, and different alternatives to solve the problem. automatic speech recognition speaker adaptation generative training gaussian mixture model kaldi finite-state transducers Probability Theory and Statistics Sannolikhetsteori och statistik
323	Living in two worlds : experiences of non-native english speakers in an accelerated second-degree baccalaureate nursing program Dudas, Kimberly 01 January 2014 (has links) Background: Students of diverse ethnic backgrounds, including nonnative English speakers, also known as those who speak English as an additional language (EAL) are increasingly enrolling in prelicensure nursing programs. Information regarding success of EAL nursing students is limited, with emphasis on traditional prelicensure programs. Purpose: The purpose of this study was to explore the lived experience of recent EAL graduates of an accelerated second-degree baccalaureate nursing program by offering a firsthand account of being an EAL student in this type of nursing program. Theoretical Framework: Leininger's Theory of Cultural Care Diversity and Universality and Vygotsky's Theory of Socio-Historical Learning served as the theoretical framework. Methods: The research tradition of hermeneutic phenomenology utilizing the van Manen approach was applied to this study. Results: The study revealed five major themes: bridging cultures, needing more time, myriad of emotions, network of support, and finding my way. Several subthemes emerged to support major themes illustrating the complexity of being an EAL student in a fast-paced and challenging program. Conclusions: Exploring experiences of EAL graduates while enrolled in an accelerated second-degree baccalaureate nursing program offers insight into the challenges faced by EAL students and potentially influences nursing education, practice, and policy to improve the numbers of diverse nurses. Health and environmental sciences Education Accelerated second-degree baccalaureate English as a second language English as an additional language Non-native english speaker Nursing student Nursing student lived experience Nursing
324	Optimalizace modelování gaussovských směsí v podprostorech a jejich skórování v rozpoznávání mluvčího / Optimization of Gaussian Mixture Subspace Models and Related Scoring Algorithms in Speaker Verification Glembek, Ondřej January 2012 (has links) Tato práce pojednává o modelování v podprostoru parametrů směsí gaussovských rozložení pro rozpoznávání mluvčího. Práce se skládá ze tří částí. První část je věnována skórovacím metodám při použití sdružené faktorové analýzy k modelování mluvčího. Studované metody se liší převážně v tom, jak se vypořádávají s variabilitou kanálu testovacích nahrávek. Metody jsou prezentovány v souvislosti s obecnou formou funkce pravděpodobnosti pro sdruženou faktorovou analýzu a porovnány jak z hlediska přesnosti, tak i z hlediska rychlosti. Je zde prokázáno, že použití lineární aproximace pravděpodobnostní funkce dává výsledky srovnatelné se standardním vyhodnocením pravděpodobnosti při dramatickém zjednodušení matematického zápisu a tím i zvýšení rychlosti vyhodnocování. Druhá část pojednává o extrakci tzv. i-vektorů, tedy nízkodimenzionálních reprezentací nahrávek. Práce prezentuje dva přístupy ke zjednodušení extrakce. Motivací pro tuto část bylo jednak urychlení extrakce i-vektorů, jednak nasazení této úspěšné techniky na jednoduchá zařízení typu mobilní telefon, a také matematické zjednodušení umožněňující využití numerických optimalizačních metod pro diskriminativní trénování. Výsledky ukazují, že na dlouhých nahrávkách je zrychlení vykoupeno poklesem úspěšnosti rozpoznávání, avšak na krátkých nahrávkách, kde je úspěšnost rozpoznávání nízká, se rozdíly úspěšnosti stírají. Třetí část se zabývá diskriminativním trénováním v oblasti rozpoznávání mluvčího. Jsou zde shrnuty poznatky z předchozích prací zabývajících se touto problematikou. Kapitola navazuje na poznatky z předchozích dvou částí a pojednává o diskriminativním trénování parametrů extraktoru i-vektorů. Výsledky ukazují, že při klasickém trénování extraktoru a následném diskriminatviním přetrénování tyto metody zvyšují úspěšnost.
325	Constructing a Gay Persona: A Sociophonetic Case Study of an LGBT Talk Show in Taiwan Pan, Junquan, Pan 10 December 2018 (has links) No description available. Linguistics Sociolinguistics Asian Studies Language the LGBT Community Gay Men Taiwan Mandarin Taiwan Chinese Linguistics Sociolinguistics Sociophonetics Speaker Design Indexicality Language Performance Identity Style Persona
326	Deep CASA for Robust Pitch Tracking and Speaker Separation Liu, Yuzhou January 2019 (has links) No description available. Computer Science Engineering Computational auditory scene analysis speech separation talker-independent speaker separation permutation invariant training pitch estimation multi-pitch estimation machine learning deep learning neural networks
327	Teknik för dokumentering avmöten och konferenser / Technology for documenting meetings and conferences Stojanovic, Milan January 2019 (has links) Documentation of meetings and conferences is performed at most companies by one or more people sitting at a computer and typing what has been said during the meeting. This may lead to typing mistakes or incorect perception by the person who records. The human factor is quite large. This work will focus on developing proposals for new technologies that reduce or eliminate the human factor, thus improving the documentation of meetings and conferences. It represents a problem for many companies and institutions, including Seavus Stockholm, where this study is conducted. It is assumed that most of the companies do not document their meetings and conferences in video or audio format, so this study will therefore only be about text-based documentation.The aim of this study was to investigate how to implement new features and build a modern conference system, using modern technologies and new applications to improve the documentation of meetings and conferences. Speech to text in combination with speech recognition is something that has not yet been implemented for such a purpose, and it can facilitate documenting meetings and conferences.To complete the study, several methods were combined to achieve the desired goals. First, the projects scope and objectives were defined. Then, based on analysis of the observations made in the company documenting process, a design proposal was created. Following this, interviews with the stakeholders were conducted where the proposals were presented and a requirement specification was created. Then the theory was studied to create an understanding of how different techniques work to then design and create a proposal for the architecture.The result of this study contains a proposal for architecture that shows that it is possible to implement these techniques to improve the documentation process. Furthermore, possible use cases and interaction diagrams are presented that show how the system may work.Although the proof of the concept is considered to be satisfactory, additional work and testing is needed to fully implement and integrate the concept into reality. / Dokumentering av möten och konferenser utförs på de flesta företag av en eller flera personer som sitter vid en dator och antecknar det som har sagts under mötet. Det kan medföra att det som skrivs ner inte stämmer med det som har sagts eller att det uppfattades felaktigt av personen som antecknar. Den mänskliga faktorn är ganska stor. Detta arbete kommer att fokusera på att ta fram förslag på nya tekniker som minskar eller eliminerar den mänskliga faktorn, och därmed förbättrar dokumenteringen av möten och konferenser. Det föreställer ett problem för många företag och institutioner, däribland för Seavus Stockholm, där denna studie utförs. Det antas att de flesta företag inte dokumenterar deras möten och konferenser i video eller ljudformat, och därmed kommer denna studie bara att handla om dokumentering i textformat.Målet med denna studie var att undersöka hur man, med hjälp av moderna tekniker och nya tillämpningar, kan implementera nya funktioner och bygga ett modernt konferenssystem, för att förbättra dokumenteringen av möten och konferenser. Tal till text i kombination med talarigenkänning är något som ännu inte har implementerats för ett sådant ändamål, och det kan underlätta dokumenteringen av möten och konferenser.För att slutföra studien kombinerades flera metoder för att uppnå de önskade målen.Först definierades projektens omfattning och mål. Därefter, baserat på analys och observationer av företagets dokumenteringsprocess, skapades ett designförslag. Därefter genomfördes intervjuer med intressenterna där förslagen presenterades och en kravspecifikation skapades. Då studerades teorin för att skapa förståelse för hur olika tekniker arbetar, för att sedan designa och skapa ett förslag till arkitekturen.Resultatet av denna studie innehåller ett förslag till arkitektur, som visar att det är möjligt att implementera dessa tekniker för att förbättra dokumentationsprocessen. Dessutom presenteras möjliga användningsfall och interaktionsdiagram som visar hur systemet kan fungera.Även om beviset av konceptet anses vara tillfredsställande, ytterligare arbete och test behövs för att fullt ut implementera och integrera konceptet i verkligheten. Speech-to-text Speaker recognition Software development Architechture Design Conference tool Tal-till-text Talarigenkänning Systemutveckling Arkitektur Design Konferensverktyg Computer and Information Sciences Data- och informationsvetenskap
328	Penetration testing of a smart speaker / Penetrationstestning av en smart högtalare Nouiser, Amin January 2023 (has links) Smart speakers are becoming increasingly ubiquitous. Previous research has studied the security of these devices; however, only some studies have employed a penetration testing methodology. Moreover, most studies have only investigated models by well-known brands such as the Amazon or Google. Therefore, there is a research gap of penetration tests on less popular smart speaker models. This study aims to address this gap by conducting a penetration test on the less popular JBL Link Music with firmware version 23063250. The results show that the speaker is subject to several security threats and is vulnerable to some attacks. The Bluetooth Low Energy implementation is vulnerable to passive eavesdropping. Additionally, the speaker is vulnerable to an 802.11 denial of service attack, and a boot log containing sensitive information can be accessed through a serial communication interface. It is concluded that the speaker is, in some aspects, insecure. / Smarta högtalare blir alltmer närvarande. Tidigare forskning har undersökt säkerheten kring dessa, dock har endast några använt en penetrerings testnings metolologi. Därutover har de flesta studier endast studerat modeller av välkända varumärken som Google eller Amazon. Därmed finns en vetenskaplig kunskapslucka kring penetrationstester av mindre populära modeller. Denna studie syftar till att bemöta denna lucka genom att utföra ett penetrationstest av den mindre populära JBL Link Music med mjukvaruversion 23063250. Resultaten visar att högtalaren är utsatt för flera säkerhetshot och är sårbar för några attacket. Bluetooth Low Energy implementationen är sårbar för passiv avlyssning. Därutöver är högtalaren sårbar för en 802.11 denial of service attack och en boot logg innehållande känslig information kan nås genom ett seriellt kommunikations gränssnitt. Slutsatsen dras att högtalaren, i vissa aspekter, är osäker. Penetration testing Ethical hacking Smart speaker Cybersecurity Internet of Things Penetrationstestning Etisk hackning Smart högtalare Cybersäkerhet Sakernas Internet Computer and Information Sciences Data- och informationsvetenskap
329	Intercultural Sensitivity in First-Generation College Students Hunkler, Cassidi L. 05 June 2023 (has links) No description available. Educational Technology Sociolinguistics Communication English As A Second Language first-generation college student non-native english speaker intercultural sensitivity intercultural communication intercultural competency international instructor international teaching assistant
330	Experiments in speaker diarization using speaker vectors / Experiment med talarvektorer för diarisering Cui, Ming January 2021 (has links) Speaker Diarization is the task of determining ‘who spoke when?’ in an audio or video recording that contains an unknown amount of speech and also an unknown number of speakers. It has emerged as an increasingly important and dedicated domain of speech research. Initially, it was proposed as a research topic related to automatic speech recognition, where speaker diarization serves as an upstream processing step. Over recent years, however, speaker diarization has become an important key technology for many tasks, such as navigation, retrieval, or higher-level inference on audio data. Our research focuses on the existing speaker diarization algorithms. Particularly, the thesis targets the differences between supervised and unsupervised methods. The aims of this thesis is to check the state-of-the-art algorithms and analyze which algorithm is most suitable for our application scenarios. Its main contributions are (1) an empirical study of speaker diarization algorithms; (2) appropriate corpus data pre-processing; (3) audio embedding network for creating d-vectors; (4) experiments on different algorithms and corpus and comparison of them; (5) a good recommendation for our requirements. The empirical study shows that, for embedding extraction module, due to the neural networks can be trained with big datasets, the diarization performance can be significantly improved by replacing i-vectors with d-vectors. Moreover, the differences between supervised methods and unsupervised methods are mostly in clustering module. The thesis only uses d-vectors as the input of diarization network and selects two main algorithms as compare objects: Spectral Clustering represents unsupervised method and Unbounded Interleaved-state Recurrent Neural Network (UIS-RNN) represents supervised method. / talardiarisering är uppgiften att bestämma ”vem talade när?” i en ljud- eller videoinspelning som innehåller en okänd mängd tal och även ett okänt antal talare. Det har framstått som en allt viktigare och dedikerad domän inom talforskning. Ursprungligen föreslogs det som ett forskningsämne relaterat till automatisk taligenkänning, där talardiarisering fungerar som ett processteg upströms. Under de senaste åren har dock talardiarisering blivit en viktig nyckelteknik för många uppgifter, till exempel navigering, hämtning, eller högre nivå slutledning på ljuddata. Vår forskning fokuserar på de befintliga algoritmerna för talare diarisering. Speciellt riktar sig avhandlingen på skillnaderna mellan övervakade och oövervakade metoder. Syftet med denna avhandling är att kontrollera de mest avancerade algoritmerna och analysera vilken algoritm som passar bäst för våra applikationsscenarier. Dess huvudsakliga bidrag är (1) en empirisk studie av algoritmer för talare diarisering; (2) lämplig förbehandling av corpusdata, (3) ljudinbäddningsnätverk för att skapa d-vektorer; (4) experiment på olika algoritmer och corpus och jämförelse av dem; (5) en bra rekommendation för våra krav. Den empiriska studien visar att för inbäddning av extraktionsmodul, på grund av de neurala nätverkna kan utbildas med stora datamängder, diariseringsprestandan kan förbättras avsevärt genom att ersätta i-vektorer med dvektorer. Dessutom är skillnaderna mellan övervakade metoder och oövervakade metoder mestadels i klustermodulen. Avhandlingen använder endast dvektorer som ingång till diariseringsnätverk och väljer två huvudalgoritmer som jämförobjekt: Spektralkluster representerar oövervakad metod och obegränsat återkommande neuralt nätverk (UIS-RNN) representerar övervakad metod. Speaker Diarization Embedding Extraction Module Deep Learning Supervised method Unsupervised method Talardiarisering inbäddning av extraktionsmodul djupinlärning övervakad metod oövervakad metod Computer and Information Sciences Data- och informationsvetenskap

Search results