• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 170
  • 40
  • 33
  • 30
  • 14
  • 10
  • 9
  • 8
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • Tagged with
  • 391
  • 104
  • 101
  • 86
  • 80
  • 47
  • 39
  • 33
  • 32
  • 31
  • 30
  • 30
  • 28
  • 28
  • 27
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
331

The Rhetoric of Comparison in the YMCA: Belletristic Rhetoric and the Native Speaker Ideal

Cummings, Lance 23 July 2014 (has links)
No description available.
332

On Generalization of Supervised Speech Separation

Chen, Jitong 30 August 2017 (has links)
No description available.
333

Deictic Reference: Arabs vs. Arab Americans

Esseili, Fatima A. 20 June 2006 (has links)
No description available.
334

“It’s quite usual that the pupils ask if they have to speak with British or American pronunciation” : A Qualitative Study Concerning the Role of Pronunciation in the Swedish Upper Secondary ELT Classroom / "Det är ganska vanligt att eleverna frågor om de måste tala med brittiskt eller amerikanskt uttal" : En Kvalitativ Studie om Uttalets Roll i den Svenska Gymnasieskolans Engelskundervisning

Löf, Hanna January 2024 (has links)
This study explores how upper secondary school English teachers in Sweden view English pronunciation in the classroom, particularly in the context of English as a global language and the native speaker ideal. The results are analysed from a sociocultural perspective, highlighting contextual and cultural aspects of pronunciation teaching. Despite its importance for effective communication in a global context, the current English syllabi provide limited directives on teaching the productive skill of pronunciation. The study is based on eight semi-structured interviews with Swedish upper secondary school English teachers. Three research questions were formulated and addressed in the results section concerning the teachers’ perspectives on English pronunciation, their methods and strategies, and attitudes toward language varieties and the native speaker ideal. Furthermore, thematic analysis was utilised to identify themes that corresponded to the aim of the study. The findings indicate that pronunciation is primarily addressed alongside receptive skills, and the teachers prioritise communicative ability over adherence to a specific pronunciation norm. Teachers’ emphasis on pronunciation varies according to their pupils’ needs. While the teachers recognise the importance of exposing pupils to diverse English varieties, 60% give inner-circle forms as examples. Furthermore, the teachers perceive that pupils often aspire to sound native, yet the teachers accept different English varieties in the classroom. The study discusses the use of a Swedish variety of English as both an opportunity and a challenge. Moreover, various teaching strategies are reported, with reading aloud being employed by half of the teachers and a phonetic approach by two. The findings suggest that pronunciation teaching is a context-sensitive practice influenced by ongoing negotiations between teachers and pupils in the specific classroom.
335

Particularity, practicality and possibility: an investigation into the awareness and use of communicative language teaching methodology in a college of higher education in Oman

McLean, Alistair Charles 16 September 2011 (has links)
This study investigates awareness and use of communicative language teaching methodology (CLT) in a foundation programme at an institution of higher learning in the Sultanate of Oman, where rapid expansion and a reliance on expatriate skills has resulted in the employment of predominantly native English teachers, many with inadequate formal teacher training. The qualitative research methodology employed involved a core of five teachers using three data-gathering instruments and ten additional English language teachers who responded to a questionnaire. The study finds that the majority of teachers have inadequate knowledge of the CLT approach and do not use it in the classroom. The findings suggest that an adapted version of CLT which embraces local contextual and sociocultural conditions may be pedagogically viable. The study draws comparisons between the idea of a hypothetical, “adapted” version of CLT and the notions of “particularity, practicality and possibility” as suggested by Kumaravadivelu (2006). / English Studies / M.A. (Specialisation in Teaching English to Speakers of Other Languages, TESOL)
336

Use of Coherent Point Drift in computer vision applications

Saravi, Sara January 2013 (has links)
This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration.
337

Speaker adaptation of deep neural network acoustic models using Gaussian mixture model framework in automatic speech recognition systems / Utilisation de modèles gaussiens pour l'adaptation au locuteur de réseaux de neurones profonds dans un contexte de modélisation acoustique pour la reconnaissance de la parole

Tomashenko, Natalia 01 December 2017 (has links)
Les différences entre conditions d'apprentissage et conditions de test peuvent considérablement dégrader la qualité des transcriptions produites par un système de reconnaissance automatique de la parole (RAP). L'adaptation est un moyen efficace pour réduire l'inadéquation entre les modèles du système et les données liées à un locuteur ou un canal acoustique particulier. Il existe deux types dominants de modèles acoustiques utilisés en RAP : les modèles de mélanges gaussiens (GMM) et les réseaux de neurones profonds (DNN). L'approche par modèles de Markov cachés (HMM) combinés à des GMM (GMM-HMM) a été l'une des techniques les plus utilisées dans les systèmes de RAP pendant de nombreuses décennies. Plusieurs techniques d'adaptation ont été développées pour ce type de modèles. Les modèles acoustiques combinant HMM et DNN (DNN-HMM) ont récemment permis de grandes avancées et surpassé les modèles GMM-HMM pour diverses tâches de RAP, mais l'adaptation au locuteur reste très difficile pour les modèles DNN-HMM. L'objectif principal de cette thèse est de développer une méthode de transfert efficace des algorithmes d'adaptation des modèles GMM aux modèles DNN. Une nouvelle approche pour l'adaptation au locuteur des modèles acoustiques de type DNN est proposée et étudiée : elle s'appuie sur l'utilisation de fonctions dérivées de GMM comme entrée d'un DNN. La technique proposée fournit un cadre général pour le transfert des algorithmes d'adaptation développés pour les GMM à l'adaptation des DNN. Elle est étudiée pour différents systèmes de RAP à l'état de l'art et s'avère efficace par rapport à d'autres techniques d'adaptation au locuteur, ainsi que complémentaire. / Differences between training and testing conditions may significantly degrade recognition accuracy in automatic speech recognition (ASR) systems. Adaptation is an efficient way to reduce the mismatch between models and data from a particular speaker or channel. There are two dominant types of acoustic models (AMs) used in ASR: Gaussian mixture models (GMMs) and deep neural networks (DNNs). The GMM hidden Markov model (GMM-HMM) approach has been one of the most common technique in ASR systems for many decades. Speaker adaptation is very effective for these AMs and various adaptation techniques have been developed for them. On the other hand, DNN-HMM AMs have recently achieved big advances and outperformed GMM-HMM models for various ASR tasks. However, speaker adaptation is still very challenging for these AMs. Many adaptation algorithms that work well for GMMs systems cannot be easily applied to DNNs because of the different nature of these models. The main purpose of this thesis is to develop a method for efficient transfer of adaptation algorithms from the GMM framework to DNN models. A novel approach for speaker adaptation of DNN AMs is proposed and investigated. The idea of this approach is based on using so-called GMM-derived features as input to a DNN. The proposed technique provides a general framework for transferring adaptation algorithms, developed for GMMs, to DNN adaptation. It is explored for various state-of-the-art ASR systems and is shown to be effective in comparison with other speaker adaptation techniques and complementary to them.
338

Particularity, practicality and possibility: an investigation into the awareness and use of communicative language teaching methodology in a college of higher education in Oman

McLean, Alistair Charles 16 September 2011 (has links)
This study investigates awareness and use of communicative language teaching methodology (CLT) in a foundation programme at an institution of higher learning in the Sultanate of Oman, where rapid expansion and a reliance on expatriate skills has resulted in the employment of predominantly native English teachers, many with inadequate formal teacher training. The qualitative research methodology employed involved a core of five teachers using three data-gathering instruments and ten additional English language teachers who responded to a questionnaire. The study finds that the majority of teachers have inadequate knowledge of the CLT approach and do not use it in the classroom. The findings suggest that an adapted version of CLT which embraces local contextual and sociocultural conditions may be pedagogically viable. The study draws comparisons between the idea of a hypothetical, “adapted” version of CLT and the notions of “particularity, practicality and possibility” as suggested by Kumaravadivelu (2006). / English Studies / M.A. (Specialisation in Teaching English to Speakers of Other Languages, TESOL)
339

Traitement neuronal des voix et familiarité : entre reconnaissance et identification du locuteur

Plante-Hébert, Julien 12 1900 (has links)
La capacité humaine de reconnaitre et d’identifier de nombreux individus uniquement grâce à leur voix est unique et peut s’avérer cruciale pour certaines enquêtes. La méconnaissance de cette capacité jette cependant de l’ombre sur les applications dites « légales » de la phonétique. Le travail de thèse présenté ici a comme objectif principal de mieux définir les différents processus liés au traitement des voix dans le cerveau et les paramètres affectant ce traitement. Dans une première expérience, les potentiels évoqués (PÉs) ont été utilisés pour démontrer que les voix intimement familières sont traitées différemment des voix inconnues, même si ces dernières sont fréquemment répétées. Cette expérience a également permis de mieux définir les notions de reconnaissance et d’identification de la voix et les processus qui leur sont associés (respectivement les composantes P2 et LPC). Aussi, une distinction importante entre la reconnaissance de voix intimement familières (P2) et inconnues, mais répétées (N250) a été observée. En plus d’apporter des clarifications terminologiques plus-que-nécessaires, cette première étude est la première à distinguer clairement la reconnaissance et l’identification de locuteurs en termes de PÉs. Cette contribution est majeure, tout particulièrement en ce qui a trait aux applications légales qu’elle recèle. Une seconde expérience s’est concentrée sur l’effet des modalités d’apprentissage sur l’identification de voix apprises. Plus spécifiquement, les PÉs ont été analysés suite à la présentation de voix apprises à l’aide des modalités auditive, audiovisuelle et audiovisuelle interactive. Si les mêmes composantes (P2 et LPC) ont été observées pour les trois conditions d’apprentissage, l’étendue de ces réponses variait. L’analyse des composantes impliquées a révélé un « effet d’ombrage du visage » (face overshadowing effect, FOE) tel qu’illustré par une réponse atténuée suite à la présentation de voix apprise à l’aide d’information audiovisuelle par rapport celles apprises avec dans la condition audio seulement. La simulation d’interaction à l’apprentissage à quant à elle provoqué une réponse plus importante sur la LPC en comparaison avec la condition audiovisuelle passive. De manière générale, les données rapportées dans les expériences 1 et 2 sont congruentes et indiquent que la P2 et la LPC sont des marqueurs fiables des processus de reconnaissance et d’identification de locuteurs. Les implications fondamentales et en phonétique légale seront discutées. / The human ability to recognize and identify speakers by their voices is unique and can be critical in criminal investigations. However, the lack of knowledge on the working of this capacity overshadows its application in the field of “forensic phonetics”. The main objective of this thesis is to characterize the processing of voices in the human brain and the parameters that influence it. In a first experiment, event related potentials (ERPs) were used to establish that intimately familiar voices are processed differently from unknown voices, even when the latter are repeated. This experiment also served to establish a clear distinction between neural components of speaker recognition and identification supported by corresponding ERP components (respectively the P2 and the LPC). An essential contrast between the processes underlying the recognition of intimately familiar voices (P2) and that of unknown but previously heard voices (N250) was also observed. In addition to clarifying the terminology of voice processing, the first study in this thesis is the first to unambiguously distinguish between speaker recognition and identification in terms of ERPs. This contribution is major, especially when it comes to applications of voice processing in forensic phonetics. A second experiment focused more specifically on the effects of learning modalities on later speaker identification. ERPs to trained voices were analysed along with behavioral responses of speaker identification following a learning phase where participants were trained on voices in three modalities : audio only, audiovisual and audiovisual interactive. Although the ERP responses for the trained voices showed effects on the same components (P2 and LPC) across the three training conditions, the range of these responses varied. The analysis of these components first revealed a face overshadowing effect (FOE) resulting in an impaired encoding of voice information. This well documented effect resulted in a smaller LPC for the audiovisual condition compared to the audio only condition. However, effects of the audiovisual interactive condition appeared to minimize this FOE when compared to the passive audiovisual condition. Overall, the data presented in both experiments is generally congruent and indicate that the P2 and the LPC are reliable electrophysiological markers of speaker recognition and identification. The implications of these findings for current voice processing models and for the field of forensic phonetics are discussed.
340

Evaluation of Methods for Sound Source Separation in Audio Recordings Using Machine Learning

Gidlöf, Amanda January 2023 (has links)
Sound source separation is a popular and active research area, especially with modern machine learning techniques. In this thesis, the focus is on single-channel separation of two speakers into individual streams, and specifically considering the case where two speakers are also accompanied by background noise. There are different methods to separate speakers and in this thesis three different methods are evaluated: the Conv-TasNet, the DPTNet, and the FaSNetTAC.  The methods were used to train models to perform the sound source separation. These models were evaluated and validated through three experiments. Firstly, previous results for the chosen separation methods were reproduced. Secondly, appropriate models applicable for NFC's datasets and applications were created, to fulfill the aim of this thesis. Lastly, all models were evaluated on an independent dataset, similar to datasets from NFC. The results were evaluated using the metrics SI-SNRi and SDRi. This thesis provides recommended models and methods suitable for NFC applications, especially concluding that the Conv-TasNet and the DPTNet are reasonable choices.

Page generated in 0.1193 seconds