• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 54
  • 11
  • 6
  • 4
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 107
  • 42
  • 33
  • 30
  • 22
  • 21
  • 15
  • 15
  • 14
  • 13
  • 12
  • 11
  • 10
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

RAMBLE: robust acoustic modeling for Brazilian learners of English / RAMBLE: modelagem acústica robusta para estudantes brasileiros de Inglês

Shulby, Christopher Dane 08 August 2018 (has links)
The gains made by current deep-learning techniques have often come with the price tag of big data and where that data is not available, a new solution must be found. Such is the case for accented and noisy speech where large databases do not exist and data augmentation techniques, which are less than perfect, present an even larger obstacle. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. An example of a low resource scenario exists even in the fifth largest land in the world; home to most of the speakers of the seventh most spoken language on earth. Brazil is the leader in the Latin-American economy and as a BRIC country aspires to become an ever-stronger player in the global marketplace. Still, English proficiency is low, even for professionals in businesses and universities. Low intelligibility and strong accents can damage professional credibility. It has been established in the literature for foreign language teaching that it is important that adult learners are made aware of their errors as outlined by the Noticing Theory, explaining that a learner is more successful when he is able to learn from his own mistakes. An essential objective of this dissertation is to classify phonemes in the acoustic model which is needed to properly identify phonemic errors automatically. A common belief in the community is that deep learning requires large datasets to be effective. This happens because brute force methods create a highly complex hypothesis space which requires large and complex networks which in turn demand a great amount of data samples in order to generate useful networks. Besides that, the loss functions used in neural learning does not provide statistical learning guarantees and only guarantees the network can memorize the training space well. In the case of accented or noisy speech where a new sample can carry a great deal of variation from the training samples, the generalization of such models suffers. The main objective of this dissertation is to investigate how more robust acoustic generalizations can be made, even with little data and noisy accented-speech data. The approach here is to take advantage of raw feature extraction provided by deep learning techniques and instead focus on how learning guarantees can be provided for small datasets to produce robust results for acoustic modeling without the dependency of big data. This has been done by careful and intelligent parameter and architecture selection within the framework of the statistical learning theory. Here, an intelligently defined CNN architecture, together with context windows and a knowledge-driven hierarchical tree of SVM classifiers achieves nearly state-of-the-art frame-wise phoneme recognition results with absolutely no pretraining or external weight initialization. A goal of this thesis is to produce transparent and reproducible architectures with high frame-level accuracy, comparable to the state of the art. Additionally, a convergence analysis based on the learning guarantees of the statistical learning theory is performed in order to evidence the generalization capacity of the model. The model achieves 39.7% error in framewise classification and a 43.5% phone error rate using deep feature extraction and SVM classification even with little data (less than 7 hours). These results are comparable to studies which use well over ten times that amount of data. Beyond the intrinsic evaluation, the model also achieves an accuracy of 88% in the identification of epenthesis, the error which is most difficult for Brazilian speakers of English This is a 69% relative percentage gain over the previous values in the literature. The results are significant because it shows how deep feature extraction can be applied to little data scenarios, contrary to popular belief. The extrinsic, task-based results also show how this approach could be useful in tasks like automatic error diagnosis. Another contribution is the publication of a number of freely available resources which previously did not exist, meant to aid future researches in dataset creation. / Os ganhos obtidos pelas atuais técnicas de aprendizado profundo frequentemente vêm com o preço do big data e nas pesquisas em que esses grandes volumes de dados não estão disponíveis, uma nova solução deve ser encontrada. Esse é o caso do discurso marcado e com forte pronúncia, para o qual não existem grandes bases de dados; o uso de técnicas de aumento de dados (data augmentation), que não são perfeitas, apresentam um obstáculo ainda maior. Outro problema encontrado é que os resultados do estado da arte raramente são reprodutíveis porque os métodos usam conjuntos de dados proprietários, redes prétreinadas e/ou inicializações de peso de outras redes maiores. Um exemplo de um cenário de poucos recursos existe mesmo no quinto maior país do mundo em território; lar da maioria dos falantes da sétima língua mais falada do planeta. O Brasil é o líder na economia latino-americana e, como um país do BRIC, deseja se tornar um participante cada vez mais forte no mercado global. Ainda assim, a proficiência em inglês é baixa, mesmo para profissionais em empresas e universidades. Baixa inteligibilidade e forte pronúncia podem prejudicar a credibilidade profissional. É aceito na literatura para ensino de línguas estrangeiras que é importante que os alunos adultos sejam informados de seus erros, conforme descrito pela Noticing Theory, que explica que um aluno é mais bem sucedido quando ele é capaz de aprender com seus próprios erros. Um objetivo essencial desta tese é classificar os fonemas do modelo acústico, que é necessário para identificar automaticamente e adequadamente os erros de fonemas. Uma crença comum na comunidade é que o aprendizado profundo requer grandes conjuntos de dados para ser efetivo. Isso acontece porque os métodos de força bruta criam um espaço de hipóteses altamente complexo que requer redes grandes e complexas que, por sua vez, exigem uma grande quantidade de amostras de dados para gerar boas redes. Além disso, as funções de perda usadas no aprendizado neural não fornecem garantias estatísticas de aprendizado e apenas garantem que a rede possa memorizar bem o espaço de treinamento. No caso de fala marcada ou com forte pronúncia, em que uma nova amostra pode ter uma grande variação comparada com as amostras de treinamento, a generalização em tais modelos é prejudicada. O principal objetivo desta tese é investigar como generalizações acústicas mais robustas podem ser obtidas, mesmo com poucos dados e/ou dados ruidosos de fala marcada ou com forte pronúncia. A abordagem utilizada nesta tese visa tirar vantagem da raw feature extraction fornecida por técnicas de aprendizado profundo e obter garantias de aprendizado para conjuntos de dados pequenos para produzir resultados robustos para a modelagem acústica, sem a necessidade de big data. Isso foi feito por meio de seleção cuidadosa e inteligente de parâmetros e arquitetura no âmbito da Teoria do Aprendizado Estatístico. Nesta tese, uma arquitetura baseada em Redes Neurais Convolucionais (RNC) definida de forma inteligente, junto com janelas de contexto e uma árvore hierárquica orientada por conhecimento de classificadores que usam Máquinas de Vetores Suporte (Support Vector Machines - SVMs) obtém resultados de reconhecimento de fonemas baseados em frames quase no estado da arte sem absolutamente nenhum pré-treinamento ou inicialização de pesos de redes externas. Um objetivo desta tese é produzir arquiteturas transparentes e reprodutíveis com alta precisão em nível de frames, comparável ao estado da arte. Adicionalmente, uma análise de convergência baseada nas garantias de aprendizado da teoria de aprendizagem estatística é realizada para evidenciar a capacidade de generalização do modelo. O modelo possui um erro de 39,7% na classificação baseada em frames e uma taxa de erro de fonemas de 43,5% usando raw feature extraction e classificação com SVMs mesmo com poucos dados (menos de 7 horas). Esses resultados são comparáveis aos estudos que usam bem mais de dez vezes essa quantidade de dados. Além da avaliação intrínseca, o modelo também alcança uma precisão de 88% na identificação de epêntese, o erro que é mais difícil para brasileiros falantes de inglês. Este é um ganho relativo de 69% em relação aos valores anteriores da literatura. Os resultados são significativos porque mostram como raw feature extraction pode ser aplicada a cenários de poucos dados, ao contrário da crença popular. Os resultados extrínsecos também mostram como essa abordagem pode ser útil em tarefas como o diagnóstico automático de erros. Outra contribuição é a publicação de uma série de recursos livremente disponíveis que anteriormente não existiam, destinados a auxiliar futuras pesquisas na criação de conjuntos de dados.
102

Cortical and subcortical mechanisms of persistent stuttering / Kortikale und subkortikale Mechanismen bei persistentemStottern

Neef, Nicole 10 January 2011 (has links)
No description available.
103

Implementing the teaching handwriting, reading and spelling skills programme with an intermediate phase deaf Gauteng learner using the spoken language approach

Mumford, Vivien Patricia 01 1900 (has links)
The rationale for this study was to investigate the implementation of the THRASS literacy programme on a deaf learner who uses the spoken language approach. Particular emphasis was given to the role played by the Phoneme Machine together with Cued Speech. THRASS focuses on phoneme-grapheme correspondence by explicit phonics instruction to develop word analysis and recognition skills. Cued Speech is used as an instructional tool to facilitate visual access to auditory-based phonology. The research was framed within the Interpretivist paradigm and a qualitative case study design predominated, although the launch and landing of the study was quantitative in nature. The findings indicated that the auditory-based phonology of the English language may be accessed by a deaf learner, when supported by a visual instructional tool such as Cued Speech in synchronicity with speech-reading, to develop print literacy skills. This study opens the gateway to further enquiry on enhancing deaf literacy levels. / Inclusive Education / M. Ed. (Inclusive Education)
104

RAMBLE: robust acoustic modeling for Brazilian learners of English / RAMBLE: modelagem acústica robusta para estudantes brasileiros de Inglês

Christopher Dane Shulby 08 August 2018 (has links)
The gains made by current deep-learning techniques have often come with the price tag of big data and where that data is not available, a new solution must be found. Such is the case for accented and noisy speech where large databases do not exist and data augmentation techniques, which are less than perfect, present an even larger obstacle. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. An example of a low resource scenario exists even in the fifth largest land in the world; home to most of the speakers of the seventh most spoken language on earth. Brazil is the leader in the Latin-American economy and as a BRIC country aspires to become an ever-stronger player in the global marketplace. Still, English proficiency is low, even for professionals in businesses and universities. Low intelligibility and strong accents can damage professional credibility. It has been established in the literature for foreign language teaching that it is important that adult learners are made aware of their errors as outlined by the Noticing Theory, explaining that a learner is more successful when he is able to learn from his own mistakes. An essential objective of this dissertation is to classify phonemes in the acoustic model which is needed to properly identify phonemic errors automatically. A common belief in the community is that deep learning requires large datasets to be effective. This happens because brute force methods create a highly complex hypothesis space which requires large and complex networks which in turn demand a great amount of data samples in order to generate useful networks. Besides that, the loss functions used in neural learning does not provide statistical learning guarantees and only guarantees the network can memorize the training space well. In the case of accented or noisy speech where a new sample can carry a great deal of variation from the training samples, the generalization of such models suffers. The main objective of this dissertation is to investigate how more robust acoustic generalizations can be made, even with little data and noisy accented-speech data. The approach here is to take advantage of raw feature extraction provided by deep learning techniques and instead focus on how learning guarantees can be provided for small datasets to produce robust results for acoustic modeling without the dependency of big data. This has been done by careful and intelligent parameter and architecture selection within the framework of the statistical learning theory. Here, an intelligently defined CNN architecture, together with context windows and a knowledge-driven hierarchical tree of SVM classifiers achieves nearly state-of-the-art frame-wise phoneme recognition results with absolutely no pretraining or external weight initialization. A goal of this thesis is to produce transparent and reproducible architectures with high frame-level accuracy, comparable to the state of the art. Additionally, a convergence analysis based on the learning guarantees of the statistical learning theory is performed in order to evidence the generalization capacity of the model. The model achieves 39.7% error in framewise classification and a 43.5% phone error rate using deep feature extraction and SVM classification even with little data (less than 7 hours). These results are comparable to studies which use well over ten times that amount of data. Beyond the intrinsic evaluation, the model also achieves an accuracy of 88% in the identification of epenthesis, the error which is most difficult for Brazilian speakers of English This is a 69% relative percentage gain over the previous values in the literature. The results are significant because it shows how deep feature extraction can be applied to little data scenarios, contrary to popular belief. The extrinsic, task-based results also show how this approach could be useful in tasks like automatic error diagnosis. Another contribution is the publication of a number of freely available resources which previously did not exist, meant to aid future researches in dataset creation. / Os ganhos obtidos pelas atuais técnicas de aprendizado profundo frequentemente vêm com o preço do big data e nas pesquisas em que esses grandes volumes de dados não estão disponíveis, uma nova solução deve ser encontrada. Esse é o caso do discurso marcado e com forte pronúncia, para o qual não existem grandes bases de dados; o uso de técnicas de aumento de dados (data augmentation), que não são perfeitas, apresentam um obstáculo ainda maior. Outro problema encontrado é que os resultados do estado da arte raramente são reprodutíveis porque os métodos usam conjuntos de dados proprietários, redes prétreinadas e/ou inicializações de peso de outras redes maiores. Um exemplo de um cenário de poucos recursos existe mesmo no quinto maior país do mundo em território; lar da maioria dos falantes da sétima língua mais falada do planeta. O Brasil é o líder na economia latino-americana e, como um país do BRIC, deseja se tornar um participante cada vez mais forte no mercado global. Ainda assim, a proficiência em inglês é baixa, mesmo para profissionais em empresas e universidades. Baixa inteligibilidade e forte pronúncia podem prejudicar a credibilidade profissional. É aceito na literatura para ensino de línguas estrangeiras que é importante que os alunos adultos sejam informados de seus erros, conforme descrito pela Noticing Theory, que explica que um aluno é mais bem sucedido quando ele é capaz de aprender com seus próprios erros. Um objetivo essencial desta tese é classificar os fonemas do modelo acústico, que é necessário para identificar automaticamente e adequadamente os erros de fonemas. Uma crença comum na comunidade é que o aprendizado profundo requer grandes conjuntos de dados para ser efetivo. Isso acontece porque os métodos de força bruta criam um espaço de hipóteses altamente complexo que requer redes grandes e complexas que, por sua vez, exigem uma grande quantidade de amostras de dados para gerar boas redes. Além disso, as funções de perda usadas no aprendizado neural não fornecem garantias estatísticas de aprendizado e apenas garantem que a rede possa memorizar bem o espaço de treinamento. No caso de fala marcada ou com forte pronúncia, em que uma nova amostra pode ter uma grande variação comparada com as amostras de treinamento, a generalização em tais modelos é prejudicada. O principal objetivo desta tese é investigar como generalizações acústicas mais robustas podem ser obtidas, mesmo com poucos dados e/ou dados ruidosos de fala marcada ou com forte pronúncia. A abordagem utilizada nesta tese visa tirar vantagem da raw feature extraction fornecida por técnicas de aprendizado profundo e obter garantias de aprendizado para conjuntos de dados pequenos para produzir resultados robustos para a modelagem acústica, sem a necessidade de big data. Isso foi feito por meio de seleção cuidadosa e inteligente de parâmetros e arquitetura no âmbito da Teoria do Aprendizado Estatístico. Nesta tese, uma arquitetura baseada em Redes Neurais Convolucionais (RNC) definida de forma inteligente, junto com janelas de contexto e uma árvore hierárquica orientada por conhecimento de classificadores que usam Máquinas de Vetores Suporte (Support Vector Machines - SVMs) obtém resultados de reconhecimento de fonemas baseados em frames quase no estado da arte sem absolutamente nenhum pré-treinamento ou inicialização de pesos de redes externas. Um objetivo desta tese é produzir arquiteturas transparentes e reprodutíveis com alta precisão em nível de frames, comparável ao estado da arte. Adicionalmente, uma análise de convergência baseada nas garantias de aprendizado da teoria de aprendizagem estatística é realizada para evidenciar a capacidade de generalização do modelo. O modelo possui um erro de 39,7% na classificação baseada em frames e uma taxa de erro de fonemas de 43,5% usando raw feature extraction e classificação com SVMs mesmo com poucos dados (menos de 7 horas). Esses resultados são comparáveis aos estudos que usam bem mais de dez vezes essa quantidade de dados. Além da avaliação intrínseca, o modelo também alcança uma precisão de 88% na identificação de epêntese, o erro que é mais difícil para brasileiros falantes de inglês. Este é um ganho relativo de 69% em relação aos valores anteriores da literatura. Os resultados são significativos porque mostram como raw feature extraction pode ser aplicada a cenários de poucos dados, ao contrário da crença popular. Os resultados extrínsecos também mostram como essa abordagem pode ser útil em tarefas como o diagnóstico automático de erros. Outra contribuição é a publicação de uma série de recursos livremente disponíveis que anteriormente não existiam, destinados a auxiliar futuras pesquisas na criação de conjuntos de dados.
105

Spoken language identification in resource-scarce environments

Peche, Marius 24 August 2010 (has links)
South Africa has eleven official languages, ten of which are considered “resource-scarce”. For these languages, even basic linguistic resources required for the development of speech technology systems can be difficult or impossible to obtain. In this thesis, the process of developing Spoken Language Identification (S-LID) systems in resource-scarce environments is investigated. A Parallel Phoneme Recognition followed by Language Modeling (PPR-LM) architecture is utilized and three specific scenarios are investigated: (1) incomplete resources, including the lack of audio transcriptions and/or pronunciation dictionaries; (2) inconsistent resources, including the use of speech corpora that are unmatched with regard to domain or channel characteristics; and (3) poor quality resources, such as wrongly labeled or poorly transcribed data. Each situation is analysed, techniques defined to mitigate the effect of limited or poor quality resources, and the effectiveness of these techniques evaluated experimentally. Techniques evaluated include the development of orthographic tokenizers, bootstrapping of transcriptions, filtering of low quality audio, diarization and channel normalization techniques, and the human verification of miss-classified utterances. The knowledge gained from this research is used to develop the first S-LID system able to distinguish between all South African languages. The system performs well, able to differentiate among the eleven languages with an accuracy of above 67%, and among the six primary South African language families with an accuracy of higher than 80%, on segments of speech of between 2s and 10s in length. AFRIKAANS : Suid-Afrika het elf amptelike tale waarvan tien as hulpbron-skaars beskou word. Vir die tien tale kan selfs die basiese hulpbronne wat benodig word om spraak tegnologie stelsels te ontwikkel moeilik wees om te bekom. Die proses om ‘n Gesproke Taal Identifisering stelsel vir hulpbron-skaars omgewings te ontwikkel, word in hierdie tesis ondersoek. ‘n Parallelle Foneem Herkenning gevolg deur Taal Modellering argitektuur word ingespan om drie spesifieke moontlikhede word ondersoek: (1) Onvolledige Hulpbronne, byvoorbeeld vermiste transkripsies en uitspraak woordeboeke; (2) Teenstrydige Hulpbronne, byvoorbeeld die gebruik van spraak data-versamelings wat teenstrydig is in terme van kanaal kenmerke; en (3) Hulpbronne van swak kwaliteit, byvoorbeeld foutief geklasifiseerde data en klank opnames wat swak getranskribeer is. Elke situasie word geanaliseer, tegnieke om die negatiewe effekte van min of swak hulpbronne te verminder word ontwikkel, en die bruikbaarheid van hierdie tegnieke word deur middel van eksperimente bepaal. Tegnieke wat ontwikkel word sluit die ontwikkeling van ortografiese ontleders, die outomatiese ontwikkeling van nuwe transkripsies, die filtrering van swak kwaliteit klank-data, klank-verdeling en kanaal normalisering tegnieke, en menslike verifikasie van verkeerd geklassifiseerde uitsprake in. Die kennis wat deur hierdie navorsing bekom word, word gebruik om die eerste Gesproke Taal Identifisering stelsel wat tussen al die tale van Suid-Afrika kan onderskei, te ontwikkel. Hierdie stelsel vaar relatief goed, en kan die elf tale met ‘n akkuraatheid van meer as 67% identifiseer. Indien daar op die ses taal families gefokus word, verbeter die persentasie tot meer as 80% vir segmente wat tussen 2 en 10 sekondes lank. Copyright / Dissertation (MEng)--University of Pretoria, 2010. / Electrical, Electronic and Computer Engineering / unrestricted
106

韻尾類比訓練對國小六年級學生英文讀字能力之成效研究 / The effects of rime analogy training on word reading for efl sixth graders

黃秀玉, Huang, Shiu Yu Unknown Date (has links)
本研究旨在探討韻尾類比訓練對國小六年級學生英文讀字能力、讀字態度之影響及其學習困難。研究分兩階段進行:第一階段為小規模之預試研究,第二階段則為正式實驗。預試研究後,研究者在教法、試題做修正改進,並經由了解學生之思考過程及學習困難後,再進一步設計更完整之訪談。 在正式實驗中,對象為桃園縣某國小二個六年級班級,並從二班各挑出25人做為實驗組及對照組。實驗組施以韻尾類比策略訓練,教材來源為學生二至五年級之教科書字彙以做為類比策略運用之基礎。對照組雖使用相同之教材,但教法則僅限於字母與音的對應關係。實驗時間為每週20分鐘(每週兩節英語之前10分鐘),持續十週。兩組學生在教學前後各施以讀字測驗及唸讀英文字態度問卷調查,訓練後則二組各選6名做為訪談對象以進一步了解他們的學習困難。 結果發現,二組學生在讀字能力上並無顯著差別,但在讀字態度上只有實驗組有顯著正向改變。比較二組學習困難則發現對照組之困難較為複雜。此外,實驗組之低程度學生在接受類比訓練後,在讀字能力及讀字態度上相較於對照組之低程度學生有非常明顯之進步。 以上研究結果顯示,韻尾類比策略訓練可以提升國小六年級學生英文認字能力亦能正向改變學生之讀字態度,尤其對低程度學生更為有效。最後根據本研究之結果及學生之學習困難提出教學建議,供未來國小英語教師英文讀字教學時之參考。 / The purpose of this study is to explore the effects of rime analogy training on sixth graders with respect to their decoding skills, attitudinal changes towards reading English words, and perceived difficulties with word reading. The present study comprised 2 phases: the first being a small-scale pilot study, the second a formal study. The pilot functioned as a preparatory work for the formal study. In the pilot, the testing materials, instruments, and activities of the training were tested and revised to be more suitable for the formal study. From the students’ responses, the researcher obtained some insights about their thinking process and learning difficulties and this allowed for designing a more complete interview for the formal study. In the formal study, there was an experimental group and a control group, each comprised of 25 sixth graders from two classes in one elementary school in Tao Yuan county. The experimental group received rime analogy training. The teaching materials were selected from the participants’ textbooks word bank, from the second grade to the fifth grade, as a basis for making analogy. The control group was taught with the same materials but received phonics instruction that focused only on grapheme-phoneme correspondences rules. Both groups received two 10-minute training sessions a week for 10 weeks, and were administered the same pre-and post-test (generalization test) to assess decoding skills, and a pre-and post-training questionnaire on attitudes toward reading English words. After the training, six participants from each group were further interviewed to understand their thinking process and perceived difficulties. The findings are as follows. In terms of the decoding skills, the post generalization test showed that no significant statistical difference was found between the two groups. In light of the attitudinal changes, only within-group comparisons of the experimental group were significantly different. In view of perceived difficulties, the interviews revealed that the difficulties in the control group were more complicated than those in the experimental group. The most noteworthy finding is that the lowest-proficiency participants in the experimental group not only outperformed their counterparts in the control group in decoding skills, but also demonstrated far more positive attitudinal changes after the training. The findings provide supporting evidence for the value of rime analogy training in promoting students’ decoding abilities and positively changing students’ learning attitudes. The nature of students’ perceived difficulties is also discussed, in respect of which several pedagogical implications and suggestions for future studies are outlined.
107

Examination of the (si) and (ʃi) confusion by Japanese ESL learners

Nogita, Akitsugu 30 August 2010 (has links)
It is a general belief in Japan that the English /s/ and /ʃ/ before high front vowels (as in "see" and "she") are problematic for Japanese ESL (English-as-a-second-language) learners. Some research has also reported the /s/ and /ʃ/ confusion by Japanese ESL learners. Their pronunciation errors are often explained based on phonetics, but there are reasons to believe that the learners’ knowledge of the phonemes of the target words is at fault. This study examines 1) whether monolingual Japanese speakers distinguish the [si] and [ʃi] syllables in both perception and production in the Japanese contexts and 2) what would be the sources of Japanese speakers’ challenges in mastering the distinction between [si] and [ʃi] in their English production if Japanese speakers can produce and perceive the difference between these syllables. This study conducted two experiments. In the first experiment, 93 monolingual Japanese speakers between the ages of 17 and 89 in and around Tôkyô read aloud the written stimuli that had [si] and [ʃi] in the Japanese contexts, repeated the sound stimuli that had [si] and [ʃi] in the Japanese contexts, and listened to the [si:] and [ʃi:] syllables in isolation recorded by a native speaker of Canadian English. The results showed that the participants all distinguished [si] and [ʃi] in both perception and production regardless of their ages. Based on these results, I hypothesized that the [s] and [ʃ] confusion by Japanese ESL learners is caused by misunderstanding, rather than an inability to articulate these sounds. In the second experiment, 27 Japanese ESL students were recorded reading an English passage. The passage contains /s/ (7 times) and /ʃ/ (11 times) before high front vowels. After the reading, the participants were taught the basic English phonological system and the symbol-sound correspondence rules such as “s”-/s/ and “sh”-/ʃ/. The lesson lasted 40 minutes during which the participants were also interviewed to find out their awareness of the symbol-sound correspondence. No articulation explanations were given during the lesson. After the lesson, the participants read the same passage. The results showed that /s/ and /ʃ/ were mispronounced 39 and 67 times respectively in total by the 27 participants before the lesson, but only 7 and 19 times after the lesson. These changes are statistically significant. Moreover, the interview during the lesson revealed that the participants lacked phonological awareness in English as well as the knowledge of the symbol-sound correspondence rules. This study concluded that many of the mispronunciations by Japanese ESL learners, including /s/ and /ʃ/, can be solved by teaching the English phonics rules and some basic phonological rules without teaching the articulation of these sounds.

Page generated in 0.1468 seconds