Global ETD Search

1	The effects of part–of–speech tagging on text–to–speech synthesis for resource–scarce languages / G.I. Schlünz Schlünz, Georg Isaac January 2010 (has links) In the world of human language technology, resource–scarce languages (RSLs) suffer from the problem of little available electronic data and linguistic expertise. The Lwazi project in South Africa is a large–scale endeavour to collect and apply such resources for all eleven of the official South African languages. One of the deliverables of the project is more natural text–to–speech (TTS) voices. Naturalness is primarily determined by prosody and it is shown that many aspects of prosodic modelling is, in turn, dependent on part–of–speech (POS) information. Solving the POS problem is, therefore, a prudent first step towards meeting the goal of natural TTS voices. In a resource–scarce environment, obtaining and applying the POS information are not trivial. Firstly, an automatic tagger is required to tag the text to be synthesised with POS categories, but state–of–the–art POS taggers are data–driven and thus require large amounts of labelled training data. Secondly, the subsequent processes in TTS that are used to apply the POS information towards prosodic modelling are resource–intensive themselves: some require non–trivial linguistic knowledge; others require labelled data as well. The first problem asks the question of which available POS tagging algorithm will be the most accurate on little training data. This research sets out to answer the question by reviewing the most popular supervised data–driven algorithms. Since literature to date consists mostly of isolated papers discussing one algorithm, the aim of the review is to consolidate the research into a single point of reference. A subsequent experimental investigation compares the tagging algorithms on small training data sets of English and Afrikaans, and it is shown that the hidden Markov model (HMM) tagger outperforms the rest when using both a comprehensive and a reduced POS tagset. Regarding the second problem, the question arises whether it is perhaps possible to circumvent the traditional approaches to prosodic modelling by learning the latter directly from the speech data using POS information. In other words, does the addition of POS features to the HTS context labels improve the naturalness of a TTS voice? Towards answering this question, HTS voices are trained from English and Afrikaans prosodically rich speech. The voices are compared with and without POS features incorporated into the HTS context labels, analytically and perceptually. For the analytical experiments, measures of prosody to quantify the comparisons are explored. It is then also noted whether the results of the perceptual experiments correlate with their analytical counterparts. It is found that, when a minimal feature set is used for the HTS context labels, the addition of POS tags does improve the naturalness of the voice. However, the same effect can be accomplished by including segmental counting and positional information instead of the POS tags. / Thesis (M.Sc. Engineering Sciences (Electrical and Electronic Engineering))--North-West University, Potchefstroom Campus, 2011. part-of-speech tagging text-to-speech synthesis resource-scarce language naturalness prosody HTS context labels
2	The effects of part–of–speech tagging on text–to–speech synthesis for resource–scarce languages / G.I. Schlünz Schlünz, Georg Isaac January 2010 (has links) In the world of human language technology, resource–scarce languages (RSLs) suffer from the problem of little available electronic data and linguistic expertise. The Lwazi project in South Africa is a large–scale endeavour to collect and apply such resources for all eleven of the official South African languages. One of the deliverables of the project is more natural text–to–speech (TTS) voices. Naturalness is primarily determined by prosody and it is shown that many aspects of prosodic modelling is, in turn, dependent on part–of–speech (POS) information. Solving the POS problem is, therefore, a prudent first step towards meeting the goal of natural TTS voices. In a resource–scarce environment, obtaining and applying the POS information are not trivial. Firstly, an automatic tagger is required to tag the text to be synthesised with POS categories, but state–of–the–art POS taggers are data–driven and thus require large amounts of labelled training data. Secondly, the subsequent processes in TTS that are used to apply the POS information towards prosodic modelling are resource–intensive themselves: some require non–trivial linguistic knowledge; others require labelled data as well. The first problem asks the question of which available POS tagging algorithm will be the most accurate on little training data. This research sets out to answer the question by reviewing the most popular supervised data–driven algorithms. Since literature to date consists mostly of isolated papers discussing one algorithm, the aim of the review is to consolidate the research into a single point of reference. A subsequent experimental investigation compares the tagging algorithms on small training data sets of English and Afrikaans, and it is shown that the hidden Markov model (HMM) tagger outperforms the rest when using both a comprehensive and a reduced POS tagset. Regarding the second problem, the question arises whether it is perhaps possible to circumvent the traditional approaches to prosodic modelling by learning the latter directly from the speech data using POS information. In other words, does the addition of POS features to the HTS context labels improve the naturalness of a TTS voice? Towards answering this question, HTS voices are trained from English and Afrikaans prosodically rich speech. The voices are compared with and without POS features incorporated into the HTS context labels, analytically and perceptually. For the analytical experiments, measures of prosody to quantify the comparisons are explored. It is then also noted whether the results of the perceptual experiments correlate with their analytical counterparts. It is found that, when a minimal feature set is used for the HTS context labels, the addition of POS tags does improve the naturalness of the voice. However, the same effect can be accomplished by including segmental counting and positional information instead of the POS tags. / Thesis (M.Sc. Engineering Sciences (Electrical and Electronic Engineering))--North-West University, Potchefstroom Campus, 2011. part-of-speech tagging text-to-speech synthesis resource-scarce language naturalness prosody HTS context labels
3	Enkele tegnieke vir die ontwikkeling en benutting van etiketteringhulpbronne vir hulpbronskaars tale / A.C. Griebenow Griebenow, Annick January 2015 (has links) Because the development of resources in any language is an expensive process, many languages, including the indigenous languages of South Africa, can be classified as being resource scarce, or lacking in tagging resources. This study investigates and applies techniques and methodologies for optimising the use of available resources and improving the accuracy of a tagger using Afrikaans as resource-scarce language and aims to i) determine whether combination techniques can be effectively applied to improve the accuracy of a tagger for Afrikaans, and ii) determine whether structural semi-supervised learning can be effectively applied to improve the accuracy of a supervised learning tagger for Afrikaans. In order to realise the first aim, existing methodologies for combining classification algorithms are investigated. Four taggers, trained using MBT, SVMlight, MXPOST and TnT respectively, are then combined into a combination tagger using weighted voting. Weights are calculated by means of total precision, tag precision and a combination of precision and recall. Although the combination of taggers does not consistently lead to an error rate reduction with regard to the baseline, it manages to achieve an error rate reduction of up to 18.48% in some cases. In order to realise the second aim, existing semi-supervised learning algorithms, with specific focus on structural semi-supervised learning, are investigated. Structural semi-supervised learning is implemented by means of the SVD-ASO-algorithm, which attempts to extract the shared structure of untagged data using auxiliary problems before training a tagger. The use of untagged data during the training of a tagger leads to an error rate reduction with regard to the baseline of 1.67%. Even though the error rate reduction does not prove to be statistically significant in all cases, the results show that it is possible to improve the accuracy in some cases. / MSc (Computer Science), North-West University, Potchefstroom Campus, 2015 Hulpbronskaars taal Masjienleer Gedeeltelik-gekontroleerde leer Strukturele leer Mensetaaltegnologie Kombinasie-woordsoortetiketteerder Natuurliketaalverwerking Resource-scarce language Machine learning Semi-supervised learning Structural learning Human language technology Combination tagger Natural language processing
4	Enkele tegnieke vir die ontwikkeling en benutting van etiketteringhulpbronne vir hulpbronskaars tale / A.C. Griebenow Griebenow, Annick January 2015 (has links) Because the development of resources in any language is an expensive process, many languages, including the indigenous languages of South Africa, can be classified as being resource scarce, or lacking in tagging resources. This study investigates and applies techniques and methodologies for optimising the use of available resources and improving the accuracy of a tagger using Afrikaans as resource-scarce language and aims to i) determine whether combination techniques can be effectively applied to improve the accuracy of a tagger for Afrikaans, and ii) determine whether structural semi-supervised learning can be effectively applied to improve the accuracy of a supervised learning tagger for Afrikaans. In order to realise the first aim, existing methodologies for combining classification algorithms are investigated. Four taggers, trained using MBT, SVMlight, MXPOST and TnT respectively, are then combined into a combination tagger using weighted voting. Weights are calculated by means of total precision, tag precision and a combination of precision and recall. Although the combination of taggers does not consistently lead to an error rate reduction with regard to the baseline, it manages to achieve an error rate reduction of up to 18.48% in some cases. In order to realise the second aim, existing semi-supervised learning algorithms, with specific focus on structural semi-supervised learning, are investigated. Structural semi-supervised learning is implemented by means of the SVD-ASO-algorithm, which attempts to extract the shared structure of untagged data using auxiliary problems before training a tagger. The use of untagged data during the training of a tagger leads to an error rate reduction with regard to the baseline of 1.67%. Even though the error rate reduction does not prove to be statistically significant in all cases, the results show that it is possible to improve the accuracy in some cases. / MSc (Computer Science), North-West University, Potchefstroom Campus, 2015 Hulpbronskaars taal Masjienleer Gedeeltelik-gekontroleerde leer Strukturele leer Mensetaaltegnologie Kombinasie-woordsoortetiketteerder Natuurliketaalverwerking Resource-scarce language Machine learning Semi-supervised learning Structural learning Human language technology Combination tagger Natural language processing

1

Page generated in 0.0638 seconds