Spelling suggestions: "subject:"resourcescarce language"" "subject:"resourcescarcity language""
1 |
The effects of part–of–speech tagging on text–to–speech synthesis for resource–scarce languages / G.I. SchlünzSchlünz, Georg Isaac January 2010 (has links)
In the world of human language technology, resource–scarce languages (RSLs) suffer from the problem
of little available electronic data and linguistic expertise. The Lwazi project in South Africa
is a large–scale endeavour to collect and apply such resources for all eleven of the official South
African languages. One of the deliverables of the project is more natural text–to–speech (TTS)
voices. Naturalness is primarily determined by prosody and it is shown that many aspects of
prosodic modelling is, in turn, dependent on part–of–speech (POS) information. Solving the POS
problem is, therefore, a prudent first step towards meeting the goal of natural TTS voices.
In a resource–scarce environment, obtaining and applying the POS information are not trivial.
Firstly, an automatic tagger is required to tag the text to be synthesised with POS categories, but
state–of–the–art POS taggers are data–driven and thus require large amounts of labelled training
data. Secondly, the subsequent processes in TTS that are used to apply the POS information
towards prosodic modelling are resource–intensive themselves: some require non–trivial linguistic
knowledge; others require labelled data as well.
The first problem asks the question of which available POS tagging algorithm will be the most
accurate on little training data. This research sets out to answer the question by reviewing the
most popular supervised data–driven algorithms. Since literature to date consists mostly of isolated
papers discussing one algorithm, the aim of the review is to consolidate the research into a single
point of reference. A subsequent experimental investigation compares the tagging algorithms on
small training data sets of English and Afrikaans, and it is shown that the hidden Markov model
(HMM) tagger outperforms the rest when using both a comprehensive and a reduced POS tagset.
Regarding the second problem, the question arises whether it is perhaps possible to circumvent
the traditional approaches to prosodic modelling by learning the latter directly from the speech
data using POS information. In other words, does the addition of POS features to the HTS context
labels improve the naturalness of a TTS voice? Towards answering this question, HTS voices are
trained from English and Afrikaans prosodically rich speech. The voices are compared with and
without POS features incorporated into the HTS context labels, analytically and perceptually. For
the analytical experiments, measures of prosody to quantify the comparisons are explored. It is
then also noted whether the results of the perceptual experiments correlate with their analytical
counterparts. It is found that, when a minimal feature set is used for the HTS context labels, the
addition of POS tags does improve the naturalness of the voice. However, the same effect can be
accomplished by including segmental counting and positional information instead of the POS tags. / Thesis (M.Sc. Engineering Sciences (Electrical and Electronic Engineering))--North-West University, Potchefstroom Campus, 2011.
|
2 |
The effects of part–of–speech tagging on text–to–speech synthesis for resource–scarce languages / G.I. SchlünzSchlünz, Georg Isaac January 2010 (has links)
In the world of human language technology, resource–scarce languages (RSLs) suffer from the problem
of little available electronic data and linguistic expertise. The Lwazi project in South Africa
is a large–scale endeavour to collect and apply such resources for all eleven of the official South
African languages. One of the deliverables of the project is more natural text–to–speech (TTS)
voices. Naturalness is primarily determined by prosody and it is shown that many aspects of
prosodic modelling is, in turn, dependent on part–of–speech (POS) information. Solving the POS
problem is, therefore, a prudent first step towards meeting the goal of natural TTS voices.
In a resource–scarce environment, obtaining and applying the POS information are not trivial.
Firstly, an automatic tagger is required to tag the text to be synthesised with POS categories, but
state–of–the–art POS taggers are data–driven and thus require large amounts of labelled training
data. Secondly, the subsequent processes in TTS that are used to apply the POS information
towards prosodic modelling are resource–intensive themselves: some require non–trivial linguistic
knowledge; others require labelled data as well.
The first problem asks the question of which available POS tagging algorithm will be the most
accurate on little training data. This research sets out to answer the question by reviewing the
most popular supervised data–driven algorithms. Since literature to date consists mostly of isolated
papers discussing one algorithm, the aim of the review is to consolidate the research into a single
point of reference. A subsequent experimental investigation compares the tagging algorithms on
small training data sets of English and Afrikaans, and it is shown that the hidden Markov model
(HMM) tagger outperforms the rest when using both a comprehensive and a reduced POS tagset.
Regarding the second problem, the question arises whether it is perhaps possible to circumvent
the traditional approaches to prosodic modelling by learning the latter directly from the speech
data using POS information. In other words, does the addition of POS features to the HTS context
labels improve the naturalness of a TTS voice? Towards answering this question, HTS voices are
trained from English and Afrikaans prosodically rich speech. The voices are compared with and
without POS features incorporated into the HTS context labels, analytically and perceptually. For
the analytical experiments, measures of prosody to quantify the comparisons are explored. It is
then also noted whether the results of the perceptual experiments correlate with their analytical
counterparts. It is found that, when a minimal feature set is used for the HTS context labels, the
addition of POS tags does improve the naturalness of the voice. However, the same effect can be
accomplished by including segmental counting and positional information instead of the POS tags. / Thesis (M.Sc. Engineering Sciences (Electrical and Electronic Engineering))--North-West University, Potchefstroom Campus, 2011.
|
3 |
Enkele tegnieke vir die ontwikkeling en benutting van etiketteringhulpbronne vir hulpbronskaars tale / A.C. GriebenowGriebenow, Annick January 2015 (has links)
Because the development of resources in any language is an expensive process, many languages, including the indigenous languages of South Africa, can be classified as being resource scarce, or lacking in tagging resources. This study investigates and applies techniques and methodologies for optimising the use of available resources and improving the accuracy of a tagger using Afrikaans as resource-scarce language and aims to i) determine whether combination techniques can be effectively applied to improve the accuracy of a tagger for Afrikaans, and ii) determine whether structural semi-supervised learning can be effectively applied to improve the accuracy of a supervised learning tagger for Afrikaans. In order to realise the first aim, existing methodologies for combining classification algorithms are investigated. Four taggers, trained using MBT, SVMlight, MXPOST and TnT respectively, are then combined into a combination tagger using weighted voting. Weights are calculated by means of total precision, tag precision and a combination of precision and recall. Although the combination of taggers does not consistently lead to an error rate reduction with regard to the baseline, it manages to achieve an error rate reduction of up to 18.48% in some cases. In order to realise the second aim, existing semi-supervised learning algorithms, with specific focus on structural semi-supervised learning, are investigated. Structural semi-supervised learning is implemented by means of the SVD-ASO-algorithm, which attempts to extract the shared structure of untagged data using auxiliary problems before training a tagger. The use of untagged data during the training of a tagger leads to an error rate reduction with regard to the baseline of 1.67%. Even though the error rate reduction does not prove to be statistically significant in all cases, the results show that it is possible to improve the accuracy in some cases. / MSc (Computer Science), North-West University, Potchefstroom Campus, 2015
|
4 |
Enkele tegnieke vir die ontwikkeling en benutting van etiketteringhulpbronne vir hulpbronskaars tale / A.C. GriebenowGriebenow, Annick January 2015 (has links)
Because the development of resources in any language is an expensive process, many languages, including the indigenous languages of South Africa, can be classified as being resource scarce, or lacking in tagging resources. This study investigates and applies techniques and methodologies for optimising the use of available resources and improving the accuracy of a tagger using Afrikaans as resource-scarce language and aims to i) determine whether combination techniques can be effectively applied to improve the accuracy of a tagger for Afrikaans, and ii) determine whether structural semi-supervised learning can be effectively applied to improve the accuracy of a supervised learning tagger for Afrikaans. In order to realise the first aim, existing methodologies for combining classification algorithms are investigated. Four taggers, trained using MBT, SVMlight, MXPOST and TnT respectively, are then combined into a combination tagger using weighted voting. Weights are calculated by means of total precision, tag precision and a combination of precision and recall. Although the combination of taggers does not consistently lead to an error rate reduction with regard to the baseline, it manages to achieve an error rate reduction of up to 18.48% in some cases. In order to realise the second aim, existing semi-supervised learning algorithms, with specific focus on structural semi-supervised learning, are investigated. Structural semi-supervised learning is implemented by means of the SVD-ASO-algorithm, which attempts to extract the shared structure of untagged data using auxiliary problems before training a tagger. The use of untagged data during the training of a tagger leads to an error rate reduction with regard to the baseline of 1.67%. Even though the error rate reduction does not prove to be statistically significant in all cases, the results show that it is possible to improve the accuracy in some cases. / MSc (Computer Science), North-West University, Potchefstroom Campus, 2015
|
Page generated in 0.0638 seconds