Spelling suggestions: "subject:"prosodic.""
101 |
Suprasegmental representations for the modeling of fundamental frequency in statistical parametric speech synthesisFonseca De Sam Bento Ribeiro, Manuel January 2018 (has links)
Statistical parametric speech synthesis (SPSS) has seen improvements over recent years, especially in terms of intelligibility. Synthetic speech is often clear and understandable, but it can also be bland and monotonous. Proper generation of natural speech prosody is still a largely unsolved problem. This is relevant especially in the context of expressive audiobook speech synthesis, where speech is expected to be fluid and captivating. In general, prosody can be seen as a layer that is superimposed on the segmental (phone) sequence. Listeners can perceive the same melody or rhythm in different utterances, and the same segmental sequence can be uttered with a different prosodic layer to convey a different message. For this reason, prosody is commonly accepted to be inherently suprasegmental. It is governed by longer units within the utterance (e.g. syllables, words, phrases) and beyond the utterance (e.g. discourse). However, common techniques for the modeling of speech prosody - and speech in general - operate mainly on very short intervals, either at the state or frame level, in both hidden Markov model (HMM) and deep neural network (DNN) based speech synthesis. This thesis presents contributions supporting the claim that stronger representations of suprasegmental variation are essential for the natural generation of fundamental frequency for statistical parametric speech synthesis. We conceptualize the problem by dividing it into three sub-problems: (1) representations of acoustic signals, (2) representations of linguistic contexts, and (3) the mapping of one representation to another. The contributions of this thesis provide novel methods and insights relating to these three sub-problems. In terms of sub-problem 1, we propose a multi-level representation of f0 using the continuous wavelet transform and the discrete cosine transform, as well as a wavelet-based decomposition strategy that is linguistically and perceptually motivated. In terms of sub-problem 2, we investigate additional linguistic features such as text-derived word embeddings and syllable bag-of-phones and we propose a novel method for learning word vector representations based on acoustic counts. Finally, considering sub-problem 3, insights are given regarding hierarchical models such as parallel and cascaded deep neural networks.
|
102 |
The Effect of Age of Acquisition and Second-Language Experience on Segments and Prosody: A Cross-Sectional Study of Korean Bilinguals' English and Korean ProductionOh, Grace Eunhae, 1980- 09 1900 (has links)
xviii, 210 p. : ill. (some col.) / The current dissertation investigated segmental and prosodic aspects of first- (L1) and second-language (L2) speech production. Forty Korean-speaking adults and children varying in L2 experience (6 months-inexperienced vs. 6 years-experienced) as well as twenty age-matched native English speaking adults and children participated. Experienced children born in the U.S. were first exposed to English much earlier than inexperienced children. Group differences were investigated for insight into the effect of differing language experience on speech production.
For segmental aspects, spectral quality and duration of English and Korean vowels (Chapter II), the effect of English coda consonant voicing on vowel and consonant closure duration (Chapter III), and language-specific voice onset time (VOT) in English and Korean stops (Chapter IV) were examined. All Korean groups except the experienced children differed from the native English speakers in vowel spectral quality and coda voicing production. The experienced children showed native-like production of both English and Korean vowels and also used VOT to distinguish Korean aspirated and English voiceless stops. These results suggest that the experienced children have separate phonological representations for their two languages.
For prosodic aspects, stressed and unstressed vowels in English multisyllabic words (Chapter V) and Korean four-syllable phrases (Chapter VI) were elicited. The results of stressed and unstressed vowel production revealed that the Korean adults were able to acquire English prosody in a native-like manner, except for reduced vowel quality. Contrary to the little L1-L2 interaction in prosody for adults, Korean experienced children's production suggested a strong influence of English acquisition on the development of Korean prosody in terms of fundamental frequency, intensity, and duration patterns.
Different degrees of L1-L2 interaction between Korean experienced children's production of segments and prosody are discussed from the developmental standpoint of simultaneous bilingual children's language shift from the mother tongue to English. In addition to children's greater plasticity of language acquisition, external (e.g., peer pressure, language input) and internal (e.g., ethnic self-identity) factors are likely to have created a language learning environment different from that of the Korean adults. As a result, the degree and direction of L1-L2 interaction varied by linguistic domains, depending on the age of the learner and the language experience. / Committee in charge: Susan Guion-Anderson, Chairperson;
Melissa Redford, Member;
Vsevolod Kapatsinski, Member;
Kaori Idemaru, Outside Member
|
103 |
A interface prosódia-sintaxe na produção e no processamento de estruturas de Tópico e de SVOSilva, Carolina Garcia de Carvalho 15 October 2015 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-04-27T15:10:49Z
No. of bitstreams: 1
carolinagarciadecarvalhosilva.pdf: 4355278 bytes, checksum: 49744d937d77f51f06fa3dada064b83b (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-05-02T01:02:15Z (GMT) No. of bitstreams: 1
carolinagarciadecarvalhosilva.pdf: 4355278 bytes, checksum: 49744d937d77f51f06fa3dada064b83b (MD5) / Made available in DSpace on 2016-05-02T01:02:15Z (GMT). No. of bitstreams: 1
carolinagarciadecarvalhosilva.pdf: 4355278 bytes, checksum: 49744d937d77f51f06fa3dada064b83b (MD5)
Previous issue date: 2015-10-15 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / O presente estudo tem como objetivo investigar o papel da prosódia no processamento
sintático. Nossa hipótese é a de que a prosódia não só impede a ambiguidade, mas também
guia o parser, fornecendo pistas para a construção da estrutura sintática no curso do
processamento realizado por adultos. Elegemos como objetos de análise as estruturas de
Tópico (de argumento interno) e de SVO (não topicalizado) porque, no contraste entre ambas,
as pistas prosódicas são acessíveis desde o início da sentença. Um conjunto de experimentos
foi realizado com os seguintes objetivos: na produção: (a) verificar as propriedades prosódicas
de sentenças com estruturas SVO e de Tópico em tarefas de leitura; (b) investigar se há uma
preferência por uma prosódia default; na compreensão: (c) verificar se há preferência por uma
ou outra estrutura, a partir da comparação do tempo de processamento de estímulos auditivos
com e sem pistas prosódicas; (d) verificar se a modificação dessas pistas poderia "enganar" as
escolhas do parser. Para a construção dos estímulos experimentais, foram selecionadas
palavras que podem pertencer tanto à categoria Verbo, quanto à categoria Adjetivo.
Construímos sentenças com os dois tipos de estruturas sintáticas: Tópico – [A criança
SUJA]IPa madrinha mandou ela para o banho; e SVO – [A criança]ϕ [SUJA a madrinha] com
a comida do almoço. Os resultados dos experimentos de produção revelam que há pistas
acústicas que diferenciam os dois tipos de estrutura. Além disso, sugerem também que há uma
preferência, na leitura em voz alta, pela prosódia SVO quando o participante desconhece o
sentido completo das sentenças. Quanto aos experimentos de compreensão, também parece
haver uma preferência pela estrutura de SVO quando as pistas prosódicas não estão acessíveis
para o ouvinte. Por outro lado, quando a prosódia é informativa, os resultados sugerem que
esta não só guia o parser na escolha da estrutura sintática, mas também poderia “enganá-lo”,
levando ao processamento de uma estrutura diferente nos casos de incongruência entre sintaxe
e prosódia. Em conjunto, os resultados revelam que, por um lado, parece haver uma estrutura
default SVO, mas, por outro, a prosódia de Tópico parece impedir, em certa medida, a
ativação desse default. Com os tipos de estruturas analisadas e com base nos resultados
experimentais, defendemos que a prosódia guia essas projeções, já que as pistas acústicas
estão acessíveis desde o início das sentenças. De acordo com a proposta de Bocci (2008), as
propriedades de Tópico estariam codificadas na sintaxe, guiando a derivação sintática; tais
propriedades, por sua vez, seriam disponibilizadas, previamente, via prosódia. Os resultados
obtidos são discutidos à luz de três abordagens teóricas: a hipótese do Bootstrapping
Prosódico/Fonológico (MORGAN & DEMUTH, 1996; CHRISTOPHE et al., 1997, 2008),
que sustenta que pistas prosódicas promovem a segmentação do fluxo de fala, facilitando o
processamento (no caso dos adultos); o Modelo Integrado da Computação On-line (CORRÊA
& AUGUSTO, 2006; 2007), segundo o qual a árvore sintática vai se formando no curso do
processamento; e o Modelo de Compreensão Auditiva da Linguagem (FRIEDERICI, 2011),
que prevê a interação da prosódia com a sintaxe no nível neurofisiológico. / This study investigates the role of prosody in syntactic processing. Our hypothesis is that
prosody could not only prevent ambiguity, but it also guides the parser, providing cues to the
syntactic tree construction during linguistic processing by adults. We elected Topic (of intern
argument) and SVO (nontopicalized) structures because in contrast with both of them, the
prosodic cues are accessible from the beginning of the sentences in both structures. A set of
experiments was conducted according to the following objectives: in production: (a) to verify
the prosodic properties of SVO and Topic sentences in reading experiments; (b) to investigate
if there is a preference for a default prosody; in comprehension: (c) to verify the preference
for each structure by comparing listening reaction time with or without prosodic cues; (d) to
verify if the modification of these cues could "trick" the parser choices. For the construction
of experimental stimuli, words that may belong to Verb category as well as Adjective
category were selected. We built sentences with both types of syntactic tree: Topic – [A
criança SUJA]IP a madrinha mandou ela para o banho; and SVO – [A criança]ϕ [SUJA a
madrinha] com a comida do almoço. (Topic – The dirty child was sent to have a shower by
their godmother; and SVO – The child soils their godmother with the lunch food.) The results
of the production experiments revealed that there are acoustic cues which differentiate both
types of structure. Furthermore they suggest that there is a preference, when reading it aloud,
for SVO prosody production when participants are not aware of the full meaning of the
sentences. As regards the comprehension experiments it also showed a preference for SVO
structure when prosodic cues are not accessible to the listener. On the other hand, when the
prosody is informative, the results suggest that not only it guides the parser in syntactic
structure choices, but also it could “trick” and mislead it to the processing of a different
structure in the cases of incongruity between syntax and prosody. Altogether the results reveal
that, on the one hand, it seems to have a SVO default structure, but, on the other hand, the
Topic prosody seems to prevent, in a way, the activation of this default. Having the analyzed
structures and based on the experimental results, we indorse that the prosody guides these
projections, since the acoustic cues are accessible from the beginning of sentences. According
to Bocci’s proposal (2008), Topic properties would be coded in syntax, guiding the syntactic
derivation; in turn, these properties would be previously available via prosody. The obtained
results are discussed in the light of three theoretical approaches: the Prosodic/Phonological
Bootstrapping Hypothesis (MORGAN & DEMUTH, 1996; CHRISTOPHE et al., 1997;
2008), which holds that prosodic cues promote speech stream segmentation, facilitating
linguistic processing (in case of adults); the Integrated Model of Online Computation
(CORRÊA & AUGUSTO, 2006; 2007), according to which the syntactic tree is created
during the course of processing; and the Auditory Comprehension Model of Language
(FRIEDERICI, 2011), which foresees the prosody-syntax interaction on the
neurophysiological level.
|
104 |
O papel das fronteiras de sintagma fonológico na restrição do processamento sintático e na delimitação das categorias lexicaisSilva, Carolina Garcia de Carvalho 03 July 2009 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2016-10-11T12:07:56Z
No. of bitstreams: 1
carolinagarciadecarvalhosilva.pdf: 963240 bytes, checksum: 73b376e7be1b22197666cbad3e5ccb77 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2016-10-11T15:58:23Z (GMT) No. of bitstreams: 1
carolinagarciadecarvalhosilva.pdf: 963240 bytes, checksum: 73b376e7be1b22197666cbad3e5ccb77 (MD5) / Made available in DSpace on 2016-10-11T15:58:23Z (GMT). No. of bitstreams: 1
carolinagarciadecarvalhosilva.pdf: 963240 bytes, checksum: 73b376e7be1b22197666cbad3e5ccb77 (MD5)
Previous issue date: 2009-07-03 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Este estudo tem como objetivo investigar a influência de fronteiras de sintagmas fonológicos () na identificação das categorias lexicais no PB. Parte-se da idéia de que as categorias lexicais são identificadas a partir da estrutura sintática (BAKER, 2003). Esta, por sua vez, é mapeada pela estrutura prosódica. Assim, a hipótese de trabalho adotada é a de que a estrutura prosódica, ao restringir o processamento sintático, permite a identificação das categorias lexicais de termos ambíguos. Assume-se como perspectiva teórica a integração entre o Programa Minimalista (CHOMSKY, 1995; 1999) e o modelo do Bootstrapping Fonológico (MORGAN e DEMUTH, 1996; CHRISTOPHE et al., 1997), nos termos de Corrêa (2006), assim como com um modelo de processamento (Modelo Integrado da Competência Linguística, MICL: Corrêa e Augusto, 2006). Toma-se ainda a Fonologia Prosódica (NESPOR e VOGEL, 1986) que sustenta que as unidades fonológicas são organizadas hierarquicamente e que há uma relação, ainda que não obrigatória, entre constituintes prosódicos e sintáticos. Foram desenvolvidas duas atividades experimentais, tendo como base os estudos de Millotte et al. (2007) no francês, a fim de verificar como a sensibilidade às pistas prosódicas pode restringir o processamento sintático de sentenças, e consequentemente permitir a identificação das categorias lexicais Adjetivo e Verbo. Ambos os experimentos utilizaram sentenças com palavras ambíguas na condição Verbo – [a menina] [LIMPA...] – e na condição Adjetivo – [a menina LIMPA]. No Experimento 1, buscou-se verificar diferenças acústicas entre as duas condições nas fronteiras de sintagma fonológico. Mediramse os valores da duração, da frequência fundamental e da intensidade nos finais das fronteiras prosódicas. A análise destes valores revelou que: (i) há diferenças prosódicas que sinalizam a existência de fronteira de sintagma fonológico; (ii) as categorias lexicais N, V e Adj têm comportamentos distintos na estrutura prosódica. O Experimento 2 testou se, dependendo apenas do contexto prosódico, os participantes seriam capazes de identificar as categorias sintáticas dos elementos ambíguos. Os resultados encontrados sustentam a hipótese de que as pistas prosódicas existentes nas fronteiras de sintagma fonológico auxiliam na restrição do processamento sintático e na identificação das categorias lexicais. / This study investigates the influence of phonological phrase boundaries () on the identification of lexical categories in Brazilian Portuguese. The start point assumption is that lexical categories are identified based on the syntactic structure (BAKER, 2003). On the other hand, the syntactic structure is mapped by the prosodic structure. Thus, the working hypothesis adopted is that, since the prosodic structure constrains the syntactic structure, it allows, in consequence, the identification of lexical categories of ambiguous terms. We assume the integration between the Minimalist Program (CHOMSKY, 1995; 1999) and the Phonological Bootstrapping Model (MORGAN and DEMUTH, 1996; CHRISTOPHE et al., 1997), in terms of Corrêa (2006), as well as a Processing Model (Integrated Model of Linguistic Competence, MICL: CORRÊA and AUGUSTO, 2006). We also assume the Prosodic Phonology (NESPOR and VOGEL, 1986) which argues that the phonological units are hierarchically organized and that there is a relation between the prosodic e the syntactic constituents, even though that relation may not be obligatory. Based on the studies of Millotte et al. (2007) in French, two experimental activities were devolved in order to verify how the sensibility to prosodic cues may constrain the syntactic processing of sentences and allow the identification of lexical categories ADJ and V. Both experiments used sentences containing ambiguous words in the condition Verb – [a menina] [LIMPA...] (the girl CLEANS) – and in the condition Adjective – [a menina LIMPA] (the CLEAN girl). In Experiment 1, we tried to verify the acoustic differences between the two conditions in the phonological phrase boundaries. We measured the values of duration, fundamental frequency and intensity at the end of the prosodic boundaries. The analysis of those values revealed that (i) there are prosodic differences that signalize the existence of phonological phrase boundaries; (ii) the lexical categories N, V and ADJ have different behaviors in the prosodic structure. The Experiment 2 tested if, depending exclusively on the prosodic context, the participants were capable of identifying the syntactic categories of the ambiguous elements. The results support the hypothesis that prosodic cues present on the phonological phrase boundaries help constraining the syntactic processing and, in consequence, the identification of the lexical categories.
|
105 |
Synchronic and diachronic morphoprosody : evidence from Mapudungun and Early EnglishMolineaux Ress, Benjamin Joseph January 2014 (has links)
In the individual grammars of time-bound speakers, as well as in the historical transmission of a language, prosodic and morphological domains are forced to interact. This research focuses, in particular, on stress, and its instantiation in different domains of the morphological structure. It asks what factors are involved in prioritising one system – morphology or stress assignment – over the other and how radical the consequences of this may be on the overall structure of the language. The data comes from two typologically distinct languages: Mapudungun (previously 'Araucanian'), a polysynthetic and agglutinating language isolate from Chile and Argentina documented for over 400 years; and English, far further into the isolating and fusional spectra, and documented from the 7th century onwards. In both languages, we focus on morphologically complex words and how they evolve in relation to stress. In Mapudungun we examine the entire historical period, while in English we focus on the changes from Old to Middle English (8th -14th centuries). The analyses show how different types of data (from acoustics, to native and non-native intuitions; from historical corpora, to present-day experimentation techniques), can be used in order to assess whether the prosodic system will accommodate to the demarcation of morphological domains or whether morphological structure is to be shoehorned into the prosodic system's rhythmic pattern. Original contemporary field and experimental work on Mapudungun shows stress to fall on right-aligned moraic trochees in the stem and word domains. This contradicts claims in the foot-typology literature, where Araucanian stress goes from left to right, building quantity-insensitive iambs. A reconstruction of the history of the stress system suggests a transition from quantity insensitivity to sensitivity and the establishment of two domains of stress, which ultimately facilitates the parsing of word-internal structure, emphasising the demarcative function of stress. In the case of Early English, the focus is on the prefixal domain. Here the optimisation of the stress system – also trochaic – is shown to reduce the instances of clash in the language at large. As a result, a split in the prefixal system is identified, where prefixes constituting heavy, non-branching feet are avoided – and are ultimately lost – due to clash with root-initial stress, while light and branching feet remain in the language. In this case, it is the rhythmic or structural role of stress that is emphasised. Language internal factors are evaluated – in particular morphological type and stress properties – alongside external factors such as contact (with Chilean Spanish and Norman French), in order to provide a more general context for the observed changes and synchronic structure of the languages. A key concept in the analysis is that of 'pertinacity', the conservative nature of transmission in grammars, which leads learners to perpetuate perceived core elements of the system.
|
106 |
Emotionell prosodi efter högersidig cerebral stroke : Akustisk analys samt skattning av röstens uttrycksfullhet / Emotional prosody after right-hemisphere stroke : Acoustic analysis and rating of voice expressionJohansson, Inga-Lena January 2014 (has links)
Viktiga aspekter av kommunikationen styrs från höger hjärnhalva, däribland emotionell prosodi. Forskningen inom detta område har dock hittills varit ganska begränsad. En aspekt, som inte undersökts än, är jämförelse av deltagarens egen och lyssnares skattning av röstens uttrycksfullhet. Syftet med studien vara att undersöka förmågan att uttrycka emotionell prosodi efter stroke i höger hemisfär. Deltagare var tre patienter med stroke i höger hemisfär samt tre kontrollpersoner utan neurologisk sjukdom/skada eller problem med tal eller röst. Sammansättningen i grupperna av deltagare med stroke respektive kontrollpersoner var likvärdig avseende kön, ålder, dialektområde och utbildningsnivå. Emotionell prosodi undersöktes genom flera metoder: akustisk analys av grundtonsvariation samt deltagarnas egen såväl som lyssnares skattning av röstens uttrycksfullhet. I resultaten framkommer tendenser, som indikerar en skillnad mellan deltagarna med stroke i höger hemisfär och kontrollpersonerna. För deltagarna med stroke sågs mindre grundtonsvariation och lägre skattningar av röstens uttrycksfullhet. Då deltagarantalet var litet, bör resultaten tolkas med försiktighet. De tendenser till skillnader mellan försöks- och kontrollpersonerna som ses i resultaten motiverar dock för vidare studier. / Important aspects of communication, including emotional prosody, are regulated from the right hemisphere. However, the research in the area of emotional prosody has so far been rather limited. One of the aspects that have not been examined yet is the comparison of the participant’s own rating of voice expression with ratings by listeners. The aim of the study was to assess expressive emotional prosody after right-hemisphere stroke. Participants were three patients with right-hemisphere stroke and three controls without neurological conditions or problems regarding speech or voice. The groups of participants with stroke and the controls were matched regarding sex, age, dialect and level of education. Emotional prosody was examined using multiple methods: acoustic analysis of variation in fundamental frequency and the participants’ own as well as listeners’ rating of voice expression. The results show tendencies that indicate a difference between the participants with right-hemisphere stroke and the controls. The participants with stroke showed smaller variations in fundamental frequency and lower ratings of voice expression. Due to the small sample size in the present study, results should be treated with caution. However, the tendencies shown in the results regarding differences between subjects and controls would justify further studies.
|
107 |
EXPLORING THE ROLE OF PROSODIC AWARENESS AND EXECUTIVE FUNCTIONS IN WORD READING AND READING COMPREHENSION: A STUDY OF COGNITIVE FLEXIBILITY IN ADULT READERSChan, JESSICA S. 20 December 2013 (has links)
The current study examined the phonological process of prosodic ability in a model of adult word reading and reading comprehension ability. All phonological tasks involve executive functions (EF) reflected in an individual’s flexibility for manipulating different components of language. To account for the EF demands involved in phonological tasks of reading, EF was assessed using measures of inhibitory control and switching attention as both a control variable and predictor in each model of reading. Two research questions guided the study: 1) Do prosodic ability and EF make independent contributions to word reading, and reading comprehension ability when controlling for the other? 2) Do prosodic ability and EF make unique contributions to word reading, and reading comprehension ability when controlling for the other, in addition to controlling for vocabulary, fluid (nonverbal) intelligence, rapid automatized naming (RAN - Digits), and phonological short-term memory (PSTM)?
Participants were one hundred and three native-English speaking adults (18 to 55 years of age) recruited from Eastern Ontario. A total of 8 regression models were tested. The analyses revealed unique contributions of prosodic ability in adult word reading achievement, and EF in silent reading comprehension. Prosody’s contribution to word reading above EF supports prosodic awareness as a phonological skill that can be used to explain individual differences in word reading, whereas EF’s contribution to reading comprehension supports its’ role in more complex reading tasks. Prosody and EF represent constructs that warrant future consideration in models of reading. / Thesis (Master, Education) -- Queen's University, 2013-12-19 16:15:50.64
|
108 |
Rytinių kauniškių Prienų šnektos fonologija / The phonological system of the eastern Kaunas Prienai subdialectJaroslavienė, Jurgita 01 December 2010 (has links)
Disertacijoje pirmą kartą nuosekliai sinchroniškai aprašoma vakarų aukštaičių rytinių kauniškių Prienų šnektos fonologinė sistema: 1) analizuojama kirčiavimo sistema, kirčio įtaka prozodinei žodžių struktūrai, aptariamas šalutinių kirčių vartojimas, aprašomi pagrindinio kirčio akustiniai požymiai; 2) analizuojama priegaidžių sistema, fonologinė interpretacija, akustiniai požymiai; 3) aprašoma vokalizmo sistema, aptariami kiekybiniai ir kokybiniai jų alofonų vartojimo ypatumai, pateikiamos būdingosios balsių akustinės ir artikuliacinės charakteristikos; 4) aprašoma konsonantizmo sistema, aptariama priebalsių junginių struktūra; aprašomi akustiniai ir artikuliaciniai bei kai kurie kiti priebalsinių fonemų skiriamieji bruožai. Dabartinės situacijos analizei daugiausia naudojamasi individualiai nuo 1996 m. kauptais tiriamosios šnektos faktais. Be subjektyviųjų metodų (empirinių stebėjimų ir šnektos tekstų bei tiriamosios medžiagos įrašų klausymo), pasitelkiami ir objektyvieji – instrumentiniai ir statistiniai metodai. Garsų analizės programa PRAAT matuotos ir analizuotos balsių ir priebalsių spektrinės savybės, atlikta daug instrumentinių prozodijos tyrimų. Kai kurie tiriamosios šnektos reiškiniai lyginami su kitų, daugiausia kaimyninių, tarmių ar patarmių bei bendrinės lietuvių kalbos faktais. / The dissertation is the first consistent synchronic description of the phonological system existing in the subdialect of Prienai, a part of the Eastern Kaunas subdialect of Western Aukštaitian. It contains: 1) an analysis of the stressing system, the influence of stress on the prosodic structure of words, a usage review of secondary stresses and a description of primary stress acoustic features; 2) an analysis of the accent system, a related phonological interpretation and acoustic features; 3) a description of the vowel (vocalic) subsystem, a review of qualitative and quantitative peculiarities related to usage of vowel allophones and a presentation of typical acoustic and articulatory characteristics of vowels; 4) a description of the consonant subsystem and a review of the consonant combination structure; a description of acoustic and articulatory as well as some other distinctive features of consonant phonemes.
Facts of the subdialect of Prienai accumulated individually since 1996 are mainly used to carry out an analysis of the present situation. In addition to subjective methods (empirical observation and audition of subdialect texts and recordings of the materials researched), objective (instrumental and statistical) methods are also applied.
The sound analysis software PRAAT was used to measure and analyse spectral qualities of vowels and consonants, and numeral prosodic experiments were carried out.
Some of the phenomena of the analysed subdialect are compared to... [to full text]
|
109 |
The phonological system of the eastern Kaunas Prienai subdialect / Rytinių kauniškių Prienų šnektos fonologijaJaroslavienė, Jurgita 01 December 2010 (has links)
The dissertation is the first consistent synchronic description of the phonological system existing in the subdialect of Prienai, a part of the Eastern Kaunas subdialect of Western Aukštaitian. It contains: 1) an analysis of the stressing system, the influence of stress on the prosodic structure of words, a usage review of secondary stresses and a description of primary stress acoustic features; 2) an analysis of the accent system, a related phonological interpretation and acoustic features; 3) a description of the vowel (vocalic) subsystem, a review of qualitative and quantitative peculiarities related to usage of vowel allophones and a presentation of typical acoustic and articulatory characteristics of vowels; 4) a description of the consonant subsystem and a review of the consonant combination structure; a description of acoustic and articulatory as well as some other distinctive features of consonant phonemes.
Facts of the subdialect of Prienai accumulated individually since 1996 are mainly used to carry out an analysis of the present situation. In addition to subjective methods (empirical observation and audition of subdialect texts and recordings of the materials researched), objective (instrumental and statistical) methods are also applied.
The sound analysis software PRAAT was used to measure and analyse spectral qualities of vowels and consonants, and numeral prosodic experiments were carried out.
Some of the phenomena of the analysed subdialect are compared to... [to full text] / Disertacijoje pirmą kartą nuosekliai sinchroniškai aprašoma vakarų aukštaičių rytinių kauniškių Prienų šnektos fonologinė sistema: 1) analizuojama kirčiavimo sistema, kirčio įtaka prozodinei žodžių struktūrai, aptariamas šalutinių kirčių vartojimas, aprašomi pagrindinio kirčio akustiniai požymiai; 2) analizuojama priegaidžių sistema, fonologinė interpretacija, akustiniai požymiai; 3) aprašoma vokalizmo sistema, aptariami kiekybiniai ir kokybiniai jų alofonų vartojimo ypatumai, pateikiamos būdingosios balsių akustinės ir artikuliacinės charakteristikos; 4) aprašoma konsonantizmo sistema, aptariama priebalsių junginių struktūra; aprašomi akustiniai ir artikuliaciniai bei kai kurie kiti priebalsinių fonemų skiriamieji bruožai. Dabartinės situacijos analizei daugiausia naudojamasi individualiai nuo 1996 m. kauptais tiriamosios šnektos faktais. Be subjektyviųjų metodų (empirinių stebėjimų ir šnektos tekstų bei tiriamosios medžiagos įrašų klausymo), pasitelkiami ir objektyvieji – instrumentiniai ir statistiniai metodai. Garsų analizės programa PRAAT matuotos ir analizuotos balsių ir priebalsių spektrinės savybės, atlikta daug instrumentinių prozodijos tyrimų. Kai kurie tiriamosios šnektos reiškiniai lyginami su kitų, daugiausia kaimyninių, tarmių ar patarmių bei bendrinės lietuvių kalbos faktais.
|
110 |
Advanced natural language processing for improved prosody in text-to-speech synthesis / G. I. SchlünzSchlünz, Georg Isaac January 2014 (has links)
Text-to-speech synthesis enables the speech-impeded user of an augmentative and alternative communication system to partake in any conversation on any topic, because it can produce dynamic content. Current synthetic voices do not sound very natural, however, lacking in the areas of emphasis and emotion. These qualities are furthermore important to convey meaning and intent beyond that which can be achieved by the vocabulary of words only. Put differently, speech synthesis requires a more comprehensive analysis of its text input beyond the word level to infer the meaning and intent that elicit emphasis and emotion. The synthesised speech then needs to imitate the effects that these textual factors have on the acoustics of human speech. This research addresses these challenges by commencing with a literature study on the state of the art in the fields of natural language processing, text-to-speech synthesis and speech prosody. It is noted that the higher linguistic levels of discourse, information structure and affect are necessary for the text analysis to shape the prosody appropriately for more natural synthesised speech. Discourse and information structure account for meaning, intent and emphasis, and affect formalises the modelling of emotion. The OCC model is shown to be a suitable point of departure for a new model of affect that can leverage the higher linguistic levels. The audiobook is presented as a text and speech resource for the modelling of discourse, information structure and affect because its narrative structure is prosodically richer than the random constitution of a traditional text-to-speech corpus. A set of audiobooks are selected and phonetically aligned for subsequent investigation. The new model of discourse, information structure and affect, called e-motif, is developed to take advantage of the audiobook text. It is a subjective model that does not specify any particular belief system in order to appraise its emotions, but defines only anonymous affect states. Its cognitive and social features rely heavily on the coreference resolution of the text, but this process is found not to be accurate enough to produce usable features values. The research concludes with an experimental investigation of the influence of the e-motif features on human speech and synthesised speech. The aligned audiobook speech is inspected for prosodic correlates of the cognitive and social features, revealing that some activity occurs in the into national domain. However, when the aligned audiobook speech is used in the training of a synthetic voice, the e-motif effects are overshadowed by those of structural features that come standard in the voice building framework. / PhD (Information Technology), North-West University, Vaal Triangle Campus, 2014
|
Page generated in 0.1051 seconds