Global ETD Search

11	Prosodia in L2: giudizi percettivi di italofoni sulla produzione di apprendenti svedesi : Fenomeni diatopici nella percezione degli italofoni / Prosody in L2: the perceptive judgments of native Italians on the productions by Swedish learners : Diatopic phenomena in the natives’ perception Greco, Alberto January 2017 (has links) In the L2 learning process, prosody is among the most determinant linguistic features for a fully pragmatic competence. Additionally, this ability is often crucial for the disambiguation of certain structures and thus also crucial for perceptive acceptability. Italian prosodic configurations dramatically vary through the diatopic dimension. Nevertheless, certain prosodic structures and patterns can still be perceived as effective and acceptable. With these perspectives, we explored the oral acceptability in L2 of advanced Swedish learners of Italian through perceptive judgments by natives. Despite the lack of acoustic analysis and the limitation in the size of the sample of our study, the results showed a strong indication for a regional variation of the perception by the Italian natives, not only considering the production of learners but also considering the natives themselves. Finally, we discuss some of the didactic implications which may be considered particularly useful for native teachers in L2 contexts. L2 acquisition perception prosody linguistic variation Italian Swedish L2 acquisizione percezione prosodia variazione linguistica italiano svedese Languages and Literature Språk och litteratur
12	[en] EVALUATION OF POETRY TRANSLATION: ANNOTATION IN THE SEARCH FOR CONSENSUS / [pt] AVALIAÇÃO DE TRADUÇÃO DE POESIA: A ANOTAÇÃO NA BUSCA PELO CONSENSO JULIANA CUNHA MENEZES 01 June 2017 (has links) [pt] Este estudo, que se insere no viés pós-estruturalista, tem como hipótese a possibilidade de se estabelecerem categorias capazes de instrumentalizar avaliações minimamente consensuais de traduções de poesia. Assim, dadas duas ou mais traduções de um poema, submetidas a dois ou mais avaliadores que adotem categorias uniformes de análise, suas avaliações, ainda que não idênticas, terão em comum alguns pontos relevantes. A busca pelo consenso é feita através da anotação, uma das atividades da Linguística Computacional, que consiste em identificar e classificar um certo fenômeno linguístico, utilizando rótulos, etiquetas, categorias, em um determinado corpus para, assim, atingirmos um determinado objetivo. Os objetivos da tese são (a) fornecer, aos interessados em tradução de poesia, insumos para se poder avaliar, de forma minimamente consensual, traduções de poemas; e (b) explicitar, sistematizar e validar categorias do nível semântico-lexical, e descrever e confirmar categorias do nível formal (níveis métrico e rimático) e do plano de recursos sonoros, a fim de que possam ser usadas para embasar avaliações minimamente consensuais de traduções de poesia. A pesquisa apresenta três etapas. Na primeira, a anotação é utilizada como metodologia na busca pelo consenso. Nessa etapa, anotações de poemas originais e de traduções foram feitas por diferentes estudiosos. Ao comparar essas anotações em busca de consenso, confirmei/validei ou reformulei as categorias. O consenso permite confirmação e validação, já a falta dele abre espaço para reformulações e refinamentos. Na segunda etapa, a metodologia de Britto, com algumas observações adicionais, foi utilizada para analisar os resultados oriundos da primeira etapa. Objetiva-se, por meio de tal metodologia, verificar se os aspectos mais relevantes dos níveis métrico, rimático, semântico-lexical e do plano de recursos sonoros (aliterações, assonâncias e recursos afins) dos poemas originais foram recriados nas traduções. E na terceira, são utilizados os resultados da segunda etapa a fim de se produzir elementos para uma possível avaliação de traduções de poesia: entre duas traduções do soneto 130 de Shakespeare, verificar qual seria a mais fiel ao original. A hipótese foi comprovada quanto ao nível formal e ao plano de recursos sonoros, mas não quanto ao nível semântico-lexical. A validação das categorias do nível semântico-lexical pode prosseguir em pesquisas futuras, o que pode ou não resultar na possibilidade de concordância entre avaliações desse nível. O objetivo (a) foi atingido. Já o objetivo (b) foi atingido em parte: a explicitação, sistematização e validação das categorias do nível semântico-lexical iniciaram-se nesta pesquisa, e podem continuar em estudos futuros. Esta pesquisa pode ser vista como contribuição tanto para área de tradução de poesia, quanto para a Linguística Computacional. Quanto à primeira, a anotação prevê interpretações e tomadas de decisão, evidenciando, assim, as possíveis interpretações e decisões tomadas durante o processo de tradução. E em relação à segunda, o desenvolvimento de uma ferramenta para anotação de poemas, e de uma métrica para avaliação de traduções de poesia, utilizando as categorias presentes nesta tese, poderia ocorrer através de uma parceria com a Engenharia Computacional. / [en] This study, which can be included in the post-structuralist field, has as its hypothesis the possibility of establishing categories capable of making the following kind of evaluation possible: minimally consensual evaluations of poetry translations. Therefore, when taking into account two or more translations of a poem, submitted to two or more evaluators who adopt uniform categories of analysis, their evaluations, though not identical, will have some relevant aspects in common. The search for consensus is carried out through annotation, one of the activities of Computational Linguistics, which identify and classify a certain linguistic phenomenon, using labels, tags, categories, in a given corpus, so as to achieve a certain goal. The aims of this dissertation are (a) to provide tools for evaluating translations of poems in a minimally consensual way;and (b) to define, systematize and validate the categories of the semantic-lexical level, and to describe and confirm the categories of the formal level (metric and rhymic levels) and of the poetic field of sound resources, so that they all can be used to support minimally consensual evaluations of poetry translation.The research consists of three steps. In the first, annotation is used as a methodology in the search of consensus. In this step, annotations of poems and their translations were made by different annotators. By comparing these annotations searching for consensus, categories were confirmed/validated or reformulated. The consensus allows confirmation and validation, but absence of consensus suggests reformulations and refinements. In the second step, Britto s methodology, with some additional observations, was used to analyse the results from the first step. This methodology aims at verifying whether the most relevant aspects of the formal and semantic-lexical levels and of the poetic field of sound resources (alliterations, assonances, and the like) of the poems have been re-created in the translations. And in the third step, the results of the second one are used in order to produce resources for a possible evaluation of poetry translations: among two translations of the sonnet 130 by Shakespeare, which one would be the most faithful. The hypothesis was proved concerning the formal level and the poetic field of sound resources, but wasn t proved regarding the semantic-lexical level. The validation of categories of the semantic-lexical level can go on in future researches, which may or may not result in the possibility of agreement among evaluations of this level. Aim (a) was achieved. And aim (b) was partially achieved: the definition, systematization and validation of the categories of the semantic-lexical level started in this research, and can go on in future studies. This research can be seen as a contribution not only to the field of poetry translation, but also to Computational Linguistics. Regarding the first, annotation requires interpretations and decision-making processes, thus highlighting the possible interpretations and decision-making processes used during translation. And, concerning the second, the development of a tool for poem annotation, and of a metric for evaluation of poetry translation, using the categories present in this dissertation, may be achieved through a partnership with Computer Engineering. [pt] AVALIACAO [en] EVALUATION [pt] LINGUISTICA COMPUTACIONAL [en] COMPUTATIONAL LINGUISTICS [pt] TRADUCAO DE POESIA [pt] PROSODIA POETICA COMPARADA [pt] ANOTACAO [en] ANNOTATION
13	ANALISI PROSODICA DELLE DOMANDE RETORICHE NEL BUNDESTAG DAMIAZZI, VINCENZO 04 June 2021 (has links) Le domande retoriche (DR) sono parte integrante delle sedute plenarie nel Bundestag e vengono qui analizzate dal punto di vista prosodico. Il corpus comprende 40 DR in senso stretto e 60 domande topic-setting (DTS). L’analisi prosodica è stata svolta con il software PRAAT e si concentra principalmente sugli accenti nucleari, sui toni di confine e sui contorni nucleari. A seguito di un’analisi percettiva preliminare, la successiva analisi acustica si è concentrata sulla misurazione e l’osservazione delle variazioni di F0 e intensità in relazione ai contorni nucleari e prenucleari. I risultati mostrano che le DR sono realizzate in modo quasi esclusivo con contorni discendenti e si è osservata anche la presenza di numerosi toni nucleari con downstep. Un altro elemento caratteristico è la presenza di due o più accenti enfatici in posizione prenucleare che rompono la regolarità dello schema intonativo del tedesco. Sono stati inoltre osservati accenti sul pronome interrogativo, picchi di F0 sui deittici e realizzazioni contrastive (o verum focus) e tutti sono stati associati a una funzione di persuasione e mantenimento dell’attenzione. Nelle DTS sono stati osservati in gran parte gli stessi schemi. A differenza delle DR, tuttavia, circa un quarto delle DTS presenta un contorno ascendente o ascendente-progrediente. Tale contorno contribuisce a marcare le domande come espediente narrativo e come parte di un più ampio contesto prosodico. / Rhetorical questions (RQs) are an integral part of the plenary sessions of the Bundestag and are here analysed in their prosodic realisations. The corpus comprises 40 RQs and 60 topic-setting questions (TSQs). The prosodic analysis has been carried out using the software PRAAT and focuses on nuclear accents, boundary tones and nuclear contours. After a preliminary perceptive analysis, the following acoustic analysis was aimed at measuring and observing the variations of F0 and intensity in relation to nuclear and pre-nuclear contours. The results show that RQs are almost exclusively produced with a falling contour and a widespread use of downstepped nuclear tones was also observed. Another key feature was the occurrence of two or more emphatic accents in prenuclear position which break the regular pattern of German intonation. Other prosodic patterns such as accents on the wh-word, F0 peaks on deictics and the use of contrastive realisations – or verum focus – were observed and their function associated to persuasion and maintaining attention. TSQs largely presented the same patterns, but differently from RQs, one-fourth of TSQs presented a rising or rising-progredient contour. This contour contributes to marking the question as a narrative device and as part of a broader prosodic context.
14	Prosodia y fonación no modal de vocales en shiwilu (jebero) Madalengoitia Barúa, María Gracia 02 August 2018 (has links) La presente investigación ofrece una descripción acústica de la realización de la consonante oclusiva glotal del shiwilu según su posición en la estructura métrica de la palabra. El sistema fonológico del shiwilu incluye una oclusiva glotal /ʔ/. Esta oclusiva, que puede aparecer como coda silábica en sílabas (C)VC, no se realiza siempre como un salto glotal propiamente dicho, aunque es constante que la vocal que la antecede presente fonación no modal en parte o en toda su extensión. Dicha fonación no modal es siempre una laringalización que presenta, en algunos casos, los rasgos de la voz crujiente. La aparición de las diversas realizaciones de la secuencia /Vʔ/, la cual subyacentemente presenta una vocal modal seguida de una oclusiva glotal, tiene una relación con la posición de la oclusiva en la estructura métrica de la palabra. La realización de dicha secuencia en la posición prominente de la estructura métrica, es decir, en la sílaba acentuada, muestra una tendencia a mantener los rasgos subyacentes: se realiza un salto glotal y, aunque se presenta una laringalización de la vocal, esta se restringe a su parte final, precedida de una porción vocálica modal. Por el contrario, en las posiciones no prominentes de la estructura métrica, es decir, en la sílaba extramétrica y en la sílaba no acentuada del pie, los rasgos subyacentes de la secuencia /Vʔ/ tienden a perderse: no se realiza un salto glotal y la laringalización de la vocal puede ocupar parte o, incluso, toda la extensión del segmento vocálico. Esta investigación, además, muestra que, en shiwilu, la inclinación espectral es un parámetro acústico que permite distinguir la voz modal de la voz crujiente. / Tesis Prosodia Fonología
15	[en] SYNTACTIC IMPAIRMENT AND READING ABILITIES: POSSIBLE RELATIONS / [pt] POSSÍVEIS RELAÇÕES ENTRE DISTÚRBIOS DA LINGUAGEM NO DOMÍNIO DA SINTAXE E HABILIDADES DE LEITURA NOELLE CASTRO FERREIRA 16 August 2017 (has links) [pt] Essa dissertação investiga uma possível relação entre comprometimentos no domínio sintático e dificuldades de leitura. Mais especificamente, busca-se verificar se dificuldades na compreensão oral de estruturas altamente custosas – interrogativas QU mais N de objeto (OWH mais N) e relativas de objeto (ORCs) – preveem problemas na compreensão leitora, quando tais estruturas estão envolvidas. 78 alunos (idade média: 12) do sexto ano de três escolas públicas do Rio de Janeiro participaram desse estudo. Suas habilidades sintáticas foram inicialmente testadas. Dois grupos foram criados para participarem dos testes de leitura: com possível comprometimento sintático (SI) (n igual 25) e controle (CT) (n igual 53). Um teste de reconhecimento de palavras e outro de leitura de palavras/pseudopalavras isoladas foram elaborados e realizados, uma vez que a fluência em leitura requer que dificuldades nessas habilidades sejam superadas. Novos grupos (SI e CT), sem problemas nesse nível, foram definidos (com 12 participantes cada). Dois experimentos foram conduzidos, buscando testar suas habilidades de fluência (velocidade, precisão e prosódia) em leitura com OWH mais N e ORCs em sentenças isoladas (tarefa de identificação de imagens) e no discurso (tarefa de leitura automonitorada). No último caso, o sujeito interveniente foi manipulado quanto à complexidade estrutural (DP completo ou pronome). Um aspecto da prosódia (uso adequado de pitch) distinguiu os grupos, com menor desempenho no grupo SI em ambas as tarefas. Um aspecto da precisão (número de disfluências) também os distinguiu quando as sentenças investigadas foram apresentadas no discurso (mais disfluências no grupo SI). A compreensão foi particularmente afetada quando as sentenças foram lidas isoladamente. Fatores relacionados à escolha lexical, bem como fatores discursivos podem criar demandas diferenciadas para as sentenças no discurso e isso pode ter minimizado efeitos de grupo. O efeito de intervenção foi obtido na direção prevista nas interrogativas QU mais N (OWH mais N), em que uma demanda maior foi observada nas condições de DP completos. Já no caso das relativas de objeto (ORC), um efeito significativo não foi obtido, possivelmente devido a dificuldades na atribuição de força ilocucionária interrogativa às perguntas SIM /NÃO. Uma amostra de sentenças com diferentes estruturas e com pontuação variada foi examinada, a fim de se obter uma análise geral de fluência em leitura. Novamente o pitch é indicativo de comprometimento sintático. Comprometimentos na interface sintaxe-prosódia podem explicar esses resultados. / [en] This dissertation intends to verify whether syntactic impairment, as detected in in the oral comprehension of costly structures - object WH plus N questions (OWH plus N) and object relative clauses (ORCs) -, predicts difficulties in reading fluency and comprehension. The syntactic abilities of 6th graders from three public schools in Rio de Janeiro were evaluated, giving rise to syntactically impaired (SI) and control (CT) groups. Those who satisfied criteria at the word level in word-recognition/reading tasks proceeded. Two experiments assessed reading fluency (rate, accuracy, prosody) and the comprehension of each of the target structures in isolation (picture-identification task) and in discourse (self-paced reading task). In the latter, the intervening subject was manipulated for structural complexity (full nominal phrase and pronoun). One aspect of prosody (pitch contour) distinguished the groups, with lower scores in the SI group in both tasks. An aspect of accuracy (number of disfluencies) distinguished them in the discourse task (more disfluencies in the SI group). The comprehension of the target-sentences is isolation was harder for the SI group. Lexical/discourse factors can create differential demands for sentences in discourse, minimizing group effects. The effect of intervention was in the predicted direction for OWH plus N sentences (more demands for full nominal phase subjects). As for RC, this effect was not significant, possibly due to difficulties in the ascription of illocutionary force to YES/NO questions. In an overall analysis of reading fluency in discourse (different structures/punctuation marks), the pitch contour indicates syntactic impairment. Impairment at the syntax-prosody interface can account for these results. [pt] FLUENCIA EM LEITURA [en] READING FLUENCY [pt] COMPROMETIMENTO SINTATICO [en] SYNTACTIC IMPAIRMENT [pt] DEL [en] SLI [pt] PROSODIA [en] PROSODY [pt] HIPOTESE DA INTERVENCAO [en] INTERVENTION HYPOTHESIS
16	La lingua dell'insegnante: Un modello per l'insegnamento e per l'apprendimento. Fondamenti metodologici dell'insegnamento CLIL. / DIE SPRACHE DER LEHRPERSON: EIN LEHR-LERN-MODELL METHODISCHE GRUNDLAGEN DES BILINGUALEN SACHFACHUNTERRICHTS / The language of the teacher: A model for teaching and learning. Methodological principles of CLIL. ZANIN, RENATA 23 March 2015 (has links) L’insegnamento disciplinare in una lingua straniera (Content and Language Integrated Learning - CLIL) richiede al docente DNL (discipline non linguistiche) una particolare attenzione alla lingua comune, alla lingua della disciplina, ma soprattutto alle frasi fatte, ai frasemi, alle parole sintagmatiche e alle espressioni multiparola (Masini 2009). La categoria degli atti linguistici attenti alla lingua (Leisen 2010) può essere definita in relazione alla lingua franca, caratterizzata da un minor uso di parole sintagmatiche, di espressioni multiparola, di frasi idiomatiche (Aguado 2002b) come anche da una minore attenzione alle relative forme prosodiche. Se da un lato sembra in questo modo definito uno degli obiettivi principali della formazione linguistica degli insegnanti DNL (discipline non linguistiche), dall’altro devono essere analizzate e verificate le vie che permettano il raggiungimento di tale obiettivo. Il lavoro di ricerca svolto traccia un modello di insegnamento e di apprendimento basato proprio sulle espressioni multiparola e sulla loro espressione prosodica. Tre i filoni scientifici alla base del modello: il "competition model" (Bates 1999), la "idiomatisch geprägte Sprache" e la "Anschließbarkeit" (Feilke 1994, 1996) e la lettura silente (Perrone-Bertolotti et al. 2013). Da essi nasce l’indicazione della "prosodische Prägung" di grande importanza per il modello di insegnamento e apprendimento proposto che trova nelle fonti letterarie tedesche testi eccellenti per una didattica interdisciplinare, che sappia valorizzare il ruolo del docente di lingua straniera nel suo apporto all’apprendimento della lingua tedesca nella classe CLIL. / Teaching a CLIL-class (Content and Language Integrated Learning) is a big challenge for teachers of non-linguistic disciplines (NLD), as they need to pay particular attention not only to the specialized language but also to the everyday language, in particular to chunks and formulaic sequences. The discourse structure necessary for a ‘good’ Clil lesson contrasts with the lingua franca, which is characterized by a reduced use of chunks and formulaic sequences (Aguado 2002, Wray 2002) and by a reduced attention for prosodic features. Teacher training on Content Language Integrated Learning needs therefore to raise and to foster language awareness of discipline teachers. The research presented in this work proposes a model for teaching and learning in CLIL-classes, which is based on chunks, formulaic sequences and their prosodic features. The scientific foundations of this model are the competition model (Bates 1999), the "idiomatisch geprägte Sprache" as well as the "Anschließbarkeit" (Feilke 1994, 1996) and the silent reading (Perrone-Bertolotti et al. 2013). The central point of the model for teaching and learning is the "prosodische Prägung", which can be successfully exercised and learned with poems from the great German literature giving new value to the role of the foreign language teacher in supporting the discipline teachers in CLIL-classes. L-LIN/02: DIDATTICA DELLE LINGUE MODERNE
17	Resonance in storytelling:verbal, prosodic and embodied practices of stance taking Niemelä, M. (Maarit) 27 April 2011 (has links) Abstract This study examines stories as they appear in everyday conversation, focusing on the high degree of parallelism observed in them. Such parallelism is shown to be a vehicle of stance taking in interaction. Stance taking is here viewed as a highly intersubjective and interactive, public, multi-layered activity, which involves words, linguistic structures, voices, the body and the surrounding environment, and is embedded in the sequential organisation of social interaction. Stance taking involves various types of resonance between two interaction participants and also between the interactional turns of one participant. The concept of resonance is treated as the process of activating affinity across dialogic turns of talk within a telling or a series of tellings. The present study uses both audio and video recordings of naturally-occuring everyday interactions as data. The study first shows that voiced direct reported speech (DRS) utterances displaying a shared stance are an appropriate response to prior voiced DRS utterances and that a sequence of subsequent resonant voiced DRS utterances is an orderly phenomenon in interaction and a sequentially relevant practice of stance taking. Secondly, the study explicates the way in which participants use resonant words, structures, voicing and embodiment, and implicate the surrounding environment in constructing a reporting space. The reporting space enables and invites active participation in the form of multimodal enactments from all the participants of the telling event to the overall stance-taking activity within the storytelling sequence. Thirdly, the study details the use of resonating formal storytelling elements functioning as a resource for stance taking, e.g. the preface of a second telling by second tellers ties back to the preface and the high point of a prior telling. Finally, the study examines the way in which multiple actions, such as troubles telling, delivering news, giving an explanation and requesting advice, are accomplished via repeated tellings of a story in different interactional contexts. Similar structural units of such tellings resonate in form, whereas some lexico-syntactic details of these units vary according to the social actions that are being accomplished via the tellings, according to the engagement of the recipient in the telling and to the physical circumstances of the telling. / Tiivistelmä Tutkimus tarkastelee arkikertomuksissa ilmeneviä parallellismin muotoja ja sitä miten nämä rakentavat vuorovaikutuksellista asennoitumistoimintaa. Asennoituminen nähdään monisäikeisenä intersubjektiivisena ja interaktiivisena toimintana, joka rakentuu puhujien sanojen, kielellisten rakenteiden, äänen ja kehon keinoin. Samanaikaisesti se rakentuu vuorovaikutuksen sosiaalisten toimintojen ja niiden sekventiaalisen järjestyksen tuloksena. Asennoitumistoimintaa ilmentää eriasteinen resonanssi pääasiassa eri puhujien mutta myös yhden puhujan eri vuorojen välillä: Puhujan resonoiva vuoro sitoo sen edellisen arkikertomuksen tai arkikertomussarjan vuoroihin aktivoiden näin yhtäläisyyden vuorojen välillä. Ilmiöitä tarkastellaan vuorovaikutuslingvistiikan ja keskustelunanalyysin menetelmin. Tutkimuksen aineisto koostuu englannin- ja suomenkielisistä äänitetyistä ja videoiduista arkikeskusteluista. Tutkimus osoittaa, että kertomistapahtumaan osallistuvat puhujat tuottavat kertomusten huippukohdissa kohosteisia referointivuoroja vastauksina aiempien kertojien kohosteisiin referointivuoroihin. Puhujat ilmaisevat tällä tavalla asennoitumistaan yhtäältä kerronnan sisältöön ja toisaalta edeltävien vuorojen ilmentämään asennoitumistoimintaan. Tutkimuksessa kartoitetaan myös sitä, miten puhujat rakentavat asennoitumista sanojen, kielellisten rakenteiden, prosodian ja kehollisten keinojen avulla. Kertomusten huippukohdissa puhujat referoivat roolihenkilöitä puheen lisäksi myös kehollisin keinoin, mitä tutkimuksessa kutsutaan roolissa toimimiseksi. Vastaanottajat voivat vastata asettumalla itsekin rooliin. He osoittavat ymmärtävänsä kertojan näkökulman tuottamalla kertomuksen sisältöön ja kertojan ilmentämiin asenteisiin sopivia samanlinjaisia lisävuoroja. Edelleen tutkimus kuvailee nk. toisen kertomuksen kielellisiä, prosodisia ja kehollisia elementtejä, jotka resonoivat edeltävän kertomuksen vuorojen elementtien kanssa ja joiden avulla asennoitumistoiminta rakentuu. Kertojat viittaavat toisen kertomuksen vuoroillaan edellisen kertomuksen vuoroihin aktivoiden yhtäläisyyksiä yhtäältä kyseisten resonoivien vuorojen ja toisaalta edeltävän ja toisen kertomuksen asennoitumistoimintojen välillä. Lisäksi tutkimuksessa tarkastellaan samansisältöisiä peräkkäisiä arkikertomuksia, jotka on tuotettu eri vastaanottajille. Kertoja tuottaa samansisältöisten kertomusten avulla eri toimintoja vastaanottajasta ja vuorovaikutusympäristöstä riippuen. Kertomusten välillä on resonoivia rakenteellisia yhtäläisyyksiä, mutta ne myös poikkeavat toisistaan sosiaalisen toiminnon sekä vastaanottajan sitoutumisen asteen ja ympäröivien olosuhteiden mukaan. direct reported speech (DRS) embodiment enactment intersubjectivity multimodality parallelism recycling reporting space resonance retelling second story stance taking storytelling arkikertomus asennoitumistoiminta intersubjektiivisuus kehollisuus multimodaalisuus parallellismi prosodia referointi resonanssi roolissa toimiminen roolitila toinen kertomus
18	Producción de un corpus oral y modelado prosódico para la síntesis del habla expresiva Iriondo Sanz, Ignasi 18 June 2008 (has links) Aquesta tesi aborda diferents aspectes relacionats amb la síntesi de la parla expressiva. Es parteix de l'experiència prèvia en sistemes de conversió de text a parla del Grup en Processament Multimodal (GPMM) d'Enginyeria i Arquitectura La Salle, amb l'objectiu de millorar la capacitat expressiva d'aquest tipus de sistemes. La parla expressiva transmet informació paralingüística com, per exemple, l'emoció del parlant, el seu estat d'ànim, una determinada intenció o aspectes relacionats amb l'entorn o amb el seu interlocutor. Els dos objectius principals de la present tesi consisteixen, d'una banda, en el desenvolupament d'un corpus oral expressiu i, d'una altra, en la proposta d'un sistema de modelatge i predicció de la prosòdia per a la seva utilització en l'àmbit de la síntesi expressiva del parla.En primer lloc, es requereix un corpus oral adequat per a la generació d'alguns dels mòduls que componen un sistema de síntesi del parla expressiva. La falta de disponibilitat d'un recurs d'aquest tipus va motivar el desenvolupament d'un nou corpus. A partir de l'estudi dels procediments d'obtenció de parla emocionada o expressiva i de l'experiència prèvia del grup, es planteja el disseny, l'enregistrament, l'etiquetatge i la validació del nou corpus. El principal objectiu consisteix a aconseguir una elevada qualitat del senyal i una cobertura fonètica suficient (segmental i prosòdica), sense renunciar a l'autenticitat des del punt de vista de l'expressivitat oral. El corpus desenvolupat té una durada de més de cinc hores i conté cinc estils expressius: neutre, alegre, sensual, agressiu i trist. En tractar-se de parla expressiva obtinguda mitjançant la lectura de textos semànticament relacionats amb els estils definits, s'ha requerit un procés de validació que garanteixi que les locucions que formen el corpus incorporin el contingut expressiu desitjat. L'avaluació exhaustiva de tots els enunciats del corpus seria excessivament costosa en un corpus de gran grandària. D'altra banda, no existeix suficient coneixement científic per a emular completament la percepció subjectiva mitjançant tècniques automàtiques que permetin una validació exhaustiva i fiable dels corpus orals. En el present treball s'ha proposat un mètode que suposa un avanç cap a una solució pràctica i eficient d'aquest problema, mitjançant la combinació d'una avaluació subjectiva amb tècniques d'identificació automàtica de l'emoció en el parla. El mètode proposat s'utilitza per a portar a terme una revisió automàtica de l'expressivitat del corpus desenvolupat. Finalment, una prova subjectiva ha permès validar el correcte funcionament d'aquest procés automàtic. En segon lloc i, sobre la base dels coneixements actuals, de l'experiència adquirida i dels reptes que es desitjaven abordar, s'ha desenvolupat un sistema d'estimació de la prosòdia basat en corpus. Tal sistema es caracteritza per modelar de forma conjunta les funcions lingüística i paralingüística de la prosòdia a partir de l'extracció automàtica d'atributs prosòdics del text, que constitueixen l'entrada d'un sistema d'aprenentatge automàtic que prediu els trets prosòdics modelats prèviament. El sistema de modelatge prosòdic presentat en aquest treball es fonamenta en el raonament basat en casos, que es tracta d'una tècnica d'aprenentatge automàtic per analogia. Per a l'ajustament d'alguns paràmetres del sistema desenvolupat i per a la seva avaluació s'han utilitzat mesures objectives de l'error i de la correlació calculades en les locucions del conjunt de prova. Atès que les mesures objectives sempre es refereixen a casos concrets, no aporten informació sobre el grau d'acceptació que tindrà la parla sintetitzada en els oïdors. Per tant, s'han portat a terme una sèrie de proves de percepció en les quals un conjunt d'avaluadors ha puntuat un grup d'estímuls en cada estil. Finalment, s'han analitzat els resultats per a cada estil i s'han comparat amb les mesures objectives obtingudes, el que ha permès extreure algunes conclusions sobre la rellevància dels trets prosòdics en la parla expressiva, així com constatar que els resultats generats pel mòdul prosòdic han tingut una bona acceptació, encara que s'han produït diferències segons l'estil. / Esta tesis aborda diferentes aspectos relacionados con la síntesis del habla expresiva. Se parte de la experiencia previa en sistemas de conversión de texto en habla del Grup en Processament Multimodal (GPMM) de Enginyeria i Arquitectura La Salle, con el objetivo de mejorar la capacidad expresiva de este tipo de sistemas. El habla expresiva transmite información paralingüística como, por ejemplo, la emoción del hablante, su estado de ánimo, una determinada intención o aspectos relacionados con el entorno o con su interlocutor. Los dos objetivos principales de la presente tesis consisten, por una parte, en el desarrollo de un corpus oral expresivo y, por otra, en la propuesta de un sistema de modelado y predicción de la prosodia para su utilización en el ámbito de la síntesis expresiva del habla. En primer lugar, se requiere un corpus oral adecuado para la generación de algunos de los módulos que componen un sistema de síntesis del habla expresiva. La falta de disponibilidad de un recurso de este tipo motivó el desarrollo de un nuevo corpus. A partir del estudio de los procedimientos de obtención de habla emocionada o expresiva y de la experiencia previa del grupo, se plantea el diseño, la grabación, el etiquetado y la validación del nuevo corpus. El principal objetivo consiste en conseguir una elevada calidad de la señal y una cobertura fonética suficiente (segmental y prosódica), sin renunciar a la autenticidad desde el punto de vista de la expresividad oral. El corpus desarrollado tiene una duración de más de cinco horas y contiene cinco estilos expresivos: neutro, alegre, sensual, agresivo y triste. Al tratarse de habla expresiva obtenida mediante la lectura de textos semánticamente relacionados con los estilos definidos, se ha requerido un proceso de validación que garantice que las locuciones que forman el corpus incorporen el contenido expresivo deseado. La evaluación exhaustiva de todos los enunciados del corpus sería excesivamente costosa en un corpus de gran tamaño. Por otro lado, no existe suficiente conocimiento científico para emular completamente la percepción subjetiva mediante técnicas automáticas que permitan una validación exhaustiva y fiable de los corpus orales. En el presente trabajo se ha propuesto un método que supone un avance hacia una solución práctica y eficiente de este problema, mediante la combinación de una evaluación subjetiva con técnicas de identificación automática de la emoción en el habla. El método propuesto se utiliza para llevar a cabo una revisión automática de la expresividad del corpus desarrollado. Finalmente, una prueba subjetiva con oyentes ha permitido validar el correcto funcionamiento de este proceso automático.En segundo lugar y, sobre la base de los conocimientos actuales, a la experiencia adquirida y a los retos que se deseaban abordar, se ha desarrollado un sistema de estimación de la prosodia basado en corpus. Tal sistema se caracteriza por modelar de forma conjunta las funciones lingüística y paralingüística de la prosodia a partir de la extracción automática de atributos prosódicos del texto, que constituyen la entrada de un sistema de aprendizaje automático que predice los rasgos prosódicos modelados previamente. El sistema de modelado prosódico presentado en este trabajo se fundamenta en el razonamiento basado en casos que se trata de una técnica de aprendizaje automático por analogía. Para el ajuste de algunos parámetros del sistema desarrollado y para su evaluación se han utilizado medidas objetivas del error y de la correlación calculadas en las locuciones del conjunto de prueba. Dado que las medidas objetivas siempre se refieren a casos concretos, no aportan información sobre el grado de aceptación que tendrá el habla sintetizada en los oyentes. Por lo tanto, se han llevado a cabo una serie de pruebas de percepción en las que un conjunto de oyentes ha puntuado un grupo de estímulos en cada estilo. Finalmente, se han analizado los resultados para cada estilo y se han comparado con las medidas objetivas obtenidas, lo que ha permitido extraer algunas conclusiones sobre la relevancia de los rasgos prosódicos en el habla expresiva, así como constatar que los resultados generados por el módulo prosódico han tenido una buena aceptación, aunque se han producido diferencias según el estilo. / This thesis deals with different aspects related to expressive speech synthesis (ESS). Based on the previous experience in text-to-speech (TTS) systems of the Grup en Processament Multimodal (GPMM) of Enginyeria i Arquitectura La Salle, its main aim is to improve the expressive capabilities of such systems. The expressive speech transmits paralinguistic information as, for example, the emotion of the speaker, his/her mood, a certain intention or aspects related to the environment or to his/her conversational partner. The present thesis tackles two main objectives: on the one hand, the development of an expressive speech corpus and, on the other, the modelling and the prediction of prosody from text for their use in the ESS framework. First, an ESS system requires a speech corpus suitable for the development and the performance of some of its modules. The unavailability of a resource of this kind motivated the development of a new corpus. Based on the study of the strategies to obtain expressive speech and the previous experience of the group, the different tasks have been defined: design, recording, segmentation, tagging and validation. The main objective is to achieve a high quality speech signal and sufficient phonetic coverage (segmental and prosodic), preserving the authenticity from the point of view of the oral expressiveness. The recorded corpus has 4638 sentences and it is 5 h 12 min long; it contains five expressive styles: neutral, happy, sensual, aggressive and sad. Expressive speech has been obtained by means of the reading of texts semantically related to the defined styles. Therefore, a validation process has been required in order to guarantee that recorded utterances incorporate the desired expressive content. A comprehensive assessment of the whole corpus would be too costly. Moreover, there is insufficient scientific knowledge to completely emulate the subjective perception through automated techniques that yield a reliable validation of speech corpora. In this thesis, we propose an approach that supposes a step towards a practical solution to this problem, by combining subjective evaluation with techniques for the automatic identification of emotion in speech. The proposed method is used to perform an automatic review of the expressiveness of the corpus developed. Finally, a subjective test has allowed listeners to validate this automatic process.Second, based on our current experience and the proposed challenges, a corpus-based system for prosody estimation has been developed. This system is characterized by modelling both the linguistic and the paralinguistic functions of prosody. A set of prosodic attributes is automatically extracted from text. This information is the input to an automatic learning system that predicts the prosodic features modelled previously by a supervised training. The root mean squared error and the correlation coefficient have been used in both the adjustment of some system parameters and the objective evaluation. However, these measures are referred to specific utterances delivered by the speaker in the recording session, and then they do not provide information about the degree of acceptance of synthesized speech in listeners. Therefore, we have conducted different perception tests in which a group of listeners has scored a set of stimuli in each expressive style. Finally, the results for each style have been analyzed and compared with the objective measures, which has allowed to draw some conclusions about the relevance of prosodic features in expressive speech, as well as to verify that the results generated by the prosodic module have had a good acceptance, although with differences as a function of the style. Speech Technology Text-to-speech Expressive speech synthesis Prosody Speech corpora Tecnologías del Habla Síntesis del habla expresiva Conversión de texto en habla Prosodia Corpus orales Tecnologies de la Parla Conversió de text a parla Prosòdia Síntesi de la parla expressiva Corpus Orals Les TIC i la seva gestió 621.3 81

Search results