• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 29
  • 28
  • 10
  • 7
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 95
  • 25
  • 25
  • 14
  • 13
  • 13
  • 11
  • 10
  • 10
  • 10
  • 9
  • 9
  • 8
  • 7
  • 7
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Lire l'oral. Pour une typologie linguistique des représentations écrites de l'oralité. (Le cas du français). / Orality effects. Toward a linguistic typology of written representations of spoken language. The case of French.

Mahrer, Rudolf 26 June 2014 (has links)
Cette thèse propose de passer en revue les modalités de la représentation écrite de l’oralité. L’écrit littéraire sert de matériau à la théorisation.La problématique – comment l’écrit représente-t-il l’oral ? – est d’abord située et reformulée dans le cadre de la linguistique de la parole (I). Les rapports entre oralité et scripturalité sont ensuite étudiés sous trois angles. L’angle biotechnologique compare la matérialité et l’affordance des signaux graphiques et des signaux acoustiques (II 1). L’examen sémiotique reconnaît dans le français écrit un système dit phonographique dont la fonction est de représenter l’expression des signes du français oral. Sont analysées alors les relations entre les systèmes de signes impliqués, la diversité des actualisations possibles du système phonographique (effets de voix), ainsi que diverses sémiotiques analogiques (II 2). On étudie ensuite le rôle de la prosodie dans la lecture. La position adoptée est la suivante : bien qu’elle soit facultative dans l’activité de lecture, la prosodie est spécialement sollicitée par des écrits qu’on peut caractériser linguistiquement. L’interprétation prosodique apporte à ces écrits un surcroît de signification en même temps qu’il produit un mode spécifique de représentation de l’oral appelé effet prosodique (II 3). L’angle sémantique est esquissé finalement : il conduit à dégager deux modalités de représentation supplémentaire. Pour la première, l’oral se situe sur le plan sémantico-référentiel de l’expression écrite (écrire à propos d’oral) ; pour la seconde, l’oral est un extérieur discursif modalisant le dire écrit : l’écrit est reconnu comme énoncé à la manière de l’oral (effet de style oral). / This PhD thesis attempts to review the modalities of orality in written representation. Literary writings act as the material for theorization. First of all, the thesis statement – how does writing represent oral – is situated and then, reformulated within the frame of linguistique de la parole (the linguistic field of speech) (I). The connections between orality and writing are then studied under three angles. The biotechnological angle compares the materiality and the affordance of graphic signs and acoustic signals (II 1). A semiotic examination acknowledges, in French, a phonographical system whose function is to represent the expression of French oral signs. Thus, the relationships between the systems of implicated signs, the diversity of possible actualisations of the phonographic system (voice effects), as well as various analogical semiotics are analysed (II 2). Furthermore, the role of prosody is studied within reading. The stand taken is the following : even though it is optional during a reading activity, prosody is especially sought-after by linguistically characterised writings. The prosodic interpretation brings to these writings a surge of signification while producing a specific mode of oral representation called the prosodic effects (II 3). The semantic angle is finally drawn : it leads to two additional modalities of representation. For the first part, speech is located on the semantic and referential plan of the written expression (writing about speech); as for the second part, spoken language is a discursive exteriority : writing is recognised as an oral-like utterance (oral-like effect).
52

Ritmo e escrita em L\'innommable, Comment c\'est e Compagnie de Samuel Beckett / Rhythm and writing in L\'innommable, Comment c\'est, and Compagnie de Samuel Beckett

Tereza Cristina Bulla 21 September 2012 (has links)
A escrita é uma ferramenta poderosa utilizada pelo homem desde que ele descobriu que poderia se comunicar sobre um suporte fixo e não somente pela fala. Ao longo dos anos, o homem desenvolveu essa ferramenta e o suporte onde ela era inserida, transformando e desenvolvendo ambos através de papiros, pergaminhos e códices, até chegar à imprensa, que revolucionou de vez a escrita com a introdução dos sinais de pontuação na mesma. Com o passar dos anos e com o advento da literatura moderna, o suporte textual foi cada vez sendo mais valorizado e trabalhado, até chegarmos a escritores modernos como Samuel Beckett. Mas por que ritmo e escrita? Porque ambos estão intimamente relacionados: não existe escrita sem ritmo. De fato, não há discurso sem ritmo, pois ele é organizado pelo ritmo. Assim, a sintaxe e a pontuação fazem parte do jogo rítmico textual. Pode-se dizer que ritmo, sintaxe e pontuação formam uma tríade poderosa e analisar esses elementos nos três últimos romances de Samuel Beckett é um trabalho importante para se mostrar um trabalho inovador com o ritmo e a escrita. / Writing is a powerful tool used by man since he discovered he could communicate on a fixed support, not only through speech. Over the years, man has developed this tool and support where it was inserted, transforming and developing both through papyrus scrolls and codices, until you get to the press, which revolutionized the writing of time with the introduction of punctuation marks in it. Over the years and with the advent of modern literature, the textual support was increasingly being more valued and worked until we reach modern writers as Samuel Beckett. But why rhythm and writing? Because both are intimately related: there is no writing without rhythm. In fact, there is no speech rhythm as it is organized by the rhythm. Thus, syntax and punctuation are part of textual rhythm game. You could say that rhythm, syntax and punctuation form a powerful triad and analyze these elements in the last three novels of Samuel Beckett is an important job to show innovative work with the rhythm and writing.
53

Stil. Punkt. : Zur Übersetzung von Interpunktion und Satzaufteilung als stilistische Merkmale. / Style. Period. : The translation of punctuation and sentence splitting as stylistic features

Eriksson, Josefine January 2022 (has links)
This paper studies the translation of style in the book Die Welt auf dem Teller by Doris Dörrie (2020) from German into Swedish. It is argued that the style is partly held in punctuation and sentence length and the focus of the study is how these can be translated from German into Swedish, considering their importance for the text style.  The analysis shows that the source text has a more differentiated use of punctuation whereas the target text is more restricted/neutral. Differences concerning how punctuation is translated are mainly due to grammatical differences but also a question of whether the punctuation can be experienced equally by the source and target text reader. Both circumstances have an influence on the translation.  It is argued that Swedish readers expect shorter sentences and texts accessible to the reader. A higher density and more complex constructions are accepted in German. On average, shorter sentences are found in this translation, and the deviance is lower. When the sentence length in the source text is seen as a significant stylistic feature, this structure is kept in the translation. Otherwise, sentences are often split to become more accessible. When the sentence construction is kept, it is however still often shorter than the source text. As this is expected from the target text reader, the stylistic effect can arguably still be considered preserved.  The translation can therefore be said to be more neutral on both sentence length and punctuation. The translation is giving the stylistic features space when this is considered a characteristic feature, otherwise, it is changed to fit the language norm.
54

Užití interpunkce, emoji a emotikonů v urážlivých komentářích na YouTube / The Use of Punctuation, Emoji and Emoticons in YouTube Abusive Comments

Bočková, Renata January 2019 (has links)
This thesis attempts to contribute to the study of punctuation marks (including emoji and emoticons) used in computer-mediated communication. It aims to describe their role in abusive comments on YouTube videos with LGBT content and the extent to which their use differs in respectful and hateful comments on such videos. The analysis concentrates also on how the distribution of punctuation marks differs in relation to the polarity, content and length of comments. The thesis also provides a comparison of the frequency of the occurrence of punctuation marks in both respectful and hateful comments. In addition to that, this paper attempts to classify emoji and emoticons according to their role in the text. Key words Computer-mediated communication, YouTube, emoji, emoticons, punctuation, Internet communication
55

Ensino de pontuação em coleções didáticas de português: uma análise dialógica

Silva, Anderson Cristiano da 08 December 2015 (has links)
Made available in DSpace on 2016-04-28T18:23:11Z (GMT). No. of bitstreams: 1 Anderson Cristiano da Silva.pdf: 15280934 bytes, checksum: 98d171ae532bac39e34d4c16c91df2b1 (MD5) Previous issue date: 2015-12-08 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / At the end of primary school too many students leave primary school with gaps in reading and writing skills. Concerning the writing skill and despite the difficulties found, there is the ability of using punctuation mark as well as the lack of knowledge about this linguistic resource for the constitution of senses in the text. This specific learning difficulty is the result of multiple elements which we can emphasize one: the teaching proposals for punctuation in mother tongue pedagogical materials in circulation. Thus, our motivation arose mainly concerning the punctuation marks which are covered in Portuguese textbooks of elementary school, approved by the Programa Nacional do Livro Didático (PNLD) and distributed in Brazilian public schools. Thus, this study lays on a reflection on the subject, revealing a way to question the educational-methodological approaches, therefore allowing new perspectives, whose results may contribute to the expansion of this topic in the field of Applied Linguistics and Language Studies. This research aims at analyzing the punctuation marks activities in of two textbook collections (6th - 9th grade): Português: uma proposta para o letramento, by Magda Soares, and Português: linguagens, by William Roberto Cereja and Thereza Cochar Magalhães. In this case, we have established some important research questions: (1) What theoretical and methodological guidelines are offered by the two collections for the teaching of punctuation marks in the final years of elementary school? (2) How do they link (or not) to the readers´ formation and producers of texts in the textbooks analyzed, according to official documents guidelines? These questions are related to the guiding assumption that the punctuation activities, present in two collections selected for this research, cannot contribute to the complete development of the writing competence, specifically for the proper use of punctuation marks. Based on this proposition, our thesis is that the two selected collections show unsatisfactory educational approaches related to punctuation, not fitting with the requirements of the official documents, developing critical thinking partially and the ability of proper use of punctuation marks on writing. To support our research, specifically related to the existing types of dialogical relations between didactic approaches on punctuation and expository plot involved in its constitution, the research has as theoretical framework the contributions of Dialogic Discourse Analysis, electing some key concepts developed by Bakhtin and the Circle, such as: utterance and dialogical relations. From the organizational perspective, it was proposed two axes. In the theoretical axis, we present the state of knowledge on punctuation topic by seeking them on last decades of Brazilian academic production. In a second step, we construct the description of the research context, collection and delimitation of the corpus. In practical axis, we started the description and analysis of dialogic educational activities about punctuation teaching in both listed collections, as well as the utterances that parameterize the constitution of these works. Our considerations show considerable differences in approach to punctuation between the two collections, which include: the heterogeneous distribution of content in different school years, lack of coordination between the content and textual production proposals as well as the emphasis given to the oral modality in the exercises at the collection Português: uma proposta para o letramento / A partir dos resultados de avaliações em larga escala, observa-se que ao final do Ensino Fundamental os educandos apresentam defasagens em relação às habilidades de leitura e escrita correspondentes ao ano escolar em que se encontram. Dentre elas, estão a falta de habilidade para a colocação dos sinais de pontuação e, consequentemente, a não consciência de sua importância para a constituição de sentidos na escrita. Essa dificuldade resulta de diversos fatores, dentre os quais destacamos as possíveis inadequações nas propostas de ensino da pontuação em materiais didáticos de língua materna em circulação. A motivação para esta tese, portanto, recai sobre a maneira como os sinais de pontuação são abordados nos livros didáticos de Português (LDP) do Ensino Fundamental (EF), aprovados pelo Programa Nacional do Livro Didático (PNLD) e distribuídos nas escolas públicas brasileiras. Sendo uma forma de questionar abordagens didático-metodológicas, este trabalho justifica-se pela reflexão a respeito do assunto e pela possibilidade de apresentar, dentro do campo da Linguística Aplicada e dos Estudos da Linguagem, novos olhares sobre a temática. A pesquisa buscou analisar as atividades didáticas relativas aos sinais de pontuação presentes nos volumes do 6º ao 9º anos de duas coleções: Português: uma proposta para o letramento, de Magda Soares, e Português: linguagens, de William Roberto Cereja e Thereza Cochar Magalhães. Estabelecemos, para tanto, as seguintes perguntas de pesquisa: (1) Que encaminhamentos teórico-metodológicos são oferecidos pelas duas coleções no que se refere ao ensino dos sinais de pontuação nos Anos Finais do Ensino Fundamental? (2) Como as propostas didáticas relativas ao emprego da pontuação, nas obras analisadas, se articulam (ou não) à formação de leitores e produtores de textos, conforme orientações dos documentos oficiais? A partir dessas questões, trabalhamos com a hipótese de que as atividades de pontuação, presentes nas duas coleções selecionadas para esta pesquisa, podem não colaborar para o pleno desenvolvimento da competência escritora dos educandos. Nossa tese é a de que as duas coleções selecionadas apresentam abordagens didáticas insatisfatórias a respeito de pontuação, não condizendo com o prescrito pelos documentos oficiais e desenvolvendo de maneira parcial a reflexão crítica e a habilidade de uso adequado dos sinais de pontuação. Para alicerçar nossa investigação, a pesquisa teve como arcabouço teórico as contribuições da Análise Dialógica do Discurso (ADD), concebida a partir dos trabalhos de Bakhtin e o Círculo, elegendo enunciado concreto e relações dialógicas como conceitos-chave. Da perspectiva organizacional, foram apresentados dois eixos: (i) o estado do conhecimento sobre pontuação, com base em algumas produções acadêmicas brasileiras que trataram do assunto nas últimas décadas, e a descrição do contexto de pesquisa, a coleta e a delimitação do corpus; (ii) a descrição e análise das atividades didáticas sobre o ensino da pontuação nas duas coleções selecionadas, bem como os enunciados que parametrizam a constituição dessas obras. Os resultados apontaram, apesar das diferenças existentes entre as coleções, semelhanças consideráveis na abordagem da pontuação, destacando-se a distribuição heterogênea do conteúdo em anos escolares distintos, a falta de articulação entre o conteúdo e as propostas de produção textual. Entretanto, a diferença mais relevante identificada entre as obras está relacionada à eleição da modalidade oral como espaço privilegiado para a realização dos exercícios, na coleção Português: uma proposta para o letramento
56

Studier i Mikael Agricolas bibliska företal

Fredriksson, Inger January 1985 (has links)
A study has been undertaken of the biblical prefaces of Mikael Agricola. All the prefaces are based on those of Luther and/or translations of the same in the Swedish Bible of 1541 (GVB). The New Testament prefaces, like GVB, keep closely to the originals. There are however visual differences — in punctuation, capital letter usage and paragraphing. The literal translation makes the material very suitable for a study of Agricola's use of capital letters and punctuation in comparison with the Lutheran Bible and GVB. The material is too limited for any conclusions to be drawn about the principles underlying paragraphing. Agricola, like Luther (from 1539 onwards) and GVB, sometimes uses capital initial letters in the substantive designations of God and Christ the Holy Spirit the Bible and its books the Church and its sections occupations nationalities Adjectives relating to the above groups may also have capital initials. Personal pronouns relating to God or important persons may also be written with capital initials. Unlike the originals, Agricola's texts may also give prominence to other pronouns than the personal, to verbs, adverbs, numerals and intensifies. The punctuation corresponds partly with present usage: complete clauses are usually separated by punctuation marks. One basic difference is that in Agricola, Luther and GVB breathing pauses for reading aloud are indicated with commas or full stops. As Agricola stressed his utterances differently from his models, it follows that his punctuation differs from theirs. Agricola's prefaces to Old Testament writings are also based on Luther's, but only two of them are direct translations. Agricola's exclusions and additions have been studied. The former include many of the brief descriptions of contents in Luther's prefaces. The additions are interesting; sometimes Agricola does not accept Luther's brief biographical summaries about the authors of various biblical books, and uses instead Hieronymus' prefaces in Vulgata. He also refers to the old Jewish work on the human condition, Seder Olam. Agricola's longest preface, to the Psalms, is very much his own. A few pages are devoted to Agricola's summary of commentaries on the Psalms made by two Fathers of the Church — Augustinus Aurelius and Basilius the Great. Here Agricola makes generous use of the popular stylistic device of the time — amplification, an accumulation of more or less synonymous expressions for the same idea. Even in sentences directly translated from Luther and GVB, Agricola often extends the amplifications. Agricola's four rhyming prefaces are not based on any model at all. They were written in the Germanic doggerel metre, and have much in common with late mediaeval rhyming chronicles. Agricola's often drastic way of expressing himself makes delightful reading. / digitalisering@umu
57

The punctuation and intonation of parentheticals

Bodenbender, Christel 17 May 2010 (has links)
From a historical perspective, punctuation marks are often assumed to only represent some of the phonetic structure of the spoken form of that text. It has been argued recently that punctuation today is a linguistic system that not only represents some of the phonetic sentence structure but also syntactic as well as semantic information. One case in point is the observation that the semantic difference in differently punctuated parenthetical phrases is not reflected in the intonation contour. This study provides the acoustic evidence for this observation. Furthermore, this study makes recommendations to achieve natural-sounding text-to-speech output for English parentheticals by incorporating the study's findings with respect to parenthical intonation. The experiment conducted for this study involved three male and three female native speakers of Canadian English reading aloud a set of 20 sentences with parenthetical and non-parenthetical phrases. These sentences were analyzed with respect to acoustic characteristics due to differences in punctuation as well as due to differences between parenthetical and non-parenthetical phrases. A number of conclusions were drawn based on the results of the experiment: (1) a difference in punctuation, although entailing a semantic difference, is not reflected in the intonation pattern; (2) in contrast to the general understanding that parenthetical phrases are lower-leveled and narrower in pitch range than the surrounding sentence, this study shows that it is not the parenthetical phrase itself that is implemented differently from its non-parenthetical counterpart; rather, the phrase that precedes the parenthetical exhibits a lower baseline and with that a wider pitch range than the corresponding phrase in a non-parenthetical sentence; (3) sentences with two adjacent parenthetical phrases or one embedded in the other exhibit the same pattern for the parenthetical-preceding phrase as the sentences in (2) above and a narrowed pitch range for the parenthetical phrases that are not in the final position of the sequence of parentheticals; (4) no pausing pattern could be found; (5) the characteristics found for parenthetical phrases can be implemented in synthesized speech through the use of SABLE speech markup as part of the SABLE speech synthesis system. This is the first time that the connection between punctuation and intonation in parenthetical sentences has been investigated; it is also the first look at sentences with more than one parenthetical phrase. This study contributes to our understanding of the intonation of parenthetical phrases in English and their implementation in text-to-speech systems, by providing an analysis of their acoustic characteristics.
58

Poétique du point de suspension : valeur et interprétations. / Poetic of the ellipsis : value and interpretations.

Rault, Julien 11 December 2014 (has links)
Partant de l'hypothèse d'un mouvement global d'inclusion des signes de ponctuation dans le système de la langue, prenant appui sur une grammaticalisation, ce travail envisage l'élément ponctuant comme un véritable signe de langue écrite, soit un ponctème auquel il est possible de conférer une valeur différentielle et une signification, par la confrontation d'un signifiant graphique et d'un signifié.Nous proposons une étude linguistique du point de suspension, signe complexe, polyvalent, traditionnellement doté de propriétés antithétiques et d'innombrables fonctions. En ayant soin de distinguer le niveau sémiotique du niveau sémantique et métasémantique, nous définissons le ponctème comme le « signe du latent » dont l'interprétation peut être synthétisée en trois réalisations (suppression, suspension, supplémentation) qui sont le support d'enjeux syntaxique, sémantique, énonciatif majeurs.La valeur, réflexive, intrinsèquement contestataire, de la latence permet alors d'envisager les différentes réalisations discursives du ponctème, dans le discours littéraire, journalistique ou encore métalinguistique (imaginaire) ; elle offre la possibilité d'appréhender, dans une perspective poétique (stylistique, générique, socio-historique, épistémologique), un genre de discours qui transcende l'opposition entre discours littéraire et non-littéraire. Depuis son apparition et son utilisation dans le théâtre imprimé français (XVIIe siècle) jusqu'aux nombreux usages contemporains dans divers genres de discours, l'idéogramme du latent, mi-dire faisant apparaître une possible apparition, est un signe dont la valeur labile, excessive, infinissant le sens, procède fondamentalement d'un discours oblique. / This work assumes the inclusion of punctuation in the system of the language, motivated by grammaticalization. It considers ponctuation mark as a real sign of written language, a ''ponctème'' to which it is possible to confer a differential value and a meaning, by the confrontation of a graphic signifier and a signified. We present a linguistic study of the suspension point, a complex and versatile sign, traditionally endowed with antithetical properties and with uncountable functions. By distinguishing the semiotic level from the semantic and metasemantic level, we define the ''ponctème'' as the sign of latency, which can be discursively interpreted as 1) suppression 2) suspension 3) supplementation. The reflexive value of latency allows us to account for the various discursive constructions of the sign, in the literary, journalistic or else metalinguistic discourses. It offers the possibility of considering in the perspective of poetics a genre of discourse which transcends the opposition between literary and non-literary discourse. From its first use in the french printed theater of the XVIIth century to todays's uses in numerous kinds of discourse, this ideogram of latency (making appear a possible appearance) is a sign whose unstable, transgressive, excessive value is closely related to an oblique discourse.
59

Enseigner et apprendre la grammaire : le cas de la phrase et de la ponctuation au cycle II / Teaching and learning grammar : the case of the phrase and punctuation in Cycle II

Jarno-El Hilali, Guénola 04 July 2011 (has links)
La grammaire - longtemps considérée comme routinière, ennuyeuse et formaliste - est un objet de réflexions au centre de nouvelles préoccupations en matière d'enseignement. En témoignent les programmes mettant en avant l'importa,ce de notions relatives à l'énonciation, à la cohésion textuelle, à côté des études relatives à la grammaire de phrase. Ce travail vise à contribuer aux réflexions sur l'acquisition de cette discipline. Il se matérialise autour de la phrase et de la ponctuation. Qu'est-ce que la phrase ? La ponctuation ? Une connaissance spontanée ? Un savoir appris à l'école ? Comment donner du sens à cet enseignement ? Comment le penser ? L'organiser ? C'est autour de ces questions que s'organise ce travail : une première partie regroupe les principaux courants grammaticaux "classiques" et "contemporains" qui ont eu un impact durable sur l'enseignement de la phrase et de la ponctuation ; une deuxième porte sur une synthèse des fondements de l'enseignement et de l'apprentissage de la grammaire, mis en lumière par les recherches récentes dans les domaines de la psychologie cognitive et de la didactique ; une troisième s'attache à analyser les manuels d'enseignement et autres documents didactiques utilisés dans les salles de classe ; une quatrième enfin livre les conclusions auxquelles nous sommes parvenus au terme de plusieurs expériences menées dans des classes de CP et de CE1 : l'une sur l'approche de la phrase en contexte, l'autre sur l'approche du système de ponctuation. De ce travail, nous retenons que nos propositions didactiques peuvent réconcilier les élèves avec la grammaire, et plus particulièrement avec la ponctuation, bête noire de la production écrite. / Grammar - regarded a long time as routine, tedious and formal - is an object of reflexions in the center of new concerns as regards teaching. In the programs testify proposing the importance to concepts relating to the stating, with textual cohesion, beside the relative studies with the grammar of sentence. This work aims at contributing to the reflexions on the acquisition of this discipline. It materializes around the sentence and of the punctuation. What the sentence ? The punctuation ? A spontaneous knowledge ? A knowledge learned at school ? How to give direction to this teaching ? How to think it ? To organize ? It is around these questions that this work is organized : a first part gathers the principal grammatical currents "traditional" and "contemporaries" who had a durable impact on the teaching of the sentence and the punctuation ; a second door on a synthesis of the bases of teaching and training of the grammar, clarified by recent research in the fields of cognitive psychology and the didactic one ; a third attempts to analyse handbook of teaching and other didactic documents used in classrooms ; a fourth finally delivers the conclusions to which we arrived at the end of several experiments undertaken in classes of CP and CE1 : one on the approach of the sentence in context, the other on the approach of the system of punctuation. Of this work, we retain that our didactic proposals can reconcile pupils with grammar, and more particularly with the punctuation, pet peeve of the written production.
60

Punctuation Restoration as Post-processing Step for Swedish Language Automatic Speech Recognition

Gupta, Ishika January 2023 (has links)
This thesis focuses on the Swedish language, where punctuation restoration, especially as a postprocessing step for the output of Automatic Speech Recognition (ASR) applications, needs furtherresearch. I have collaborated with NewsMachine AB, a company that provides large-scale mediamonitoring services for its clients, for which it employs ASR technology to convert spoken contentinto text.This thesis follows an approach initially designed for high-resource languages such as English. Themethod is based on KB-BERT, a pre-trained Swedish neural network language model developedby the National Library of Sweden. The project uses KB-BERT with a Bidirectional Long-ShortTerm Memory (BiLSTM) layer on top for the task of punctuation restoration. The model is finetuned using the TED Talk 2020 dataset in Swedish, which is acquired from OPUS (an open-sourceparallel corpus). The punctuation marks comma, period, question mark, and colon are considered for this project. A comparative analysis is conducted between two KB-BERT models: bertbase-swedish-cased and albert-base-swedish-cased-alpha. The fine-tuned Swedish BERT-BiLSTMmodel, trained on 5 classes, achieved an overall F1-score of 81.6%, surpassing the performance ofthe ALBERT-BiLSTM model, which was also trained on 5 classes and obtained an overall F1-scoreof 66.6%. Additionally, the BERT-BiLSTM model, trained on 4 classes (excluding colon), outperformed prestoBERT, an existing model designed for the same task in Swedish, with an overallF1-score of 82.8%. In contrast, prestoBERT achieved an overall F1-score of 78.9%.As a further evaluation of the model’s performance on ASR transcribed text, noise was injectedbased on four probabilities (0.05, 0.1, 0.15, 0.2) into a copy of the test data in the form of threeword-level errors (deletion, substitution, and insertion). The performance of the BERT-BiLSTMmodel substantially decreased for all the errors as the probability of noise injected increased. Incontrast, the model still performed comparatively better when dealing with deletion errors as compared to substitution and insertion errors. Lastly, the data resources received from NewsMachineAB were used to perform a qualitative assessment of how the model performs in punctuating realtranscribed data as compared to human judgment.

Page generated in 0.1114 seconds