Global ETD Search

181	Analýza anglo-francouzských pravých protějšků v korpusu autentických vzorků textů / Analysis of English and French true friends (vrais amis) in a corpus of authentic text samples Pípalová, Mariana January 2017 (has links) This final project provides a parole analysis of vrais amis (true counterparts) in current French and English. To this end a specialized English-French translation corpus was assembled, composed of three subcorpora equal in length, namely Religious, Political and Fiction discourse, amounting altogether to approximately 60,000 words. With the help of the AntConc instrument, true friends employed in the corpus were generated, here conceived of as a register-specific phenomenon exclusively. Using the frequency criteria, a central set of 64 most frequent counterparts was delineated. These central counterparts, marked by (almost) identical frequencies, identical contexts and the same registers, were subjected to a multiaspectual analysis, scrutinizing the pronunciation, spelling, word classes, share of derivation, and frequency of types and tokens. Since English proved to be the borrowing language in all instances, the research also indirectly addressed the degree of their integration in the English words stock by reference to frequency bands. For most of the researched aspects, three zones of counterparts were identified, namely those exhibiting identity, close similarity and relative difference. As a result, employing the Theory of Centre and Periphery (Daneš 1966), we may arrange true counterparts...
182	Analýza vybraných lingvistických aspektů zjednodušené beletrie ve srovnání s originály / Analysis of Selected Linguistic Aspects of Simplified Fiction as against the Originals Romanenko, Elena January 2017 (has links) The thesis presents a multi-aspectual analysis of simplified fiction at the B2 and C1 levels and their original counterparts. It aims to explore the simplification and language transformation performed on authentic texts to adapt them to particular CEFR levels. The thesis also endeavors to provide an insight into whether there are common linguistic features that characterize authentic and adapted texts of different levels, thus helping teachers and learners justify their choice between original and simplified texts. Based on the theoretical framework, the thesis provides an analysis of a specialized corpus of six texts which is comprised of the first chapters of the two original novels and their simplified versions adapted to the B2 and C1 levels by two different publishers. Each sample was subjected to scrutiny of selected linguistic features, thus unveiling the tendencies in the language, discourse, and information control in the graded readers. Consequently, the results of the text analysis were contrasted with CEFR to compare the actual text complexity with its assigned CEFR level. The results of the analysis seem to indicate certain discrepancies in this respect. Keywords: CEFR, specialized corpus, graded readers, authentic texts, simplification, language control, discourse control,...
183	Hypotéza unique items v překladu. Korpusová studie. / Unique items hypothesis in translation. A corpus-based study. Špínová, Adéla January 2017 (has links) This thesis is focused on testing the so-called unique items hypothesis on Czech language data. Supposed Czech unique items were chosen from lexical units, word-formation phenomena, syntactic structures and language use phenomena. Their frequency in a comparable monolingual corpus of contemporary Czech was established and the differences in frequency were statistically tested. This quantitative research was accompanied by a qualitative probe into the English source texts from which sentences containing selected unique items were translated using an aligned parallel corpus of English-Czech translations. The results reveal a general tendency of unique items to be underrepresented in translated language and a variety of source- language phenomena that underlie unique items usage in the target language.
184	Predicting the N400 Component in Manipulated and Unchanged Texts with a Semantic Probability Model Bjerva, Johannes January 2012 (has links) Within the field of computational linguistics, recent research has made successful advances in integrating word space models with n-gram models. This is of particular interest when a model that encapsulates both semantic and syntactic information is desirable. A potential application for this can be found in the field of psycholinguistics, where the neural response N400 has been found to occur in contexts with semantic incongruities. Previous research has found correlations between cloze probabilities and N400, while more recent research has found correlations between cloze probabilities and language models. This essay attempts to uncover whether or not a more direct connection between integrated models and N400 can be found, hypothesizing that low probabilities elicit strong N400 responses and vice versa. In an EEG experiment, participants read a text manipulated using a language model, and a text left unchanged. Analysis of the results shows that the manipulations to some extent yielded results supporting the hypothesis. Further results are found when analysing responses to the unchanged text. However, no significant correlations between N400 and the computational model are found. Future research should improve the experimental paradigm, so that a larger scale EEG recording can be used to construct a large EEG corpus. / Innom datalingvistikken har tidligere forskning gjort framsteg når det gjelder å kombinere ordromsmodeller og n-grammodeller. Dette er av spesiell interesse når det er ønskelig å ha en modell som fanger både semantisk og syntaktisk informasjon. Et potensielt bruksområde for en slik modell finnes innom psykolingvistikk, der en neural respons som kalles N400 vist seg å oppstå i kontekster med semantisk inkongruens. Tidligere forskning har oppdaget en sterk korrelasjon mellom cloze probabilities og N400, og nylig forskning har funnet korrelasjoner mellom cloze probabilities og sannsynlighetsmodeller fra datalingvistikk. Denne oppgaven har som mål å undersøke hvorvidt en mer direkte kobling mellom slike kombinerte modeller og N400 finnes, med hypotesen at lave sannsynligheter leder til store N400-responser og omvendt. Et antall forsøkspersoner leste en tekst manipulert ved hjelp av en slik modell, og en naturlig tekst, i et EEG-eksperiment. Resultatsanalysen viser at manipuleringene til en viss grad gav resultat som støtter hypotesen. Tilsvarende resultat ble funnet under resultatanalysen av responsene til den naturlige teksten. Ingen signifikante korrelasjoner ble oppdaget mellom N400 og den kombinerte modellen. Forbedringer for videre forskning involverer å blant annet forbedre eksperimentparadigmet slik at en storstilt EEG-inspilling kan gjennomføres for å konstruere en EEG-korpus. / Inom datalingvistiken har tidigare forskning visat lovande resultat vid kombinering av ordrumsmodeller och n-gramsmodeller. Detta är av speciellt intresse när det är önskvärt att ha en modell som fångar både semantisk och syntaktisk information. Ett potensielt användningsområde för en sådan modell finns inom psykolingvistiken, där en neural respons kallad N400 visat sig uppstå i situationer med semantisk inkongruens. Tidigare forskning har upptäckt en stark korrelation mellan cloze probabilities och N400, medan en nyare studie har upptäckt en korrelation mellan cloze probabilities och sannolikhetsmodeller från datalingvistiken. Denna uppsats har som mål att undersöka huruvida en mer direkt koppling mellan sådana kombinerade modeller och N400 finns, med hypotesen att låga sannolikheter leder till stora N400-responser och vice versa. Ett antal försökspersoner läste en text manipulerad med hjälp av en probabilistisk modell, och en naturlig text, i ett EEG-experiment. Resultatsanalysen visar att manipuleringen till viss grad gav resultat som stödjer hypotesen. Motsvarande resultat hittades under resultatanalysen av responserna till den naturliga texten. Inga signifikanta korrelationer blev upptäckta mellan N400 och den kombinerade modellen. Förbättringar för vidare forskning involverar bland annat att förbättra experimentparadigmet så att en storskalig EEG-inspelning kan genomföras för att konstruera en EEG-korpus. Computational semantics EEG corpus Model integration N400 Datorlingvistisk semantik EEG-korpus Modellintegrering N400 General Language Studies and Linguistics
185	The Typology of Focus Marking in South Asian Englishes Lange, Claudia, Bernaisch, Tobias January 2012 (has links) The emergence of grammatical norms in postcolonial varieties of English has been argued to manifest itself in quantitative preferences rather than in categorical distinctions (cf. Schneider 2007: 46). Several studies on Indian English, however, have shown that this South Asian variety has developed innovative uses, i.e. marked qualitative differences, for the additive focus marker also and the restrictive focus markers only and itself as presentational focus markers (Bhatt 2000, Lange 2007, Balasubramanian 2009), e.g. Since 7 am itself, schoolchildren started to reach the venue smartly dressed and armed with their queries and waited patiently for more than two hours for the programme to begin. (IN_TI_38032) Number-related mismatches in agreement between the antecedent in plural and the focus marker in singular have also been attested. This structural phenomenon may be indicative of a grammaticalization process of the focus marker itself to an invariant focus particle as illustrated in the following example. He said the temporary peace achieved by leaders of the country was a victory for the Sri Lankan Security Forces itself as it was gained by the Security Forces at the expense of their lives. (LK_DN_2004-07-02) The present study is concerned with variation and convergence in the use of focus marking with itself in South Asian Englishes, i.e. Bangladeshi English, Indian English, Maldivian English, Nepali English, Pakistani English and Sri Lankan English. On the basis of the South Asian varieties of English (SAVE) corpus, an 18-million word web-based newspaper corpus featuring acrolectal language use of the varieties under scrutiny (cf. Bernaisch et al. 2011), we report on the pervasiveness of (presentational) focus marking with itself. Although the novel usage of itself as illustrated above certainly represents a feature of South Asian English, there is a clear pattern characterised by unity and diversity with regard to the individual varieties of English in South Asia.Despite the pan-South Asian presence of presentational itself, quantity, grammaticalization processes and structural combinability provide grounds to argue that presentational itself is more firmly rooted in some South Asian varieties of English (e.g. Indian English and Sri Lankan English) than in others (Bangladeshi English or Maldivian English). info:eu-repo/classification/ddc/420 ddc:420
186	Užití spojovacích prostředků v textech nerodilých mluvčích češtiny / The use of Connectors in the Texts of Non-Native Speakers of Czech Pečený, Pavel January 2017 (has links) Title: The Use of Connectors in the Texts of Non-Native Speakers of Czech Author: Pavel Pečený Department: Institute of Czech Language and Theory of Communication Faculty of Arts, Charles University Supervisor: doc. RNDr. Vladimír Petkevič, CSc. Abstract Over the last few years two learner corpora of authentic texts of non-native speakers of Czech have originated (MERLIN Corpus and CzeSL Corpus), giving linguists an important source of data for researching Czech as a foreign language. Ergo, for the first time it is possible to carry out the language analysis of non-native speakers of Czech using tools of corpus linguistics to formulate evidence-based research findings. The presented thesis uses that as well, focussing on the study and description of the use of connectors in the written text production of non-native speakers of Czech, being primarily based on evidence from the learner MERLIN Corpus, which as opposed to other corpora is characterized by linking written text production reliably to the proficiency levels of the Common European Framework of Reference for Languages (CEFR), including the levels of A2-B2. At the same time, it also contains a text error annotation, thus enabling to ascertain what effect the language proficiency has on the extent of the repertoire and frequency of using connectors,...
187	Rozšiřování slovní zásoby u romských žáků na druhém stupni ZŠ / Expanding the Vocabulary of Roma Pupils in the Second Level of Primary School Kráčmarová, Drahoslava January 2019 (has links) The work deals with the acquisition of vocabulary in the period of advanced language development in second-level primary school pupils from different socio-economic and socio- cultural backgrounds. The theoretical section summarises the basic findings as well as the existing empirical research, focusing in particular on the influence of the family environment of the individual on his/her linguistic development. It provides a complex picture of the education of Roma children in the Czech Republic at this point in time. The investigative section presents independent research into the vocabulary of pupils, with regard to selected word groups in the acquisition corpus SKRIPT2015. The texts of pupils attending schools in socially-deprived areas, or in areas under threat of social deprivation and with a strong Roma presence, are compared with the texts of pupils attending average schools. On the basis of this comparison, a picture emerges of the impact of different socio-economic and socio-cultural environments on the written language of pupils, as well as the effectiveness (or lack thereof) of the existing support mechanisms provided by educational institutions. KEYWORDS acquisition of language, advanced language development, expansion of vocabulary, education of Roma children, acquisition corpus, SKRIPT2015
188	Lexikální zápor ve španělštině / Lexical negation in Spanish Malinová, Markéta January 2019 (has links) The diploma thesis deals with negation in Spanish, especially lexical negation and negative prefixes. It is divided into two parts: the first one is theoretical which is followed by a practical one, which contains its own research. The aim of the first part is to genereally define negation from the perspective of various disciplines and to characterize the means that are used to create the negative (in Spanish and Czech). The greatest emphasis is placed on word- formation negation, that is why basic word-forming processes are defined here. In the theoretical part, the greatest attention is paid to prefixation: basic categories of prefixes are described with emphasis on negative prefixes. The second part focuses on the research which is based on the work with the language corpus and subsequently the questionnaire, which was created for the purpose of the work. The aim of the practical part is to discover some specifics in the behaviour of individual negative prefixes (a-, des-, in-, anti- and contra-) and their possible interchangeability. Subsequently conclusions are drawn from this part, following the theoretical part and completing the overall view of lexical negation in Spanish.
189	Forma a funkce u substantiv v češtině: vztah pádu a syntaktické funkce. Na materiálu korpusu současné psané češtiny (SYN2005) / Form and function of nouns in Czech: relation between nominal case and syntactic function. Based on a synchronic written corpus of Czech (SYN2005) Jelínek, Tomáš January 2012 (has links) The case in Czech is the basic morphological means by which nouns express their function in a sentence. The objective of this thesis is to describe, from a frequency point of view, the relation between form and function of nouns, or, more precisely, how frequently cases (both simple and prepositional) are used to realise syntactic functions in sentences. The thesis is based on one of the largest corpora of written synchronic Czech: 100-million-token corpus SYN2005. In order to obtain data on frequencies of syntactic functions of nouns in relation to their cases, we annotated the corpus SYN2005 with a dependency syntactic annotation. For this annotation, we adopted the format of the analytical layer of the Prague Dependency Treebank. The syntactic annotation has been performed by a stochastic parser: the MST parser. Since the reliability of this annotation was not high enough, we have built an automatic correction module, which identifies errors of syntactic annotation in the output of the stochastic parser and corrects these errors by means of linguistic rules. We have implemented 26 different rules, but annotation errors have been reduced by merely 6-8%. However, this correction module can be further developed. It can be used to correct the output of any dependency parser trained on the data from...
190	Interjekce v německo-českých slovnících / Interjections in German-Czech Dictionaries Felbr, Lukáš January 2021 (has links) This master's thesis analyses the lexicographical treatment of interjections in current German- Czech commercial dictionaries, special focus is laid on user aspect. The goal of this thesis is to provide an overview of possible issues in the lexicographical processing of interjections. The focus is on both the macrostructure with the selection of lemmas and microstructure of the respective analysed entries. Compared to corpus data and to monolingual dictionaries, problems include especially part of speech specifications, selection of Czech translation equivalents, lack of examples demonstrating the pragmatic function of the interjections, or insufficient metalanguage comments. The example of the dictionary entry of the lemma ach shows which steps and decisions must be made by the lexicographical work. At the end, concrete solutions for the lexicographical treatment of interjections are proposed, such as the exact classification of the respective meanings, metalanguage comments on these meanings, necessary attention to the prosodic and syntactic properties, and dialogical examples.

Search results