• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 88
  • 6
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 123
  • 123
  • 53
  • 50
  • 46
  • 36
  • 32
  • 32
  • 30
  • 29
  • 27
  • 22
  • 22
  • 21
  • 21
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Fine-tuning a LLM using Reinforcement Learning from Human Feedback for a Therapy Chatbot Application / Finjustering av en LLM med hjälp av förstärkande inlärning från mänsklig återkoppling (eng. RLHF) för en Psykolog-chatbot applikation

Bill, Desirée, Eriksson, Theodor January 2023 (has links)
The field of AI and machine learning has seen exponential growth in the last decade and even more so in the recent year with the considerable public interest in Large Language models (LLMs) such as chat-GPT. LLMs can be used for several purposes, but one possible application would be fine-tuning a model to perform a particular function in a specific field. The goal is therefore fine-tuning a LLM in the field of psychology using a new method called Reinforcement Learning from Human Feedback to determine if it is a viable method in such cases. The theory behind LLMs and RLHF as well as the ethical perspective on developing a psychological AI is presented. Previous studies on both RLHF and AI in psychology are presented, showing the goal is feasible. Then the method is explained for both training and evaluating the model which is done by comparing a pre-trained model with the fine-tuned one. The study is considered scientifically relevant as RLHF has been used to fine-tune LLMs earlier, but has not been done with the intent to make it more specified in a field. The result did not show any clear difference between the pre-trained and the fine-tuned model therefore, more tests are required. However, with the limitations regarding hardware, time to train, and available data, there is much improvement needed for future studies. An ethical framework applied to a digital psychology assistant is discussed and a suitable introduction to the market and division of responsibilities is proposed. / Området AI och maskininlärning har sett exponentiell tillväxt under det senaste decenniet och ännu mer under det senaste året med det stora allmänintresset för stora språkmodeller som chat-GPT. Stora språkmodeller kan användas till flera saker där en möjlig tillämpning är att finjustera en modell för att fylla en viss funktion inom ett specifikt yrke. Målet med arbetet är därför att finjustera en språkmodell inom området psykologi med hjälp av en ny metod kallad Reinforcement Learning from Human Feedback för att undersöka metodens tillämplighet. Teorin bakom stora språkmodeller och RLHF samt det etiska perspektivet på att utveckla en digital psykologi assistent förklaras. Därefter presenteras tidigare studier om både RLHF och AI inom psykologi som visar att målet är genomförbart. Metoden för att både träna och utvärdera modellen förklaras som görs genom att jämföra den förtränade modellen med den finjusterade. Studien bedöms som vetenskapligt relevant även fast RLHF har använts för att finjustera språkmodeller tidigare, har det inte gjorts med målet att finjustera en språkmodell till ett visst yrke. Resultatet visade inte på någon tydlig skillnad mellan den förtränade och den finjusterade modellen, därför krävs fler tester krävs. Men med de begräsningar som fanns gällande hårdvara, tid att träna och tillgänglig data är det mycket som kan förbättras i framtida studier. Det etiska ramverket applicerat på en digital psykologi assistent diskuteras och en lämplig introduktion till marknaden och ansvarsfördelning föreslås.
52

An In-Depth study on the Utilization of Large Language Models for Test Case Generation

Johnsson, Nicole January 2024 (has links)
This study investigates the utilization of Large Language Models for Test Case Generation. The study uses the Large Language model and Embedding model provided by Llama, specifically Llama2 of size 7B, to generate test cases given a defined input. The study involves an implementation that uses customization techniques called Retrieval Augmented Generation (RAG) and Prompt Engineering. RAG is a method that in this study, stores organisation information locally, which is used to create test cases. This stored data is used as complementary data apart from the pre-trained data that the large language model has already trained on. By using this method, the implementation can gather specific organisation data and therefore have a greater understanding of the required domains. The objective of the study is to investigate how AI-driven test case generation impacts the overall software quality and development efficiency. This is evaluated by comparing the output of the AI-based system, to manually created test cases, as this is the company standard at the time of the study. The AI-driven test cases are analyzed mainly in the form of coverage and time, meaning that we compare to which degree the AI system can generate test cases compared to the manually created test case. Likewise, time is taken into consideration to understand how the development efficiency is affected. The results reveal that by using Retrieval Augmented Generationin combination with Prompt Engineering, the system is able to identify test cases to a certain degree. The results show that 66.67% of a specific project was identified using the AI, however, minor noise could appear and results might differ depending on the project’s complexity. Overall the results revealed how the system can positively impact the development efficiency and could also be argued to have a positive effect on the software quality. However, it is important to understand that the implementation as its current stage, is not sufficient enough to be used independently, but should rather be used as a tool to more efficiently create test cases.
53

Exploring artificial intelligence bias : a comparative study of societal bias patterns in leading AI-powered chatbots.

Udała, Katarzyna Agnieszka January 2023 (has links)
The development of artificial intelligence (AI) has revolutionised the way we interact with technology and each other, both in society and in professional careers. Although they come with great potential for productivity and automation, AI systems have been found to exhibit biases that reflect and perpetuate existing societal inequalities. With the recent rise of artificial intelligence tools exploiting the large language model (LLM) technology, such as ChatGPT, Bing Chat and Bard AI, this research project aims to investigate the extent of AI bias in said tools and explore its ethical implications. By reviewing and analysing responses to carefully crafted prompts generated by three different AI chatbot tools, the author will intend to determine whether the content generated by these tools indeed exhibits patterns of bias related to various social identities, as well as compare the extent to which such bias is present across all three tools. This study will contribute to the growing body of literature on AI ethics and inform efforts to develop more equitable and inclusive AI systems. By exploring the ethical dimensions of AI bias in selected LLMs, this research will shed light on the broader societal implications of AI and the role of technology in shaping our future.
54

Domain Adaptation with N-gram Language Models for Swedish Automatic Speech Recognition : Using text data augmentation to create domain-specific n-gram models for a Swedish open-source wav2vec 2.0 model / Domänanpassning Med N-gram Språkmodeller för Svensk Taligenkänning : Datautökning av text för att skapa domänspecifika n-gram språkmodeller för en öppen svensk wav2vec 2.0 modell

Enzell, Viktor January 2022 (has links)
Automatic Speech Recognition (ASR) enables a wide variety of practical applications. However, many applications have their own domain-specific words, creating a gap between training and test data when used in practice. Domain adaptation can be achieved through model fine-tuning, but it requires domain-specific speech data paired with transcripts, which is labor intensive to produce. Fortunately, the dependence on audio data can be mitigated to a certain extent by incorporating text-based language models during decoding. This thesis explores approaches for creating domain-specific 4-gram models for a Swedish open-source wav2vec 2.0 model. The three main approaches extend a social media corpus with domain-specific data to estimate the models. The first approach utilizes a relatively small set of in-domain text data, and the second approach utilizes machine transcripts from another ASR system. Finally, the third approach utilizes Named Entity Recognition (NER) to find words of the same entity type in a corpus to replace with in-domain words. The 4-gram models are evaluated by the error rate (ERR) of recognizing in-domain words in a custom dataset. Additionally, the models are evaluated by the Word Error Rate (WER) on the Common Voice test set to ensure good overall performance. Compared to not having a language model, the base model improves the WER on Common Voice by 2.55 percentage points and the in-domain ERR by 6.11 percentage points. Next, adding in-domain text to the base model results in a 2.61 WER improvement and a 10.38 ERR improvement over not having a language model. Finally, adding in-domain machine transcripts and using the NER approach results in the same 10.38 ERR improvement as adding in-domain text but slightly less significant WER improvements of 2.56 and 2.47, respectively. These results contribute to the exploration of state-of-the-art Swedish ASR and have the potential to enable the adoption of open-source ASR models for more use cases. / Automatisk taligenkänning (ASR) möjliggör en mängd olika praktiska tillämpningar. Men många tillämpningsområden har sin egen uppsättning domänspecifika ord vilket kan skapa problem när en taligenkänningsmodell används på data som skiljer sig från träningsdatan. Taligenkänningsmodeller kan anpassas till nya domäner genom fortsatt träning med taldata, men det kräver tillgång till domänspecifik taldata med tillhörande transkript, vilket är arbetskrävande att producera. Lyckligtvis kan beroendet av ljuddata mildras till viss del genom användande av textbaserade språkmodeller tillsammans med taligenkänningsmodellerna. Detta examensarbete utforskar tillvägagångssätt för att skapa domänspecifika 4-gram-språkmodeller för en svensk wav2vec 2.0-modell som tränats av Kungliga Biblioteket. Utöver en basmodell så används tre huvudsakliga tillvägagångssätt för att utöka en korpus med domänspecifik data att träna modellerna från. Det första tillvägagångssättet använder en relativt liten mängd domänspecifik textdata, och det andra tillvägagångssättet använder transkript från ett annat ASR-system (maskintranskript). Slutligen använder det tredje tillvägagångssättet Named Entity Recognition (NER) för att hitta ord av samma entitetstyp i en korpus som sedan ersätts med domänspecifika ord. Språkmodellerna utvärderas med ett nytt domänspecifikt evalueringsdataset samt på testdelen av Common Voice datasetet. Jämfört med att inte ha en språkmodell förbättrar basmodellen Word Error Rate (WER) på Common Voice med 2,55 procentenheter och Error Rate (ERR) inom domänen med 6,11 procentenheter. Att lägga till domänspecifik text till basmodellens korpus resulterar i en 2,61 WER-förbättringochen10,38 ERR-förbättring jämfört med att inte ha en språkmodell. Slutligen, att lägga till domänspecifika maskintranskript och att använda NER-metoden resulterar i samma 10.38 ERR-förbättringar som att lägga till domänspecifik text men något mindre WER-förbättringar på 2.56 respektive 2.47 procentenheter. Den här studien bidrar till svensk ASR och kan möjliggöra användandet av öppna taligenkänningsmodeller för fler användningsområden.
55

Improving the Accessibility of Arabic Electronic Theses and Dissertations (ETDs) with Metadata and Classification

Abdelrahman, Eman January 2021 (has links)
Much research work has been done to extract data from scientific papers, journals, and articles. However, Electronic Theses and Dissertations (ETDs) remain an unexplored genre of data in the research fields of natural language processing and machine learning. Moreover, much of the related research involved data that is in the English language. Arabic data such as news and tweets have begun to receive some attention in the past decade. However, Arabic ETDs remain an untapped source of data despite the vast number of benefits to students and future generations of scholars. Some ways of improving the browsability and accessibility of data include data annotation, indexing, parsing, translation, and classification. Classification is essential for the searchability and management of data, which can be manual or automated. The latter is beneficial when handling growing volumes of data. There are two main roadblocks to performing automatic subject classification on Arabic ETDs. The first is the unavailability of a public corpus of Arabic ETDs. The second is the Arabic language’s linguistic complexity, especially in academic documents. This research presents the Otrouha project, which aims at building a corpus of key metadata of Arabic ETDs as well as providing a methodology for their automatic subject classification. The first goal is aided by collecting data from the AskZad Digital Library. The second goal is achieved by exploring different machine learning and deep learning techniques. The experiments’ results show that deep learning using pretrained language models gave the highest classification performance, indicating that language models significantly contribute to natural language understanding. / M.S. / An Electronic Thesis or Dissertation (ETD) is an openly-accessible electronic version of a graduate student’s research thesis or dissertation. It documents their main research effort that has taken place and becomes available in the University Library instead of a paper copy. Over time, collections of ETDs have been gathered and made available online through different digital libraries. ETDs are a valuable source of information for scholars and researchers, as well as librarians. With the digitalization move in most Middle Eastern Universities, the need to make Arabic ETDs more accessible significantly increases as their numbers increase. One of the ways to improve their accessibility and searchability is through providing automatic classification instead of manual classification. This thesis project focuses on building a corpus of metadata of Arabic ETDs and building a framework for their automatic subject classification. This is expected to pave the way for more exploratory research on this valuable genre of data.
56

Preserving Knowledge in Power Line Engineering with Language Models and Design

Götling, Axel January 2024 (has links)
The loss of senior expertise in power line design poses a critical challenge to the sustainable energy transition. Current methods of knowledge transfer fail to prevent the loss of invaluable knowledge necessary for future junior power line designers. Additionally, the rise of informal deployment of generative language models may also threaten to bury hand-written knowledge documents before this knowledge can be extracted, structured, and preserved for future guidance. This thesis proposes a framework where large language models are integrated into knowledge transfer and decision-making guidance for an engineering enterprise. Using this framework, this thesis further explores how data-driven knowledge tools can assist junior design engineers by supporting information retrieval and directing to knowledge sources. The ability of a large language model to retrieve relevant knowledge from an engineering design document was validated by comparing the process of human designers manually completing a similar task. In this evaluation involving six participants and the large language model, responses to questions on mechanical dimensioning of stays for utility poles were ranked by experts. The results showed that the large language model responses were ranked similarly to the junior designers on average. Additionally, a small-scale demonstrative knowledge tool, insights from interviews, literature studies as well as the results from the validation study lead to the conclusion that large language models can assist power line designers via a knowledge tool. Beyond power line design, this thesis contributes to the understanding of how data-driven language models can assist knowledge retrieval and decision-making across other engineering design domains. This work utilizes a professional education document on the mechanical dimensioning of wooden power line poles including an analysis on the wind and weight span’s affect on the dimension of the pole, developed parallel to this work. The original design data from the document supported the tests conducted in this thesis. The professional education document on the mechanical dimensioning of wooden power line poles was developed in parallel to this thesis as a case study supporting the tests with original design data on power line design knowledge. The work also discusses risks and ethical aspects when implementing such a knowledge tool. Risks such as leakage of classified information are emphasized and need comprehensive systems and methods to be avoided. It is therefore highlighted how important it is to carry out the project with care and expertise to avoid damage to companies and society. Local language models or highly trusted AI system providers are recommended to ensure that no sensitive information is leaked to an unwanted third-party. With a high degree of caution and consideration of risks, an effective knowledge tool can contribute to increased efficiency, faster and more sustainable development of power line infrastructure, and thus an faster energy transition. / Förlusten av senior expertis inom kraftledningskonstruktion utgör en kritisk utmaning för den hållbara energiomställningen. Nuvarande metoder för kunskapsöverföring är otillräcklig för att förhindra förlusten av ovärderlig kunskap som är nödvändig för framtida juniora kraftledningsprojektörer. Dessutom kan den ökade informella användingen av generativa språkmodeller hota att begrava mänskligt skrivna kunskapsdokument. Detta arbete presenterar ett ramverk d¨ar storskaliga språkmodeller används för att underlätta kunskapsöverföring och tillhandahålla vägledning vid beslutsfattande inom ingenjörsföretag. Med hjälp av detta ramverk utforskar arbetet ytterligare hur datadrivna kunskapsverktyg kan hjälpa juniora kraftledningskonstrukt¨orer genom att stödja informationsinhämtning med hänvisning till kunskapskällorna. En storskalig språkmodells förmåga att hämta relevant kunskap från ett tekniskt designdokument validerades genom att jämföra processen för mänskliga designers som manuellt slutförde en liknande uppgift. I denna utv¨ardering, som involverade sex deltagare och den storskaliga spr˚akmodellen, rankades svaren på frågor om mekanisk dimensionering av stag för kraftledningsstolpar av experter. Resultaten visade att den storskaliga språkmodellens svar i genomsnitt rankades på liknade nivå som de juniora ingenjörerna. Tillsammans med  ett småskaligt demonstrativt kunskapsverktyg, insikter från intervjuer med kraftledningskonstruktörer, litteraturstudier samt resultat från valideringsstudien dras slutsatsen att storskaliga språkmodeller kan stödja kraftledningskonstruktörer via ett kunskapsverktyg. Utöver kraftledningskonstruktion bidrar detta arbete till förståelsen av hur datadrivna språkmodeller kan hjälpa till med kunskapsinhämtning och beslutsfattande  inom andra tekniska designområden. Arbetet använder ett professionellt utbildningsunderlag om mekanisk dimensionering av kraftledningsstolpar i träkonstruktion, inklusive en analys av vertikala- och horistontella linspannets påverkan på stolpens dimension, utvecklat parallellt med detta arbete. Orginaldesigndata från underlaget stödde de tester som genomfördes. Arbetet belyser även risker och etiska aspekter vid implementering av ett sådant kunskapsverktyg. Risker som läckage av sekretessbelagd information betonas, och omfattande system och metoder behövs för att undvika dem. Därför understryks hur viktigt det är att genomföra liknande projekt med noggrannhet, försiktighet och expertis för att undvika skador på företag och samhälle. Lokala språkmodeller eller API-leverantörer med högt förtroende rekommenderas för att minimera risken att känslig information läcker ut till en oönskad tredje part. Med stor försiktighet och hänsyn till riskerna kan ett effektivt kunskapsverktyg bidra till ökad effektivitet, snabbare och mer hållbar utveckling av kraftledningsinfrastruktur, och därmed en snabbare energiomställning.
57

Generativ AI i gymnasieskolan : Undersökning av en lektionsseries påverkan på gymnasieelevernas färdigheter / Generative AI in Upper Secondary School : Investigating the impact of a lesson series on upper secondary students' skills

Piorkowski, Bartosz Michal January 2024 (has links)
Denna kvasiexperimentella studie syftade till att undersöka hur en lektionsserie kan struktureras och implementeras med mål att utveckla gymnasieelevers förmåga att använda sig av generativ artificiell intelligens som ett pedagogiskt verktyg. För att möta detta syfte genomfördes tre lektioner om artificiell intelligens, maskininlärning, neurala nätverk och stora språkmodeller med fokus på utveckling av teknisk kunskap och praktiska färdigheter med inslag av etik och kritik. Valet av dessa teman grundades i ett tidigare etablerat ramverk för undervisning inom AIläskunnighet. Vidare teman tas dessa teman upp som del av teknikprogrammet och den kommande AI-kursen enligt Skolverkets förslag. Lektionsseriens påverkan kvantifierades med hjälp av två enkäter – en innan och en efter genomförandet av lektionsserien. Lektionsserien presenterades för två gymnasieklasser vilka bestod av totalt ungefär 50 elever. Urvalet av gymnasieklasserna grundades i deras anslutning till den uppdragsgivande läraren. Vidare valdes respondenterna till enkäten utifrån de elever som fysiskt deltog på den första och sista lektionen och frivilligt valde att svara på enkäten. Dessutom intervjuades fyra tekniklärare för att bättre anpassa lektionsinnehållet till målgruppen. Analysen av svarsfrekvensen till enkätfrågorna visade att lektionsserien hade en statistiskt signifikant påverkan på elevernas tekniska kunskaper, men dess påverkan på elevernas praktiska färdigheter var i stort statistiskt insignifikant. Samtidigt påvisade frekvensanalysen att gymnasieeleverna i regel överskattade sin förmåga att kritiskt granska datorgenererad text och var i stort omedvetna om relevanta etiska frågeställningar. Explorativa faktoranalysen visade att det existerar åtminstone två typer av elever. En elevgrupp av okänd storlek använder sig av stora språkmodeller för att accelerera sina studier genom att lösa problem de annars inte kunde lösa. I detta fall har artificiell intelligens en multiplicerande effekt på elevernas produktivitet. En annan elevgrupp av okänd storlek har i stället som mål att förbättra sina skolresultat genom att använda sig av stora språkmodeller för att lösa deras problem åt dem. Samtidigt överskattar dessa elever sin förmåga att granska datorgenererad text. I detta fall har artificiell intelligens en dämpande effekt på elevernas lärande. Studiens slutsats är att det i dagsläget finns behov för undervisning av gymnasieelever på teknikprogrammet om artificiell intelligens. Detta utrymme kan i stort uppfyllas av en tre lektioner lång lektionsserie. Dock erkänner studien att det finns ytterligare utrymme för praktiska moment där läraren handleder eleverna i deras användning av verktyg såsom ChatGPT. Vidare finns det utrymme för kontinuerligt arbete med kritik och etik, möjligtvis som del av de tidigare nämnda praktiska momenten. / This quais-experimental study aimed to investigate how a series of lessons could be structured and implemented with the goal of developing secondary level students’ ability to use generative artificial intelligence as an educational tool. To meet this goal three lessons on artificial intelligence, machine learning, neural networks, and large language models were conducted, focusing on the development of technical knowledge and practical skills with the inclusion of ethics and critical thinking. The choice of these topics was based on a previously established framework for AI-literacy education. Further, these topics are brought up as part of the Swedish upper secondary school technology programme as well as the upcoming AI-course as per the proposal made by the Swedish Agency for Education. The impact of the lesson series was quantified using two form surveys – one before and one after the implementation of the lesson series. The lesson series was presented to two student classes totalling roughly 50 students. The selection of student classes were based on their affiliation with the assigning teacher. Further, the survey respondents were sampled from the students who physically attended the first and last lesson and voluntarily elected to respond. Additionally, four technology teachers were interviewed to better adapt the teaching material to the student demographic. Response analysis showed that the lesson series had a statistically significant impact on students’ technical knowledge, but its impact on students’ practical skills was largely statistically insignificant. At the same time, the frequency analysis indicated that students generally overestimated their ability to critically evaluate computer-generated text and were largely unaware of relevant ethical issues. Exploratory factor analysis had shown that there exist at least two types of students. A student group of unknown size use large language models to accelerate their studies through solving problems they could not otherwise solve. In this case, artificial intelligence has a multiplying effect on the students’ productivity. Another group of students of unknown size instead use large language models to solve their problems for them with the goal of improving their academic performance. At the same time, these students overestimate their ability to evaluate computer-generated text critically. In this case, artificial intelligence has a dampening effect on the students’ learning. The study concludes that there is a need for teaching secondary level students from the technology programme about artificial intelligence. This space can largely be fulfilled by a series of three lessons. However, the study acknowledges that there remains room for practical activities where the teacher guides students in their use of tools such as ChatGPT. Furthermore, there is room for ongoing work on critical thinking and ethics, possibly as part of the aforementioned practical activities.
58

L'atténuation statistique des surdétections d'un correcteur grammatical symbolique

Gotti, Fabrizio 02 1900 (has links)
Les logiciels de correction grammaticale commettent parfois des détections illégitimes (fausses alertes), que nous appelons ici surdétections. La présente étude décrit les expériences de mise au point d’un système créé pour identifier et mettre en sourdine les surdétections produites par le correcteur du français conçu par la société Druide informatique. Plusieurs classificateurs ont été entraînés de manière supervisée sur 14 types de détections faites par le correcteur, en employant des traits couvrant di-verses informations linguistiques (dépendances et catégories syntaxiques, exploration du contexte des mots, etc.) extraites de phrases avec et sans surdétections. Huit des 14 classificateurs développés sont maintenant intégrés à la nouvelle version d’un correcteur commercial très populaire. Nos expériences ont aussi montré que les modèles de langue probabilistes, les SVM et la désambiguïsation sémantique améliorent la qualité de ces classificateurs. Ce travail est un exemple réussi de déploiement d’une approche d’apprentissage machine au service d’une application langagière grand public robuste. / Grammar checking software sometimes erroneously flags a correct word sequence as an error, a problem we call overdetection in the present study. We describe the devel-opment of a system for identifying and filtering out the overdetections produced by the French grammar checker designed by the firm Druide Informatique. Various fami-lies of classifiers have been trained in a supervised way for 14 types of detections flagged by the grammar checker, using features that capture diverse linguistic phe-nomena (syntactic dependency links, POS tags, word context exploration, etc.), extracted from sentences with and without overdetections. Eight of the 14 classifiers we trained are now part of the latest version of a very popular commercial grammar checker. Moreover, our experiments have shown that statistical language models, SVMs and word sense disambiguation can all contribute to the improvement of these classifiers. This project is a striking illustration of a machine learning component suc-cessfully integrated within a robust, commercial natural language processing application.
59

Reconnaissance de la parole pour l’aide à la communication pour les sourds et malentendants / Speech recognition as a communication aid for deaf and hearing impaired people

Orosanu, Luiza 11 December 2015 (has links)
Cette thèse fait partie du projet RAPSODIE dont l’objectif est de proposer une reconnaissance vocale spécialisée sur les besoins des personnes sourdes et malentendantes. Deux axes sont étudiées : la modélisation lexicale et l’extraction d’informations para-lexicales. Concernant la modélisation lexicale, nous avons étudié les modèles de langage hybrides combinant mots et syllabes, et nous avons proposé une nouvelle approche basée sur une notion de similarité entre mots pour l’ajout de nouveaux mots dans le modèle de langage. Concernant l’extraction d’informations para-lexicales, nous avons étudié l'utilisation des paramètres prosodiques, des paramètres linguistiques ou de leur combinaison pour la détection des questions et des affirmations. Cette détection a comme but de signaler aux personnes sourdes ou malentendantes quand une question leur est adressée / This thesis is part of the RAPSODIE project which aims at proposing a speech recognition device specialized on the needs of deaf and hearing impaired people. Two aspects are studied: optimizing the lexical models and extracting para-lexical information. Regarding the lexical modeling, we studied hybrid language models combining words and syllables, and we proposed a new approach based on a similarity measure between words to add new words in the language model. Regarding the extraction of para-lexical information, we investigated the use of prosodic features, of linguistic features and of their combination for the detection of questions and statements. This detection aims to inform the deaf and hearing impaired people when a question is addressed to them
60

Chinese students' perception of, orientation towards and identification with English through transnational higher education

Du, Xiangping January 2009 (has links)
Given the international status and importance of English, English language study has attracted millions of Chinese learners. Apart from those who study abroad, more and more Chinese students are motivated to study in English-medium Transnational Higher Education (THE) programmes inside China. English is a diversifying and fragmenting language that has various functions and can be used for different purposes. Whilst, according to many scholars, English has broken free from the ownership of ‘native English’ speakers, Chinese learners of English are still worried about conforming to ‘native-speaker models’ of English and so falling victim to an English linguistic imperialism project, driven by English-medium THE programmes. Accordingly, this research sets out to investigate, the extent to which Chinese learners, in a UK affiliated THE programme in China, feel the need to orientate to or identify with ‘native English’ and its speakers, and run the risk of becoming victims of English linguistic imperialism. Results from a combination of methods: questionnaires, focus group discussions and interviews, show that students’ orientations towards and identification with English and its speakers are diverse, complex and multi-dimensional, and have gone beyond affiliation with ‘native English’ speakers. Studying in English-medium THE programmes does not necessarily lead to English linguistic imperialism, but is a process of interaction where learners may consciously mediate ‘native English’ norms and express individual, local, national or international identities, literally taking advantage of the programmes’ material benefits and deliberately learning the language for international communication. This research suggests that learners in THE programmes are conscious of the overall context individually, nationally and internationally and feel free to orientate to English in ways that are suitable for their own purposes and which represent their preferred identity.

Page generated in 0.0846 seconds