Spelling suggestions: "subject:"tentent classification"" "subject:"depentent classification""
1 |
Low-Resource Natural Language Understanding in Task-Oriented DialogueLouvan, Samuel 11 March 2022 (has links)
Task-oriented dialogue (ToD) systems need to interpret the user's input to understand the user's needs (intent) and corresponding relevant information (slots). This process is performed by a Natural Language Understanding (NLU) component, which maps the text utterance into a semantic frame representation, involving two subtasks: intent classification (text classification) and slot filling (sequence tagging). Typically, new domains and languages are regularly added to the system to support more functionalities. Collecting domain-specific data and performing fine-grained annotation of large amounts of data every time a new domain and language is introduced can be expensive. Thus, developing an NLU model that generalizes well across domains and languages with less labeled data (low-resource) is crucial and remains challenging.
This thesis focuses on investigating transfer learning and data augmentation methods for low-resource NLU in ToD. Our first contribution is a study of the potential of non-conversational text as a source for transfer. Most transfer learning approaches assume labeled conversational data as the source task and adapt the NLU model to the target task. We show that leveraging similar tasks from non-conversational text improves performance on target slot filling tasks through multi-task learning in low-resource settings. Second, we propose a set of lightweight augmentation methods that apply data transformation on token and sentence levels through slot value substitution and syntactic manipulation. Despite its simplicity, the performance is comparable to deep learning-based augmentation models, and it is effective on six languages on NLU tasks. Third, we investigate the effectiveness of domain adaptive pre-training for zero-shot cross-lingual NLU. In terms of overall performance, continued pre-training in English is effective across languages. This result indicates that the domain knowledge learned in English is transferable to other languages. In addition to that, domain similarity is essential. We show that intermediate pre-training data that is more similar – in terms of data distribution – to the target dataset yields better performance.
|
2 |
Understand me, do you? : An experiment exploring the natural language understanding of two open source chatbotsOlofsson, Linnéa, Patja, Heidi January 2021 (has links)
What do you think of when you hear the word chatbot? A helpful assistant when booking flight tickets? Maybe a frustrating encounter with a company’s customer support, or smart technologies that will eventually take over your job? The field of chatbots is under constant development and bots are more and more taking a place in our everyday life, but how well do they really understand us humans? The objective of this thesis is to investigate how capable two open source chatbots are in understanding human language when given input containing spelling errors, synonyms or faulty syntax. The study will further investigate if the bots get better at identifying what the user’s intention is when supplied with more training data to base their analysis on. Two different chatbot frameworks, Botpress and Rasa, were consulted to execute this experiment. The two bots were created with basic configurations and trained using the same data. The chatbots underwent three rounds of training and testing, where they were given additional training and asked control questions to see if they managed to interpret the correct intent. All tests were documented and scores were calculated to create comparable data. The results from these tests showed that both chatbots performed well when it came to simpler spelling errors and syntax variations. Their understanding of more complex spelling errors were lower in the first testing phase but increased with more training data. Synonyms followed a similar pattern, but showed a minor tendency towards becoming overconfident and producing incorrect results with a high confidence in the last phase. The scores pointed to both chatbots getting better at understanding the input when receiving additional training. In conclusion, both chatbots showed signs of understanding language variations when given minimal training, but got significantly better results when provided with more data. The potential to create a bot with a substantial understanding of human language is evident with these results, even for developers who are previously not experienced with creating chatbots, also taking into consideration the vast possibilities to customise your chatbot.
|
3 |
Leveraging Sequential Nature of Conversations for Intent ClassificationGotteti, Shree January 2021 (has links)
No description available.
|
4 |
Computer Enabled Interventions to Communication and Behavioral Problems in Collaborative Work EnvironmentsShivakumar, Ashutosh 23 May 2022 (has links)
No description available.
|
5 |
Intent classification through conversational interfaces : Classification within a small domainLekic, Sasa, Liu, Kasper January 2019 (has links)
Natural language processing and Machine learning are subjects undergoing intense study nowadays. These fields are continually spreading, and are more interrelated than ever before. A case in point is text classification which is an instance of Machine learning(ML) application in Natural Language processing(NLP).Although these subjects have evolved over the recent years, they still have some problems that have to be considered. Some are related to the computing power techniques from these subjects require, whereas the others to how much training data they require.The research problem addressed in this thesis regards lack of knowledge on whether Machine learning techniques such as Word2Vec, Bidirectional encoder representations from transformers (BERT) and Support vector machine(SVM) classifier can be used for text classification, provided only a small training set. Furthermore, it is not known whether these techniques can be run on regular laptops.To solve the research problem, the main purpose of this thesis was to develop two separate conversational interfaces utilizing text classification techniques. These interfaces, provided with user input, can recognise the intent behind it, viz. classify the input sentence within a small set of pre-defined categories. Firstly, a conversational interface utilizing Word2Vec, and SVM classifier was developed. Secondly, an interface utilizing BERT and SVM classifier was developed. The goal of the thesis was to determine whether a small dataset can be used for intent classification and with what accuracy, and if it can be run on regular laptops.The research reported in this thesis followed a standard applied research method. The main purpose was achieved and the two conversational interfaces were developed. Regarding the conversational interface utilizing Word2Vec pre-trained dataset, and SVM classifier, the main results showed that it can be used for intent classification with the accuracy of 60%, and that it can be run on regular computers. Concerning the conversational interface utilizing BERT and SVM Classifier, the results showed that this interface cannot be trained and run on regular laptops. The training ran over 24 hours and then crashed.The results showed that it is possible to make a conversational interface which is able to classify intents provided only a small training set. However, due to the small training set, and consequently low accuracy, this conversational interface is not a suitable option for important tasks, but can be used for some non-critical classification tasks. / Natural language processing och maskininlärning är ämnen som forskas mycket om idag. Dessa områden fortsätter växa och blir allt mer sammanvävda, nu mer än någonsin. Ett område är textklassifikation som är en gren av maskininlärningsapplikationer (ML) inom Natural language processing (NLP).Även om dessa ämnen har utvecklats de senaste åren, finns det fortfarande problem att ha i å tanke. Vissa är relaterade till rå datakraft som krävs för dessa tekniker medans andra problem handlar om mängden data som krävs.Forskningsfrågan i denna avhandling handlar om kunskapsbrist inom maskininlärningtekniker som Word2vec, Bidirectional encoder representations from transformers (BERT) och Support vector machine(SVM) klassificierare kan användas som klassification, givet endast små träningsset. Fortsättningsvis, vet man inte om dessa metoder fungerar på vanliga datorer.För att lösa forskningsproblemet, huvudsyftet för denna avhandling var att utveckla två separata konversationsgränssnitt som använder textklassifikationstekniker. Dessa gränssnitt, give med data, kan känna igen syftet bakom det, med andra ord, klassificera given datamening inom ett litet set av fördefinierade kategorier. Först, utvecklades ett konversationsgränssnitt som använder Word2vec och SVM klassificerare. För det andra, utvecklades ett gränssnitt som använder BERT och SVM klassificerare. Målet med denna avhandling var att avgöra om ett litet dataset kan användas för syftesklassifikation och med vad för träffsäkerhet, och om det kan användas på vanliga datorer.Forskningen i denna avhandling följde en standard tillämpad forskningsmetod. Huvudsyftet uppnåddes och de två konversationsgränssnitten utvecklades. Angående konversationsgränssnittet som använde Word2vec förtränat dataset och SVM klassificerar, visade resultatet att det kan användas för syftesklassifikation till en träffsäkerhet på 60%, och fungerar på vanliga datorer. Angående konversationsgränssnittet som använde BERT och SVM klassificerare, visade resultatet att det inte går att köra det på vanliga datorer. Träningen kördes i över 24 timmar och kraschade efter det.Resultatet visade att det är möjligt att skapa ett konversationsgränssnitt som kan klassificera syften, givet endast ett litet träningsset. Däremot, på grund av det begränsade träningssetet, och konsekvent låg träffsäkerhet, är denna konversationsgränssnitt inte lämplig för viktiga uppgifter, men kan användas för icke kritiska klassifikationsuppdrag.
|
6 |
Mining Behavior of Citizen Sensor Communities to Improve Cooperation with Organizational ActorsPurohit, Hemant 01 September 2015 (has links)
No description available.
|
Page generated in 0.122 seconds