Global ETD Search

31	Prototyputveckling för skalbar motor med förståelse för naturligt språk / Prototype development for a scalable engine with natural language understanding Galdo, Carlos, Chavez, Teddy January 2018 (has links) Förståelse för naturligt språk, språk som har utvecklats av människan ex. talspråk eller teckenspråk, är en del av språkteknik. Det är ett brett ämnesområde där utvecklingen har gått fram i snabb takt senaste 20 åren. En bidragande faktor till denna utveckling är framgångarna med neurala nätverk som är en matematisk modell inspirerad av biologiska hjärnor. Förståelse för naturligt språk används inom många områden där det krävs att applikationer förstår innebörden av textinmatning. Exempel på applikationer som använder förståelse för naturligt språk är Google translate, Googles sökmotor och rättstavningsfunktionen i textredigerarprogram. A Great Thing AB har utvecklat applikationen Thing Launcher. Thing Launcher är en applikation som hanterar andra applikationer med hjälp av användarens olika kriterier i samband mobilens olika funktionaliteter som; väder, geografisk position, tid mm. Ett exempel kan vara att användaren vill att Spotify ska spela en specifik låt när användaren kommer hem, eller att en taxi ska vara på plats när användaren anländer till en geografisk position. I dagsläget styr man Thing Launcher med hjälp av textinmatningar. A Great Thing AB behöver hjälp att ta en prototyp på en motor med förståelse för naturligt språk som kan styras av både textinmatning och röstinmatning. Motorn ska användas i applikationen Thing Launcher. Med skalbarhet menas att motorn ska kunna utvecklas, att nya funktioner och applikationer ska kunna läggas till, samtidigt som systemet ska kunna vara i drift och att prestandan påverkas så lite som möjligt. Detta examensarbete har som syfte att undersöka vilka algoritmer som är lämpliga för att bygga en skalbar motor med förståelse av naturligt språk. Utifrån detta utveckla en prototyp. En litteraturstudie gjordes mellan dolda Markovmodeller och neurala nätverk. Resultatet visade att neurala nätverk var överlägset i förståelse av naturligt språk. Flera typer av neurala nätverk finns implementerade i TensorFlow och den är mycket flexibelt med sitt bredda utbud av kompatibla mobila enheter, vilket nyttar utvecklingen med det modulära aspekten och därför valdes detta som ramverk för att utveckla prototypen. De två viktigaste komponenterna i prototypen bestod av Command tagger, som ska kunna identifiera vilken applikation som användaren vill styra och NER tagger, som ska identifiera vad användaren vill att applikationen ska utföra. För att mäta träffsäkerheten utfördes det två tester, en för respektive tagger, flera gånger som mätte hur ofta komponenterna gissade rätt efter varje träningsrunda. Varje träningsrunda bestod av att komponenterna fick tiotusentals meningar som de fick gissa på följt av facit för att ge feedback. Med hjälp av feedback kunde komponenterna anpassas för hur de agerar i framtiden i samma situation. Command tagger gissade rätt 94 procent av gångerna och Ner tagger gissade rätt 96 procent av gångerna efter de sista träningsrundorna. I prototypen användes Androids inbyggda mjukvara för taligenkänning. Det är en funktion som omvandlar ljudvågor till text. En serverbaserad lösning med REST applikationsgränssnitt utvecklades för att göra motorn skalbar. Resultatet visar att fungerande prototyp som kan vidareutvecklas till en skalbar motor för naturligt språk. / Natural Language Understanding is a field that is part of Natural Language Processing. Big improvements have been made in the broad field of Natural Language Understanding during the past two decades. One big contribution to this is improvement is Neural Networks, a mathematical model inspired by biological brains. Natural Language Understanding is used in fields that require deeper understanding by applications. Google translate, Google search engine and grammar/spelling check are some examples of applications requiring deeper understanding. Thing Launcher is an application developed by A Great Thing AB. Thing Launcher is an application capable of managing other applications with different parameters. Some examples of parameters the user can use are geographic position and time. The user can as an example control what song will be played when you get home or order an Uber when you arrive to a certain destination. It is possible to control Thing Launcher today by text input. A Great Thing AB needs help developing a prototype capable of understanding text input and speech. The meaning of scalable is that it should be possible to develop, add functions and applications with as little impact as possible on up time and performance of the service. A comparison of suitable algorithms, tools and frameworks has been made in this thesis in order research what it takes to develop a scalable engine with the natural language understanding and then build a prototype from this gathered information. A theoretical comparison was made between Hidden Markov Models and Neural Networks. The results showed that Neural Networks are superior in the field of natural language understanding. The tests made in this thesis indicated that high accuracy could be achieved using neural networks. TensorFlow framework was chosen because it has many different types of neural network implemented in C/C++ ready to be used with Python and alsoand for the wide compatibility with mobile devices. The prototype should be able to identify voice commands. The prototype has two important components called Command tagger, which is going to identify which application the user wants to control and NER tagger, which is the going to identify what the user wants to do. To calculate the accuracy, two types of tests, one for each component, was executed several times to calculate how often the components guessed right after each training iteration. Each training iteration consisted of giving the components thousands of sentences to guess and giving them feedback by then letting them know the right answers. With the help of feedback, the components were molded to act right in situations like the training. The tests after the training process resulted with the Command tagger guessing right 94% of the time and the NER tagger guessing right 96% of the time. The built-in software in Android was used for speech recognition. This is a function that converts sound waves to text. A server-based solution with REST interface was developed to make the engine scalability. This thesis resulted with a working prototype that can be used to further developed into a scalable engine. natural language understanding natural language neural network NLU natural language processing speech recognition hidden Markov model naturligt språk neurala nätverk NLU språkteknik taligenkänning dold Markovmodell Software Engineering Programvaruteknik
32	Bidirectional Encoder Representations from Transformers (BERT) for Question Answering in the Telecom Domain. : Adapting a BERT-like language model to the telecom domain using the ELECTRA pre-training approach / BERT för frågebesvaring inom telekomdomänen : Anpassning till telekomdomänen av en BERT-baserad språkmodell genom ELECTRA-förträningsmetoden Holm, Henrik January 2021 (has links) The Natural Language Processing (NLP) research area has seen notable advancements in recent years, one being the ELECTRA model which improves the sample efficiency of BERT pre-training by introducing a discriminative pre-training approach. Most publicly available language models are trained on general-domain datasets. Thus, research is lacking for niche domains with domain-specific vocabulary. In this paper, the process of adapting a BERT-like model to the telecom domain is investigated. For efficiency in training the model, the ELECTRA approach is selected. For measuring target- domain performance, the Question Answering (QA) downstream task within the telecom domain is used. Three domain adaption approaches are considered: (1) continued pre- training on telecom-domain text starting from a general-domain checkpoint, (2) pre-training on telecom-domain text from scratch, and (3) pre-training from scratch on a combination of general-domain and telecom-domain text. Findings indicate that approach 1 is both inexpensive and effective, as target- domain performance increases are seen already after small amounts of training, while generalizability is retained. Approach 2 shows the highest performance on the target-domain QA task by a wide margin, albeit at the expense of generalizability. Approach 3 combines the benefits of the former two by achieving good performance on QA both in the general domain and the telecom domain. At the same time, it allows for a tokenization vocabulary well-suited for both domains. In conclusion, the suitability of a given domain adaption approach is shown to depend on the available data and computational budget. Results highlight the clear benefits of domain adaption, even when the QA task is learned through behavioral fine-tuning on a general-domain QA dataset due to insufficient amounts of labeled target-domain data being available. / Dubbelriktade språkmodeller som BERT har på senare år nått stora framgångar inom språkteknologiområdet. Flertalet vidareutvecklingar av BERT har tagits fram, bland andra ELECTRA, vars nyskapande diskriminativa träningsprocess förkortar träningstiden. Majoriteten av forskningen inom området utförs på data från den allmänna domänen. Med andra ord finns det utrymme för kunskapsbildning inom domäner med områdesspecifikt språk. I detta arbete utforskas metoder för att anpassa en dubbelriktad språkmodell till telekomdomänen. För att säkerställa hög effektivitet i förträningsstadiet används ELECTRA-modellen. Uppnådd prestanda i måldomänen mäts med hjälp av ett frågebesvaringsdataset för telekom-området. Tre metoder för domänanpassning undersöks: (1) fortsatt förträning på text från telekom-området av en modell förtränad på den allmänna domänen; (2) förträning från grunden på telekom-text; samt (3) förträning från grunden på en kombination av text från telekom-området och den allmänna domänen. Experimenten visar att metod 1 är både kostnadseffektiv och fördelaktig ur ett prestanda-perspektiv. Redan efter kort fortsatt förträning kan tydliga förbättringar inom frågebesvaring inom måldomänen urskiljas, samtidigt som generaliserbarhet kvarhålls. Tillvägagångssätt 2 uppvisar högst prestanda inom måldomänen, om än med markant sämre förmåga att generalisera. Metod 3 kombinerar fördelarna från de tidigare två metoderna genom hög prestanda dels inom måldomänen, dels inom den allmänna domänen. Samtidigt tillåter metoden användandet av ett tokenizer-vokabulär väl anpassat för båda domäner. Sammanfattningsvis bestäms en domänanpassningsmetods lämplighet av den respektive situationen och datan som tillhandahålls, samt de tillgängliga beräkningsresurserna. Resultaten påvisar de tydliga vinningar som domänanpassning kan ge upphov till, även då frågebesvaringsuppgiften lärs genom träning på ett dataset hämtat ur den allmänna domänen på grund av otillräckliga mängder frågebesvaringsdata inom måldomänen. Deep Learning Natural Language Understanding Transformers Language Models Representation Learning Domain Adaption Representationsinlärning Djupinlärning Språkteknologi Transformatorer Språkmodeller Domänanpassning Computer and Information Sciences Data- och informationsvetenskap
33	A Framework to Understand Emoji Meaning: Similarity and Sense Disambiguation of Emoji using EmojiNet Wijeratne, Sanjaya January 2018 (has links) No description available. Artificial Intelligence Computer Science Computer Engineering Sociolinguistics Emoji EmojiNet Emoji Similarity Emoji Sense Disambiguation Emoji Understanding Emoji Research Twitter Word Embedding Social Media Linguistics Natural Language Processing Machine Learning Natural Language Understanding Unicode Emoji Semiotics
34	Question-answering chatbot for Northvolt IT Support Hjelm, Daniel January 2023 (has links) Northvolt is a Swedish battery manufacturing company that specializes in the production of sustainable lithium-ion batteries for electric vehicles and energy storage systems. Established in 2016, the company has experienced significant growth in recent years. This growth has presented a major challenge for the IT Support team, as they face a substantial volume of ITrelated inquiries. To address this challenge and allow the IT Support team to concentrate on more complex support tasks, a question-answering chatbot has been implemented as part of this thesis project. The chatbot has been developed using the Microsoft Bot Framework and leverages Microsoft cloud services, specifically Azure Cognitive Services, to provide intelligent and cognitive capabilities for answering employee questions directly within Microsoft Teams. The chatbot has undergone testing by a diverse group of employees from various teams within the organization and was evaluated based on three key metrics: effectiveness (including accuracy, precision, and intent recognition rate), efficiency (including response time and scalability), and satisfaction. The test results indicate that the accuracy, precision, and intent recognition rate fall below the required thresholds for production readiness. However, these metrics can be improved by expanding the knowledge base of the bot. The chatbot demonstrates impressive efficiency in terms of response time and scalability, and its user-friendly nature contributes to a positive user experience. Users express high levels of satisfaction with their interactions with the bot, and the majority would recommend it to their colleagues, recognizing it as a valuable service solution that will benefit all employees at Northvolt in the future. Moving forward, the primary focus should be on expanding the knowledge base and effectively communicating the bot’s purpose and scope to enhance effectiveness and satisfaction. Additionally, integrating the bot with advanced AI features, such as OpenAI’s language models available within Microsoft’s ecosystem, would elevate the bot to the next level. Artificial intelligence Chatbot Natural language processing Natural language understanding Machine learning Deep learning Transformer Question answering Conversational agents Conversational AI Computer Sciences Datavetenskap (datalogi)
35	The Effect of Data Quantity on Dialog System Input Classification Models / Datamängdens effekt på modeller för avsiktsklassificering i chattkonversationer Lipecki, Johan, Lundén, Viggo January 2018 (has links) This paper researches how different amounts of data affect different word vector models for classification of dialog system user input. A hypothesis is tested that there is a data threshold for dense vector models to reach the state-of-the-art performance that have been shown with recent research, and that character-level n-gram word-vector classifiers are especially suited for Swedish classifiers–because of compounding and the character-level n-gram model ability to vectorize out-of-vocabulary words. Also, a second hypothesis is put forward that models trained with single statements are more suitable for chat user input classification than models trained with full conversations. The results are not able to support neither of our hypotheses but show that sparse vector models perform very well on the binary classification tasks used. Further, the results show that 799,544 words of data is insufficient for training dense vector models but that training the models with full conversations is sufficient for single statement classification as the single-statement- trained models do not show any improvement in classifying single statements. / Detta arbete undersöker hur olika datamängder påverkar olika slags ordvektormodeller för klassificering av indata till dialogsystem. Hypotesen att det finns ett tröskelvärde för träningsdatamängden där täta ordvektormodeller när den högsta moderna utvecklingsnivån samt att n-gram-ordvektor-klassificerare med bokstavs-noggrannhet lämpar sig särskilt väl för svenska klassificerare söks bevisas med stöd i att sammansättningar är särskilt produktiva i svenskan och att bokstavs-noggrannhet i modellerna gör att tidigare osedda ord kan klassificeras. Dessutom utvärderas hypotesen att klassificerare som tränas med enkla påståenden är bättre lämpade att klassificera indata i chattkonversationer än klassificerare som tränats med hela chattkonversationer. Resultaten stödjer ingendera hypotes utan visar istället att glesa vektormodeller presterar väldigt väl i de genomförda klassificeringstesterna. Utöver detta visar resultaten att datamängden 799 544 ord inte räcker till för att träna täta ordvektormodeller väl men att konversationer räcker gott och väl för att träna modeller för klassificering av frågor och påståenden i chattkonversationer, detta eftersom de modeller som tränats med användarindata, påstående för påstående, snarare än hela chattkonversationer, inte resulterar i bättre klassificerare för chattpåståenden. Chatbot Chatterbot Virtual Assistant Dialog System Natural Language Understanding Word Embedding Word Vector Models Text Classification Chattbot Virtuell Assistent Dialogsystem Naturlig språkbehandling Ordinbäddning Ordvektormodeller Textklassificering
36	Mobilní personální asistenti / Mobile personal assistants Techl, Jan January 2013 (has links) This thesis focuses on analysis, definition and description of mobile personal assistants as a phenomenon emerging in past few years. Mobile personal assistants are first mentioned in the context of computational linguistics and information needs, which is one of the motivations to use them. Main interest of this thesis is an introduction of the core technologies for the natural language communication between the assistant and its user, followed by an introduction of host environments and possible usage. The thesis also presents the limitations and risks resulting from using them, which are in some ways affecting their usability. Beside the analysis the main focus is on the design and implementation of the natural language understanding (NLU) system, which can be used in particular personal assistant application. This system is implemented as a web service and consists of an annotation scheme with a set of components. The results show that the system architecture and tools used are suitable solution for the construction of a basic NLU system, which has been created and which is in the compliance with the requested parameters. It is still difficult task to achieve high precision, which depends on many factors including the amount of training data, which was very small in this case. However, the resulting application is a solid starting point for its further development and extensions.
37	Speech-To-Model: A Framework for Creating Software Models Using Voice Commands Bhandari, Nabin 21 July 2023 (has links) No description available. Computer Science Engineering Voice-driven Software Modeling VDSM UML IML Natural Language Understanding NLU Natural Language Processing NLP Voice commands Rasa Regular Expression Regex Voice-driven Modeling VDM

Page generated in 0.1519 seconds