• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 859
  • 186
  • 86
  • 59
  • 34
  • 24
  • 16
  • 12
  • 11
  • 10
  • 8
  • 6
  • 5
  • 4
  • 4
  • Tagged with
  • 1605
  • 1605
  • 1390
  • 558
  • 526
  • 436
  • 357
  • 344
  • 242
  • 228
  • 221
  • 217
  • 211
  • 207
  • 195
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
261

Giant Pigeon and Small Person: Prompting Visually Grounded Models about the Size of Objects

Yi Zhang (12438003) 22 April 2022 (has links)
<p>Empowering machines to understand our physical world should go beyond models with only natural language and models with only vision. Vision and language is a growing field of study that attempts to bridge the gap between natural language processing and computer vision communities by enabling models to learn visually grounded language. However, as an increasing number of pre-trained visual linguistic models focus on the alignment between visual regions and natural language, it is difficult to claim that these models capture certain properties of objects in their latent space, such as size. Inspired by recent trends in prompt learning, this study will design a prompt learning framework for two visual linguistic models, ViLBERT and ViLT, and use different manually crafted prompt templates to evaluate the consistency of performance of these models in comparing the size of objects. The results of this study showed that ViLT is more consistent in prediction accuracy for the given task with six pairs of objects under four prompt designs. However, the overall prediction accuracy is lower than the expectation on this object size comparison task; even the better model in this study, ViLT, has only 16 out of 24 cases better than the proposed random chance baseline. As this study is a preliminary study to explore the potential of pre-trained visual linguistic models on object size comparison, there are many directions for future work, such as investigating more models, choosing more object pairs, and trying different methods for feature engineering and prompt engineering.</p>
262

Natural Language Document and Event Association Using Stochastic Petri Net Modeling

Mills, Michael Thomas 29 May 2013 (has links)
No description available.
263

On Advancing Natural Language Interfaces: Data Collection, Model Development, and User Interaction

Yao, Ziyu January 2021 (has links)
No description available.
264

Skadligt innehåll på nätet - Toxiskt språk på TikTok

Wester, Linn, Stenvall, Elin January 2024 (has links)
Toxiskt språk på internet och det som ofta i vardagliga termer benämns som näthat innefattar kränkningar, hot och stötande språk. Toxiskt språk är särskilt märkbart på sociala medier. Det går att upptäcka toxiskt språk på internet med hjälp av maskininlärning som automatiskt känner igen typiska särdrag för toxiskt språk. Tidigare svensk forskning har undersökt förekomsten av toxiskt språk på sociala medier med hjälp av maskininlärning, men det saknas fortfarande forskning på den allt mer populära plattformen TikTok. Syftet med denna studie är att undersöka förekomsten och särdragen av toxiska kommentarer på TikTok med hjälp av maskininlärning och manuella metoder. Studien är menad att ge en bättre förståelse för vad unga möts av i kommentarerna på TikTok. Studien applicerar en mixad metod i en dokumentundersökning av 69 895 kommentarer. Maskininlärningsmodellen Hatescan användes för att automatiskt klassificera sannolikheten att toxiskt språk förekommer i kommentarerna. Utifrån denna sannolikhet analyserades ett urval av kommentarerna manuellt, vilket ledde till både kvantitativa och kvalitativa fynd. Resultatet av studien visade att omfattningen av toxiskt språk var relativt liten, där 0,24% av 69 895 kommentarer ansågs vara toxiska enligt en både automatiserad och manuell bedömning. Den typ av toxiskt språk som mest förekom i undersökningen visades vara obscent språk, som till majoriteten innehöll svordomar. / Toxic language on the internet and what is often referred to in everyday terms as cyberbullying includes insults, threats and offensive language. Toxic language is particularly noticeable on social media. It is possible to detect toxic language on the internet with the help of machine learning in the form of, among other things, Natural Language Processing (NLP) techniques, which automatically recognize typical characteristics of toxic language. Previous Swedish research has investigated the presence of toxic language on social media using machine learning, but there is still a lack of research on the increasingly popular platform TikTok. Through the study, the authors intend to investigate the prevalence and characteristics of toxic comments on TikTok using both a machine learning technique and manual methods. The study is meant to provide a better understanding of what young people encounter in the comments on TikTok. The study applies a mixed method in a document survey of 69 895 comments. Hatescan was used to automatically classify the likelihood of toxic language appearing in the comments. Based on this probability, a section of the comments could be sampled and manually analysed using theory, leading to both quantitative and qualitative findings. The results of the study showed that the prevalence of toxic language was relatively small, with 0.24% of 69 895 comments considered toxic based on an automatic and manual analysis. The type of toxic language that occurred the most in the study was shown to be obscene language, the majority of which contained swear words.
265

Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions

Panesar, Kulvinder 05 October 2020 (has links)
yes / This paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution.
266

Developing an enriched natural language grammar for prosodically-improved concent-to-speech synthesis

Marais, Laurette 04 1900 (has links)
The need for interacting with machines using spoken natural language is growing, along with the expectation that synthetic speech in this context sound natural. Such interaction includes answering questions, where prosody plays an important role in producing natural English synthetic speech by communicating the information structure of utterances. CCG is a theoretical framework that exploits the notion that, in English, information structure, prosodic structure and syntactic structure are isomorphic. This provides a way to convert a semantic representation of an utterance into a prosodically natural spoken utterance. GF is a framework for writing grammars, where abstract tree structures capture the semantic structure and concrete grammars render these structures in linearised strings. This research combines these frameworks to develop a system that converts semantic representations of utterances into linearised strings of natural language that are marked up to inform the prosody-generating component of a speech synthesis system. / Computing / M. Sc. (Computing)
267

JSreal : un réalisateur de texte pour la programmation web

Daoust, Nicolas 09 1900 (has links)
Site web associé au mémoire: http://daou.st/JSreal / La génération automatique de texte en langage naturel est une branche de l’intelligence artificielle qui étudie le développement de systèmes produisant des textes pour différentes applications, par exemple la description textuelle de jeux de données massifs ou l’automatisation de rédactions textuelles routinières. Un projet de génération de texte comporte plusieurs grandes étapes : la détermination du contenu à exprimer, son organisation en structures comme des paragraphes et des phrases et la production de chaînes de caractères pour un lecteur humain ; c’est la réalisation, à laquelle ce mémoire s’attaque. Le web est une plateforme en constante croissance dont le contenu, de plus en plus dynamique, se prête souvent bien à l’automatisation par un réalisateur. Toutefois, les réalisateurs existants ne sont pas conçus en fonction du web et leur utilisation requiert beaucoup de connaissances, compliquant leur emploi. Le présent mémoire de maîtrise présente JSreal, un réalisateur conçu spécifiquement pour le web et facile d’apprentissage et d’utilisation. JSreal permet de construire une variété d’expressions et de phrases en français, qui respectent les règles de grammaire et de syntaxe, d’y ajouter des balises HTML et de les intégrer facilement aux pages web. / Natural language generation, a part of artificial intelligence, studies the development of systems that produce text for different applications, for example the textual description of massive datasets or the automation of routine text redaction. Text generation projects consist of multiple steps : determining the content to be expressed, organising it in logical structures such as sentences and paragraphs, and producing human-readable character strings, a step usually called realisation, which this thesis takes on. The web is constantly growing and its contents, getting progressively more dynamic, are well-suited to automation by a realiser. However, existing realisers are not designed with the web in mind and their operation requires much knowledge, complicating their use. This master’s thesis presents JSreal, a realiser designed specifically for the web and easy to learn and use. JSreal allows its user to build a variety of French expressions and sentences, to add HTML tags to them and to easily integrate them into web pages.
268

MULTILINGUAL CYBERBULLYING DETECTION SYSTEM

Rohit Sidram Pawar (6613247) 11 June 2019 (has links)
Since the use of social media has evolved, the ability of its users to bully others has increased. One of the prevalent forms of bullying is Cyberbullying, which occurs on the social media sites such as Facebook©, WhatsApp©, and Twitter©. The past decade has witnessed a growth in cyberbullying – is a form of bullying that occurs virtually by the use of electronic devices, such as messaging, e-mail, online gaming, social media, or through images or mails sent to a mobile. This bullying is not only limited to English language and occurs in other languages. Hence, it is of the utmost importance to detect cyberbullying in multiple languages. Since current approaches to identify cyberbullying are mostly focused on English language texts, this thesis proposes a new approach (called Multilingual Cyberbullying Detection System) for the detection of cyberbullying in multiple languages (English, Hindi, and Marathi). It uses two techniques, namely, Machine Learning-based and Lexicon-based, to classify the input data as bullying or non-bullying. The aim of this research is to not only detect cyberbullying but also provide a distributed infrastructure to detect bullying. We have developed multiple prototypes (standalone, collaborative, and cloud-based) and carried out experiments with them to detect cyberbullying on different datasets from multiple languages. The outcomes of our experiments show that the machine-learning model outperforms the lexicon-based model in all the languages. In addition, the results of our experiments show that collaboration techniques can help to improve the accuracy of a poor-performing node in the system. Finally, we show that the cloud-based configurations performed better than the local configurations.
269

A study of the use of natural language processing for conversational agents

Wilkens, Rodrigo Souza January 2016 (has links)
linguagem é uma marca da humanidade e da consciência, sendo a conversação (ou diálogo) uma das maneiras de comunicacão mais fundamentais que aprendemos quando crianças. Por isso uma forma de fazer um computador mais atrativo para interação com usuários é usando linguagem natural. Dos sistemas com algum grau de capacidade de linguagem desenvolvidos, o chatterbot Eliza é, provavelmente, o primeiro sistema com foco em diálogo. Com o objetivo de tornar a interação mais interessante e útil para o usuário há outras aplicações alem de chatterbots, como agentes conversacionais. Estes agentes geralmente possuem, em algum grau, propriedades como: corpo (com estados cognitivos, incluindo crenças, desejos e intenções ou objetivos); incorporação interativa no mundo real ou virtual (incluindo percepções de eventos, comunicação, habilidade de manipular o mundo e comunicar com outros agentes); e comportamento similar ao humano (incluindo habilidades afetivas). Este tipo de agente tem sido chamado de diversos nomes como agentes animados ou agentes conversacionais incorporados. Um sistema de diálogo possui seis componentes básicos. (1) O componente de reconhecimento de fala que é responsável por traduzir a fala do usuário em texto. (2) O componente de entendimento de linguagem natural que produz uma representação semântica adequada para diálogos, normalmente utilizando gramáticas e ontologias. (3) O gerenciador de tarefa que escolhe os conceitos a serem expressos ao usuário. (4) O componente de geração de linguagem natural que define como expressar estes conceitos em palavras. (5) O gerenciador de diálogo controla a estrutura do diálogo. (6) O sintetizador de voz é responsável por traduzir a resposta do agente em fala. No entanto, não há consenso sobre os recursos necessários para desenvolver agentes conversacionais e a dificuldade envolvida nisso (especialmente em línguas com poucos recursos disponíveis). Este trabalho foca na influência dos componentes de linguagem natural (entendimento e gerência de diálogo) e analisa em especial o uso de sistemas de análise sintática (parser) como parte do desenvolvimento de agentes conversacionais com habilidades de linguagem mais flexível. Este trabalho analisa quais os recursos do analisador sintático contribuem para agentes conversacionais e aborda como os desenvolver, tendo como língua alvo o português (uma língua com poucos recursos disponíveis). Para isto, analisamos as abordagens de entendimento de linguagem natural e identificamos as abordagens de análise sintática que oferecem um bom desempenho. Baseados nesta análise, desenvolvemos um protótipo para avaliar o impacto do uso de analisador sintático em um agente conversacional. / Language is a mark of humanity and conscience, with the conversation (or dialogue) as one of the most fundamental manners of communication that we learn as children. Therefore one way to make a computer more attractive for interaction with users is through the use of natural language. Among the systems with some degree of language capabilities developed, the Eliza chatterbot is probably the first with a focus on dialogue. In order to make the interaction more interesting and useful to the user there are other approaches besides chatterbots, like conversational agents. These agents generally have, to some degree, properties like: a body (with cognitive states, including beliefs, desires and intentions or objectives); an interactive incorporation in the real or virtual world (including perception of events, communication, ability to manipulate the world and communicate with others); and behavior similar to a human (including affective abilities). This type of agents has been called by several terms, including animated agents or embedded conversational agents (ECA). A dialogue system has six basic components. (1) The speech recognition component is responsible for translating the user’s speech into text. (2) The Natural Language Understanding component produces a semantic representation suitable for dialogues, usually using grammars and ontologies. (3) The Task Manager chooses the concepts to be expressed to the user. (4) The Natural Language Generation component defines how to express these concepts in words. (5) The dialog manager controls the structure of the dialogue. (6) The synthesizer is responsible for translating the agents answer into speech. However, there is no consensus about the necessary resources for developing conversational agents and the difficulties involved (especially in resource-poor languages). This work focuses on the influence of natural language components (dialogue understander and manager) and analyses, in particular the use of parsing systems as part of developing conversational agents with more flexible language capabilities. This work analyses what kind of parsing resources contributes to conversational agents and discusses how to develop them targeting Portuguese, which is a resource-poor language. To do so we analyze approaches to the understanding of natural language, and identify parsing approaches that offer good performance, based on which we develop a prototype to evaluate the impact of using a parser in a conversational agent.
270

Biomedical Concept Association and Clustering Using Word Embeddings

Setu Shah (5931128) 12 February 2019 (has links)
<div>Biomedical data exists in the form of journal articles, research studies, electronic health records, care guidelines, etc. While text mining and natural language processing tools have been widely employed across various domains, these are just taking off in the healthcare space.</div><div><br></div><div>A primary hurdle that makes it difficult to build artificial intelligence models that use biomedical data, is the limited amount of labelled data available. Since most models rely on supervised or semi-supervised methods, generating large amounts of pre-processed labelled data that can be used for training purposes becomes extremely costly. Even for datasets that are labelled, the lack of normalization of biomedical concepts further affects the quality of results produced and limits the application to a restricted dataset. This affects reproducibility of the results and techniques across datasets, making it difficult to deploy research solutions to improve healthcare services.</div><div><br></div><div>The research presented in this thesis focuses on reducing the need to create labels for biomedical text mining by using unsupervised recurrent neural networks. The proposed method utilizes word embeddings to generate vector representations of biomedical concepts based on semantics and context. Experiments with unsupervised clustering of these biomedical concepts show that concepts that are similar to each other are clustered together. While this clustering captures different synonyms of the same concept, it also captures the similarities between various diseases and the symptoms that those diseases are symptomatic of.</div><div><br></div><div>To test the performance of the concept vectors on corpora of documents, a document vector generation method that utilizes these concept vectors is also proposed. The document vectors thus generated are used as an input to clustering algorithms, and the results show that across multiple corpora, the proposed methods of concept and document vector generation outperform the baselines and provide more meaningful clustering. The applications of this document clustering are huge, especially in the search and retrieval space, providing clinicians, researchers and patients more holistic and comprehensive results than relying on the exclusive term that they search for.</div><div><br></div><div>At the end, a framework for extracting clinical information that can be mapped to electronic health records from preventive care guidelines is presented. The extracted information can be integrated with the clinical decision support system of an electronic health record. A visualization tool to better understand and observe patient trajectories is also explored. Both these methods have potential to improve the preventive care services provided to patients.</div>

Page generated in 0.053 seconds