Global ETD Search

71	Algorithms for assessing the quality and difficulty of multiple choice exam questions Luger, Sarah Kaitlin Kelly January 2016 (has links) Multiple Choice Questions (MCQs) have long been the backbone of standardized testing in academia and industry. Correspondingly, there is a constant need for the authors of MCQs to write and refine new questions for new versions of standardized tests as well as to support measuring performance in the emerging massive open online courses, (MOOCs). Research that explores what makes a question difficult, or what questions distinguish higher-performing students from lower-performing students can aid in the creation of the next generation of teaching and evaluation tools. In the automated MCQ answering component of this thesis, algorithms query for definitions of scientific terms, process the returned web results, and compare the returned definitions to the original definition in the MCQ. This automated method for answering questions is then augmented with a model, based on human performance data from crowdsourced question sets, for analysis of question difficulty as well as the discrimination power of the non-answer alternatives. The crowdsourced question sets come from PeerWise, an open source online college-level question authoring and answering environment. The goal of this research is to create an automated method to both answer and assesses the difficulty of multiple choice inverse definition questions in the domain of introductory biology. The results of this work suggest that human-authored question banks provide useful data for building gold standard human performance models. The methodology for building these performance models has value in other domains that test the difficulty of questions and the quality of the exam takers. 006.3
72	Facilitating Efficient Information Seeking in Social Media January 2017 (has links) abstract: Online social media is popular due to its real-time nature, extensive connectivity and a large user base. This motivates users to employ social media for seeking information by reaching out to their large number of social connections. Information seeking can manifest in the form of requests for personal and time-critical information or gathering perspectives on important issues. Social media platforms are not designed for resource seeking and experience large volumes of messages, leading to requests not being fulfilled satisfactorily. Designing frameworks to facilitate efficient information seeking in social media will help users to obtain appropriate assistance for their needs and help platforms to increase user satisfaction. Several challenges exist in the way of facilitating information seeking in social media. First, the characteristics affecting the user’s response time for a question are not known, making it hard to identify prompt responders. Second, the social context in which the user has asked the question has to be determined to find personalized responders. Third, users employ rhetorical requests, which are statements having the syntax of questions, and systems assisting information seeking might be hindered from focusing on genuine questions. Fouth, social media advocates of political campaigns employ nuanced strategies to prevent users from obtaining balanced perspectives on issues of public importance. Sociological and linguistic studies on user behavior while making or responding to information seeking requests provides concepts drawing from which we can address these challenges. We propose methods to estimate the response time of the user for a given question to identify prompt responders. We compute the question specific social context an asker shares with his social connections to identify personalized responders. We draw from theories of political mobilization to model the behaviors arising from the strategies of people trying to skew perspectives. We identify rhetorical questions by modeling user motivations to post them. / Dissertation/Thesis / Doctoral Dissertation Electrical Engineering 2017 Social research Computer science Elections Information Seeking Question Answering Social Search
73	Representing, Reasoning and Answering Questions about Biological Pathways Various Applications January 2014 (has links) abstract: Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and drug development. Scientists studying these interconnected processes have identified various pathways involved in drug metabolism, diseases, and signal transduction, etc. High-throughput technologies, new algorithms and speed improvements over the last decade have resulted in deeper knowledge about biological systems, leading to more refined pathways. Such pathways tend to be large and complex, making it difficult for an individual to remember all aspects. Thus, computer models are needed to represent and analyze them. The refinement activity itself requires reasoning with a pathway model by posing queries against it and comparing the results against the real biological system. Many existing models focus on structural and/or factoid questions, relying on surface-level information. These are generally not the kind of questions that a biologist may ask someone to test their understanding of biological processes. Examples of questions requiring understanding of biological processes are available in introductory college level biology text books. Such questions serve as a model for the question answering system developed in this thesis. Thus, the main goal of this thesis is to develop a system that allows the encoding of knowledge about biological pathways to answer questions demonstrating understanding of the pathways. To that end, a language is developed to specify a pathway and pose questions against it. Some existing tools are modified and used to accomplish this goal. The utility of the framework developed in this thesis is illustrated with applications in the biological domain. Finally, the question answering system is used in real world applications by extracting pathway knowledge from text and answering questions related to drug development. / Dissertation/Thesis / Ph.D. Computer Science 2014 Computer science Artificial intelligence biological pathways knowledge representation question answering reasoning
74	Content Abuse and Privacy Concerns in Online Social Networks Kayes, Md Imrul 16 November 2015 (has links) Online Social Networks (OSNs) have seen an exponential growth over the last decade, with Facebook having more than 1.49 billion monthly active users and Twitter having 135,000 new users signing up every day as of 2015. Users are sharing 70 million photos per day on the Instagram photo-sharing network. Yahoo Answers question-answering community has more than 1 billion posted answers. The meteoric rise in popularity has made OSNs important social platforms for computer-mediated communications and embedded themselves into society’s daily life, with direct consequences to the offline world and activities. OSNs are built on a foundation of trust, where users connect to other users with common interests or overlapping personal trajectories. They leverage real-world social relationships and/or common preferences, and enable users to communicate online by providing them with a variety of interaction mechanisms. This dissertation studies abuse and privacy in online social networks. More specifically, we look at two issues: (1) the content abusers in the community question answering (CQA) social network and, (2) the privacy risks that comes from the default permissive privacy settings of the OSNs. Abusive users have negative consequences for the community and its users, as they decrease the community’s cohesion, performance, and participation. We investigate the reporting of 10 million editorially curated abuse reports from 1.5 million users in Yahoo Answers, one of the oldest, largest, and most popular CQA platforms. We characterize the contribution and position of the content abusers in Yahoo Answers social networks. Based on our empirical observations, we build machine learning models to predict such users. Users not only face the risk of exposing themselves to abusive users or content, but also face leakage risks of their personal information due to weak and permissive default privacy policies. We study the relationship between users’ privacy concerns and their engagement in Yahoo Answers social networks. We find privacy-concerned users have higher qualitative and quantitative contributions, show higher retention, report more abuses, have higher perception on answer quality and have larger social circles. Next, we look at users’ privacy concerns, abusive behavior, and engagement through the lenses of national cultures and discover cross-cultural variations in CQA social networks. However, our study in Yahoo Answers reveals that the majority of users (about 87%) do not change the default privacy policies. Moreover, we find a similar story in a different type of social network (blogging): 92% bloggers’ do not change their default privacy settings. These results on default privacy are consistent with general-purpose social networks (such as Facebook) and warn about the importance of user-protecting default privacy settings. We model and implement default privacy as contextual integrity in OSNs. We present a privacy framework, Aegis, and provide a reference implementation. Aegis models expected privacy as contextual integrity using semantic web tools and focuses on defining default privacy policies. Finally, this dissertation presents a comprehensive overview of the privacy and security attacks in the online social networks projecting them in two directions: attacks that exploit users’ personal information and declared social relationships for unintended purposes; and attacks that are aimed at the OSN service provider itself, by threatening its core business. Community Question Answering Crowdsourcing User Behavior Cross-cultural Variations Contextual Integrity Computer Sciences
75	Un système de question-réponse simple appliqué à SQuAD Elbaz, Ilan 03 1900 (has links) La tâche de question-réponse (Question-Answering, QA) est bien ancrée dans la communauté de Traitement Automatique du Langage Naturel (TALN) depuis de nombreuses années. De manière générale, celle-ci consiste à répondre à des questions données à l’aide de documents (textuels ou autres) ou de conversations en faisant au besoin usage de connaissances et en mettant en oeuvre des mécanismes d’inférence. Ainsi, dépendamment du jeu de données et de la tâche lui étant associée, il faut que le système puisse détecter et comprendre les éléments utiles pour répondre correctement à chacune des questions posées. De nombreux progrès ont été réalisés depuis quelques années avec des modèles neuronaux de plus en plus complexes, ces derniers sont cependant coûteux en production, et relativement opaques. Du à leur opacité, il est difficile d’anticiper avec précision le comportement de certains modèles et d’ainsi prévoir quand ces systèmes vont retourner de mauvaises réponses. Contrairement à la très grande majorité des systèmes proposés actuellement, nous allons dans ce mémoire tenter de résoudre cette tâche avec des modèles de taille contrôlable, on s’intéressera principalement aux approches basées sur les traits (features). Le but visé en restreignant la taille des modèles est qu’ils généralisent mieux. On pourra alors mesurer ce que ces modèles capturent afin d’évaluer la granularité de leur "compréhension" de la langue. Aussi, en analysant les lacunes de modèles de taille contrôlable, on pourra mettre en valeur ce que des modèles plus complexes ont capturé. Pour réaliser notre étude, on s’évalue ici sur SQuAD: un jeu de données populaire proposé par l’Université Standford. / The Question-Answering task (QA) is a well established Natural Language Processing (NLP) task. Generally speaking, it consists in answering questions using documents (textual or otherwise) or conversations, making use of knowledge if necessary and implementing inference mechanisms. Thus, depending on the data set and the task associated with it, the system must be able to detect and understand the useful elements to correctly answer each of the questions asked. A lot of progress has been made in recent years with increasingly complex neural models. They are however expensive in production, and relatively opaque. Due to this opacity, it is diÿcult to accurately predict the behavior of some models and thus, to predict when these systems will return wrong answers. Unlike the vast majority of systems currently proposed, in this thesis we will try to solve this task with models with controllable size. We will focus mainly on feature-based approaches. The goal in restricting the size of the models is that they generalize better. So we will measure what these models capture in order to assess the granularity of their "understanding" of the language. Also, by analyzing the gaps of controllable size models, we will be able to highlight what more complex models have captured. To carry out our study, we evaluate ourselves here on SQuAD: a popular data set o˙ered by Standford University. Question-Réponse SQuAD Question-Answering
76	Question Answering System in a Business Intelligence Context / Système de questions/réponses dans un contexte de business intelligence Kuchmann-Beauger, Nicolas 15 February 2013 (has links) Le volume et la complexité des données générées par les systèmes d’information croissent de façon singulière dans les entrepôts de données. Le domaine de l’informatique décisionnelle (aussi appelé BI) a pour objectif d’apporter des méthodes et des outils pour assister les utilisateurs dans leur tâche de recherche d’information. En effet, les sources de données ne sont en général pas centralisées, et il est souvent nécessaire d’interagir avec diverses applications. Accéder à l’information est alors une tâche ardue, alors que les employés d’une entreprise cherchent généralement à réduire leur charge de travail. Pour faire face à ce constat, le domaine « Enterprise Search » s’est développé récemment, et prend en compte les différentes sources de données appartenant aussi bien au réseau privé d’entreprise qu’au domaine public (telles que les pages Internet). Pourtant, les utilisateurs de moteurs de recherche actuels souffrent toujours de du volume trop important d’information à disposition. Nous pensons que de tels systèmes pourraient tirer parti des méthodes du traitement naturel des langues associées à celles des systèmes de questions/réponses. En effet, les interfaces en langue naturelle permettent aux utilisateurs de rechercher de l’information en utilisant leurs propres termes, et d’obtenir des réponses concises et non une liste de documents dans laquelle l’éventuelle bonne réponse doit être identifiée. De cette façon, les utilisateurs n’ont pas besoin d’employer une terminologie figée, ni de formuler des requêtes selon une syntaxe très précise, et peuvent de plus accéder plus rapidement à l’information désirée. Un challenge lors de la construction d’un tel système consiste à interagir avec les différentes applications, et donc avec les langages utilisés par ces applications d’une part, et d’être en mesure de s’adapter facilement à de nouveaux domaines d’application d’autre part. Notre rapport détaille un système de questions/réponses configurable pour des cas d’utilisation d’entreprise, et le décrit dans son intégralité. Dans les systèmes traditionnels de l’informatique décisionnelle, les préférences utilisateurs ne sont généralement pas prises en compte, ni d’ailleurs leurs situations ou leur contexte. Les systèmes état-de-l’art du domaine tels que Soda ou Safe ne génèrent pas de résultats calculés à partir de l’analyse de la situation des utilisateurs. Ce rapport introduit une approche plus personnalisée, qui convient mieux aux utilisateurs finaux. Notre expérimentation principale se traduit par une interface de type search qui affiche les résultats dans un dashboard sous la forme de graphes, de tables de faits ou encore de miniatures de pages Internet. En fonction des requêtes initiales des utilisateurs, des recommandations de requêtes sont aussi affichées en sus, et ce dans le but de réduire le temps de réponse global du système. En ce sens, ces recommandations sont comparables à des prédictions. Notre travail se traduit par les contributions suivantes : tout d’abord, une architecture implémentée via des algorithmes parallélisés et qui prend en compte la diversité des sources de données, à savoir des données structurées ou non structurées dans le cadre d’un framework de questions/réponses qui peut être facilement configuré dans des environnements différents. De plus, une approche de traduction basée sur la résolution de contrainte, qui remplace le traditionnel langage-pivot par un modèle conceptuel et qui conduit à des requêtes multidimensionnelles mieux personnalisées. En outre, en ensemble de patrons linguistiques utilisés pour traduire des questions BI en des requêtes pour bases de données, qui peuvent être facilement adaptés dans le cas de configurations différentes. / The amount and complexity of data generated by information systems keep increasing in Warehouses. The domain of Business Intelligence (BI) aims at providing methods and tools to better help users in retrieving those data. Data sources are distributed over distinct locations and are usually accessible through various applications. Looking for new information could be a tedious task, because business users try to reduce their work overload. To tackle this problem, Enterprise Search is a field that has emerged in the last few years, and that takes into consideration the different corporate data sources as well as sources available to the public (e.g. World Wide Web pages). However, corporate retrieval systems nowadays still suffer from information overload. We believe that such systems would benefit from Natural Language (NL) approaches combined with Q&A techniques. Indeed, NL interfaces allow users to search new information in their own terms, and thus obtain precise answers instead of turning to a plethora of documents. In this way, users do not have to employ exact keywords or appropriate syntax, and can have faster access to new information. Major challenges for designing such a system are to interface different applications and their underlying query languages on the one hand, and to support users’ vocabulary and to be easily configured for new application domains on the other hand. This thesis outlines an end-to-end Q&A framework for corporate use-cases that can be configured in different settings. In traditional BI systems, user-preferences are usually not taken into account, nor are their specific contextual situations. State-of-the art systems in this field, Soda and Safe do not compute search results on the basis of users’ situation. This thesis introduces a more personalized approach, which better speaks to end-users’ situations. Our main experimentation, in this case, works as a search interface, which displays search results on a dashboard that usually takes the form of charts, fact tables, and thumbnails of unstructured documents. Depending on users’ initial queries, recommendations for alternatives are also displayed, so as to reduce response time of the overall system. This process is often seen as a kind of prediction model. Our work contributes to the following: first, an architecture, implemented with parallel algorithms, that leverages different data sources, namely structured and unstructured document repositories through an extensible Q&A framework, and this framework can be easily configured for distinct corporate settings; secondly, a constraint-matching-based translation approach, which replaces a pivot language with a conceptual model and leads to more personalized multidimensional queries; thirdly, a set of NL patterns for translating BI questions in structured queries that can be easily configured in specific settings. In addition, we have implemented an iPhone/iPad™ application and an HTML front-end that demonstrate the feasibility of the various approaches developed through a series of evaluation metrics for the core component and scenario of the Q&A framework. To this end, we elaborate on a range of gold-standard queries that can be used as a basis for evaluating retrieval systems in this area, and show that our system behave similarly as the well-known WolframAlpha™ system, depending on the evaluation settings. Entrepôts de données Systèmes de questions/réponses Langage naturel Data warehouses Question Answering systems Natural Language
77	Knowledge Representation, Reasoning and Learning for Non-Extractive Reading Comprehension January 2019 (has links) abstract: While in recent years deep learning (DL) based approaches have been the popular approach in developing end-to-end question answering (QA) systems, such systems lack several desired properties, such as the ability to do sophisticated reasoning with knowledge, the ability to learn using less resources and interpretability. In this thesis, I explore solutions that aim to address these drawbacks. Towards this goal, I work with a specific family of reading comprehension tasks, normally referred to as the Non-Extractive Reading Comprehension (NRC), where the given passage does not contain enough information and to correctly answer sophisticated reasoning and ``additional knowledge" is required. I have organized the NRC tasks into three categories. Here I present my solutions to the first two categories and some preliminary results on the third category. Category 1 NRC tasks refer to the scenarios where the required ``additional knowledge" is missing but there exists a decent natural language parser. For these tasks, I learn the missing ``additional knowledge" with the help of the parser and a novel inductive logic programming. The learned knowledge is then used to answer new questions. Experiments on three NRC tasks show that this approach along with providing an interpretable solution achieves better or comparable accuracy to that of the state-of-the-art DL based approaches. The category 2 NRC tasks refer to the alternate scenario where the ``additional knowledge" is available but no natural language parser works well for the sentences of the target domain. To deal with these tasks, I present a novel hybrid reasoning approach which combines symbolic and natural language inference (neural reasoning) and ultimately allows symbolic modules to reason over raw text without requiring any translation. Experiments on two NRC tasks shows its effectiveness. The category 3 neither provide the ``missing knowledge" and nor a good parser. This thesis does not provide an interpretable solution for this category but some preliminary results and analysis of a pure DL based approach. Nonetheless, the thesis shows beyond the world of pure DL based approaches, there are tools that can offer interpretable solutions for challenging tasks without using much resource and possibly with better accuracy. / Dissertation/Thesis / Doctoral Dissertation Computer Science 2019 Artificial intelligence Commonsense QA Inductive Logic Programming Propara Question Answering Word Math problems
78	Sentence Pair Modeling and Beyond Lan, Wuwei January 2021 (has links) No description available. Computer Science
79	Question Classification in Question Answering Systems Sundblad, Håkan January 2007 (has links) Question answering systems can be seen as the next step in information retrieval, allowing users to pose questions in natural language and receive succinct answers. In order for a question answering system as a whole to be successful, research has shown that the correct classification of questions with regards to the expected answer type is imperative. Question classification has two components: a taxonomy of answer types, and a machinery for making the classifications. This thesis focuses on five different machine learning algorithms for the question classification task. The algorithms are k nearest neighbours, naïve bayes, decision tree learning, sparse network of winnows, and support vector machines. These algorithms have been applied to two different corpora, one of which has been used extensively in previous work and has been constructed for a specific agenda. The other corpus is drawn from a set of users' questions posed to a running online system. The results showed that the performance of the algorithms on the different corpora differs both in absolute terms, as well as with regards to the relative ranking of them. On the novel corpus, naïve bayes, decision tree learning, and support vector machines perform on par with each other, while on the biased corpus there is a clear difference between them, with support vector machines being the best and naïve bayes being the worst. The thesis also presents an analysis of questions that are problematic for all learning algorithms. The errors can roughly be divided as due to categories with few members, variations in question formulation, the actual usage of the taxonomy, keyword errors, and spelling errors. A large portion of the errors were also hard to explain. / <p>Report code: LiU-Tek-Lic-2007:29.</p> Question classification question answering machine learning taxonomy evaluation
80	Low-resource Language Question Answering Systemwith BERT Jansson, Herman January 2021 (has links) The complexity for being at the forefront regarding information retrieval systems are constantly increasing. Recent technology of natural language processing called BERT has reached superhuman performance in high resource languages for reading comprehension tasks. However, several researchers has stated that multilingual model’s are not enough for low-resource languages, since they are lacking a thorough understanding of those languages. Recently, a Swedish pre-trained BERT model has been introduced which is trained on significantly more Swedish data than the multilingual models currently available. This study compares both multilingual and Swedish monolingual inherited BERT model’s for question answering utilizing both a English and a Swedish machine translated SQuADv2 data set during its fine-tuning process. The models are evaluated with SQuADv2 benchmark and within a implemented question answering system built upon the classical retriever-reader methodology. This study introduces a naive and more robust prediction method for the proposed question answering system as well finding a sweet spot for each individual model approach integrated into the system. The question answering system is evaluated and compared against another question answering library at the leading edge within the area, applying a custom crafted Swedish evaluation data set. The results show that the fine-tuned model based on the Swedish pre-trained model and the Swedish SQuADv2 data set were superior in all evaluation metrics except speed. The comparison between the different systems resulted in a higher evaluation score but a slower prediction time for this study’s system. BERT Question Answering system Reading Comprehension Low resource language SQuADv2 Computer Systems Datorsystem

Search results