Global ETD Search

41	Handling inconsistency in databases and data integration systems / Bravo, Loreto, January 1900 (has links) Thesis (Ph.D.) - Carleton University, 2007. / Includes bibliographical references (p. 238-250). Also available in electronic format on the Internet.
42	Strojové učení pro odpovídání na otázky v češtině / Machine Learning for Question Answering in Czech Pastorek, Peter January 2020 (has links) This Master's thesis deals with teaching neural network question answering in Czech. Neural networks are created in Python programming language using the PyTorch library. They are created based on the LSTM structure. They are trained on the Czech SQAD dataset. Because Czech data set is smaller than the English data sets, I opted to extend neural networks with algorithmic procedures. For easier application of algorithmic procedures and better accuracy, I divide question answering into smaller parts.
43	Using Gaze Tracking to Tackle Duplicate Questions on Community Based Question Answering Websites: A Case Study of Ifixit Gandhi, Pankti 01 June 2018 (has links) The number of unanswered questions on Community based Question Answering (CQA) websites has increased significantly due to the rising number of duplicate questions. This is a serious problem, one that could lead to the decline of such beneficial websites. This thesis presents novel avenues that use gaze tracking technology and behavioral testing to tackle this problem. Based on prior studies on web search behaviors, we assumed that adding contextual information (snippets) to proposed related questions displayed on the `Ask a Question' page of the CQA website iFixit would improve the asker experience and reduce their tendency to post a new duplicate question. The first lab experiment where this web page was redesigned and compared to the original one was conducted on 8 participants. Results confirmed that participants were more likely to find an answer to their question on the redesigned page. A second experiment, conducted remotely and on a larger sample of 74 participants, aimed to discover strategic attributes that increase the perceived similarity of question pairs. These attributes were used in the third lab experiment (20 participants) to redesign and assess the snippets from Experiment 1. Results indicated that snippets containing `symptom(s)' and `cause(s)' attributes constitute an incremental improvement over basic snippets: they are perceived as slightly more relevant and require significantly less gaze fixations on the asker's part. Community based Question Answering CQA Human Computer Interaction iFixit
44	Analyzing Answer Acceptance on Stack Overflow Using the Asker's Participation in Answer Comments Yiqun Zhang (16326174) 14 June 2023 (has links) <p> </p> <p>CQA platforms face problems, particularly inactive participants and low-quality content, that hurt long-term sustainability (Srba & Bielikova, 2016). Recent CQA studies have revealed the great value of answer comments in contributing to crowdsourced knowledge and investigating answer acceptance. A practical step forward from recent work aiming to remedy the sustainability issue of CQA, this study has offered insights into the impact of the asker generally participating in the comments section of an answer on the acceptance of that answer on Stack Overflow (a technical CQA site). A literature review was carefully carried out to show the general scope of CQA research and position this study with related work. Compared with existing work, this study demonstrates its novelty by using attributes derived from answer comments (e.g., AskerInCommentsOrNot) in the models for analyzing answer acceptance. The data collected was broadly about machine learning (ML) along with various topics, making it representative of Stack Overflow. The 19,555 records were analyzed using the Chi-Square test and Logistic Regression. The findings indicate that the asker's participation in the comments section of an answer is associated with the acceptance of that answer, and answers with more of the asker's participation in answer comments are more likely to be accepted. Broadly, this research supports the idea that answer comments are a valuable type of social interaction and feedback in CQA. This research also has beneficial implications for stakeholders on Stack Overflow and potentially technical CQA, including facilitating CQA flow, effectively evaluating helpful information, improving system designs, and motivating user participation.</p> community question answering
45	Encyclopaedic question answering Dornescu, Iustin January 2012 (has links) Open-domain question answering (QA) is an established NLP task which enables users to search for speciVc pieces of information in large collections of texts. Instead of using keyword-based queries and a standard information retrieval engine, QA systems allow the use of natural language questions and return the exact answer (or a list of plausible answers) with supporting snippets of text. In the past decade, open-domain QA research has been dominated by evaluation fora such as TREC and CLEF, where shallow techniques relying on information redundancy have achieved very good performance. However, this performance is generally limited to simple factoid and deVnition questions because the answer is usually explicitly present in the document collection. Current approaches are much less successful in Vnding implicit answers and are diXcult to adapt to more complex question types which are likely to be posed by users. In order to advance the Veld of QA, this thesis proposes a shift in focus from simple factoid questions to encyclopaedic questions: list questions composed of several constraints. These questions have more than one correct answer which usually cannot be extracted from one small snippet of text. To correctly interpret the question, systems need to combine classic knowledge-based approaches with advanced NLP techniques. To Vnd and extract answers, systems need to aggregate atomic facts from heterogeneous sources as opposed to simply relying on keyword-based similarity. Encyclopaedic questions promote QA systems which use basic reasoning, making them more robust and easier to extend with new types of constraints and new types of questions. A novel semantic architecture is proposed which represents a paradigm shift in open-domain QA system design, using semantic concepts and knowledge representation instead of words and information retrieval. The architecture consists of two phases, analysis – responsible for interpreting questions and Vnding answers, and feedback – responsible for interacting with the user. This architecture provides the basis for EQUAL, a semantic QA system developed as part of the thesis, which uses Wikipedia as a source of world knowledge and iii employs simple forms of open-domain inference to answer encyclopaedic questions. EQUAL combines the output of a syntactic parser with semantic information from Wikipedia to analyse questions. To address natural language ambiguity, the system builds several formal interpretations containing the constraints speciVed by the user and addresses each interpretation in parallel. To Vnd answers, the system then tests these constraints individually for each candidate answer, considering information from diUerent documents and/or sources. The correctness of an answer is not proved using a logical formalism, instead a conVdence-based measure is employed. This measure reWects the validation of constraints from raw natural language, automatically extracted entities, relations and available structured and semi-structured knowledge from Wikipedia and the Semantic Web. When searching for and validating answers, EQUAL uses the Wikipedia link graph to Vnd relevant information. This method achieves good precision and allows only pages of a certain type to be considered, but is aUected by the incompleteness of the existing markup targeted towards human readers. In order to address this, a semantic analysis module which disambiguates entities is developed to enrich Wikipedia articles with additional links to other pages. The module increases recall, enabling the system to rely more on the link structure of Wikipedia than on word-based similarity between pages. It also allows authoritative information from diUerent sources to be linked to the encyclopaedia, further enhancing the coverage of the system. The viability of the proposed approach was evaluated in an independent setting by participating in two competitions at CLEF 2008 and 2009. In both competitions, EQUAL outperformed standard textual QA systems as well as semi-automatic approaches. Having established a feasible way forward for the design of open-domain QA systems, future work will attempt to further improve performance to take advantage of recent advances in information extraction and knowledge representation, as well as by experimenting with formal reasoning and inferencing capabilities. 025.04
46	Question Answering on RDF Data Cubes Höffner, Konrad 26 March 2021 (has links) The Semantic Web, a Web of Data, is an extension of the World Wide Web (WWW), a Web of Documents. A large amount of such data is freely available as Linked Open Data (LOD) for many areas of knowledge, forming the LOD Cloud. While this data conforms to the Resource Description Framework (RDF) and can thus be processed by machines, users need to master a formal query language and learn a specific vocabulary. Semantic Question Answering (SQA) systems remove those access barriers by letting the user ask natural language questions that the systems translate into formal queries. Thus, the research area of SQA plays an important role for the acceptance and benefit of the Semantic Web. The original contributions of this thesis to SQA are: First, we survey the current state of the art of SQA. We complement existing surveys by systematically identifying SQA publications in the chosen timeframe. 72 publications describing 62 different systems are systematically and manually selected using predefined inclusion and exclusion criteria out of 1960 candidates from the end of 2010 to July 2015. The survey identifies common challenges, structured solutions, and recommendations on research opportunities for future systems. From that point on, we focus on multidimensional numerical data, which is immensely valuable as it influences decisions in health care, policy and finance, among others. With the growth of the open data movement, more and more of it is becoming freely available. A large amount of such data is included in the LOD cloud using the RDF Data Cube (RDC) vocabulary. However, consuming multidimensional numerical data requires experts and specialized tools. Traditional SQA systems cannot process RDCs because their meta-structure is opaque to applications that expect facts to be encoded in single triples, This motivates our second contribution, the design and implementation of the first SQA algorithm on RDF Data Cubes. We kick-start this new research subfield by creating a user question corpus and a benchmark over multiple data sets. The evaluation of our system on the benchmark, which is included in the public Question Answering over Linked Data (QALD) challenge of 2016, shows the feasibility of the approach, but also highlights challenges, which we discuss in detail as a starting point for future work in the field. The benchmark is based on our final contribution, the addition of 955 financial government spending data sets to the LOD cloud by transforming data sets of the OpenSpending project to RDF Data Cubes. Open spending data has the power to reduce corruption by increasing accountability and strengthens democracy because voters can make better informed decisions. An informed and trusting public also strengthens the government itself because it is more likely to commit to large projects. OpenSpending.org is an open platform that provides public finance data from governments around the world. The transformation result, called LinkedSpending, consists of more than five million planned and carried out financial transactions in 955 data sets from all over the world as Linked Open Data and is freely available and openly licensed.:1 Introduction 1.1 Motivation 1.2 Research Questions and Contributions 1.3 Thesis Structure 2 Preliminaries 2.1 Semantic Web 2.1.1 URIs and URLs 2.1.2 Linked Data 2.1.3 Resource Description Framework 2.1.4 Ontologies 2.2 Question Answering 2.2.1 History 2.2.2 Definitions 2.2.3 Evaluation 2.2.4 SPARQL 2.2.5 Controlled Vocabulary 2.2.6 Faceted Search 2.2.7 Keyword Search 2.3 Data Cubes 3 Related Work 3.1 Semantic Question Answering 3.1.1 Surveys 3.1.2 Evaluation Campaigns 3.1.3 System Frameworks 3.2 Question Answering on RDF Data Cubes 3.3 RDF Data Cube Data Sets 4 Systematic Survey of Semantic Question Answering 4.1 Methodology 4.1.1 Inclusion Criteria 4.1.2 Exclusion Criteria 4.1.3 Result 4.2 Systems 4.2.1 Implementation 4.2.2 Examples 4.2.3 Answer Presentation 4.3 Challenges 4.3.1 Lexical Gap 4.3.2 Ambiguity 4.3.3 Multilingualism 4.3.4 Complex Queries 4.3.5 Distributed Knowledge 4.3.6 Procedural, Temporal and Spatial Questions 4.3.7 Templates 5 Question Answering on RDF Data Cubes 5.1 Question Corpus 5.2 Corpus Analysis 5.3 Data Cube Operations 5.4 Algorithm 5.4.1 Preprocessing 5.4.2 Matching 5.4.3 Combining Matches to Constraints 5.4.4 Execution 6 LinkedSpending 6.1 Choice of Source Data 6.1.1 Government Spending 6.1.2 OpenSpending 6.2 OpenSpending Source Data 6.3 Conversion of OpenSpending to RDF 6.4 Publishing 6.5 Overview over the Data Sets 6.6 Data Set Quality Analysis 6.6.1 Intrinsic Dimensions 6.6.2 Representational Dimensions 6.7 Evaluation 6.7.1 Experimental Setup and Benchmark 6.7.2 Discussion 7 Conclusion 7.1 Research Question Summary 7.2 SQA Survey 7.2.1 Lexical Gap 7.2.2 Ambiguity 7.2.3 Multilingualism 7.2.4 Complex Operators 7.2.5 Distributed Knowledge 7.2.6 Procedural, Temporal and Spatial Data 7.2.7 Templates 7.2.8 Future Research 7.3 CubeQA 7.4 LinkedSpending 7.4.1 Shortcomings 7.4.2 Future Work Bibliography Appendix A The CubeQA Question Corpus Appendix B The QALD-6 Task 3 Benchmark Questions B.1 Training Data B.2 Testing Data info:eu-repo/classification/ddc/000 ddc:000
47	Knowledge Extraction for Hybrid Question Answering Usbeck, Ricardo 22 May 2017 (has links) (PDF) Since the proposal of hypertext by Tim Berners-Lee to his employer CERN on March 12, 1989 the World Wide Web has grown to more than one billion Web pages and still grows. With the later proposed Semantic Web vision,Berners-Lee et al. suggested an extension of the existing (Document) Web to allow better reuse, sharing and understanding of data. Both the Document Web and the Web of Data (which is the current implementation of the Semantic Web) grow continuously. This is a mixed blessing, as the two forms of the Web grow concurrently and most commonly contain different pieces of information. Modern information systems must thus bridge a Semantic Gap to allow a holistic and unified access to information about a particular information independent of the representation of the data. One way to bridge the gap between the two forms of the Web is the extraction of structured data, i.e., RDF, from the growing amount of unstructured and semi-structured information (e.g., tables and XML) on the Document Web. Note, that unstructured data stands for any type of textual information like news, blogs or tweets. While extracting structured data from unstructured data allows the development of powerful information system, it requires high-quality and scalable knowledge extraction frameworks to lead to useful results. The dire need for such approaches has led to the development of a multitude of annotation frameworks and tools. However, most of these approaches are not evaluated on the same datasets or using the same measures. The resulting Evaluation Gap needs to be tackled by a concise evaluation framework to foster fine-grained and uniform evaluations of annotation tools and frameworks over any knowledge bases. Moreover, with the constant growth of data and the ongoing decentralization of knowledge, intuitive ways for non-experts to access the generated data are required. Humans adapted their search behavior to current Web data by access paradigms such as keyword search so as to retrieve high-quality results. Hence, most Web users only expect Web documents in return. However, humans think and most commonly express their information needs in their natural language rather than using keyword phrases. Answering complex information needs often requires the combination of knowledge from various, differently structured data sources. Thus, we observe an Information Gap between natural-language questions and current keyword-based search paradigms, which in addition do not make use of the available structured and unstructured data sources. Question Answering (QA) systems provide an easy and efficient way to bridge this gap by allowing to query data via natural language, thus reducing (1) a possible loss of precision and (2) potential loss of time while reformulating the search intention to transform it into a machine-readable way. Furthermore, QA systems enable answering natural language queries with concise results instead of links to verbose Web documents. Additionally, they allow as well as encourage the access to and the combination of knowledge from heterogeneous knowledge bases (KBs) within one answer. Consequently, three main research gaps are considered and addressed in this work: First, addressing the Semantic Gap between the unstructured Document Web and the Semantic Gap requires the development of scalable and accurate approaches for the extraction of structured data in RDF. This research challenge is addressed by several approaches within this thesis. This thesis presents CETUS, an approach for recognizing entity types to populate RDF KBs. Furthermore, our knowledge base-agnostic disambiguation framework AGDISTIS can efficiently detect the correct URIs for a given set of named entities. Additionally, we introduce REX, a Web-scale framework for RDF extraction from semi-structured (i.e., templated) websites which makes use of the semantics of the reference knowledge based to check the extracted data. The ongoing research on closing the Semantic Gap has already yielded a large number of annotation tools and frameworks. However, these approaches are currently still hard to compare since the published evaluation results are calculated on diverse datasets and evaluated based on different measures. On the other hand, the issue of comparability of results is not to be regarded as being intrinsic to the annotation task. Indeed, it is now well established that scientists spend between 60% and 80% of their time preparing data for experiments. Data preparation being such a tedious problem in the annotation domain is mostly due to the different formats of the gold standards as well as the different data representations across reference datasets. We tackle the resulting Evaluation Gap in two ways: First, we introduce a collection of three novel datasets, dubbed N3, to leverage the possibility of optimizing NER and NED algorithms via Linked Data and to ensure a maximal interoperability to overcome the need for corpus-specific parsers. Second, we present GERBIL, an evaluation framework for semantic entity annotation. The rationale behind our framework is to provide developers, end users and researchers with easy-to-use interfaces that allow for the agile, fine-grained and uniform evaluation of annotation tools and frameworks on multiple datasets. The decentral architecture behind the Web has led to pieces of information being distributed across data sources with varying structure. Moreover, the increasing the demand for natural-language interfaces as depicted by current mobile applications requires systems to deeply understand the underlying user information need. In conclusion, the natural language interface for asking questions requires a hybrid approach to data usage, i.e., simultaneously performing a search on full-texts and semantic knowledge bases. To close the Information Gap, this thesis presents HAWK, a novel entity search approach developed for hybrid QA based on combining structured RDF and unstructured full-text data sources. Named Entity Disambiguierung Benchmarking Hybrid Question Answering Web of Data Web of Documents RDF Semantic Annotation Named Entity Disambiguation Benchmarking Hybrid Question Answering Web of Data Web of Documents RDF ddc:000
48	Zodpovídání dotazů o obrázcích / Visual Question Answering Hajič, Jakub January 2017 (has links) Visual Question Answering (VQA) is a recently proposed multimodal task in the general area of machine learning. The input to this task consists of a single image and an associated natural language question, and the output is the answer to that question. In this thesis we propose two incremental modifications to an existing model which won the VQA Challenge in 2016 using multimodal compact bilinear pooling (MCB), a novel way of combining modalities. First, we added the language attention mechanism, and on top of that we introduce an image attention mechanism focusing on objects detected in the image ("region attention"). We also experiment with ways of combining these in a single end- to-end model. The thesis describes the MCB model and our extensions and their two different implementations, and evaluates them on the original VQA challenge dataset for direct comparison with the original work. 1
49	EVIDENCE BASED MEDICAL QUESTION ANSWERING SYSTEM USING KNOWLEDGE GRAPH PARADIGM Aqeel, Aya 22 June 2022 (has links) No description available. Artificial Intelligence Biomedical Research Medicine NLP Natural Language Processing Question Answering QA Biomedical Question Answering Knowledge Graph KG, Machine Learning ML EBM Evidence-Based Medicine
50	Exploring Knowledge Vaults with ChatGPT : A Domain-Driven Natural Language Approach to Document-Based Answer Retrieval Hammarström, Mathias January 2023 (has links) Problemlösning är en viktig aspekt i många yrken. Inklusive fabriksmiljöer, där problem kan leda till minskad produktion eller till och med produktionsstopp. Denna studie fokuserar på en specifik domän: en massafabrik i samarbete med SCA Massa. Syftet med studien är att undersöka potentialen av ett frågebesvarande system för att förbättra arbetarnas förmåga att lösa problem genom att förse dem med möjliga lösningar baserat på en naturlig beskrivning av problemet. Detta uppnås genom att ge arbetarna ett naturligt språk gränssnitt till en stor mängd domänspecifika dokument. Mer specifikt så fungerar systemet genom att utöka ChatGPT med domänspecifika dokument som kontext för en fråga. De relevanta dokumenten hittas med hjälp av en retriever, som använder vektorrepresentationer för varje dokument och jämför sedan dokumentens vektorer med frågans vektor. Resultaten visar att system har genererat rätt svar 92% av tiden, felaktigt svar 5% av tiden och inget svar ges 3% av tiden. Slutsatsen av denna studie är att det implementerade frågebesvarande systemet är lovande, speciellt när det används av en expert eller skicklig arbetare som är mindre benägen att vilseledas av felaktiga svar. Dock, på grund av studiens begränsade omfattning så krävs ytterligare studier för att avgöra om systemet är redo att distribueras i verkliga miljöer. / Problem solving is a key aspect in many professions. Including a factory setting, where problems can cause the production to slow down or even halt completely. The specific domain for this project is a pulp factory setting in collaboration with SCA Pulp. This study explores the potential of a question-answering system to enhance workers ability to solve a problem by providing possible solutions from a natural language description of the problem. This is accomplished by giving workers a natural language interface to a large corpus of domain-specific documents. More specifically the system works by augmenting ChatGPT with domain specific documents as context for a question. The relevant documents are found using a retriever, which uses vector representations for each document, and then compares the documents vectors with the question vector. The result shows that the system has generated a correct answer 92% of the time, an incorrect answer 5% of the time and no answer was given 3% of the time. Conclusions drawn from this study is that the implemented question-answering system is promising, especially when used by an expert or skilled worker who is less likely to be misled by the incorrect answers. However, due to the study’s small scale further study is required to conclude that this system is ready to be deployed in real-world scenarios. Human-computer-interaction NLP LLM ChatGPT Question-Answering Information-Retrieval. Människa-dator interaktion NLP LLM ChatGPT Question-Answering Information-Retrieval. Software Engineering Programvaruteknik

Search results