• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1331
  • 556
  • 320
  • 111
  • 84
  • 57
  • 54
  • 54
  • 37
  • 37
  • 31
  • 28
  • 25
  • 24
  • 23
  • Tagged with
  • 3118
  • 979
  • 510
  • 475
  • 424
  • 415
  • 401
  • 354
  • 326
  • 290
  • 289
  • 275
  • 258
  • 256
  • 243
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
561

Ontology for cultural variations in interpersonal communication: building on theoretical models and crowdsourced knowledge

Thakker, Dhaval, Karanasios, S, Blanchard, E., Lau, L., Dimitrova, V. 05 May 2017 (has links)
Yes / The domain of cultural variations in interpersonal communication is becoming increasingly important in various areas, including human-human interaction (e.g. business settings) and humancomputer interaction (e.g. during simulations, or with social robots). User generated content (UGC) in social media can provide an invaluable source of culturally diverse viewpoints for supporting the understanding of cultural variations. However, discovering and organizing UGC is notoriously challenging and laborious for humans, especially in ill-defined domains such as culture. This calls for computational approaches to automate the UGC sensemaking process by using tagging, linking and exploring. Semantic technologies allow automated structuring and qualitative analysis of UGC, but are dependent on the availability of an ontology representing the main concepts in a specific domain. For the domain of cultural variations in interpersonal communication, no ontological model exists. This paper presents the first such ontological model, called AMOn+, which defines cultural variations and enables tagging culture-related mentions in textual content. AMOn+ is designed based on a novel interdisciplinary approach that combines theoretical models of culture with crowdsourced knowledge (DBpedia). An evaluation of AMOn+ demonstrated its fitness-for-purpose regarding domain coverage for annotating culture-related concepts mentioned in text corpora. This ontology can underpin computational models for making sense of UGC.
562

A note on exploration of IoT generated big data using semantics

Ranjan, R., Thakker, Dhaval, Haller, A., Buyya, R. 27 July 2017 (has links)
Yes / Welcome to this special issue of the Future Generation Computer Systems (FGCS) journal. The special issue compiles seven technical contributions that significantly advance the state-of-the-art in exploration of Internet of Things (IoT) generated big data using semantic web techniques and technologies.
563

A note on intelligent exploration of semantic data

Thakker, Dhaval, Schwabe, D., Garcia, D., Kozaki, K., Brambilla, M., Dimitrova, V. 15 July 2019 (has links)
Yes / Welcome to this special issue of the Semantic Web (SWJ) journal. The special issue compiles three technical contributions that significantly advance the state-of-the-art in exploration of semantic data using semantic web techniques and technologies.
564

Towards Robust and Accurate Text-to-Code Generation

almohaimeed, saleh 01 January 2024 (has links) (PDF)
Databases play a vital role in today's digital landscape, enabling effective data storage, management, and retrieval for businesses and other organizations. However, interacting with databases often requires knowledge of query (e.g., SQL) and analysis, which can be a barrier for many users. In natural language processing, the text-to-code task, which converts natural language text into query and analysis code, bridges this gap by allowing users to access and manipulate data using everyday language. This dissertation investigates different challenges in text-to-code (including text-to-SQL as a subtask), with a focus on four primary contributions to the field. As a solution to the lack of statistical analysis in current text-to-code tasks, we introduce SIGMA, a text-to-Code dataset with statistical analysis, featuring 6000 questions with Python code labels. Baseline models show promising results, indicating that our new task can support both statistical analysis and SQL queries simultaneously. Second, we present Ar-Spider, the first Arabic cross-domain text-to-SQL dataset that addresses multilingual limitations. We have conducted experiments with LGESQL and S²SQL models, enhanced by our Context Similarity Relationship (CSR) approach, which demonstrates competitive performance, reducing the performance gap between the Arabic and English text-to-SQL datasets. Third, we address context-dependent text-to-SQL task, often overlooked by current models. The SParC dataset was explored by utilizing different question representations and in-context learning prompt engineering techniques. Then, we propose GAT-SQL, an advanced prompt engineering approach that improves both zero-shot and in-context learning experiments. GAT-SQL sets new benchmarks in both SParC and CoSQL datasets. Finally, we introduce Ar-SParC, a context-dependent Arabic text-to-SQL dataset that enables users to interact with the model through a series of interrelated questions. In total, 40 experiments were conducted to investigate this dataset using various prompt engineering techniques, and a novel technique called GAT Corrector was developed, which significantly improved the performance of all baseline models.
565

Mining Biomedical Data for Hidden Relationship Discovery

Dharmavaram, Sirisha 08 1900 (has links)
With an ever-growing number of publications in the biomedical domain, it becomes likely that important implicit connections between individual concepts of biomedical knowledge are overlooked. Literature based discovery (LBD) is in practice for many years to identify plausible associations between previously unrelated concepts. In this paper, we present a new, completely automatic and interactive system that creates a graph-based knowledge base to capture multifaceted complex associations among biomedical concepts. For a given pair of input concepts, our system auto-generates a list of ranked subgraphs uncovering possible previously unnoticed associations based on context information. To rank these subgraphs, we implement a novel ranking method using the context information obtained by performing random walks on the graph. In addition, we enhance the system by training a Neural Network Classifier to output the likelihood of the two concepts being likely related, which provides better insights to the end user.
566

Encyclopaedic question answering

Dornescu, Iustin January 2012 (has links)
Open-domain question answering (QA) is an established NLP task which enables users to search for speciVc pieces of information in large collections of texts. Instead of using keyword-based queries and a standard information retrieval engine, QA systems allow the use of natural language questions and return the exact answer (or a list of plausible answers) with supporting snippets of text. In the past decade, open-domain QA research has been dominated by evaluation fora such as TREC and CLEF, where shallow techniques relying on information redundancy have achieved very good performance. However, this performance is generally limited to simple factoid and deVnition questions because the answer is usually explicitly present in the document collection. Current approaches are much less successful in Vnding implicit answers and are diXcult to adapt to more complex question types which are likely to be posed by users. In order to advance the Veld of QA, this thesis proposes a shift in focus from simple factoid questions to encyclopaedic questions: list questions composed of several constraints. These questions have more than one correct answer which usually cannot be extracted from one small snippet of text. To correctly interpret the question, systems need to combine classic knowledge-based approaches with advanced NLP techniques. To Vnd and extract answers, systems need to aggregate atomic facts from heterogeneous sources as opposed to simply relying on keyword-based similarity. Encyclopaedic questions promote QA systems which use basic reasoning, making them more robust and easier to extend with new types of constraints and new types of questions. A novel semantic architecture is proposed which represents a paradigm shift in open-domain QA system design, using semantic concepts and knowledge representation instead of words and information retrieval. The architecture consists of two phases, analysis – responsible for interpreting questions and Vnding answers, and feedback – responsible for interacting with the user. This architecture provides the basis for EQUAL, a semantic QA system developed as part of the thesis, which uses Wikipedia as a source of world knowledge and iii employs simple forms of open-domain inference to answer encyclopaedic questions. EQUAL combines the output of a syntactic parser with semantic information from Wikipedia to analyse questions. To address natural language ambiguity, the system builds several formal interpretations containing the constraints speciVed by the user and addresses each interpretation in parallel. To Vnd answers, the system then tests these constraints individually for each candidate answer, considering information from diUerent documents and/or sources. The correctness of an answer is not proved using a logical formalism, instead a conVdence-based measure is employed. This measure reWects the validation of constraints from raw natural language, automatically extracted entities, relations and available structured and semi-structured knowledge from Wikipedia and the Semantic Web. When searching for and validating answers, EQUAL uses the Wikipedia link graph to Vnd relevant information. This method achieves good precision and allows only pages of a certain type to be considered, but is aUected by the incompleteness of the existing markup targeted towards human readers. In order to address this, a semantic analysis module which disambiguates entities is developed to enrich Wikipedia articles with additional links to other pages. The module increases recall, enabling the system to rely more on the link structure of Wikipedia than on word-based similarity between pages. It also allows authoritative information from diUerent sources to be linked to the encyclopaedia, further enhancing the coverage of the system. The viability of the proposed approach was evaluated in an independent setting by participating in two competitions at CLEF 2008 and 2009. In both competitions, EQUAL outperformed standard textual QA systems as well as semi-automatic approaches. Having established a feasible way forward for the design of open-domain QA systems, future work will attempt to further improve performance to take advantage of recent advances in information extraction and knowledge representation, as well as by experimenting with formal reasoning and inferencing capabilities.
567

Role of description logic reasoning in ontology matching

Reul, Quentin H. January 2012 (has links)
Semantic interoperability is essential on the Semantic Web to enable different information systems to exchange data. Ontology matching has been recognised as a means to achieve semantic interoperability on the Web by identifying similar information in heterogeneous ontologies. Existing ontology matching approaches have two major limitations. The first limitation relates to similarity metrics, which provide a pessimistic value when considering complex objects such as strings and conceptual entities. The second limitation relates to the role of description logic reasoning. In particular, most approaches disregard implicit information about entities as a source of background knowledge. In this thesis, we first present a new similarity function, called the degree of commonality coefficient, to compute the overlap between two sets based on the similarity between their elements. The results of our evaluations show that the degree of commonality performs better than traditional set similarity metrics in the ontology matching task. Secondly, we have developed the Knowledge Organisation System Implicit Mapping (KOSIMap) framework, which differs from existing approaches by using description logic reasoning (i) to extract implicit information as background knowledge for every entity, and (ii) to remove inappropriate correspondences from an alignment. The results of our evaluation show that the use of Description Logic in the ontology matching task can increase coverage. We identify people interested in ontology matching and reasoning techniques as the target audience of this work
568

A general purpose semantic parser using FrameNet and WordNet®.

Shi, Lei 05 1900 (has links)
Syntactic parsing is one of the best understood language processing applications. Since language and grammar have been formally defined, it is easy for computers to parse the syntactic structure of natural language text. Does meaning have structure as well? If it has, how can we analyze the structure? Previous systems rely on a one-to-one correspondence between syntactic rules and semantic rules. But such systems can only be applied to limited fragments of English. In this thesis, we propose a general-purpose shallow semantic parser which utilizes a semantic network (WordNet), and a frame dataset (FrameNet). Semantic relations recognized by the parser are based on how human beings represent knowledge of the world. Parsing semantic structure allows semantic units and constituents to be accessed and processed in a more meaningful way than syntactic parsing, moving the automation of understanding natural language text to a higher level.
569

Statistical Extraction of Multilingual Natural Language Patterns for RDF Predicates: Algorithms and Applications

Gerber, Daniel 29 August 2016 (has links) (PDF)
The Data Web has undergone a tremendous growth period. It currently consists of more then 3300 publicly available knowledge bases describing millions of resources from various domains, such as life sciences, government or geography, with over 89 billion facts. In the same way, the Document Web grew to the state where approximately 4.55 billion websites exist, 300 million photos are uploaded on Facebook as well as 3.5 billion Google searches are performed on average every day. However, there is a gap between the Document Web and the Data Web, since for example knowledge bases available on the Data Web are most commonly extracted from structured or semi-structured sources, but the majority of information available on the Web is contained in unstructured sources such as news articles, blog post, photos, forum discussions, etc. As a result, data on the Data Web not only misses a significant fragment of information but also suffers from a lack of actuality since typical extraction methods are time-consuming and can only be carried out periodically. Furthermore, provenance information is rarely taken into consideration and therefore gets lost in the transformation process. In addition, users are accustomed to entering keyword queries to satisfy their information needs. With the availability of machine-readable knowledge bases, lay users could be empowered to issue more specific questions and get more precise answers. In this thesis, we address the problem of Relation Extraction, one of the key challenges pertaining to closing the gap between the Document Web and the Data Web by four means. First, we present a distant supervision approach that allows finding multilingual natural language representations of formal relations already contained in the Data Web. We use these natural language representations to find sentences on the Document Web that contain unseen instances of this relation between two entities. Second, we address the problem of data actuality by presenting a real-time data stream RDF extraction framework and utilize this framework to extract RDF from RSS news feeds. Third, we present a novel fact validation algorithm, based on natural language representations, able to not only verify or falsify a given triple, but also to find trustworthy sources for it on the Web and estimating a time scope in which the triple holds true. The features used by this algorithm to determine if a website is indeed trustworthy are used as provenance information and therewith help to create metadata for facts in the Data Web. Finally, we present a question answering system that uses the natural language representations to map natural language question to formal SPARQL queries, allowing lay users to make use of the large amounts of data available on the Data Web to satisfy their information need.
570

Indigo : une approche multi-stratégique et adaptative pour un alignement sémantique intégrant le contexte des données à apparier

Bououlid Idrissi, Youssef January 2008 (has links)
Thèse numérisée par la Division de la gestion de documents et des archives de l'Université de Montréal.

Page generated in 0.0642 seconds