Global ETD Search

21	Predição de tags usando linked data: um estudo de caso no banco de dados Arquigrafia / Tag prediction using linked data: a case study in the Arquigrafia database Souza, Ricardo Augusto Teixeira de 17 December 2013 (has links) Dada a grande quantidade de conteúdo criado por usuários na Web, uma proposta para ajudar na busca e organização é a criação de sistemas de anotações (tagging systems), normalmente na forma de palavras-chave, extraídas do próprio conteúdo ou sugeridas por visitantes. Esse trabalho aplica um algoritmo de mineração de dados em um banco de dados RDF, contendo instâncias que podem fazer referências à rede Linked Data do DBpedia, para recomendação de tags utilizando as medidas de similaridade taxonômica, relacional e literal de descrições RDF. O banco de dados utilizado é o Arquigrafia, um sistema de banco de dados na Web cujo objetivo é catalogar imagens de projetos arquitetônicos, e que permite que visitantes adicionem tags às imagens. Foram realizados experimentos para a avaliação da qualidade das recomendações de tags realizadas considerando diferentes modelos do Arquigrafia incluindo o modelo estendido do Arquigrafia que faz referências ao DBpedia. Os resultados mostram que a qualidade da recomendação de determinadas tags pode melhorar quando consideramos diferentes modelos (com referências à rede Linked Data do DBpedia) na fase de aprendizado. / Given the huge content created by users in the Web, a way to help in search and organization is the creation of tagging systems, usually in a keyword form (extracted from the Web content or suggested by users). This work applies a data mining algorithm in a RDF database, which contain instances that can reference the DBpedia Linked Data repository, to recommend tags using the taxonomic, relational and literal similarities from RDF descriptions. The database used is the Arquigrafia, a database system available in the Web which goal is to catalog architecture projects, and it allows a user to add tags to images. Experiments were performed to evaluate the quality of the tag recommendations made considering differents models of Arquigrafia\'s database, including an extended model which has references to DBpedia. The results shown that the quality of the recommendations of some tags can be improved when we consider different models (with references to DBpedia Linked Data repository) in the learning phase. Data Mining Linked Data Linked Data Mineração de dados Recomendação de Tags Semantic Web Tag Recommendation Web Semântica
22	Extracting structured information from Wikipedia articles to populate infoboxes Lange, Dustin, Böhm, Christoph, Naumann, Felix January 2010 (has links) Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes. / Ungefähr jeder dritte Wikipedia-Artikel enthält eine Infobox - eine Tabelle, die wichtige Fakten über das beschriebene Thema in Attribut-Wert-Form darstellt. Das Schema einer Infobox, d.h. die Attribute, die für ein Konzept verwendet werden können, wird durch ein Infobox-Template definiert. Häufig geben Autoren nicht für alle Template-Attribute Werte an, wodurch unvollständige Infoboxen entstehen. Mit iPopulator stellen wir ein System vor, welches automatisch Infoboxen von Wikipedia-Artikeln durch Extrahieren von Attributwerten aus dem Artikeltext befüllt. Im Unterschied zu früheren Arbeiten erkennt iPopulator die Struktur von Attributwerten und nutzt diese aus, um die einzelnen Bestandteile von Attributwerten unabhängig voneinander zu extrahieren. Wir haben iPopulator auf der gesamten Menge der Infobox-Templates getestet und analysieren detailliert die Effektivität. Wir erreichen beispielsweise für die Extraktion einen durchschnittlichen Precision-Wert von 91% für 1.727 verschiedene Infobox-Template-Attribute. Informationsextraktion Wikipedia Linked Data Information Extraction Wikipedia Linked Data Data processing Computer science
23	Linked-OWL: A new approach for dynamic linked data service workflow composition Ahmad, Hussien, Dowaji, Salah 01 June 2013 (has links) The shift from Web of Document into Web of Data based on Linked Data principles defined by Tim Berners-Lee posed a big challenge to build and develop applications to work in Web of Data environment. There are several attempts to build service and application models for Linked Data Cloud. In this paper, we propose a new service model for linked data "Linked-OWL" which is based on RESTful services and OWL-S and copes with linked data principles. This new model shifts the service concept from functions into linked data things and opens the road for Linked Oriented Architecture (LOA) and Web of Services as part and on top of Web of Data. This model also provides high level of dynamic service composition capabilities for more accurate dynamic composition and execution of complex business processes in Web of Data environment. Linked data Linked Data Services Dynamic Workflow OWL-S RDF SPARQL
24	Linked Data Quality Assessment and its Application to Societal Progress Measurement Zaveri, Amrapali 19 May 2015 (has links) (PDF) In recent years, the Linked Data (LD) paradigm has emerged as a simple mechanism for employing the Web as a medium for data and knowledge integration where both documents and data are linked. Moreover, the semantics and structure of the underlying data are kept intact, making this the Semantic Web. LD essentially entails a set of best practices for publishing and connecting structure data on the Web, which allows publish- ing and exchanging information in an interoperable and reusable fashion. Many different communities on the Internet such as geographic, media, life sciences and government have already adopted these LD principles. This is confirmed by the dramatically growing Linked Data Web, where currently more than 50 billion facts are represented. With the emergence of Web of Linked Data, there are several use cases, which are possible due to the rich and disparate data integrated into one global information space. Linked Data, in these cases, not only assists in building mashups by interlinking heterogeneous and dispersed data from multiple sources but also empowers the uncovering of meaningful and impactful relationships. These discoveries have paved the way for scientists to explore the existing data and uncover meaningful outcomes that they might not have been aware of previously. In all these use cases utilizing LD, one crippling problem is the underlying data quality. Incomplete, inconsistent or inaccurate data affects the end results gravely, thus making them unreliable. Data quality is commonly conceived as fitness for use, be it for a certain application or use case. There are cases when datasets that contain quality problems, are useful for certain applications, thus depending on the use case at hand. Thus, LD consumption has to deal with the problem of getting the data into a state in which it can be exploited for real use cases. The insufficient data quality can be caused either by the LD publication process or is intrinsic to the data source itself. A key challenge is to assess the quality of datasets published on the Web and make this quality information explicit. Assessing data quality is particularly a challenge in LD as the underlying data stems from a set of multiple, autonomous and evolving data sources. Moreover, the dynamic nature of LD makes assessing the quality crucial to measure the accuracy of representing the real-world data. On the document Web, data quality can only be indirectly or vaguely defined, but there is a requirement for more concrete and measurable data quality metrics for LD. Such data quality metrics include correctness of facts wrt. the real-world, adequacy of semantic representation, quality of interlinks, interoperability, timeliness or consistency with regard to implicit information. Even though data quality is an important concept in LD, there are few methodologies proposed to assess the quality of these datasets. Thus, in this thesis, we first unify 18 data quality dimensions and provide a total of 69 metrics for assessment of LD. The first methodology includes the employment of LD experts for the assessment. This assessment is performed with the help of the TripleCheckMate tool, which was developed specifically to assist LD experts for assessing the quality of a dataset, in this case DBpedia. The second methodology is a semi-automatic process, in which the first phase involves the detection of common quality problems by the automatic creation of an extended schema for DBpedia. The second phase involves the manual verification of the generated schema axioms. Thereafter, we employ the wisdom of the crowds i.e. workers for online crowdsourcing platforms such as Amazon Mechanical Turk (MTurk) to assess the quality of DBpedia. We then compare the two approaches (previous assessment by LD experts and assessment by MTurk workers in this study) in order to measure the feasibility of each type of the user-driven data quality assessment methodology. Additionally, we evaluate another semi-automated methodology for LD quality assessment, which also involves human judgement. In this semi-automated methodology, selected metrics are formally defined and implemented as part of a tool, namely R2RLint. The user is not only provided the results of the assessment but also specific entities that cause the errors, which help users understand the quality issues and thus can fix them. Finally, we take into account a domain-specific use case that consumes LD and leverages on data quality. In particular, we identify four LD sources, assess their quality using the R2RLint tool and then utilize them in building the Health Economic Research (HER) Observatory. We show the advantages of this semi-automated assessment over the other types of quality assessment methodologies discussed earlier. The Observatory aims at evaluating the impact of research development on the economic and healthcare performance of each country per year. We illustrate the usefulness of LD in this use case and the importance of quality assessment for any data analysis. Linked Data Datenqualität Semantic Web Linked Data Data Quality Semantic Web ddc:500
25	[en] CRAWLING THE LINKED DATA CLOUD / [pt] COLETA DE DADOS INTERLIGADOS RAPHAEL DO VALE AMARAL GOMES 26 April 2016 (has links) [pt] As melhores práticas de dados interligados recomendam que se utilizem ontologias bem conhecidas de modo a facilitar a ligação entre um novo conjunto de triplas RDF (ou, abreviadamente, tripleset) e os já existentes. Entretanto, ambas as tarefas apresentam dificuldades. Esta tese apresenta frameworks para criação de buscadores de metadados que ajudam na seleção de ontologias e na escolha de triplesets que podem ser usados, respectivamente, nos processos de publicação e interligação de triplesets. Resumidamente, o administrador de um novo tripleset deve inicialmente definir um conjunto de termos que descrevam o domínio de interesse do tripleset. Um buscador de metadados, construído segundo os frameworks apresentados na tese, irá localizar, nos vocabulários dos triplesets existentes, aqueles que possuem relação direta ou indireta com os termos definidos pelo administrador. O buscador retornará então uma lista de ontologias que podem ser utilizadas para o domínio, bem como uma lista dos triplesets relacionados. O buscador tem então como foco os metadados dos triplesets, incluindo informações de subclasse, e a sua saída retorna somente metadados, justificando assim chama-lo de buscador focado em metadados . / [en] The Linked Data best practices recommend to publish a new tripleset using well-known ontologies and to interlink the new tripleset with other triplesets. However, both are difficult tasks. This thesis describes frameworks for metadata crawlers that help selecting the ontologies and triplesets to be used, respectively, in the publication and the interlinking processes. Briefly, the publisher of a new tripleset first selects a set of terms that describe the application domain of interest. Then, he submits the set of terms to a metadata crawler, constructed using one of the frameworks described in the thesis, that searches for triplesets which vocabularies include terms direct or transitively related to those in the initial set of terms. The crawler returns a list of ontologies that are used for publishing the new tripleset, as well as a list of triplesets with which the new tripleset can be interlinked. Hence, the crawler focuses on specific metadata properties, including subclass of, and returns only metadata, which justifies the classification metadata focused crawler. [pt] LINKED DATA [en] LINKED DATA [pt] BUSCADORES FOCADOS [pt] RECOMENDACAO DE TRIPLESETS
26	Uma abordagem para publicaÃÃo de visÃes RDF de dados relacionais / One approach to publishing RDF views of relational data Luis Eufrasio Teixeira Neto 21 January 2014 (has links) nÃo hÃ / A iniciativa Linked Data trouxe novas oportunidades para a construÃÃo da nova geraÃÃo de aplicaÃÃes Web. Entretanto, a utilizaÃÃo das melhores prÃticas estabelecidas por este padrÃo depende de mecanismos que facilitem a transformaÃÃo dos dados armazenados em bancos relacionais em triplas RDF. Recentemente, o grupo de trabalho W3C RDB2RDF propÃs uma linguagem de mapeamento padrÃo, denominada R2RML, para especificar mapeamentos customizados entre esquemas relacionais e vocabulÃrios RDF. No entanto, a geraÃÃo de mapeamentos R2RML nÃo Ã uma tarefa fÃcil. Ã imperativo, entÃo, definir: (a) uma soluÃÃo para mapear os conceitos de um esquema relacional em termos de um esquema RDF; (b) um processo que suporte a publicaÃÃo dos dados relacionais no formato RDF; e (c) uma ferramenta para facilitar a aplicaÃÃo deste processo. Assertivas de correspondÃncia sÃo propostas para formalizar mapeamentos entre esquemas relacionais e esquemas RDF. VisÃes sÃo usadas para publicar dados de uma base de dados em uma nova estrutura ou esquema. A definiÃÃo de visÃes RDF sobre dados relacionais permite que esses dados possam ser disponibilizados em uma estrutura de termos de uma ontologia OWL, sem que seja necessÃrio alterar o esquema da base de dados. Neste trabalho, propomos uma arquitetura em trÃs camadas â de dados, de visÃes SQL e de visÃes RDF â onde a camada de visÃes SQL mapeia os conceitos da camada de dados nos termos da camada de visÃes RDF. A criaÃÃo desta camada intermediÃria de visÃes facilita a geraÃÃo dos mapeamentos R2RML e evita que alteraÃÃes na camada de dados impliquem em alteraÃÃes destes mapeamentos. Adicionalmente, definimos um processo em trÃs etapas para geraÃÃo das visÃes RDF. Na primeira etapa, o usuÃrio define o esquema do banco de dados relacional e a ontologia OWL alvo e cria assertivas de correspondÃncia que mapeiam os conceitos do esquema relacional nos termos da ontologia alvo. A partir destas assertivas, uma ontologia exportada Ã gerada automaticamente. O segundo passo produz um esquema de visÃes SQL gerado a partir da ontologia exportada e um mapeamento R2RML do esquema de visÃes para a ontologia exportada, de forma automatizada. Por fim, no terceiro passo, as visÃes RDF sÃo publicadas em um SPARQL endpoint. Neste trabalho sÃo detalhados as assertivas de correspondÃncia, a arquitetura, o processo, os algoritmos necessÃrios, uma ferramenta que suporta o processo e um estudo de caso para validaÃÃo dos resultados obtidos. / The Linked Data initiative brought new opportunities for building the next generation of Web applications. However, the full potential of linked data depends on how easy it is to transform data stored in conventional, relational databases into RDF triples. Recently, the W3C RDB2RDF Working Group proposed a standard mapping language, called R2RML, to specify customized mappings between relational schemas and target RDF vocabularies. However, the generation of customized R2RML mappings is not an easy task. Thus, it is mandatory to define: (a) a solution that maps concepts from a relational schema to terms from a RDF schema; (b) a process to support the publication of relational data into RDF, and (c) a tool that implements this process. Correspondence assertions are proposed to formalize the mappings between relational schemas and RDF schemas. Views are created to publish data from a database to a new structure or schema. The definition of RDF views over relational data allows providing this data in terms of an OWL ontology structure without having to change the database schema. In this work, we propose a three-tier architecture â database, SQL views and RDF views â where the SQL views layer maps the database concepts into RDF terms. The creation of this intermediate layer facilitates the generation of R2RML mappings and prevents that changes in the data layer result in changes on R2RML mappings. Additionally, we define a three-step process to generate the RDF views of relational data. First, the user defines the schema of the relational database and the target OWL ontology. Then, he defines correspondence assertions that formally specify the relational database in terms of the target ontology. Using these assertions, an exported ontology is generated automatically. The second step produces the SQL views that perform the mapping defined by the assertions and a R2RML mapping between these views and the exported ontology. This dissertation describes a formalization of the correspondence assertions, the three-tier architecture, the publishing process steps, the algorithms needed, a tool that supports the entire process and a case study to validate the results obtained. Bancos de Dados Relacionais Web SemÃntica Linked Data Relational Databases Semantic Web Linked Data CIENCIA DA COMPUTACAO
27	Um Ambiente para Processamento de Consultas Federadas em Linked Data Mashups / An Environment for Federated Query Processing in Linked Data Mashups Regis Pires MagalhÃes 25 May 2012 (has links) CoordenaÃÃo de AperfeiÃoamento de Pessoal de NÃvel Superior / Tecnologias da Web SemÃntica como modelo RDF, URIs e linguagem de consulta SPARQL, podem reduzir a complexidade de integraÃÃo de dados ao fazer uso de ligaÃÃes corretamente estabelecidas e descritas entre fontes.No entanto, a diﬁculdade para formulaÃÃo de consultas distribuÃdas tem sido um obstÃculo para aproveitar o potencial dessas tecnologias em virtude da autonomia, distribuiÃÃo e vocabulÃrio heterogÃneo das fontes de dados.Esse cenÃrio demanda mecanismos eﬁcientes para integraÃÃo de dados sobre Linked Data.Linked Data Mashups permitem aos usuÃrios executar consultas e integrar dados estruturados e vinculados na web.O presente trabalho propÃe duas arquiteturas de Linked Data Mashups:uma delas baseada no uso de mediadores e a outra baseada no uso de Linked Data Mashup Services (LIDMS). Um mÃdulo para execuÃÃo eﬁciente de planos de consulta federados sobre Linked Data foi desenvolvido e Ã um componente comum a ambas as arquiteturas propostas.A viabilidade do mÃdulo de execuÃÃo foi demonstrada atravÃs de experimentos. AlÃm disso, um ambiente Web para execuÃÃo de LIDMS tambÃm foi deﬁnido e implementado como contribuiÃÃes deste trabalho. / Semantic Web technologies like RDF model, URIs and SPARQL query language, can reduce the complexity of data integration by making use of properly established and described links between sources.However, the difﬁculty to formulate distributed queries has been a challenge to harness the potential of these technologies due to autonomy, distribution and vocabulary of heterogeneous data sources. This scenario demands effective mechanisms for integrating data on Linked Data.Linked Data Mashups allow users to query and integrate structured and linked data on the web. This work proposes two architectures of Linked Data Mashups: one based on the use of mediators and the other based on the use of Linked Data Mashup Services (LIDMS). A module for efﬁcient execution of federated query plans on Linked Data has been developed and is a component common to both proposed architectures.The execution module feasibility has been demonstrated through experiments. Furthermore, a LIDMS execution Web environment also has been deﬁned and implemented as contributions of this work. Linked Data Mashups IntegraÃÃo de dados Consultas Federadas Data Integration Federated Queries Linked Data Mashups CIENCIA DA COMPUTACAO
28	Hromadná extrakce dat veřejné správy do RDF / Bulk extraction of public administration data to RDF Pomykacz, Michal January 2013 (has links) The purpose of this work was to deal with data extraction from various formats (HTML, XML, XLS) and transformation for further processing. As the data sources were used Czech public contracts and related code lists and classifications. Main goal was to implement periodic data extraction, RDF transformation and publishing the output in form of Linked Data using SPARQL endpoint. It was necessary to design and implement extraction modules for UnifiedViews tool as it was used for periodic extractions. Theoretical section of this thesis explains the principles of linked data and key tools used for data extraction and manipulation. Practical section deals with extractors design and implementation. Part describing extractor implementation shows methods for parsing data in various dataset formats and its transformation to RDF. The success of each extractor implementation is presented at the conclusion along with thought of usability in a real world.
29	Predição de tags usando linked data: um estudo de caso no banco de dados Arquigrafia / Tag prediction using linked data: a case study in the Arquigrafia database Ricardo Augusto Teixeira de Souza 17 December 2013 (has links) Dada a grande quantidade de conteúdo criado por usuários na Web, uma proposta para ajudar na busca e organização é a criação de sistemas de anotações (tagging systems), normalmente na forma de palavras-chave, extraídas do próprio conteúdo ou sugeridas por visitantes. Esse trabalho aplica um algoritmo de mineração de dados em um banco de dados RDF, contendo instâncias que podem fazer referências à rede Linked Data do DBpedia, para recomendação de tags utilizando as medidas de similaridade taxonômica, relacional e literal de descrições RDF. O banco de dados utilizado é o Arquigrafia, um sistema de banco de dados na Web cujo objetivo é catalogar imagens de projetos arquitetônicos, e que permite que visitantes adicionem tags às imagens. Foram realizados experimentos para a avaliação da qualidade das recomendações de tags realizadas considerando diferentes modelos do Arquigrafia incluindo o modelo estendido do Arquigrafia que faz referências ao DBpedia. Os resultados mostram que a qualidade da recomendação de determinadas tags pode melhorar quando consideramos diferentes modelos (com referências à rede Linked Data do DBpedia) na fase de aprendizado. / Given the huge content created by users in the Web, a way to help in search and organization is the creation of tagging systems, usually in a keyword form (extracted from the Web content or suggested by users). This work applies a data mining algorithm in a RDF database, which contain instances that can reference the DBpedia Linked Data repository, to recommend tags using the taxonomic, relational and literal similarities from RDF descriptions. The database used is the Arquigrafia, a database system available in the Web which goal is to catalog architecture projects, and it allows a user to add tags to images. Experiments were performed to evaluate the quality of the tag recommendations made considering differents models of Arquigrafia\'s database, including an extended model which has references to DBpedia. The results shown that the quality of the recommendations of some tags can be improved when we consider different models (with references to DBpedia Linked Data repository) in the learning phase. Linked Data Mineração de dados Recomendação de Tags Web Semântica Data Mining Linked Data Semantic Web Tag Recommendation
30	Nástroj pro vytváření konfigurací vizuálního procházení znalostních grafů / A tool for configuring knowledge graph visual browser Emeiri, Mahran January 2021 (has links) The main aim of this research is to provide a tool that helps users in creating, managing and validating configuration files visually, then compiles the user input into a valid RDF representation of the configuration that can be published as a linked open data resource, these configurations then can be used as an input to a Knowledge Graph Browser to be visualized as an interactive Knowledge Graph.

Search results