Spelling suggestions: "subject:"semistructured data"" "subject:"semistructured data""
11 |
Semantisk interoperabilitet för hantering av XMLLindgren, Ida, Norman, Isabelle January 2014 (has links)
Business Analytics används idag i ökad grad i organisationer som grund till beslutsfattande. Ett av villkoren för att kunna använda sig av Business Analytics för att utföra analyser av data från olika källor är att det finns interoperabilitet mellan dem. Syftet med den här studien är att undersöka om det är möjligt att skapa en IT-artefakt som kan hämta data ifrån flertalet XML-dokument med olika struktur för att uppnå semantisk interoperabilitet och på så vis möjliggöra för Business Analytics. Med olika struktur menar vi att benämningarna på taggarna skiljer sig språkmässigt men har samma semantiska betydelse. Lösningen skapas genom forskningsstrategin Design Science vilket innebär att en IT-artefakt utvecklas som kunskapsbidrag, och visar att en implementation av en lösning är möjlig för de semantiska problem vi identifierat. Resultatet av utvecklingen är en flexibel IT-artefakt där en användare kan koppla samman och hämta data från XML-filer med olika struktur. Denna sammankoppling skapas genom att användaren själv kan bygga upp och använda en ontologi med de ord som används som taggar i XML-filerna. Genom att använda ontologier på det här sättet visar vi med vår forskning att det är möjligt att uppnå semantisk interoperabilitet mellan XML-filer med olika struktur. Utifrån resultatet av den IT-artefakt vi skapar kan vi dra slutsatser om att det går att skapa en generell lösning för denna typ av problematik. / Today Business Analytics is becoming increasingly popular and is utilized by organizations to analyze data that is used as support for decision-making. Business Analytics requires that interoperability exists between the data sources used to gather and compile data for analysis to ensure that data can be correctly interpreted. Therefore, the aim of this study is to investigate the possibility of creating an IT-artifact for querying several XML-documents consisting of various structures in order to achieve semantic interoperability, thus enabling Business Analytics. The structural differences considered in this report focuses on when XML-tags have been given different names that essentially have the same semantic meaning. The research strategy Design Science has been used when creating the solution. As a result of the research strategy the knowledge contribution is an IT-artifact. The IT-artifact is a Proof of concept that demonstrates a possible implementation of a solution that handles the semantic problems identified in this report. The result of the development is a flexible application that users can utilize to gather data from XML-files with different structures. This is made possible by letting the user create an ontology containing the tag names from the XML-files. By using ontologies like this we have given proof that it is possible to accomplish interoperability between XML-files with different structures. The conclusion that can be drawn from the development of the IT-artifact is that it is possible to create a general solution for the identified problem.
|
12 |
Uma técnica de indexação de dados semi-estruturados para o processamento eficiente de consultas com ramificaçãoViana, Talles Brito 20 April 2012 (has links)
Made available in DSpace on 2015-05-14T12:36:35Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 1730516 bytes, checksum: 167ec230d84a25e110ad4386ec5aae74 (MD5)
Previous issue date: 2012-04-20 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / The explosive growth of web-based information systems has created various sources and vast
quantities of semi-structured data, which need to be indexed by search engines in order to
allow the retrieval of documents according to user needs. However, one of the major
challenges in the development of indexing techniques for semi-structured data is related to
how to index not only textual but also structural content. The main issue is how to efficiently
handle branching path expressions without introducing precision loss as well as undesired
growth of query processing costs and index file sizes. Several proposals for indexing semistructured
data can be found in the literature. Despite their relevant contributions, existing
proposals suffer from at least one of the problems related to precision loss, storage space
requirements and query processing costs. In such a context, this thesis proposes an efficient,
lossless path-based indexing technique (named as BranchGuide) for semi-structured data,
which deals with a well-defined class of branching path expressions. This well-defined class
includes branching paths that allow expressing parent-child dependencies between elements
in which may be imposed restrictions over the textual value of attributes of such elements. As
evinced by experimental evaluation, the adoption of the BranchGuide technique results in
excellent query processing time and generates smaller index file sizes than a structural join
indexing technique. / O surgimento de sistemas baseados na Web tem gerado uma vasta quantidade de fontes de
documentos semi-estruturados, os quais necessitam ser indexados por sistemas de busca a fim
de possibilitar a descoberta de documentos de acordo com necessidades de informação do
usuário. Entretanto, um dos maiores desafios no desenvolvimento de técnicas de indexação
para documentos semi-estruturados diz respeito a como indexar não somente o conteúdo
textual, mas também a informação estrutural dos documentos. O principal problema está em
prover suporte para consultas com ramificação sem introduzir fatores que causem perda de
precisão aos resultados de pesquisa, bem como, o crescimento indesejado do tempo de
processamento de consultas e dos tamanhos de índice. Várias técnicas de indexação para
dados semi-estruturados são encontradas na literatura. Apesar das relevantes contribuições, as
propostas existentes sofrem com problemas relacionados à perda de precisão, requisitos de
armazenamento ou custos de processamento de consultas. Neste contexto, nesta dissertação é
proposta uma técnica de indexação (denominada BranchGuide) para dados semi-estruturados
que suporta uma bem definida classe de consultas com ramificação sem perda de precisão.
Esta classe compreende caminhos com ramificação que permitem expressar dependências paifilho
entre elementos nos quais podem ser impostas restrições sob os valores de atributos de
tais elementos. Como evidenciado experimentalmente, a adoção da técnica BranchGuide gera
excelentes tempos de processamento de consulta e tamanhos de índice menores do que os
gerados por uma técnica de interseção estrutural.
|
13 |
ADVANCED INTERFACE FOR QUERYING GRAPH DATAMayes, Stephen Frederick January 2008 (has links)
No description available.
|
14 |
[en] QEEF: AN EXTENSIBLE QUERY EXECUTION ENGINE / [pt] QEEF: UMA MÁQUINA DE EXECUÇÃO DE CONSULTASFAUSTO VERAS MARANHAO AYRES 30 June 2004 (has links)
[pt] O processamento de consultas em Sistemas de Gerência de
Banco de Dados tradicionais tem sido largamente estudado na
literatura e utilizado comercialmente com enorme sucesso.
Isso é devido, em parte, à eficiência das Máquinas de
Execução de Consultas (MEC) no suporte ao modelo de
execução tradicional. Porém, o surgimento de novos cenários
de aplicação, principalmente em conseqüência do modelo
computacional da web, motivou a pesquisa de novos modelos
de execução, tais como: modelo adaptável e modelo contínuo,
além da pesquisa de modelos de dados semi-estruturados, tal
como o XML, ambos não suportados pelas MEC tradicionais. O
objetivo desta tese consiste no desenvolvimento de uma MEC
extensível frente a diferentes modelos de execução e de
dados. Adicionalmente, esta proposta trata de maneira
ortogonal o modelo de execução e o modelo de dados, o que
permite a avaliação de planos de execução de consultas
(PEC) com fragmentos em diferentes modelos. Utilizou-se a
técnica de framework de software para a especificação da
MEC extensível, produzindo o framework QEEF (Query
Execution Engine Framework). A extensibilidade da
solução reflete-se em um meta-modelo, denominado QUEM
(QUery Execution Meta-model), capaz de exprimir diferentes
modelos em um meta-PEC. O framework QEEF pré-processa um
meta-PEC e produz um PEC final a ser avaliado pela MEC
instanciada. Como parte da validação desta proposta,
instanciou-se o QEEF para diferentes modelos de execução e
de dados. / [en] Querying processing in traditional Database Management
Systems (DBMS) has been extensively studied in the
literature and adopted in industry. Such success is, in
part, due to the performance of their Query Execution
Engines (QEE) for supporting the traditional query
execution model. The advent of new query scenarios, mainly
due to the web computational model, has motivate the
research on new execution models such as: adaptive and
continuous, and on semistructured data models, such as XML,
both not natively supported by traditional query engines.
This thesis proposes the development of an extensible QEE
adapted to the new execution and data models. Achieving
this goal, we use a software design approach based on
framework technique to produce the Query Execution Engine
Framework (QEEF). Moreover, we address the question of the
orthogonality between execution and data models, witch
allows for executing query execution plans (QEP) with
fragments in different models. The extensibility of our
solution is specified by in a QEP by an execution meta-
model named QUEM (QUery Execution Meta-model) used to
express different models in a meta-QEP. During query
evaluation, the latter is pre-processed by the QEEF
producing a final QEP to be evaluated by the running QEE.
The QEEF is instantiated for different execution and data
models as part of the validation of this proposal.
|
15 |
Streamlining Certification Management with Automation and Certification Retrieval : System development using ABP Framework, Angular, and MongoDB / Effektivisering av certifikathantering med automatisering och certifikathämtning : Systemutveckling med ABP Framework, Angular och MongoDBHassan, Nour Al Dine January 2024 (has links)
This thesis examines the certification management challenge faced by Integrity360. The decentralized approach, characterized by manual processes and disparate data sources, leads to inefficient tracking of certification status and study progress. The main objective of this project was to construct a system that automates data retrieval, ensures a complete audit, and increases security and privacy. Leveraging the ASP.NET Boilerplate (ABP) framework, Angular, and MongoDB, an efficient and scalable system was designed, developed, and built based on DDD (domain-driven design) principles for a modular and maintainable architecture. The implemented system automates data retrieval from the Credly API, tracks exam information, manages exam vouchers, and implements a credible authentication system with role-based access control. With the time limitations behind the full-scale implementation of all the planned features, such as a dashboard with aggregated charts and automatic report generation, the platform significantly increases the efficiency and precision of employee certification management. Future work will include these advanced functionalities and integrations with external platforms to improve the system and increase its impact on operations in Integrity360.
|
Page generated in 0.0496 seconds