71 |
E-model: event-based graph data model theory and implementationKim, Pilho 06 July 2009 (has links)
The necessity of managing disparate data models is increasing within all IT areas. Emerging hybrid relational-XML systems are under development in this context to support both relational and XML data models. However, there are ever-growing needs for adequate data models for texts and multimedia, which are applications that require proper storage, and their capability to coexist and collaborate with other data models is as important as that of a relational-XML hybrid model. This work proposes a new data model named E-model that supports rich relations and reflects the dynamic nature of information. This E-model introduces abstract data typing objects and rules of relation that support: (1) the notion of time in object definition and relation, (2) multiple-type relations, (3) complex schema modeling methods using a relational directed acyclic graph, and (4) interoperation with popular data models. To implement the E-model prototype, extensive data operation APIs have been developed on top of relational databases. In processing dynamic queries, our prototype achieves an order of magnitude improvement in speed compared with popular data models. Based on extensive E-model APIs, a new language named EML is proposed. EML extends the SQL-89 standard with various E-model features: (1) unstructured queries, (2) unified object namespaces, (3) temporal queries, (4) ranking orders, (5) path queries, and (6) semantic expansions. The E-model system can interoperate with popular data models with its rich relations and flexible structure to support complex data models. It can act as a stand-alone database server or it can also provide materialized views for interoperation with other data models. It can also co-exist with established database systems as a centralized online archive or as a proxy database server. The current E-model prototype system was implemented on top of a relational database. This allows significant benefits from established database engines in application development. In addition to extensive features added to SQL, our EML prototype achieves an order of magnitude speed improvement in dynamic queries compared to popular database models.
Availability Release the entire work immediately for access worldwide after my graduation.
|
72 |
Uma arquitetura orientada a serviços para integração de redes de sensores e atuadores heterogêneos na internet das coisas.GOMES, Yuri Farias. 16 May 2018 (has links)
Submitted by Kilvya Braga (kilvyabraga@hotmail.com) on 2018-05-16T12:07:58Z
No. of bitstreams: 1
YURI FARIAS GOMES - DISSERTAÇÃO (PPGCC) 2016.pdf: 3048867 bytes, checksum: 60a6246decb26c17e7daf60547dab3f6 (MD5) / Made available in DSpace on 2018-05-16T12:07:58Z (GMT). No. of bitstreams: 1
YURI FARIAS GOMES - DISSERTAÇÃO (PPGCC) 2016.pdf: 3048867 bytes, checksum: 60a6246decb26c17e7daf60547dab3f6 (MD5)
Previous issue date: 2016 / Capes / A visão da Internet das Coisas possibilitou o desenvolvimento de uma diversidade de aplicações e serviços que antes não era possível devido a uma série de limitações. Apesar de algumas dificuldades ainda existentes no hardware, como poder de processamento limitado e utilizações de baterias, pesquisas indicam que no ano de 2016 mais de 6,4 bilhões de dispositivos estarão conectados. A alta diversidade destes aparelhos cria a necessidade de infraestruturas capazes de lidar com dispositivos altamente heterogêneos e suas limitações de hardware. Neste trabalho propõe-se uma arquitetura orientada a serviços para integrar dispositivos na Internet das Coisas e resolver grande parte dos problemas que essa integração ocasiona. A partir desta arquitetura, serviços e aplicações poderão acessar sensores e atuadores através da web utilizando modelos de dados definidos a partir de padrões na Internet. O gerenciamento dos nós conectados a esta infraestrutura é realizado a partir de um middleware conectado a dispositivos ou gateways para a tradução de informações na tecnologia de comunicação utilizada (ie. Bluetooth, ZigBee, entre outros). Esta proposta foi avaliada com o desenvolvimento de um middleware baseado na especificação UPnP e uma aplicação Android para simulação dos dados de sensores. Resultados do experimento demonstram a viabilidade de utilização da arquitetura proposta na Integração com aplicações, serviços e outras arquiteturas disponíveis na Internet através da web e modelos de dados padronizados. / The vision of Internet of Things enabled the development of a diverse range of applications and service not possible before due to a number of limitations. Despite some remaining problems still exists on hardware, such as limited processing power and battery usage, researches indicates that in 2016, more than 6.4 billion devices will be connected. The high diversity of the sedevices creates the need of a infrastructure capable of managing high ly heterogeneous devices and their hardware limitations. This work proposes an service-oriented architecture to integrate Io Tdevices and solvemostissues that this integration brings. Using this architecture, services and applications will be able to access sensors and actuators from the web using data models from repositories on the Internet. The management of connected devices is performed by a middleware that can be connected directly to the devices or through gateways that can translate information to the communication technology used (ie. Bluetooth, ZigBee, and others). This proposal was evaluated with the development of a middleware based on the UPnP specification and an Android application to simulate sensor data. Results from this evaluation shows the feasibility of the solution with the integration with applications, services and other architectures available on the Internet through the web using the same data model.
|
73 |
Síntese automática de interfaces gráficas de usuário para sistemas de informação em saúdeTeixeira, Iuri Malinoski 26 February 2013 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-31T18:20:47Z
No. of bitstreams: 1
iurimalinoskiteixeira.pdf: 1437690 bytes, checksum: c11d45074fef83b3318f92c12b425557 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-06-01T11:50:34Z (GMT) No. of bitstreams: 1
iurimalinoskiteixeira.pdf: 1437690 bytes, checksum: c11d45074fef83b3318f92c12b425557 (MD5) / Made available in DSpace on 2017-06-01T11:50:34Z (GMT). No. of bitstreams: 1
iurimalinoskiteixeira.pdf: 1437690 bytes, checksum: c11d45074fef83b3318f92c12b425557 (MD5)
Previous issue date: 2013-02-26 / FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas Gerais / A modelagem de dados clínicos para Sistemas de Informação em Saúde (SIS) demanda
expertise de domínio. Técnicas de Desenvolvimento Dirigido por Modelos (DDM) permi
tem uma melhor articulação entre especialistas de domínio e desenvolvedores de SISs e
possibilitam reduzir o custo de desenvolvimento desses sistemas. Modelos de dados clí-
nicos baseados em especificações padronizadas e abertas como a do openEHR facilitam
sobremaneira a aplicação de técnicas de DDM para SISs. Contudo, o uso de modelos de
dados clínicos não resolve sozinho o problema fundamental do alto custo de desenvolvi-
mento de SISs. Uma das causas desse problema é a falta de informações arquiteturais nos
modelos de dados clínicos. Sem essas informações arquiteturais, o custo de desenvolvi-
mento é deslocado para a especificação das regras de transformação de modelos de dados
clínicos em código de SIS (regras estas fundamentais nas técnicas de DDM), uma vez
que cada novo SIS a ser gerado implica na especificação de um novo conjunto de regras).
Neste contexto, este trabalho apresenta uma estratégia para geração de código de SISs ba
seada na combinação entre modelos de dados clínicos e informações arquiteturais. Nessa
estratégia, o desenvolvedor é capaz de categorizar SISs em diferentes famílias e definir um
conjunto de regras de transformação comum a todos os SISs de uma família. Cada família
é definida por um conjunto de SISs com estruturas arquiteturais semelhantes e modelos
de dados clínicos distintos. O resultado esperado dessa estratégia é um melhor reuso das
regras de transformação de modelos. Essa estratégia é empregada para se alcançar o ob
jetivo principal deste trabalho, que é a concepção de um sistema de transformação para
a síntese automática de Interfaces Gráficas de Usuário (GUI - Graphic User Interface)
para SISs, considerando as especificações openEHR e algumas construções presentes em
Linguagens de Descrição Arquitetural (ADL), como Acme. Como prova de conceito, esse
framework é aplicado em algumas famílias de SIS. / The modeling of clinical data for Health Information Systems (HIS) requires domain
expertise. Model-Driven Development (MDD) techniques provide a better articulation
between domain experts and developers of HISes and enable the reduction in the develop
ment cost of these systems. Clinical data models based on open standard specifications
such as the openEHR facilitates the application of MDD techniques for HISes. Neverthe
less, the use of clinical data models alone does not solve the fundamental problem of
high development cost for HISes. One cause for this problem is the lack of architectural
information in clinical data models. Without such architectural information, the develop
ment cost is shifted to the specification of transformation rules from clinical data models
to HIS code (these rules are fundamental in MDD techniques), since each new HIS to
be generated involves the specification of a new set of rules. In this context, this work
presents a strategy for code generation of HISes that combines clinical data models and
architectural information. In this strategy, the developer is able to categorize HISes in
distinct families and define a set of transformation rules that are common to all HISes in
a family. Each family is defined by a set of systems with similar architectural structures
and distinct clinical data models. The expected result of such a strategy is a better reuse
of model transformation rules. This strategy is employed to achieve the main objective
of this work, which is to design a transformation system for the automatic synthesis of
graphical user interfaces (GUI) for HISes, considering openEHR specifications and some
constructs present in architectural description languages (ADLs), such as Acme. As a
proof of concept, this framework is applied to some HIS families.
|
74 |
Online horizontal partitioning of heterogeneous dataHerrmann, Kai, Voigt, Hannes, Lehner, Wolfgang 30 November 2020 (has links)
In an increasing number of use cases, databases face the challenge of managing heterogeneous data. Heterogeneous data is characterized by a quickly evolving variety of entities without a common set of attributes. These entities do not show enough regularity to be captured in a traditional database schema. A common solution is to centralize the diverse entities in a universal table. Usually, this leads to a very sparse table. Although today’s techniques allow efficient storage of sparse universal tables, query efficiency is still a problem. Queries that address only a subset of attributes have to read the whole universal table includingmany irrelevant entities. Asolution is to use a partitioning of the table, which allows pruning partitions of irrelevant entities before they are touched. Creating and maintaining such a partitioning manually is very laborious or even infeasible, due to the enormous complexity. Thus an autonomous solution is desirable. In this article, we define the Online Partitioning Problem for heterogeneous data. We sketch how an optimal solution for this problem can be determined based on hypergraph partitioning. Although it leads to the optimal partitioning, the hypergraph approach is inappropriate for an implementation in a database system. We present Cinderella, an autonomous online algorithm for horizontal partitioning of heterogeneous entities in universal tables. Cinderella is designed to keep its overhead low by operating online; it incrementally assigns entities to partition while they are touched anyway duringmodifications. This enables a reasonable physical database design at runtime instead of static modeling.
|
75 |
Portál pro agregaci dat z webových zdrojů / Portal for Aggregation of Data from Web SourcesMikita, Tibor January 2019 (has links)
This thesis deals with data extraction and data aggregation from heterogeneous web sources. The goal is to create a platform and a functional web application using appropriate technologies. The main focus of the thesis is on the application design and implementation. The application domain is accommodation or lease of apartments. For the data extraction, we use the portal API or a wrapper. Obtained data is stored in a document database. In this thesis, we managed to design and implement a system that allows to obtain rental ads from multiple web sources at the same time and to present them in a uniform way.
|
76 |
Management of multidimensional aggregates for efficient online analytical processingLehner, Wolfgang, Albrecht, J., Bauer, A., Deyerling, O., Günzel, H., Hummer, W., Schlesinger, J. 02 June 2022 (has links)
Proper management of multidimensional aggregates is a fundamental prerequisite for efficient OLAP. The experimental OLAP server CUBESTAR whose concepts are described, was designed exactly for that purpose. All logical query processing is based solely on a specific algebra for multidimensional data. However, a relational database system is used for the physical storage of the data. Therefore, in popular terms, CUBESTAR can be classified as a ROLAP system. In comparison to commercially available systems, CUBESTAR is superior in two aspects. First, the implemented multidimensional data model allows more adequate modeling of hierarchical dimensions, because properties which apply only to certain dimensional elements can be modeled context-sensitively. This fact is reflected by an extended star schema on the relational side. Second, CUBESTAR supports multidimensional query optimization by caching multidimensional aggregates. Since summary tables are not created in advance but as needed, hot spots can be adequately represented. The dynamic and partition-oriented caching method allows cost reductions of up to 60% with space requirements of less than 10% of the size of the fact table.
|
77 |
Datenmodelle für fachübergreifende Wissensbasen in der interdisziplinären AnwendungMolch, Silke 17 December 2019 (has links)
Ziel dieses Beitrags aus der Lehrpraxis ist es, die erforderlichen Herangehensweisen für die Erstellung von fachübergreifenden Wissensbasen und deren Nutzung im Rahmen studentischer Semesterprojekte exemplarisch am Lehrbeispiel einer anwendenden Ingenieurdisziplin darzustellen.
|
78 |
Clustering Uncertain Data with Possible WorldsLehner, Wolfgang, Volk, Peter Benjamin, Rosenthal, Frank, Hahmann, Martin, Habich, Dirk 16 August 2022 (has links)
The topic of managing uncertain data has been explored in many ways. Different methodologies for data storage and query processing have been proposed. As the availability of management systems grows, the research on analytics of uncertain data is gaining in importance. Similar to the challenges faced in the field of data management, algorithms for uncertain data mining also have a high performance degradation compared to their certain algorithms. To overcome the problem of performance degradation, the MCDB approach was developed for uncertain data management based on the possible world scenario. As this methodology shows significant performance and scalability enhancement, we adopt this method for the field of mining on uncertain data. In this paper, we introduce a clustering methodology for uncertain data and illustrate current issues with this approach within the field of clustering uncertain data.
|
79 |
Forecasting the data cubeLehner, Wolfgang, Fischer, Ulrike, Schildt, Christopher, Hartmann, Claudio 12 January 2023 (has links)
Forecasting time series data is crucial in a number of domains such as supply chain management and display advertisement. In these areas, the time series data to forecast is typically organized along multiple dimensions leading to a high number of time series that need to be forecasted. Most current approaches focus only on selection and optimizing a forecast model for a single time series. In this paper, we explore how we can utilize time series at different dimensions to increase forecast accuracy and, optionally, reduce model maintenance overhead. Solving this problem is challenging due to the large space of possibilities and possible high model creation costs. We propose a model configuration advisor that automatically determines the best set of models, a model configuration, for a given multi-dimensional data set. Our approach is based on a general process that iteratively examines more and more models and simultaneously controls the search space depending on the data set, model type and available hardware. The final model configuration is integrated into F2DB, an extension of PostgreSQL, that processes forecast queries and maintains the configuration as new data arrives. We comprehensively evaluated our approach on real and synthetic data sets. The evaluation shows that our approach significantly increases forecast query accuracy while ensuring low model costs.
|
80 |
Exploiting big data in time series forecasting: A cross-sectional approachLehner, Wolfgang, Hartmann, Claudio, Hahmann, Martin, Rosenthal, Frank 12 January 2023 (has links)
Forecasting time series data is an integral component for management, planning and decision making. Following the Big Data trend, large amounts of time series data are available from many heterogeneous data sources in more and more applications domains. The highly dynamic and often fluctuating character of these domains in combination with the logistic problems of collecting such data from a variety of sources, imposes new challenges to forecasting. Traditional approaches heavily rely on extensive and complete historical data to build time series models and are thus no longer applicable if time series are short or, even more important, intermittent. In addition, large numbers of time series have to be forecasted on different aggregation levels with preferably low latency, while forecast accuracy should remain high. This is almost impossible, when keeping the traditional focus on creating one forecast model for each individual time series. In this paper we tackle these challenges by presenting a novel forecasting approach called cross-sectional forecasting. This method is especially designed for Big Data sets with a multitude of time series. Our approach breaks with existing concepts by creating only one model for a whole set of time series and requiring only a fraction of the available data to provide accurate forecasts. By utilizing available data from all time series of a data set, missing values can be compensated and accurate forecasting results can be calculated quickly on arbitrary aggregation levels.
|
Page generated in 0.071 seconds