Global ETD Search

341	SISTEMA INTEGRADO DE MONITORAMENTO E CONTROLE DA QUALIDADE DE COMBUSTÍVEL / INTEGRATED SYSTEMS OF TRACKING AND QUALITY CONTROL OF FUEL Marques, Delano Brandes 27 February 2004 (has links) Made available in DSpace on 2016-08-17T14:52:51Z (GMT). No. of bitstreams: 1 Delano Brandes Marques.pdf: 3918036 bytes, checksum: 599a5c86f30b5b6799c9afd54e7b5de7 (MD5) Previous issue date: 2004-02-27 / This work aims the implantation of an Integrated System that, besides allowing a better, more efficient and more practical monitoring, makes possible the control and optimization of problems related to the oil industry. In order to guarantee fuel s quality and normalization, the development of efficient tools that allow it s monitoring of any point (anywhere) and for any type of fuel is indispensable. Considering the variety of criteria, a decision making should be based on the evaluation of the most varied types of space data and not space data. In this sense, Knowledge Discovery in Databases process is used, where the Data Warehouse and Data Mining steps allied to a Geographic Information System are emphasized. This system presents as objective including several fuel monitoring regions. From different information obtained in the ANP databases, an analysis was carried out and a Data Warehouse model proposed. In the sequel, Data Mining techniques (Principal Component Analysis, Clustering Analysis and Multiple Regression) were applied to the results in order to obtain knowledge (patterns). / O presente trabalho apresenta estudos que visam a implantação de um Sistema Integrado que, além de permitir um melhor monitoramento, praticidade e eficiência, possibilite o controle e otimização de problemas relacionados à indústria de petróleo. Para garantir qualidade e normalização do combustível, é indispensável o desenvolvimento de ferramentas eficientes que permitam o seu monitoramento de qualquer ponto e para qualquer tipo de combustível. Considerando a variedade dos critérios, uma tomada de decisão deve ser baseada na avaliação dos mais variados tipos de dados espaciais e não espaciais. Para isto, é utilizado o Processo de Descoberta de Conhecimento, onde são enfatizadas as etapas de Data Warehouse e Data Mining aliadas ao conceito de um Sistema de Informação Geográfica. O sistema tem por objetivo abranger várias regiões de monitoramento de combustíveis. A partir do levantamento e análise das diferentes informações usadas nos bancos de dados da ANP foi proposto um modelo de data warehouse. Na seqüência foram aplicadas técnicas de mineração de dados (Análise de Componentes Principais, Análise de Agrupamento e Regressão) visando à obtenção de conhecimento (padrões). Análise de Combustíveis Processo KDD Mineração de Dados Data Warehouse Sistema de Informação Geográfica Fuel Analysis KDD process Data Warehouse Data Mining Geographic Information Systems
342	Nutzung von Datenbankdiensten in Data-Warehouse-Anwendungen Schlesinger, Lutz, Lehner, Wolfgang, Hümmer, Wolfgang, Bauer, Andreas 26 November 2020 (has links) Zentral für eine effiziente Analyse der in Data-Warehouse-Systemen gespeicherten Daten ist das Zusammenspiel zwischen Anwendung und Datenbanksystem. Der vorliegende Artikel klassifiziert und diskutiert unterschiedliche Wege, Data-Warehouse-Anwendungen mit dem Datenbanksystem zu koppeln, um komplexe OLAP-Szenarien zur Berechnung dem Datenbankdienst zu überlassen. Dabei werden vier unterschiedliche Kategorien, die Spracherweiterung (SQL), die anwendungsspezifische Sprachneuentwicklung (MDX), die Nutzung spezifischer Objektmodelle (JOLAP) und schließlich der Rückgriff auf XML-basierte WebServices (XCube) im einzelnen diskutiert und vergleichend gegenübergestellt. / The connection of the applications and the underlying database system is crucial for performing analyses efficiently within a data warehouse system. This paper classifies and discusses different methods to bring data warehouse applications logically close to the underlying database system so that the computation of complex OLAP scenarios may be performed within the database system and not outside at the application. In detail, four different categories ranging from language extension (SQL) over the design of a new query language (MDX) and using special object models (JOLAP) to the use of XML-based WebServices are discussed and compared in detail. info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/620 ddc:620
343	Cardinality estimation in ETL processes Lehner, Wolfgang, Thiele, Maik, Kiefer, Tim 22 April 2022 (has links) The cardinality estimation in ETL processes is particularly difficult. Aside from the well-known SQL operators, which are also used in ETL processes, there are a variety of operators without exact counterparts in the relational world. In addition to those, we find operators that support very specific data integration aspects. For such operators, there are no well-examined statistic approaches for cardinality estimations. Therefore, we propose a black-box approach and estimate the cardinality using a set of statistic models for each operator. We discuss different model granularities and develop an adaptive cardinality estimation framework for ETL processes. We map the abstract model operators to specific statistic learning approaches (regression, decision trees, support vector machines, etc.) and evaluate our cardinality estimations in an extensive experimental study. info:eu-repo/classification/ddc/004 ddc:004
344	How to Juggle Columns: An Entropy-Based Approach for Table Compression Paradies, Marcus, Lemke, Christian, Plattner, Hasso, Lehner, Wolfgang, Sattler, Kai-Uwe, Zeier, Alexander, Krueger, Jens 25 August 2022 (has links) Many relational databases exhibit complex dependencies between data attributes, caused either by the nature of the underlying data or by explicitly denormalized schemas. In data warehouse scenarios, calculated key figures may be materialized or hierarchy levels may be held within a single dimension table. Such column correlations and the resulting data redundancy may result in additional storage requirements. They may also result in bad query performance if inappropriate independence assumptions are made during query compilation. In this paper, we tackle the specific problem of detecting functional dependencies between columns to improve the compression rate for column-based database systems, which both reduces main memory consumption and improves query performance. Although a huge variety of algorithms have been proposed for detecting column dependencies in databases, we maintain that increased data volumes and recent developments in hardware architectures demand novel algorithms with much lower runtime overhead and smaller memory footprint. Our novel approach is based on entropy estimations and exploits a combination of sampling and multiple heuristics to render it applicable for a wide range of use cases. We demonstrate the quality of our approach by means of an implementation within the SAP NetWeaver Business Warehouse Accelerator. Our experiments indicate that our approach scales well with the number of columns and produces reliable dependence structure information. This both reduces memory consumption and improves performance for nontrivial queries. info:eu-repo/classification/ddc/004 ddc:004
345	Transparent Forecasting Strategies in Database Management Systems Fischer, Ulrike, Lehner, Wolfgang 02 February 2023 (has links) Whereas traditional data warehouse systems assume that data is complete or has been carefully preprocessed, increasingly more data is imprecise, incomplete, and inconsistent. This is especially true in the context of big data, where massive amount of data arrives continuously in real-time from vast data sources. Nevertheless, modern data analysis involves sophisticated statistical algorithm that go well beyond traditional BI and, additionally, is increasingly performed by non-expert users. Both trends require transparent data mining techniques that efficiently handle missing data and present a complete view of the database to the user. Time series forecasting estimates future, not yet available, data of a time series and represents one way of dealing with missing data. Moreover, it enables queries that retrieve a view of the database at any point in time - past, present, and future. This article presents an overview of forecasting techniques in database management systems. After discussing possible application areas for time series forecasting, we give a short mathematical background of the main forecasting concepts. We then outline various general strategies of integrating time series forecasting inside a database and discuss some individual techniques from the database community. We conclude this article by introducing a novel forecasting-enabled database management architecture that natively and transparently integrates forecast models. info:eu-repo/classification/ddc/004 ddc:004
346	A Sample Advisor for Approximate Query Processing Rösch, Philipp, Lehner, Wolfgang 25 January 2023 (has links) The rapid growth of current data warehouse systems makes random sampling a crucial component of modern data management systems. Although there is a large body of work on database sampling, the problem of automatic sample selection remained (almost) unaddressed. In this paper, we tackle the problem with a sample advisor. We propose a cost model to evaluate a sample for a given query. Based on this, our sample advisor determines the optimal set of samples for a given set of queries specified by an expert. We further propose an extension to utilize recorded workload information. In this case, the sample advisor takes the set of queries and a given memory bound into account for the computation of a sample advice. Additionally, we consider the merge of samples in case of overlapping sample advice and present both an exact and a heuristic solution. Within our evaluation, we analyze the properties of the cost model and compare the proposed algorithms. We further demonstrate the effectiveness and the efficiency of the heuristic solutions with a variety of experiments. info:eu-repo/classification/ddc/004 ddc:004
347	Echtzeit-Data-Warehouse-Systeme Thiele, Maik, Lehner, Wolfgang 26 January 2023 (has links) Die stets zentraler werdende Rolle der Data Warehouses, in allen Entscheidungsebenen eines Unternehmens, führt zu der Forderung nach hochaktuellen Daten bzw. echtzeitfähigen Data-Warehouses-Systemen. Dieser Artikel stellt die Frage inwieweit mit bestehenden Data-Warehouse-Architekturen eine Informationsversorgung in Echtzeit zu gewährleisten ist, deckt die Schwächen dieser Architekturen auf und diskutiert verschiedene Lösungsansätze. info:eu-repo/classification/ddc/004 ddc:004
348	Die Datenbankforschungsgruppe der Technischen Universität Dresden stellt sich vor Wolfgang, Lehner 27 January 2023 (has links) Im Herbst 2012 feiert der Lehrstuhl Datenbanken an der Technischen Universität Dresden sein 10-jähriges Bestehen unter der Leitung von Wolfgang Lehner. In diesem Zeitraum wurde die inhaltliche Ausrichtung im Bereich der Datenbankunterstützung zur Auswertung großer Datenbestände weiter fokussiert sowie auf Systemebene deutlich ausgeweitet. Die Forschungsgruppe um Wolfgang Lehner ist dabei sowohl auf internationaler Ebene durch Publikationen und Kooperationen sichtbar als auch in Forschungsverbünden auf regionaler Ebene aktiv, um sowohl an der extrem jungen und agilen Software-Industrie in Dresden zu partizipieren und, soweit eine Forschungsgruppe dies zu leisten vermag, auch unterstützend zu wirken. [Aus: Einleitung] info:eu-repo/classification/ddc/004 ddc:004
349	Merging OLTP and OLAP: Back to the Future Lehner, Wolfgang 13 January 2023 (has links) When the terms “Data Warehousing” and “Online Analytical Processing” were coined in the 1990s by Kimball, Codd, and others, there was an obvious need for separating data and workload for operational transactional-style processing and decision-making implying complex analytical queries over large and historic data sets. Large data warehouse infrastructures have been set up to cope with the special requirements of analytical query answering for multiple reasons: For example, analytical thinking heavily relies on predefined navigation paths to guide the user through the data set and to provide different views on different aggregation levels.Multi-dimensional queries exploiting hierarchically structured dimensions lead to complex star queries at a relational backend, which could hardly be handled by classical relational systems. [Off: Introduction] info:eu-repo/classification/ddc/004 ddc:004
350	Duomenų gavimas iš daugialypių šaltinių ir jų struktūrizavimas / Data Mining from Multiple Sources and Structurization Barauskas, Antanas 19 June 2014 (has links) Šio darbo idėja yra Išgauti-Pertvarkyti-Įkelti (angl. ETL) principu veikiančios sistemos sukūrimas. Sistema išgauna duomenis iš skirtingo tipo šaltinių, juos tinkamai pertvarko ir tik tuomet įkelia į parinktą saugojimo vietą. Išnagrinėti pagrindiniai duomenų gavimo būdai ir populiariausi šiuo metu ETL įrankiai. Sukurta debesų kompiuterija paremtos daugiakomponentinės duomenų gavimo iš daugialypių šaltinių ir jų struktūrizavimo vieningu formatu sistemos architektūra ir prototipas. Skirtingai nuo duomenis kaupiančių sistemų, ši sistema duomenis išgauna tik tuomet, kai jie reikalingi. Duomenų saugojimui naudojama grafu paremta duomenų bazė, kuri leidžia saugoti ne tik duomenis bet ir jų tarpusavio ryšių informaciją. Darbo apimtis: 48 puslapiai, 19 paveikslėlių, 10 lentelių ir 30 informacijos šaltinių. / The aim of this work is to create ETL (Extract-Transform-Load) system for data extraction from different types of data sources, proper transformation of the extracted data and loading the transformed data into the selected place of storage. The main techniques of data extraction and the most popular ETL tools available today have been analyzed. An architectural solution based on cloud computing, as well as, a prototype of the system for data extraction from multiple sources and data structurization have been created. Unlike the traditional data storing - based systems, the proposed system allows to extract data only in case it is needed for analysis. The graph database employed for data storage enables to store not only the data, but also the information about the relations of the entities. Structure: 48 pages, 19 figures, 10 tables and 30 references. Informatics Duomenų gavimas Duomenų pertvarkymas Duomenų įkėlimas Duomenų saugykla Data extraction Data transformation Data loading Data warehouse

Search results