Global ETD Search

11	Formal Assessment and Measurement of Data Utilization and Value for Mines Rogers, William Pratt January 2015 (has links) Most large contemporary mines already have considerable amounts of data, much of which goes largely unused. The key challenge in big data is increasing data utilization. Much of the data in the mine (not plant) come from a variety of systems, each with different databases and reporting environments. Standard technology deployments create a "silo-ification" of data leading to poor system usage. Through modern server monitoring, data utilization can quantifiably be measured. A host of other quantifiable, often automated approaches, to measuring data use and value can also be incorporated as a means of monitoring value generation. A data valuation tool is presented to measure the data assets at an operation. The Data Value Index (DVI) quantifies business intelligence best practices and user interaction considering managerial flexibility and data utilization rates. The DVI is built considering many case studies of data warehousing at various mining companies, some of which will be presented. Data Utilization mining technology Use of data warehouses Business Intelligence
12	Evaluation of view maintenance with complex joins in a data warehouse environment Asthorsson, Kjartan January 2002 (has links) Data warehouse maintenance and maintenance cost has been well studied in the literature. Integrating data sources, in a data warehouse environment, may often need data cleaning, transformation, or any other function applied to the data in order to integrate it. The impact on view maintenance, when data is integrated with other comparison operators than defined in theta join, has, however, not been closely looked at in previous studies. In this study the impact of using a complex join in data warehouse environment is analyzed to measure how different maintenance strategies are affected when data needs to be integrated using other comparison operators than defined in a theta join. The analysis shows that maintenance cost is greatly increased when using complex joins since such joins often lack optimization techniques which are available when using a theta join. The study shows, among other things, that the join aware capability of sources is not of importance when performing complex joins, and incremental view maintenance is better approach than using recomputed view maintenance, when using complex joins. Strategies for maintaining data warehouses when data is integrated using a complex join are therefore different than when a theta join is used, and different maintenance strategies need to be applied. Data warehouses view maintenance join algorithms Information Systems
13	Question Answering System in a Business Intelligence Context / Système de questions/réponses dans un contexte de business intelligence Kuchmann-Beauger, Nicolas 15 February 2013 (has links) Le volume et la complexité des données générées par les systèmes d’information croissent de façon singulière dans les entrepôts de données. Le domaine de l’informatique décisionnelle (aussi appelé BI) a pour objectif d’apporter des méthodes et des outils pour assister les utilisateurs dans leur tâche de recherche d’information. En effet, les sources de données ne sont en général pas centralisées, et il est souvent nécessaire d’interagir avec diverses applications. Accéder à l’information est alors une tâche ardue, alors que les employés d’une entreprise cherchent généralement à réduire leur charge de travail. Pour faire face à ce constat, le domaine « Enterprise Search » s’est développé récemment, et prend en compte les différentes sources de données appartenant aussi bien au réseau privé d’entreprise qu’au domaine public (telles que les pages Internet). Pourtant, les utilisateurs de moteurs de recherche actuels souffrent toujours de du volume trop important d’information à disposition. Nous pensons que de tels systèmes pourraient tirer parti des méthodes du traitement naturel des langues associées à celles des systèmes de questions/réponses. En effet, les interfaces en langue naturelle permettent aux utilisateurs de rechercher de l’information en utilisant leurs propres termes, et d’obtenir des réponses concises et non une liste de documents dans laquelle l’éventuelle bonne réponse doit être identifiée. De cette façon, les utilisateurs n’ont pas besoin d’employer une terminologie figée, ni de formuler des requêtes selon une syntaxe très précise, et peuvent de plus accéder plus rapidement à l’information désirée. Un challenge lors de la construction d’un tel système consiste à interagir avec les différentes applications, et donc avec les langages utilisés par ces applications d’une part, et d’être en mesure de s’adapter facilement à de nouveaux domaines d’application d’autre part. Notre rapport détaille un système de questions/réponses configurable pour des cas d’utilisation d’entreprise, et le décrit dans son intégralité. Dans les systèmes traditionnels de l’informatique décisionnelle, les préférences utilisateurs ne sont généralement pas prises en compte, ni d’ailleurs leurs situations ou leur contexte. Les systèmes état-de-l’art du domaine tels que Soda ou Safe ne génèrent pas de résultats calculés à partir de l’analyse de la situation des utilisateurs. Ce rapport introduit une approche plus personnalisée, qui convient mieux aux utilisateurs finaux. Notre expérimentation principale se traduit par une interface de type search qui affiche les résultats dans un dashboard sous la forme de graphes, de tables de faits ou encore de miniatures de pages Internet. En fonction des requêtes initiales des utilisateurs, des recommandations de requêtes sont aussi affichées en sus, et ce dans le but de réduire le temps de réponse global du système. En ce sens, ces recommandations sont comparables à des prédictions. Notre travail se traduit par les contributions suivantes : tout d’abord, une architecture implémentée via des algorithmes parallélisés et qui prend en compte la diversité des sources de données, à savoir des données structurées ou non structurées dans le cadre d’un framework de questions/réponses qui peut être facilement configuré dans des environnements différents. De plus, une approche de traduction basée sur la résolution de contrainte, qui remplace le traditionnel langage-pivot par un modèle conceptuel et qui conduit à des requêtes multidimensionnelles mieux personnalisées. En outre, en ensemble de patrons linguistiques utilisés pour traduire des questions BI en des requêtes pour bases de données, qui peuvent être facilement adaptés dans le cas de configurations différentes. / The amount and complexity of data generated by information systems keep increasing in Warehouses. The domain of Business Intelligence (BI) aims at providing methods and tools to better help users in retrieving those data. Data sources are distributed over distinct locations and are usually accessible through various applications. Looking for new information could be a tedious task, because business users try to reduce their work overload. To tackle this problem, Enterprise Search is a field that has emerged in the last few years, and that takes into consideration the different corporate data sources as well as sources available to the public (e.g. World Wide Web pages). However, corporate retrieval systems nowadays still suffer from information overload. We believe that such systems would benefit from Natural Language (NL) approaches combined with Q&A techniques. Indeed, NL interfaces allow users to search new information in their own terms, and thus obtain precise answers instead of turning to a plethora of documents. In this way, users do not have to employ exact keywords or appropriate syntax, and can have faster access to new information. Major challenges for designing such a system are to interface different applications and their underlying query languages on the one hand, and to support users’ vocabulary and to be easily configured for new application domains on the other hand. This thesis outlines an end-to-end Q&A framework for corporate use-cases that can be configured in different settings. In traditional BI systems, user-preferences are usually not taken into account, nor are their specific contextual situations. State-of-the art systems in this field, Soda and Safe do not compute search results on the basis of users’ situation. This thesis introduces a more personalized approach, which better speaks to end-users’ situations. Our main experimentation, in this case, works as a search interface, which displays search results on a dashboard that usually takes the form of charts, fact tables, and thumbnails of unstructured documents. Depending on users’ initial queries, recommendations for alternatives are also displayed, so as to reduce response time of the overall system. This process is often seen as a kind of prediction model. Our work contributes to the following: first, an architecture, implemented with parallel algorithms, that leverages different data sources, namely structured and unstructured document repositories through an extensible Q&A framework, and this framework can be easily configured for distinct corporate settings; secondly, a constraint-matching-based translation approach, which replaces a pivot language with a conceptual model and leads to more personalized multidimensional queries; thirdly, a set of NL patterns for translating BI questions in structured queries that can be easily configured in specific settings. In addition, we have implemented an iPhone/iPad™ application and an HTML front-end that demonstrate the feasibility of the various approaches developed through a series of evaluation metrics for the core component and scenario of the Q&A framework. To this end, we elaborate on a range of gold-standard queries that can be used as a basis for evaluating retrieval systems in this area, and show that our system behave similarly as the well-known WolframAlpha™ system, depending on the evaluation settings. Entrepôts de données Systèmes de questions/réponses Langage naturel Data warehouses Question Answering systems Natural Language
14	Využití nástrojů Business Intelligence k zefektivnění zákaznického centra / Using the Tools of Business Intelligence in Customer Care Center Mencner, Jacek January 2017 (has links) This master's thesis deals with proposal of Business Intelligence solution. The main task is analysis of Customer Care processes, based on which the Business Intelligence solution has been designed. Thanks to obtained knowledge changes has been made to get more effective contact process. The first part of the thesis describes the theoretical foundations of Business Intelligence. In the second part is analysis of the current situation and solution proposal.
15	Návrh řešení Business Intelligence / Proposal Solution of Business Intelligence Drdla, Tomáš January 2016 (has links) The aim of this thesis is to propose Business Intelligence solutions and consider its impact on decision-making, implementation costs, the overall contribution to the company and make a proposal that will help to change in the present conditiontion unsatisfactory situation of data management in enterprise JáNěkdo.CZ s.r.o.
16	Data Cleaning: Problems and Current Approaches Rahm, Erhard, Do, Hong Hai 04 February 2019 (has links) We classify data quality problems that are addressed by data cleaning and provide an overview of the main solution approaches. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data warehouses, data cleaning is a major part of the so-called ETL process. We also discuss current tool support for data cleaning. info:eu-repo/classification/ddc/004 ddc:004
17	A Provenance-based Approach Towards Impact Assessment of Schema Changes in a Data Warehouse Environment Aggarwal, Dippy January 2017 (has links) No description available. Computer Science data warehouses Impact assessment provenance schema evolution pentaho etl
18	RiTE: Providing On-Demand Data for Right-Time Data Warehousing Lehner, Wolfgang, Thomsen, Christian, Bach Pedersen, Torben 20 June 2022 (has links) Data warehouses (DWs) have traditionally been loaded with data at regular time intervals, e.g., monthly, weekly, or daily, using fast bulk loading techniques. Recently, the trend is to insert all (or only some) new source data very quickly into DWs, called near-realtime DWs (right-time DWs). This is done using regular INSERT statements, resulting in too low insert speeds. There is thus a great need for a solution that makes inserted data available quickly, while still providing bulk-load insert speeds. This paper presents RiTE ('Right-Time ETL'), a middleware system that provides exactly that. A data producer (ETL) can insert data that becomes available to data consumers on demand. RiTE includes an innovative main-memory based catalyst that provides fast storage and offers concurrency control. A number of policies controlling the bulk movement of data based on user requirements for persistency, availability, freshness, etc. are supported. The system works transparently to both producer and consumers. The system is integrated with an open source DBMS, and experiments show that it provides 'the best of both worlds', i.e., INSERT-like data availability, but with bulk-load speeds (up to 10 times faster). info:eu-repo/classification/ddc/004 ddc:004
19	DIPBench Toolsuite Lehner, Wolfgang, Böhm, Matthias, Habich, Dirk, Wloka, Uwe 27 May 2022 (has links) So far the optimization of integration processes between heterogeneous data sources is still an open challenge. A first step towards sufficient techniques was the specification of a universal benchmark for integration systems. This DIPBench allows to compare solutions under controlled conditions and would help generate interest in this research area. However, we see the requirement for providing a sophisticated toolsuite in order to minimize the effort for benchmark execution. This demo illustrates the use of the DIPBench toolsuite. We show the macro-architecture as well as the micro-architecture of each tool. Furthermore, we also present the first reference benchmark implementation using a federated DBMS. Thereby, we discuss the impact of the defined benchmark scale factors. Finally, we want to give guidance on how to benchmark other integration systems and how to extend the toolsuite with new distribution functions or other functionalities. info:eu-repo/classification/ddc/005 ddc:005
20	fAST Refresh using Mass Query Optimization Lehner, Wolfgang, Cochrane, Bobbie, Pirahesh, Hamid, Zaharioudakis, Markos 02 June 2022 (has links) Automatic summary tables (ASTs), more commonly known as materialized views, are widely used to enhance query performance, particularly for aggregate queries. Such queries access a huge number of rows to retrieve aggregated summary data while performing multiple joins in the context of a typical data warehouse star schema. To keep ASTs consistent with their underlying base data, the ASTs are either immediately synchronized or fully recomputed. This paper proposes an optimization strategy for simultaneously refreshing multiple ASTs, thus avoiding multiple scans of a large fact table (one pass for AST computation). A query stacking strategy detects common sub-expressions using the available query matching technology of DB2. Since exact common sub-expressions are rare, the novel query sharing approach systematically generates common subexpressions for a given set of 'related' queries, considering different predicates, grouping expressions, and sets of base tables. The theoretical framework, a prototype implementation of both strategies in the IBM DB2 UDB/UWO database system, and performance evaluations based on the TPC/R data schema are presented in this paper. info:eu-repo/classification/ddc/004 ddc:004

Search results