Spelling suggestions: "subject:"data warehouse"" "subject:"data arehouse""
151 |
Utvinningsmetoder och användningsområden av data ur datalager inom industribranschenAndersson, Jonas January 2005 (has links)
I datalager samlas information från flera olika system och sparas på ett gemensamt sätt. Informationen i ett datalager ger en holistiskbild över organisation och bör användas som beslutsstöd. För att utvinna information ur datalager finns flera olika metoder och informationen kan sedan användas inom flera olika områden. Syftet med studien är att undersöka vilka metoder som används för att utvinna informationen ur datalager inom industribranschen samt inom vilka användningsområden informationen används. För att besvara frågeställningen genomfördes fem intervjuer med företag i industribranschen. Resultatet visar på att vissa metoder för att utvinna information ur datalager används framför andra och inom datautvinning används inte den förutsägandekategorin alls. Framförallt använde sig företagen av informationen på kundsidan och såg det som ett levande projekt som skall täcka in hela processen i framtiden.
|
152 |
Utvärdering av riktlinjer för inkorporering av syndikat data i datalager : Praktikfältets syn på tillämpbarhet och nyttoeffekt av Strands riktlinjer för inkorporering av syndikat data i datalager.Helander, Magnus January 2005 (has links)
Inkorporering av extern data i datalager är problematiskt och problematiken bekräftas av aktuella undersökningar inom området. Detta har medfört att det utvecklats olika former av stöd för att bemöta och analysera problemen som organisationer ställs inför. För organisationer är det i högsta grad viktigt att dess beslutsfattare är välinformerade och klarar av att selektera information från stora mängder data. Det är i dessa sammanhang som en datalagerlösning är en viktig hörnsten för att stödja analys och presentation av data som ursprungligen är lagrad i olika datakällor (både interna och externa). Genom att inkorporera extern data i datalagret uppnår datalagret en betydligt högre potential och således kan även organisationer och framförallt dess beslutsfattare utvinna stora fördelar. Strand (2005) har tagit fram riktlinjer för att stödja inkorporeringsprocessen av extern data i datalager. Dock saknas en utvärdering av riktlinjerna. En utvärdering bidrar till att riktlinjernas trovärdighet stärks och att riktlinjerna på ett tidigt stadie förs in i en förvaltningsprocess.
|
153 |
Metadatadriven transformering mellan datamodellerÅhlfeldt, Fredrik January 2000 (has links)
För att flytta information från en databas till ett datalager används det idag olika tekniker. Existerande transformeringstekniker baseras på att en applikation hanterar detta. Detta examensarbete går ut på att skapa och undersöka en metod som istället genomför transformeringen i en databas. Denna transformering är metadatadriven, eftersom metadata är den information om data som krävs för att en transformering ska vara möjlig. Arbetet bygger därför på en metadatastudie som behandlar representation och struktur av metadata. Målet med arbetet är att få fram en så generell transformeringsmetod som möjligt och metoden går ut på att transformera data från en normaliserad databasstruktur till en denormaliserad datalagersstruktur.
|
154 |
Internal control risks within the data warehouse environmentDe la Rosa, Sean Paul 21 January 2008 (has links)
Please read the abstract in the section 00front of this document / Dissertation (MCom (Computer Auditing))--University of Pretoria, 2008. / Accounting / MCom / unrestricted
|
155 |
Budování rozsáhlých datových skladů na platformě MS SQL 2008 / Building of large Data Warehouses on MS SQL Server 2008 platformGottwald, Tomáš January 2009 (has links)
This diploma thesis deals with building of large Data Warehouses on Microsoft SQL Server 2008 platform. First part of the thesis discuss about news in MS SQL Server 2008, which I have considered as being important for Business Intelligence area. Following chapters are focused on approach to implementation of real Data Warehouse for Customs Administration of Czech Republic on MS SQL Server 2008 basis. These parts of the thesis cover specifics of the Customs Administration of Czech Republic, current state of Data Warehouse and reasons for migrating to MS SQL Server 2008. Further the thesis describes both logical and physical architecture of proposed solution and a way of implementation Data Warehouse on MS SQL Server 2008. The main aim of the thesis is to create a list of critical success factors (CSF) of building large Data Warehouses on MS SQL Server 2008 platform. It's not only plain list CSF, but rather the best practices for MS SQL Server Data Warehouses implementation. The most significant contribution of this thesis is that it offers instruction manual for designing and implementing large Data Warehouse on MS SQL Server basis. Background for the creation of summary of CSV and recommendations were vendor's publications and especially my own experience with the product. My colleagues from company Adastra gave me an advisory opinion as well.
|
156 |
Optimalizace databázového systému Teradata / Teradata Database System OptimizationKrejčík, Jan January 2008 (has links)
The Teradata database system is specially designed for data warehousing environment. This thesis explores the use of Teradata in this environment and describes its characteristics and potential areas for optimization. The theoretical part is tended to be a user study material and it shows the main principles Teradata system operation and describes factors significantly affecting system performance. Following sections are based on previously acquired information which is used for analysis and optimization in several areas of described environment.
|
157 |
Návrh projektu business intelligence / Project Draft Business IntelligenceŠídlo, Petr January 2007 (has links)
This dissertation titled Designing Project Business Intelligence focuses on data warehouse in a medium-sized company. The theoretical portion of the thesis describes the topic of data warehouse and dimensional modeling. In the first theoretical chapter, I compare and contrast two approaches to datawarehouse building - Bill Inmon's and Ralph Kimball's. Then, basic terminology used in the field, principles of dimensional modeling and various approaches of database use will be described. Furthermore, the thesis illustrates the role of the Mondrian system as a mediator between the relational database and the OLAP server. Possible uses of the MDX language in working with multidimensional databases are outlined. The theoretical portion of the thesis ends with a description of the applicational interface XMLA, which can be used for facilitation communication between applications and the OLAP server. The practice-oriented portion of the thesis includes a complete design of the Business Intelligence project. The project is divided into the following parts: Feasibility Study, Project Planing, Business Requirements, Design and Development, and Deployment and Maintenance. The entire project operates exclusively on Open Source Software. Consequently, the project shows that small and medium-sized companies can afford to run a fully operating Business Intelligence system.
|
158 |
Datový sklad v prostředí Amazon Web Services / Data warehouse in the Amazon Web ServicesKuželka, Kryštof January 2015 (has links)
The primary objective of this work is to investigate the potential of utilizing Hadoop and Amazon Redshift in the Amazon Web Services ("AWS") cloud, in order to design and implement a data warehouse, the efficacy of which will be tested afterwards. Contributions of this work include: documenting the technologies in the AWS cloud in Czech, demonstration of the design and performance tests of the data warehouse and the ETL part. Another considerable benefit is the added value to the company for whom the project was designed, and which is currently using the output of the project.
|
159 |
Provoz a udržitelný rozvoj datového skladu / Operation and sustainable development of the data warehouseHník, Pavel January 2011 (has links)
The present diploma thesis focuses on the subject matter of large data warehouses under the aspect of their sustainable operation and development, seeking to analyze the key factors which influence the operation of data warehouses at large businesses. The thesis is divided into three main sections, preceded by a short theoretical introduction. The first part aims at explaining and illustrating the architecture of data warehouses, with the bulk of this section being devoted to a description of the key components of data warehouse solutions as a whole, and the various available alternatives for their technical implementation. The second section familiarizes the reader with the manner in which data warehouses are managed and operated, discussing the most important operational tasks in individual sub-sections. The final section focuses on working with client tools within the ETL platform of Informatica Powercenter. This work's original contribution lies in the comparison of how the design of individual components expresses itself in the simplicity and quality of routine operations. As a matter of fact, data warehouses can only be operated as conveniently and competently as they were designed and implemented. In this thesis of mine, I primarily draw on my immediate experience and acquired knowledge from data warehouse operations at a large domestic enterprise.
|
160 |
Ad Hoc Information Extraction in a Clinical Data Warehouse with Case Studies for Data Exploration and Consistency Checks / Ad Hoc Informationsextraktion in einem Klinischen Data-Warehouse mit Fallstudien zur Datenexploration und KonsistenzüberprüfungenDietrich, Georg January 2019 (has links) (PDF)
The importance of Clinical Data Warehouses (CDW) has increased significantly in recent years as they support or enable many applications such as clinical trials, data mining, and decision making.
CDWs integrate Electronic Health Records which still contain a large amount of text data, such as discharge letters or reports on diagnostic findings in addition to structured and coded data like ICD-codes of diagnoses.
Existing CDWs hardly support features to gain information covered in texts.
Information extraction methods offer a solution for this problem but they have a high and long development effort, which can only be carried out by computer scientists.
Moreover, such systems only exist for a few medical domains.
This paper presents a method empowering clinicians to extract information from texts on their own. Medical concepts can be extracted ad hoc from e.g. discharge letters, thus physicians can work promptly and autonomously. The proposed system achieves these improvements by efficient data storage, preprocessing, and with powerful query features. Negations in texts are recognized and automatically excluded, as well as the context of information is determined and undesired facts are filtered, such as historical events or references to other persons (family history).
Context-sensitive queries ensure the semantic integrity of the concepts to be extracted.
A new feature not available in other CDWs is to query numerical concepts in texts and even filter them (e.g. BMI > 25).
The retrieved values can be extracted and exported for further analysis.
This technique is implemented within the efficient architecture of the PaDaWaN CDW and evaluated with comprehensive and complex tests.
The results outperform similar approaches reported in the literature.
Ad hoc IE determines the results in a few (milli-) seconds and a user friendly GUI enables interactive working, allowing flexible adaptation of the extraction.
In addition, the applicability of this system is demonstrated in three real-world applications at the Würzburg University Hospital (UKW).
Several drug trend studies are replicated: Findings of five studies on high blood pressure, atrial fibrillation and chronic renal failure can be partially or completely confirmed in the UKW. Another case study evaluates the prevalence of heart failure in inpatient hospitals using an algorithm that extracts information with ad hoc IE from discharge letters and echocardiogram report (e.g. LVEF < 45 ) and other sources of the hospital information system.
This study reveals that the use of ICD codes leads to a significant underestimation (31%) of the true prevalence of heart failure.
The third case study evaluates the consistency of diagnoses by comparing structured ICD-10-coded diagnoses with the diagnoses described in the diagnostic section of the discharge letter.
These diagnoses are extracted from texts with ad hoc IE, using synonyms generated with a novel method.
The developed approach can extract diagnoses from the discharge letter with a high accuracy and furthermore it can prove the degree of consistency between the coded and reported diagnoses. / Die Bedeutung von Clinical Data Warehouses (CDW) hat in den letzten Jahren stark zugenommen, da sie viele Anwendungen wie klinische Studien, Data Mining und Entscheidungsfindung unterstützen oder ermöglichen. CDWs integrieren elektronische Patientenakten, die neben strukturierten und kodierten Daten wie ICD-Codes von Diagnosen immer noch sehr vielen Textdaten enthalten, sowie Arztbriefe oder Befundberichte. Bestehende CDWs unterstützen kaum Funktionen, um die in den Texten enthaltenen Informationen zu nutzen. Informationsextraktionsmethoden bieten zwar eine Lösung für dieses Problem, erfordern aber einen hohen und langen Entwicklungsaufwand, der nur von Informatikern durchgeführt werden kann. Außerdem gibt es solche Systeme nur für wenige medizinische Bereiche.
Diese Arbeit stellt eine Methode vor, die es Ärzten ermöglicht, Informationen aus Texten selbstständig zu extrahieren. Medizinische Konzepte können ad hoc aus Texten (z. B. Arztbriefen) extrahiert werden, so dass Ärzte unverzüglich und autonom arbeiten können. Das vorgestellte System erreicht diese Verbesserungen durch effiziente Datenspeicherung, Vorverarbeitung und leistungsstarke Abfragefunktionen.
Negationen in Texten werden erkannt und automatisch ausgeschlossen, ebenso wird der Kontext von Informationen bestimmt und unerwünschte Fakten gefiltert, wie z. B. historische Ereignisse oder ein Bezug zu anderen Personen (Familiengeschichte).
Kontextsensitive Abfragen gewährleisten die semantische Integrität der zu extrahierenden Konzepte. Eine neue Funktion, die in anderen CDWs nicht verfügbar ist, ist die Abfrage numerischer Konzepte in Texten und sogar deren Filterung (z. B. BMI > 25). Die abgerufenen Werte können extrahiert und zur weiteren Analyse exportiert werden.
Diese Technik wird innerhalb der effizienten Architektur des PaDaWaN-CDW implementiert und mit umfangreichen und aufwendigen Tests evaluiert. Die Ergebnisse übertreffen ähnliche Ansätze, die in der Literatur beschrieben werden. Ad hoc IE ermittelt die Ergebnisse in wenigen (Milli-)Sekunden und die benutzerfreundliche Oberfläche ermöglicht interaktives Arbeiten und eine flexible Anpassung der Extraktion.
Darüber hinaus wird die Anwendbarkeit dieses Systems in drei realen Anwendungen am Universitätsklinikum Würzburg (UKW) demonstriert: Mehrere Medikationstrendstudien werden repliziert: Die Ergebnisse aus fünf Studien zu Bluthochdruck, Vorhofflimmern und chronischem Nierenversagen können in dem UKW teilweise oder vollständig bestätigt werden. Eine weitere Fallstudie bewertet die Prävalenz von Herzinsuffizienz in stationären Patienten in Krankenhäusern mit einem Algorithmus, der Informationen mit Ad-hoc-IE aus Arztbriefen, Echokardiogrammbericht und aus anderen Quellen des Krankenhausinformationssystems extrahiert (z. B. LVEF < 45). Diese Studie zeigt, dass die Verwendung von ICD-Codes zu einer signifikanten Unterschätzung (31%) der tatsächlichen Prävalenz von Herzinsuffizienz führt. Die dritte Fallstudie bewertet die Konsistenz von Diagnosen, indem sie strukturierte ICD-10-codierte Diagnosen mit den Diagnosen, die im Diagnoseabschnitt des Arztbriefes beschriebenen, vergleicht. Diese Diagnosen werden mit Ad-hoc-IE aus den Texten gewonnen, dabei werden Synonyme verwendet, die mit einer neuartigen Methode generiert werden. Der verwendete Ansatz kann Diagnosen mit hoher Genauigkeit aus Arztbriefen extrahieren und darüber hinaus den Grad der Übereinstimmung zwischen den kodierten und beschriebenen Diagnosen bestimmen.
|
Page generated in 0.0419 seconds