Global ETD Search

271	A Plan for OLAP Jaecksch, Bernhard, Lehner, Wolfgang, Faerber, Franz 30 May 2022 (has links) So far, data warehousing has often been discussed in the light of complex OLAP queries and as reporting facility for operative data. We argue that business planning as a means to generate plan data is an equally important cornerstone of a data warehouse system, and we propose it to be a first-class citizen within an OLAP engine. We introduce an abstract model describing relevant aspects of the planning process in general and the requirements it poses to a planning engine. Furthermore, we show that business planning lends itself well to parallelization and benefits from a column-store much like traditional OLAP does. We then develop a physical model specifically targeted at a highly parallel column-store, and with our implementation, we show nearly linear scaling behavior. info:eu-repo/classification/ddc/004 ddc:004
272	Managing high data availability in dynamic distributed derived data management system (D4M) under Churn Mirza, Ahmed Kamal January 2012 (has links) The popularity of decentralized systems is increasing day by day. These decentralized systems are preferable to centralized systems for many reasons, specifically they are more reliable and more resource efficient. Decentralized systems are more effective in the area of information management in the case when the data is distributed across multiple peers and maintained in a synchronized manner. This data synchronization is the main requirement for information management systems deployed in a decentralized environment, especially when data/information is needed for monitoring purposes or some dependent data artifacts rely upon this data. In order to ensure a consistent and cohesive synchronization of dependent/derived data in a decentralized environment, a dependency management system is needed. In a dependency management system, when one chunk of data relies on another piece of data, the resulting derived data artifacts can use a decentralized systems approach but must consider several critical issues, such as how the system behaves if any peer goes down, how the dependent data can be recalculated, and how the data which was stored on a failed peer can be recovered. In case of a churn (resulting from failing peers), how does the system adapt the transmission of data artifacts with respect to their access patterns and how does the system provide consistency management? The major focus of this thesis was to addresses the churn behavior issues and to suggest and evaluate potential solutions while ensuring a load balanced network, within the scope of a dependency information management system running in a decentralized network. Additionally, in peer-to-peer (P2P) algorithms, it is a very common assumption that all peers in the network have similar resources and capacities which is not true in real world networks. The peer‟s characteristics can be quite different in actual P2P systems; as the peers may differ in available bandwidth, CPU load, available storage space, stability, etc. As a consequence, peers having low capacities are forced to handle the same computational load which the high capacity peers handle, resulting in poor overall system performance. In order to handle this situation, the concept of utility based replication is introduced in this thesis to avoid the assumption of peer equality, enabling efficient operation even in heterogeneous environments where the peers have different configurations. In addition, the proposed protocol assures a load balanced network while meeting the requirement for high data availability, thus keeping the distributed dependent data consistent and cohesive across the network. Furthermore, an implementation and evaluation in the PeerfactSim.KOM P2P simulator of an integrated dependency management framework, D4M, was done. In order to benchmark the implementation of proposed protocol, the performance and fairness tests were examined. A conclusion is that the proposed solution adds little overhead to the management of the data availability in a distributed data management systems despite using a heterogeneous P2P environment. Additionally, the results show that the various P2P clusters can be introduced in the network based on peer‟s capabilities. / Populariteten av decentraliserade system ökar varje dag. Dessa decentraliserade system är att föredra framför centraliserade system för många anledningar, speciellt de är mer säkra och mer resurseffektiv. Decentraliserade system är mer effektiva inom informationshantering i fall när data delas ut över flera Peers och underhållas på ett synkroniserat sätt. Dessa data synkronisering är huvudkravet för informationshantering som utplacerade i en decentraliserad miljö, särskilt när data / information behövs för att kontrollera eller några beroende artefakter uppgifter lita på dessa data. För att säkerställa en konsistent och härstammar synkronisering av beroende / härledd data i en decentraliserad miljö, är ett beroende ledningssystem behövs. I ett beroende ledningssystem, när en bit av data som beror på en annan bit av data, kan de resulterande erhållna uppgifterna artefakter använd decentraliserad system approach, men måste tänka på flera viktiga frågor, såsom hur systemet fungerar om någon peer går ner, hur beroende data kan omräknas, och hur de data som lagrats på en felaktig peer kan återvinnas. I fall av churn (på grund av brist Peers), hur systemet anpassar sändning av data artefakter med avseende på deras tillgång mönster och hur systemet ger konsistens förvaltning? Den viktigaste fokus för denna avhandling var att behandlas churn beteende frågor och föreslå och bedöma möjliga lösningar samtidigt som en belastning välbalanserat nätverk, inom ramen för ett beroende information management system som kör i ett decentraliserade nätverket. Dessutom, i peerto- peer (P2P) algoritmer, är det en mycket vanlig uppfattning att alla Peers i nätverket har liknande resurser och kapacitet vilket inte är sant i verkliga nätverk. Peer egenskaper kan vara ganska olika i verkliga P2P system, som de Peers kan skilja sig tillgänglig bandbredd, CPU tillgängligt lagringsutrymme, stabilitet, etc. Som en följd, är peers har låg kapacitet tvingade att hantera sammaberäkningsbelastningen som har hög kapacitet peer hanterar vilket resulterar i dåligsystemets totala prestanda. För att hantera den här situationen, är begreppet verktygetbaserad replikering införs i denna uppsats att undvika antagandet om peer jämlikhet, så att effektiv drift även i heterogena miljöer där Peers har olika konfigurationer. Dessutom säkerställer det föreslagna protokollet en belastning välbalanserat nätverk med iakttagande kraven på hög tillgänglighet och därför hålla distribuerade beroende datakonsekvent och kohesiv över nätverket. Vidare ett genomförande och utvärdering iPeerfactSim.KOM P2P simulatorn av en integrerad beroende förvaltningsram, D4M, var gjort[.] De prestandatester och tester rättvisa undersöktes för att riktmärka genomförandet avföreslagna protokollet. En slutsats är att den föreslagna lösningen tillagt lite overhead för förvaltningen av tillgången till uppgifterna inom ett distribuerade system för datahantering, trots med användning av en heterogen P2P miljö. Dessutom visar resultaten att de olikaP2P-kluster kan införas i nätverket baserat på peer-möjligheter. heterogeneous peer-to-peer networks dependency management framework p2p Replication protocol p2p churn Computer Systems Datorsystem
273	FDM-Handbuch für HAW: Handlungshilfe für aktives Forschungsdatenmanagement an Hochschulen für angewandte Wissenschaften Hesse, Elfi, Baier, Juliane, Schmidtke, Knut 24 January 2020 (has links) Das hier vorliegende Handbuch ist im Rahmen des Projektes „Vernetztes Forschungsdatenmanagement an Hochschulen für angewandte Wissenschaften am Beispiel der HTW Dresden – FoDaMa-HTWD“ entstanden.1 Es stellt eine kurze und übersichtliche Zusammenfassung der wichtigsten Erkenntnisse dar, welche während der Projektlaufzeit an der Hochschule für Technik und Wirtschaft Dresden (HTWD) zum Forschungsdatenmanagement (FDM) gewonnen wurden. Die Autor/innen möchten mit diesem Handbuch andere Hochschulen für angewandte Wissenschaften (HAW) bei der Strategieentwicklung und dem notwendigen FDM-Strukturaufbau unterstützen. Es richtet sich demnach vorrangig an Personen, die sich an Hochschulen mit der strategischen Weiterentwicklung im Bereich Forschung beschäftigen und sich vielleicht die Frage stellen, welche unterstützenden FDM-Services und Maßnahmen ergriffen werden sollten, damit die Forschenden der eigenen Institution der zunehmenden Forderung nach offener und nachhaltiger Arbeitsweise im Umgang mit Forschungsdaten gerecht werden können. / This handbook was developed within the project ' Vernetztes Forschungsdatenmanagement an Hochschulen für angewandte Wissenschaften am Beispiel der HTW Dresden – FoDaMa-HTWD '. It is a short and clear summary of the most important findings, which were gained during the project at the University of Applied Sciences Dresden (HTWD) on research data management (FDM). With this handbook, the authors would like to support other Universities of Applied Sciences (HAW) in developing strategies and the necessary FDM structure. It is therefore primarily aimed at people who are involved in the strategic development of research at universities and who may ask themselves the question of which supporting FDM services and measures should be taken to ensure that the researchers of their own institution are able to meet the increasing demand for open and sustainable working methods in dealing with research data. info:eu-repo/classification/ddc/020 ddc:020 info:eu-repo/classification/ddc/000 ddc:000
274	Gestaltung nutzerzentrierter Assistenzen im Produktdatenmanagement Scheele, Stephan, Mantwill, Frank 06 January 2020 (has links) Die Verwaltung von Produkt- und Prozessdaten in industrieller Produktentstehung ist seit dem Aufkommen der rechnergestützten Assistenzsysteme einer der großen Hebel bei der Suche nach Effizienzsteigerungen. Neben dem Flugzeugbau ist es insbesondere die Automobilindustrie, an deren komplexen Arbeitsabläufen Neuerungen auf den Gebieten der Datenhaltung, -verwaltung und der prozessübergreifenden Zusammenarbeit erprobt werden. Die rechnergestützte Umsetzung entlang des Produktentstehungsprozesses hat zum Ziel, eine bessere Abbildbarkeit, Durchgängigkeit und Verfolgbarkeit der virtuellen Geschäftsobjekte und letztlich der tatsächlichen Produkte sicherzustellen. [... aus der Einleitung] info:eu-repo/classification/ddc/620 ddc:620
275	Modelo de Referencia para la Gestión de la Seguridad de Datos de Salud Soportado en una Plataforma Blockchain / Reference Model for Health Data Security Management Supported in a Blockchain Platform Espíritu Aranda, Walter Augusto, Machuca Nieva, Christian Fernando 17 March 2021 (has links) En la actualidad, los centros de salud tales como hospitales y clínicas necesitan guías específicas que ayuden en la creación de controles para administrar y salvaguardar la confidencialidad, disponibilidad e integridad de la información de sus sistemas. En consecuencia, se utiliza como guía la ISO/IEC 27002 que sirve para la administración de la información que cuenta con controles generales que pueden ser tomados como ejemplo. Sin embargo, ante la necesidad de aplicar controles específicos en el sector salud, se creó la ISO/IEC 27799 que tiene con objetivo brindar controles de seguridad para proteger la información personal (pacientes) en cuanto a los temas de salud. Este proyecto consiste en implementar un modelo de referencia para entidades del sector salud que permita gestionar la seguridad de los datos sensibles y confidenciales soportado en una plataforma Blockchain integrando la norma ISO/IEC 27799. Además, para el sector salud tienen que cumplirse los principios fundamentales de seguridad de la información, si uno de ellos se incumple, podría traer repercusiones negativas hacia el paciente. Los resultados obtenidos indican que al utilizar la tecnología Blockchain, los datos de salud de los pacientes están mejor resguardados ante cualquier incidente como por ejemplo un Ciberataque o manipulación mal intencionada de datos. Se espera que nuestro modelo de referencia ayude en la gestión de la seguridad de datos de salud en los hospitales y clínicas, con el fin de reducir el impacto de riesgos encontrados durante el proceso de validación. / Currently, health centers such as hospitals and needs need specific guidelines that help in creating controls to manage and safeguard the confidentiality, availability, and integrity of the information in their systems. Consequently, ISO / IEC 27002 is used as a guide, which is used for the management of information that has general controls that can be taken as an example. However, given the need to apply specific controls in the health sector, ISO / IEC 27799 was created, which aims to provide security controls to protect personal information (patients) regarding health issues. This project consists of implementing a reference model for entities in the health sector that allows managing the security of sensitive and confidential data supported on a Blockchain platform integrating the ISO / IEC 27799 standard. In addition, the three principles must be met for the health sector basic information security, if one of them is breached, it could have negative repercussions for the patient. The results obtained indicate that by using Blockchain technology, patients' health data is better protected against any incident such as a cyber-attack or malicious manipulation of data. Our reference model is expected to assist in the management of health data security in hospitals and clinics, in order to reduce the impact of risks found during the validation process. / Tesis Modelo de referencia Gestión de datos Cuidado de la salud Blockchain Reference model Data management Health care
276	GCIP: Exploiting the Generation and Optimization of Integration Processes Lehner, Wolfgang, Böhm, Matthias, Wloka, Uwe, Habich, Dirk 22 April 2022 (has links) As a result of the changing scope of data management towards the management of highly distributed systems and applications, integration processes have gained in importance. Such integration processes represent an abstraction of workflow-based integration tasks. In practice, integration processes are pervasive and the performance of complete IT infrastructures strongly depends on the performance of the central integration platform that executes the specified integration processes. In this area, the three major problems are: (1) significant development efforts, (2) low portability, and (3) inefficient execution. To overcome those problems, we follow a model-driven generation approach for integration processes. In this demo proposal, we want to introduce the so-called GCIP Framework (Generation of Complex Integration Processes) which allows the modeling of integration process and the generation of different concrete integration tasks. The model-driven approach opens opportunities for rule-based and workload-based optimization techniques. info:eu-repo/classification/ddc/004 ddc:004
277	Quality of Service and Predictability in DBMS Sattler, Kai-Uwe, Lehner, Wolfgang 03 May 2022 (has links) DBMS are a ubiquitous building block of the software stack in many complex applications. Middleware technologies, application servers and mapping approaches hide the core database technologies just like power, networking infrastructure and operating system services. Furthermore, many enterprise-critical applications demand a certain degree of quality of service (QoS) or guarantees, e.g. wrt. response time, transaction throughput, latency but also completeness or more generally quality of results. Examples of such applications are billing systems in telecommunication, where each telephone call has to be monitored and registered in a database, Ecommerce applications where orders have to be accepted even in times of heavy load and the waiting time of customers should not exceed a few seconds, ERP systems processing a large number of transactions in parallel, or systems for processing streaming or sensor data in realtime, e.g. in process automation of traffic control. As part of complex multilevel software stack, database systems have to share or contribute to these QoS requirements, which means that guarantees have to be given by the DBMS, too, and that the processing of database requests is predictable. Todays mainstream DBMS typically follow a best effort approach: requests are processed as fast as possible without any guarantees: the optimization goal of query optimizers and tuning approaches is rather to minimize resource consumption instead of just fulfilling given service level agreements. However, motivated by the situation described above there is an emerging need for database services providing guarantees or simply behave in a predictable manner and at the same time interact with other components of the software stack in order to fulfill the requirements. This is also driven by the paradigm of service-oriented architectures widely discussed in industry. Currently, this is addressed only by very specialized solutions. Nevertheless, database researchers have developed several techniques contributing to the goal of QoS-aware database systems. The purpose of the tutorial is to introduce database researchers and practitioners to the scope, the challenges and the available techniques to the problem of predictability and QoS agreements in DBMS. info:eu-repo/classification/ddc/004 ddc:004 info:eu-repo/classification/ddc/005 ddc:005
278	Iterative and Expressive Querying for Big Data Series / Requêtes itératives et expressives pour l’analyse de grandes séries de données Gogolou, Anna 15 November 2019 (has links) Les séries temporelles deviennent omniprésentes dans la vie moderne et leur analyse de plus en plus difficile compte tenu de leur taille. L’analyse des grandes séries de données implique des tâches telles que l’appariement de modèles (motifs), la détection d’anomalies, l’identification de modèles fréquents, et la classification ou le regroupement (clustering). Ces tâches reposent sur la notion de similarité. La communauté scientifique a proposé de plusieurs techniques, y compris de nombreuses mesures de similarité pour calculer la distance entre deux séries temporelles, ainsi que des techniques et des algorithmes d’indexation correspondants, afin de relever les défis de l’évolutivité lors de la recherche de similarité.Les analystes, afin de s’acquitter efficacement de leurs tâches, ont besoin de systèmes d’analyse visuelle interactifs, extrêmement rapides, et puissants. Lors de la création de tels systèmes, nous avons identifié deux principaux défis: (1) la perception de similarité et (2) la recherche progressive de similarité. Le premier traite de la façon dont les gens perçoivent des modèles similaires et du rôle de la visualisation dans la perception de similarité. Le dernier point concerne la rapidité avec laquelle nous pouvons redonner aux utilisateurs des mises à jour des résultats progressifs, lorsque les temps de réponse du système sont longs et non interactifs. Le but de cette thèse est de répondre et de donner des solutions aux défis ci-dessus.Dans la première partie, nous avons étudié si différentes représentations visuelles (Graphiques en courbes, Graphiques d’horizon et Champs de couleur) modifiaient la perception de similarité des séries temporelles. Nous avons essayé de comprendre si les résultats de recherche automatique de similarité sont perçus de manière similaire, quelle que soit la technique de visualisation; et si ce que les gens perçoivent comme similaire avec chaque visualisation s’aligne avec différentes mesures de similarité. Nos résultats indiquent que les Graphes d’horizon s’alignent sur des mesures qui permettent des variations de décalage temporel ou d’échelle (i.e., ils promeuvent la déformation temporelle dynamique). En revanche, ils ne s’alignent pas sur des mesures autorisant des variations d’amplitude et de décalage vertical (ils ne promeuvent pas des mesures basées sur la z-normalisation). L’inverse semble être le cas pour les Graphiques en courbes et les Champs de couleur. Dans l’ensemble, nos travaux indiquent que le choix de la visualisation affecte les schémas temporels que l’homme considère comme similaires. Donc, la notion de similarité dans les séries temporelles est dépendante de la technique de visualisation.Dans la deuxième partie, nous nous sommes concentrés sur la recherche progressive de similarité dans de grandes séries de données. Nous avons étudié la rapidité avec laquelle les premières réponses approximatives et puis des mises à jour des résultats progressifs sont détectées lors de l’exécuton des requêtes progressives. Nos résultats indiquent qu’il existe un écart entre le moment où la réponse finale s’est trouvée et le moment où l’algorithme de recherche se termine, ce qui entraîne des temps d’attente gonflés sans amélioration. Des estimations probabilistes pourraient aider les utilisateurs à décider quand arrêter le processus de recherche, i.e., décider quand l’amélioration de la réponse finale est improbable. Nous avons développé et évalué expérimentalement une nouvelle méthode probabiliste qui calcule les garanties de qualité des résultats progressifs de k-plus proches voisins (k-NN). Notre approche apprend d’un ensemble de requêtes et construit des modèles de prédiction basés sur deux observations: (i) des requêtes similaires ont des réponses similaires; et (ii) des réponses progressives renvoyées par les indices de séries de données sont de bons prédicteurs de la réponse finale. Nous fournissons des estimations initiales et progressives de la réponse finale. / Time series are becoming ubiquitous in modern life, and given their sizes, their analysis is becoming increasingly challenging. Time series analysis involves tasks such as pattern matching, anomaly detection, frequent pattern identification, and time series clustering or classification. These tasks rely on the notion of time series similarity. The data-mining community has proposed several techniques, including many similarity measures (or distance measure algorithms), for calculating the distance between two time series, as well as corresponding indexing techniques and algorithms, in order to address the scalability challenges during similarity search.To effectively support their tasks, analysts need interactive visual analytics systems that combine extremely fast computation, expressive querying interfaces, and powerful visualization tools. We identified two main challenges when considering the creation of such systems: (1) similarity perception and (2) progressive similarity search. The former deals with how people perceive similar patterns and what the role of visualization is in time series similarity perception. The latter is about how fast we can give back to users updates of progressive similarity search results and how good they are, when system response times are long and do not support real-time analytics in large data series collections. The goal of this thesis, that lies at the intersection of Databases and Human-Computer Interaction, is to answer and give solutions to the above challenges.In the first part of the thesis, we studied whether different visual representations (Line Charts, Horizon Graphs, and Color Fields) alter time series similarity perception. We tried to understand if automatic similarity search results are perceived in a similar manner, irrespective of the visualization technique; and if what people perceive as similar with each visualization aligns with different automatic similarity measures and their similarity constraints. Our findings indicate that Horizon Graphs promote as invariant local variations in temporal position or speed, and as a result they align with measures that allow variations in temporal shifting or scaling (i.e., dynamic time warping). On the other hand, Horizon Graphs do not align with measures that allow amplitude and y-offset variations (i.e., measures based on z-normalization), because they exaggerate these differences, while the inverse seems to be the case for Line Charts and Color Fields. Overall, our work indicates that the choice of visualization affects what temporal patterns humans consider as similar, i.e., the notion of similarity in time series is visualization-dependent.In the second part of the thesis, we focused on progressive similarity search in large data series collections. We investigated how fast first approximate and then updates of progressive answers are detected, while we execute similarity search queries. Our findings indicate that there is a gap between the time the final answer is found and the time when the search algorithm terminates, resulting in inflated waiting times without any improvement. Computing probabilistic estimates of the final answer could help users decide when to stop the search process. We developed and experimentally evaluated using benchmarks, a new probabilistic learning-based method that computes quality guarantees (error bounds) for progressive k-Nearest Neighbour (k-NN) similarity search results. Our approach learns from a set of queries and builds prediction models based on two observations: (i) similar queries have similar answers; and (ii) progressive best-so-far (bsf) answers returned by the state-of-the-art data series indexes are good predictors of the final k-NN answer. We provide both initial and incrementally improved estimates of the final answer. Séries de données Séquences Séries temporelles Systèmes de traitement de données Visualisation Interaction homme-Machine Data series Sequences Time series Data management systems Visualization Human-Computer interaction
279	LANDNETZ trifft Feldschwarm: Landwirtschaft von morgen, heute erleben Technische Universität Dresden 14 October 2021 (has links) Am 23. September 2021 fand auf dem Gutshof Raitzen in Naundorf/Sachsen der Feldtag „LANDNETZ trifft Feldschwarm® – Landwirtschaft von morgen, heute erleben“ statt. Seit 2017 forschen Wissenschaftlerinnen und Wissenschaftler der TU Dresden in den Projekten LANDNETZ und Feldschwarm®. Gemeinsam mit dem Sächsischen Landesamt für Umwelt, Landwirtschaft und Geologie (LfULG) und dem Fraunhofer-Institut für Verkehrs- und Infrastruktursysteme IVI werden im LANDNETZ neue Technologien zur flächendeckenden drahtlosen Datenübertragung und Vernetzung als grundlegende Bedingung für eine Landwirtschaft 4.0 in der Praxis überprüft. Im Testfeld werden dabei zahlreiche digitale landwirtschaftliche Anwendungen in Zusammenarbeit mit Praxisbetrieben konzipiert, erprobt und optimiert. Der Feldschwarm® - das sind kleine, intelligente Maschineneinheiten, die sich flexibel kombinieren lassen und sich so einfach an die lokalen Feldbedingungen anpassen können. Statt sechs bis zwölf Metern Arbeitsbreite koppelt das Feldschwarmkonsortium zwei oder drei technische Einheiten des Feldschwarms und macht damit Produktivität in der Landwirtschaft wieder besser skalierbar. Die neue Feldbearbeitungstechnik ist damit nicht nur sehr anpassungsfähig und hochautomatisiert, sondern schont bei gleichem Ertrag auch den Boden und erhöht die Qualität der Bearbeitung bei gleichzeitiger Einsparung von Dieselkraftstoff. info:eu-repo/classification/ddc/630 ddc:630
280	SaxFDM – ein Service für Forschende in Sachsen Nagel, Stefanie 28 June 2023 (has links) In diesem 'Snack' stellen wir SaxFDM - die Sächsische Landesinitiative für Forschungsdatenmanagement - und deren Serviceangebote vor. info:eu-repo/classification/ddc/020 ddc:020 Open Science

Search results