11 |
Delphin 6 Output File SpecificationVogelsang, Stefan, Nicolai, Andreas 12 April 2016 (has links) (PDF)
Abstract This paper describes the file formats of the output data and geometry files generated by the Delphin program, a simulation model for hygrothermal transport in porous media. The output data format is suitable for any kind of simulation output generated by transient transport simulation models. Implementing support for the Delphin output format enables use of the advanced post-processing functionality provided by the Delphin post-processing tool and its dedicated physical analysis functionality.
|
12 |
Verification of Data-aware Business Processes in the Presence of OntologiesSantoso, Ario 14 November 2016 (has links) (PDF)
The meet up between data, processes and structural knowledge in modeling complex enterprise systems is a challenging task that has led to the study of combining formalisms from knowledge representation, database theory, and process management. Moreover, to ensure system correctness, formal verification also comes into play as a promising approach that offers well-established techniques. In line with this, significant results have been obtained within the research on data-aware business processes, which studies the marriage between static and dynamic aspects of a system within a unified framework. However, several limitations are still present. Various formalisms for data-aware processes that have been studied typically use a simple mechanism for specifying the system dynamics. The majority of works also assume a rather simple treatment of inconsistency (i.e., reject inconsistent system states). Many researches in this area that consider structural domain knowledge typically also assume that such knowledge remains fixed along the system evolution (context-independent), and this might be too restrictive. Moreover, the information model of data-aware processes sometimes relies on relatively simple structures. This situation might cause an abstraction gap between the high-level conceptual view that business stakeholders have, and the low-level representation of information. When it comes to verification, taking into account all of the aspects above makes the problem more challenging.
In this thesis, we investigate the verification of data-aware processes in the presence of ontologies while at the same time addressing all limitations above. Specifically, we provide the following contributions: (1) We propose a formal framework called Golog-KABs (GKABs), by leveraging on the state of the art formalisms for data-aware processes equipped with ontologies. GKABs enable us to specify semantically-rich data-aware business processes, where the system dynamics are specified using a high-level action language inspired by the Golog programming language. (2) We propose a parametric execution semantics for GKABs that is able to elegantly accommodate a plethora of inconsistency-aware semantics based on the well-known notion of repair, and this leads us to consider several variants of inconsistency-aware GKABs. (3) We enhance GKABs towards context-sensitive GKABs that take into account the contextual information during the system evolution. (4) We marry these two settings and introduce inconsistency-aware context-sensitive GKABs. (5) We introduce the so-called Alternating-GKABs that allow for a more fine-grained analysis over the evolution of inconsistency-aware context-sensitive systems. (6) In addition to GKABs, we introduce a novel framework called Semantically-Enhanced Data-Aware Processes (SEDAPs) that, by utilizing ontologies, enable us to have a high-level conceptual view over the evolution of the underlying system. We provide not only theoretical results, but have also implemented this concept of SEDAPs.
We also provide numerous reductions for the verification of sophisticated first-order temporal properties over all of the settings above, and show that verification can be addressed using existing techniques developed for Data-Centric Dynamic Systems (which is a well-established data-aware processes framework), under suitable boundedness assumptions for the number of objects freshly introduced in the system while it evolves. Notably, all proposed GKAB extensions have no negative impact on computational complexity.
|
13 |
Human Mobility and Application Usage Prediction Algorithms for Mobile DevicesBaumann, Paul 27 October 2016 (has links) (PDF)
Mobile devices such as smartphones and smart watches are ubiquitous companions of humans’ daily life. Since 2014, there are more mobile devices on Earth than humans. Mobile applications utilize sensors and actuators of these devices to support individuals in their daily life. In particular, 24% of the Android applications leverage users’ mobility data. For instance, this data allows applications to understand which places an individual typically visits. This allows providing her with transportation information, location-based advertisements, or to enable smart home heating systems. These and similar scenarios require the possibility to access the Internet from everywhere and at any time. To realize these scenarios 83% of the applications available in the Android Play Store require the Internet to operate properly and therefore access it from everywhere and at any time.
Mobile applications such as Google Now or Apple Siri utilize human mobility data to anticipate where a user will go next or which information she is likely to access en route to her destination. However, predicting human mobility is a challenging task. Existing mobility prediction solutions are typically optimized a priori for a particular application scenario and mobility prediction task. There is no approach that allows for automatically composing a mobility prediction solution depending on the underlying prediction task and other parameters. This approach is required to allow mobile devices to support a plethora of mobile applications running on them, while each of the applications support its users by leveraging mobility predictions in a distinct application scenario.
Mobile applications rely strongly on the availability of the Internet to work properly. However, mobile cellular network providers are struggling to provide necessary cellular resources. Mobile applications generate a monthly average mobile traffic volume that ranged between 1 GB in Asia and 3.7 GB in North America in 2015. The Ericsson Mobility Report Q1 2016 predicts that by the end of 2021 this mobile traffic volume will experience a 12-fold increase. The consequences are higher costs for both providers and consumers and a reduced quality of service due to congested mobile cellular networks. Several countermeasures can be applied to cope with these problems. For instance, mobile applications apply caching strategies to prefetch application content by predicting which applications will be used next. However, existing solutions suffer from two major shortcomings. They either (1) do not incorporate traffic volume information into their prefetching decisions and thus generate a substantial amount of cellular traffic or (2) require a modification of mobile application code.
In this thesis, we present novel human mobility and application usage prediction algorithms for mobile devices. These two major contributions address the aforementioned problems of (1) selecting a human mobility prediction model and (2) prefetching of mobile application content to reduce cellular traffic.
First, we address the selection of human mobility prediction models. We report on an extensive analysis of the influence of temporal, spatial, and phone context data on the performance of mobility prediction algorithms. Building upon our analysis results, we present (1) SELECTOR – a novel algorithm for selecting individual human mobility prediction models and (2) MAJOR – an ensemble learning approach for human mobility prediction. Furthermore, we introduce population mobility models and demonstrate their practical applicability. In particular, we analyze techniques that focus on detection of wrong human mobility predictions. Among these techniques, an ensemble learning algorithm, called LOTUS, is designed and evaluated.
Second, we present EBC – a novel algorithm for prefetching mobile application content. EBC’s goal is to reduce cellular traffic consumption to improve application content freshness. With respect to existing solutions, EBC presents novel techniques (1) to incorporate different strategies for prefetching mobile applications depending on the available network type and (2) to incorporate application traffic volume predictions into the prefetching decisions. EBC also achieves a reduction in application launch time to the cost of a negligible increase in energy consumption.
Developing human mobility and application usage prediction algorithms requires access to human mobility and application usage data. To this end, we leverage in this thesis three publicly available data set. Furthermore, we address the shortcomings of these data sets, namely, (1) the lack of ground-truth mobility data and (2) the lack of human mobility data at short-term events like conferences. We contribute with JK2013 and UbiComp Data Collection Campaign (UbiDCC) two human mobility data sets that address these shortcomings. We also develop and make publicly available a mobile application called LOCATOR, which was used to collect our data sets.
In summary, the contributions of this thesis provide a step further towards supporting mobile applications and their users. With SELECTOR, we contribute an algorithm that allows optimizing the quality of human mobility predictions by appropriately selecting parameters. To reduce the cellular traffic footprint of mobile applications, we contribute with EBC a novel approach for prefetching of mobile application content by leveraging application usage predictions. Furthermore, we provide insights about how and to what extent wrong and uncertain human mobility predictions can be detected. Lastly, with our mobile application LOCATOR and two human mobility data sets, we contribute practical tools for researchers in the human mobility prediction domain.
|
14 |
Delphin 6 Output File SpecificationVogelsang, Stefan, Nicolai, Andreas 29 June 2011 (has links) (PDF)
This paper describes the file formats of the output data and geometry files generated by the Delphin program, a simulation model for hygrothermal transport in porous media. The output data format is suitable for any kind of simulation output generated by transient transport simulation models. Implementing support for the Delphin output format enables use of the advanced post-processing functionality provided by the Delphin post- processing tool and its dedicated physical analysis functionality. The article also discusses the application programming interface of the DataIO library that can be used to read/write Delphin output data and geometry files conveniently and efficiently.
|
15 |
Datenqualität in Sensordatenströmen / Data Quality in Sensor Data StreamsKlein, Anja 23 March 2010 (has links) (PDF)
Die stetige Entwicklung intelligenter Sensorsysteme erlaubt die Automatisierung und Verbesserung komplexer Prozess- und Geschäftsentscheidungen in vielfältigen Anwendungsszenarien.
Sensoren können zum Beispiel zur Bestimmung optimaler Wartungstermine oder zur Steuerung von Produktionslinien genutzt werden. Ein grundlegendes Problem bereitet dabei die Sensordatenqualität, die durch Umwelteinflüsse und Sensorausfälle
beschränkt wird. Ziel der vorliegenden Arbeit ist die Entwicklung eines Datenqualitätsmodells, das Anwendungen und Datenkonsumenten Qualitätsinformationen für eine umfassende Bewertung unsicherer Sensordaten zur Verfügung stellt. Neben Datenstrukturen zur
effizienten Datenqualitätsverwaltung in Datenströmen und Datenbanken wird eine umfassende Datenqualitätsalgebra zur Berechnung der Qualität von Datenverarbeitungsergebnissen
vorgestellt. Darüber hinaus werden Methoden zur Datenqualitätsverbesserung entwickelt, die speziell auf die Anforderungen der Sensordatenverarbeitung angepasst sind. Die Arbeit wird durch Ansätze zur nutzerfreundlichen Datenqualitätsanfrage
und -visualisierung vervollständigt.
|
16 |
Architectural Principles for Database Systems on Storage-Class MemoryOukid, Ismail 23 January 2018 (has links) (PDF)
Database systems have long been optimized to hide the higher latency of storage media, yielding complex persistence mechanisms. With the advent of large DRAM capacities, it became possible to keep a full copy of the data in DRAM. Systems that leverage this possibility, such as main-memory databases, keep two copies of the data in two different formats: one in main memory and the other one in storage. The two copies are kept synchronized using snapshotting and logging. This main-memory-centric architecture yields nearly two orders of magnitude faster analytical processing than traditional, disk-centric ones. The rise of Big Data emphasized the importance of such systems with an ever-increasing need for more main memory. However, DRAM is hitting its scalability limits: It is intrinsically hard to further increase its density.
Storage-Class Memory (SCM) is a group of novel memory technologies that promise to alleviate DRAM’s scalability limits. They combine the non-volatility, density, and economic characteristics of storage media with the byte-addressability and a latency close to that of DRAM. Therefore, SCM can serve as persistent main memory, thereby bridging the gap between main memory and storage. In this dissertation, we explore the impact of SCM as persistent main memory on database systems. Assuming a hybrid SCM-DRAM hardware architecture, we propose a novel software architecture for database systems that places primary data in SCM and directly operates on it, eliminating the need for explicit IO. This architecture yields many benefits: First, it obviates the need to reload data from storage to main memory during recovery, as data is discovered and accessed directly in SCM. Second, it allows replacing the traditional logging infrastructure by fine-grained, cheap micro-logging at data-structure level. Third, secondary data can be stored in DRAM and reconstructed during recovery. Fourth, system runtime information can be stored in SCM to improve recovery time. Finally, the system may retain and continue in-flight transactions in case of system failures.
However, SCM is no panacea as it raises unprecedented programming challenges. Given its byte-addressability and low latency, processors can access, read, modify, and persist data in SCM using load/store instructions at a CPU cache line granularity. The path from CPU registers to SCM is long and mostly volatile, including store buffers and CPU caches, leaving the programmer with little control over when data is persisted. Therefore, there is a need to enforce the order and durability of SCM writes using persistence primitives, such as cache line flushing instructions. This in turn creates new failure scenarios, such as missing or misplaced persistence primitives.
We devise several building blocks to overcome these challenges. First, we identify the programming challenges of SCM and present a sound programming model that solves them. Then, we tackle memory management, as the first required building block to build a database system, by designing a highly scalable SCM allocator, named PAllocator, that fulfills the versatile needs of database systems. Thereafter, we propose the FPTree, a highly scalable hybrid SCM-DRAM persistent B+-Tree that bridges the gap between the performance of transient and persistent B+-Trees. Using these building blocks, we realize our envisioned database architecture in SOFORT, a hybrid SCM-DRAM columnar transactional engine. We propose an SCM-optimized MVCC scheme that eliminates write-ahead logging from the critical path of transactions. Since SCM -resident data is near-instantly available upon recovery, the new recovery bottleneck is rebuilding DRAM-based data. To alleviate this bottleneck, we propose a novel recovery technique that achieves nearly instant responsiveness of the database by accepting queries right after recovering SCM -based data, while rebuilding DRAM -based data in the background. Additionally, SCM brings new failure scenarios that existing testing tools cannot detect. Hence, we propose an online testing framework that is able to automatically simulate power failures and detect missing or misplaced persistence primitives. Finally, our proposed building blocks can serve to build more complex systems, paving the way for future database systems on SCM.
|
17 |
Erweiterung des CRC-Karten-Konzeptes um RollenHamann, Markus 11 January 2018 (has links) (PDF)
Die rollenbasierte Modellierung ist ein aktueller Forschungszweig, welcher Verfahren für die Analyse und die Lehre benötigt. Zu diesem Zweck präsentiert die Arbeit eine Erweiterung des klassischen, objektorientierten CRC-Karten-Verfahrens um rollenbasierte Konzepte. Diese basiert auf grundlegenden Eigenschaften rollenbasierter Elemente, wie Rollen, Objekte und Kontexte, welche modular in das CRC-Karten- Verfahren eingebunden werden. Weiterhin soll anhand einer empirische Studie ermittelt werden, wie gut das rollenerweiterte R-CRC-Karten-Verfahren für die Aufgaben in Analyse und Lehre geeignet ist. Das R-CRC-Karten-Verfahren soll letztendlich eine effiziente Möglichkeit bieten, Problemstellungen rollenbasiert zu analysieren und rollenbasierte Konzepte in der Lehre zu vermitteln.
|
18 |
Managing and Consuming Completeness Information for RDF Data SourcesDarari, Fariz 04 July 2017 (has links) (PDF)
The ever increasing amount of Semantic Web data gives rise to the question: How complete is the data? Though generally data on the Semantic Web is incomplete, many parts of data are indeed complete, such as the children of Barack Obama and the crew of Apollo 11. This thesis aims to study how to manage and consume completeness information about Semantic Web data. In particular, we first discuss how completeness information can guarantee the completeness of query answering. Next, we propose optimization techniques of completeness reasoning and conduct experimental evaluations to show the feasibility of our approaches. We also provide a technique to check the soundness of queries with negation via reduction to query completeness checking. We further enrich completeness information with timestamps, enabling query answers to be checked up to when they are complete. We then introduce two demonstrators, i.e., CORNER and COOL-WD, to show how our completeness framework can be realized. Finally, we investigate an automated method to generate completeness statements from text on the Web via relation cardinality extraction.
|
19 |
Clustering of Distributed Word Representations and its Applicability for Enterprise SearchKorger, Christina 04 October 2016 (has links) (PDF)
Machine learning of distributed word representations with neural embeddings is a state-of-the-art approach to modelling semantic relationships hidden in natural language. The thesis “Clustering of Distributed Word Representations and its Applicability for Enterprise Search” covers different aspects of how such a model can be applied to knowledge management in enterprises. A review of distributed word representations and related language modelling techniques, combined with an overview of applicable clustering algorithms, constitutes the basis for practical studies. The latter have two goals: firstly, they examine the quality of German embedding models trained with gensim and a selected choice of parameter configurations. Secondly, clusterings conducted on the resulting word representations are evaluated against the objective of retrieving immediate semantic relations for a given term. The application of the final results to company-wide knowledge management is subsequently outlined by the example of the platform intergator and conceptual extensions."
|
20 |
Energy-Efficient Key/Value StoreTena, Frezewd Lemma 11 September 2017 (has links) (PDF)
Energy conservation is a major concern in todays data centers, which are the 21st century data processing factories, and where large and complex software systems such as distributed data management stores run and serve billions of users. The two main drivers of this major concern are the pollution impact data centers have on the environment due to their waste heat, and the expensive cost data centers incur due to their enormous energy demand. Among the many subsystems of data centers, the storage system is one of the main sources of energy consumption. Among the many types of storage systems, key/value stores happen to be the widely used in the data centers. In this work, I investigate energy saving techniques that enable a consistent hash based key/value store save energy during low activity times, and whenever there is an opportunity to reuse the waste heat of data centers.
|
Page generated in 0.0257 seconds