Global ETD Search

1	Batch regions : process instance synchronization based on data Pufahl, Luise, Meyer, Andreas, Weske, Mathias January 2013 (has links) Business process automation improves organizations’ efficiency to perform work. In existing business process management systems, process instances run independently from each other. However, synchronizing instances carrying similar characteristics, i.e., sharing the same data, can reduce process execution costs. For example, if an online retailer receives two orders from one customer, there is a chance that they can be packed and shipped together to save shipment costs. In this paper, we use concepts from the database domain and introduce data views to business processes to identify instances which can be synchronized. Based on data views, we introduce the concept of batch regions for a context-aware instance synchronization over a set of connected activities. We also evaluate the concepts introduced in this paper with a case study comparing costs for normal and batch processing. / Die Automatisierung von Geschäftsprozessen unterstützt Unternehmen, die Ausführung ihrer Prozesse effizienter zu gestalten. In existierenden Business Process Management Systemen, werden die Instanzen eines Prozesses völlig unabhängig voneinander ausgeführt. Jedoch kann das Synchronisieren von Instanzen mit ähnlichen Charakteristiken wie z.B. den gleichen Daten zu reduzierten Ausführungskosten führen. Zum Beispiel, wenn ein Onlinehändler zwei Bestellungen vom selben Kunden mit der gleichen Lieferanschrift erhält, können diese zusammen verpackt und versendet werden, um Versandkosten zu sparen. In diesem Papier verwenden wir Konzepte aus dem Datenbankbereich und führen Datensichten für Geschäftsprozesse ein, um Instanzen zu identifizieren, welche synchronisiert werden können. Auf Grundlage der Datensichten führen wir das Konzept der Batch-Regionen ein. Eine Batch-Region ermöglicht eine kontext-bewusste Instanzen-Synchronisierung über mehrere verbundene Aktivitäten. Das eingeführte Konzept wird mit einer Fallstudie evaluiert, bei der ein Kostenvergleich zwischen der normalen Prozessausführung und der Batchverarbeitung durchgeführt wird. BPM Batchverarbeitung Gruppierung von Prozessinstanzen Datensicht BPM batch processing process instance grouping data view Data processing Computer science
2	Big Data causing Big (TLB) Problems: Taming Random Memory Accesses on the GPU Karnagel, Tomas, Ben-Nun, Tal, Werner, Matthias, Habich, Dirk, Lehner, Wolfgang 13 June 2022 (has links) GPUs are increasingly adopted for large-scale database processing, where data accesses represent the major part of the computation. If the data accesses are irregular, like hash table accesses or random sampling, the GPU performance can suffer. Especially when scaling such accesses beyond 2GB of data, a performance decrease of an order of magnitude is encountered. This paper analyzes the source of the slowdown through extensive micro-benchmarking, attributing the root cause to the Translation Lookaside Buffer (TLB). Using the micro-benchmarks, the TLB hierarchy and structure are fully analyzed on two different GPU architectures, identifying never-before-published TLB sizes that can be used for efficient large-scale application tuning. Based on the gained knowledge, we propose a TLB-conscious approach to mitigate the slowdown for algorithms with irregular memory access. The proposed approach is applied to two fundamental database operations - random sampling and hash-based grouping - showing that the slowdown can be dramatically reduced, and resulting in a performance increase of up to 13×. info:eu-repo/classification/ddc/004 ddc:004
3	Cache-Efficient Aggregation: Hashing Is Sorting Müller, Ingo, Sanders, Peter, Lacurie, Arnaud, Lehner, Wolfgang, Färber, Franz 14 June 2022 (has links) For decades researchers have studied the duality of hashing and sorting for the implementation of the relational operators, especially for efficient aggregation. Depending on the underlying hardware and software architecture, the specifically implemented algorithms, and the data sets used in the experiments, different authors came to different conclusions about which is the better approach. In this paper we argue that in terms of cache efficiency, the two paradigms are actually the same. We support our claim by showing that the complexity of hashing is the same as the complexity of sorting in the external memory model. Furthermore we make the similarity of the two approaches obvious by designing an algorithmic framework that allows to switch seamlessly between hashing and sorting during execution. The fact that we mix hashing and sorting routines in the same algorithmic framework allows us to leverage the advantages of both approaches and makes their similarity obvious. On a more practical note, we also show how to achieve very low constant factors by tuning both the hashing and the sorting routines to modern hardware. Since we observe a complementary dependency of the constant factors of the two routines to the locality of the input, we exploit our framework to switch to the faster routine where appropriate. The result is a novel relational aggregation algorithm that is cache-efficient---independently and without prior knowledge of input skew and output cardinality---, highly parallelizable on modern multi-core systems, and operating at a speed close to the memory bandwidth, thus outperforming the state-of-the-art by up to 3.7x. info:eu-repo/classification/ddc/004 ddc:004

1

Page generated in 0.0462 seconds