Global ETD Search

61	Brillante Erweiterung des Horizonts: Eine multilinguale semantische Suche für den SLUB-Katalog Bonte, Achim, Glass, Robert, Mittelbach, Jens 19 December 2011 (has links) Mit der Einführung ihres neuen SLUB-Katalogs auf der Basis der Discovery-Software Primo der Firma Ex Libris hat die Sächsische Landesbibliothek – Staats- und Universitätsbibliothek Dresden (SLUB) im Dezember 2010 die zunehmend unzulängliche Welt der traditionellen elektronischen Bibliothekskataloge hinter sich gelassen. Innerhalb von neun Monaten entstand ein übergreifendes Katalogfrontend, das auf älteren Systemen aufsetzt (zum Zweck des Data Harvesting oder auch zur Inanspruchnahme der lokalen Benutzerverwaltung), zugleich aber davon weitgehend unabhängig ist. Eine besondere Herausforderung bedeutete der Anspruch, Primo nicht „out of the box“, das heißt als gesichtsloses Fertigprodukt einzusetzen, sondern als Herzstück des gesamten Informationsangebots individuell zu gestalten und weitgehend in die allgemeinen Webseiten zu integrieren. Auch die Ausleihbenutzerverwaltung sollte möglichst bruchlos in das Gesamtkonzept finden. Der SLUB-Katalog bietet heute unter einer attraktiven Benutzeroberfläche ein sehr gutes Trefferranking, Rechtschreibkorrektur, vielfältiges Drilldown, flexible Sortieralgorithmen und weitere, von Suchmaschinen gewohnte Funktionen. info:eu-repo/classification/ddc/020 ddc:020
62	On-line analytical processing in distributed data warehouses Lehner, Wolfgang, Albrecht, Jens 14 April 2022 (has links) The concepts of 'data warehousing' and 'on-line analytical processing' have seen a growing interest in the research and commercial product community. Today, the trend moves away from complex centralized data warehouses to distributed data marts integrated in a common conceptual schema. However, as the first part of this paper demonstrates, there are many problems and little solutions for large distributed decision support systems in worldwide operating corporations. After showing the benefits and problems of the distributed approach, this paper outlines possibilities for achieving performance in distributed online analytical processing. Finally, the architectural framework of the prototypical distributed OLAP system CUBESTAR is outlined. info:eu-repo/classification/ddc/005 ddc:005
63	Shrinked Data Marts Enabled for Negative Caching Lehner, Wolfgang, Thiele, Maik 15 June 2022 (has links) Data marts storing pre-aggregated data, prepared for further roll-ups, play an essential role in data warehouse environments and lead to significant performance gains in the query evaluation. However, in order to ensure the completeness of query results on the data mart without to access the underlying data warehouse, null values need to be stored explicitly; this process is denoted as negative caching. Such null values typically occur in multidimensional data sets, which are naturally very sparse. To our knowledge, there is no work on shrinking the null tuples in a multi-dimensional data set within ROLAP. For these tuples, we propose a lossless compression technique, leading to a dramatic reduction in size of the data mart. Queries depending on null value information can be answered with 100% precision by partially inflating the shrunken data mart. We complement our analytical approach with an experimental evaluation using real and synthetic data sets, and demonstrate our results. info:eu-repo/classification/ddc/004 ddc:004
64	Optimistic Coarse-Grained Cache Semantics for Data Marts Lehner, Wolfgang, Thiele, Maik, Albrecht, Jens 15 June 2022 (has links) Data marts and caching are two closely related concepts in the domain of multi-dimensional data. Both store pre-computed data to provide fast response times for complex OLAP queries, and for both it must be guaranteed that every query can be completely processed. However, they differ extremely in their update behaviour which we utilise to build a specific data mart extended by cache semantics. In this paper, we introduce a novel cache exploitation concept for data marts - coarse-grained caching - in which the containedness check for a multi-dimensional query is done through the comparison of the expected and the actual cardinalities. Therefore, we subdivide the multi-dimensional data into coarse partitions, the so called cubletets, which allow to specify the completeness criteria for incoming queries. We show that during query processing, the completeness check is done with no additional costs. info:eu-repo/classification/ddc/004 ddc:004
65	Efficient Query Processing for Dynamically Changing Datasets Idris, Muhammad, Ugarte, Martín, Vansummeren, Stijn, Voigt, Hannes, Lehner, Wolfgang 11 August 2022 (has links) The ability to efficiently analyze changing data is a key requirement of many real-time analytics applications. Traditional approaches to this problem were developed around the notion of Incremental View Maintenance (IVM), and are based either on the materialization of subresults (to avoid their recomputation) or on the recomputation of subresults (to avoid the space overhead of materialization). Both techniques are suboptimal: instead of materializing results and subresults, one may also maintain a data structure that supports efficient maintenance under updates and from which the full query result can quickly be enumerated. In two previous articles, we have presented algorithms for dynamically evaluating queries that are easy to implement, efficient, and can be naturally extended to evaluate queries from a wide range of application domains. In this paper, we discuss our algorithm and its complexity, explaining the main components behind its efficiency. Finally, we show experiments that compare our algorithm to a state-of-the-art (Higher-order) IVM engine, as well as to a prominent complex event recognition engine. Our approach outperforms the competitor systems by up to two orders of magnitude in processing time, and one order in memory consumption. info:eu-repo/classification/ddc/004 ddc:004
66	Designing Random Sample Synopses with Outliers Lehner, Wolfgang, Rosch, Philip, Gemulla, Rainer 12 August 2022 (has links) Random sampling is one of the most widely used means to build synopses of large datasets because random samples can be used for a wide range of analytical tasks. Unfortunately, the quality of the estimates derived from a sample is negatively affected by the presence of 'outliers' in the data. In this paper, we show how to circumvent this shortcoming by constructing outlier-aware sample synopses. Our approach extends the well-known outlier indexing scheme to multiple aggregation columns. info:eu-repo/classification/ddc/004 ddc:004
67	A Sample Advisor for Approximate Query Processing Rösch, Philipp, Lehner, Wolfgang 25 January 2023 (has links) The rapid growth of current data warehouse systems makes random sampling a crucial component of modern data management systems. Although there is a large body of work on database sampling, the problem of automatic sample selection remained (almost) unaddressed. In this paper, we tackle the problem with a sample advisor. We propose a cost model to evaluate a sample for a given query. Based on this, our sample advisor determines the optimal set of samples for a given set of queries specified by an expert. We further propose an extension to utilize recorded workload information. In this case, the sample advisor takes the set of queries and a given memory bound into account for the computation of a sample advice. Additionally, we consider the merge of samples in case of overlapping sample advice and present both an exact and a heuristic solution. Within our evaluation, we analyze the properties of the cost model and compare the proposed algorithms. We further demonstrate the effectiveness and the efficiency of the heuristic solutions with a variety of experiments. info:eu-repo/classification/ddc/004 ddc:004
68	General dynamic Yannakakis: Conjunctive queries with theta joins under updates Idris, Muhammad, Ugarte, Martín, Vansummeren, Stijn, Voigt, Hannes, Lehner, Wolfgang 17 July 2023 (has links) The ability to efficiently analyze changing data is a key requirement of many real-time analytics applications. In prior work, we have proposed general dynamic Yannakakis (GDYN), a general framework for dynamically processing acyclic conjunctive queries with θ-joins in the presence of data updates. Whereas traditional approaches face a trade-off between materialization of subresults (to avoid inefficient recomputation) and recomputation of subresults (to avoid the potentially large space overhead of materialization), GDYN is able to avoid this trade-off. It intelligently maintains a succinct data structure that supports efficient maintenance under updates and from which the full query result can quickly be enumerated. In this paper, we consolidate and extend the development of GDYN. First, we give full formal proof of GDYN ’s correctness and complexity. Second, we present a novel algorithm for computing GDYN query plans. Finally, we instantiate GDYN to the case where all θ-joins are inequalities and present extended experimental comparison against state-of-the-art engines. Our approach performs consistently better than the competitor systems with multiple orders of magnitude improvements in both time and memory consumption. info:eu-repo/classification/ddc/004 ddc:004
69	Querying databases privately : a new approach to private information retrieval / Asonov, Dmitri. January 1900 (has links) Thesis (doctoral) - Humboldt Universität, Berlin, 2003. / Includes bibliographical references (p. [107]-113) and index. Also issued online.

Search results