Global ETD Search

41	An Integrative and Uniform Model for Metadata Management in Data Warehousing Environment Stöhr, Thomas, Müller, Robert, Rahm, Erhard 05 February 2019 (has links) Due to the increasing complexity of data warehouses, a centralized and declarative management of metadata is essential for data warehouse administration, maintenance and usage. Metadata are usually divided into technical and semantic metadata. Typically, current approaches only support subsets of these metadata types, such as data movement metadata or multidimensional metadata for OLAP. In particular, the interdependencies between technical and semantic metadata have not yet been investigated sufficiently. The representation of these interdependencies form an important prerequisite for the translation of queries formulated at the business concept level to executable queries on physical data. Therefore, we suggest a uniform and integrative model for data warehouse metadata. This model uses a uniform representation approach based on the Uniform Modeling Language (UML) to integrate technical and semantic metadata and their interdependencies. info:eu-repo/classification/ddc/004 ddc:004
42	HematoWork: a Knowledge-based Workflow System for Distributed Cancer Therapy Müller, Robert, Heller, Barbara, Löffler, Markus, Rahm, Erhard, Winter, Alfred, Mantovani, Luisa, Klöss, M., Berger, H. Stephen, Brümmer, Franz, Fiebig, F., Jödecke, E., Neubert, Ulrike, Speer, Runa 05 February 2019 (has links) The domain of hemato-oncology is characterized by a complex and data-intensive treatment and the involvement of geographically distributed institutions (e.g. oncological ward, central commission, external panels) in the context of protocol-directed trials. Current research efforts in this domain (e.g. [1-3]) focus on specialized subtasks such as chemotherapy calculation and toxicity monitoring, but fail to support inter-application data flow and coordination aspects which have been identified as essential for integration in heterogeneous and distributed clinical environments (e.g. [4,5]). Therefore, at Leipzig University, the distributed workflow system HEMATOWORK, which has explicit knowledge about the oncological treatment and the associated communication paths between the involved institutions, is currently developed. In particular, HEMATOWORK intends to support the following basic tasks: Informatics, Computer science, Medicine info:eu-repo/classification/ddc/004 ddc:004
43	DOL - an Interoperable Document Server Melnik, Sergey, Rahm, Erhard, Sosna, Dieter 05 February 2019 (has links) We describe the design and expierences gained with the database and web-based document server DOL, which we developed at the University of Leipzig (http://dol.uni-leipzig.de). The server provides a central repository for a variety of fulltext documents. In Leipzig, it has been used since 1998 as a university-wide digital library for documents by local authors, in particular Ph.D. theses, master theses, research papers, lecture notes etc., offering a central access point to the university´s research results and educational material. Decentralized administration and different workflows are supported to met organizational and legal requirements of specific document types (e.g., Ph.D. theses). All documents are converted into several formats, and can be downloaded or viewed online in a page-wise fashion. The documents are searchable in a flexible way using fulltext and bibliographic queries. Moreover, a multi-level navigation interface is provided, supporting browsing along several dimentions. DOL is interoperable with global digital libraries such as NCSTRL and can be ported to the needs of different organisations. It is also in use at Stanford University. Informatics, Computer science, Databases info:eu-repo/classification/ddc/004 ddc:004
44	Skew-Insensitive Join Processing in Shared-Disk Database Systems Märtens, Holger 05 February 2019 (has links) Skew effects are still a significant problem for efficient query processing in parallel database systems. Especially in shared-nothing environments, this problem is aggravated by the substantial cost of data redistribution. Shared-disk systems, on the other hand, promise much higher flexibility in the distribution of workload among processing nodes because all input data can be accessed by any node at equal cost. In order to verify this potential for dynamic load balancing, we have devised a new technique for skew-tolerant join processing. In contrast to conventional solutions, our algorithm is not restricted to estimating processing costs in advance and assigning tasks to nodes accordingly. Instead, it monitors the actual progression of work and dynamically allocates tasks to processors, thus capitalizing on the uniform access pathlength in shared-disk architectures. This approach has the potential to alleviate not only any kind of data-inherent skew, but also execution skew caused by query- external workloads, by disk contention, or simply by inaccurate estimates used in predictive scheduling. We employ a detailed simulation system to evaluate the new algorithm under different types and degrees of skew. info:eu-repo/classification/ddc/004 ddc:004
45	Generic Schema Matching with Cupid Madhavan, Jayant, Bernstein, Philip A., Rahm, Erhard 05 February 2019 (has links) Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schema matching, outside of any particular data model or application. We first present a taxonomy for past solutions, showing that a rich range of techniques is available. We then propose a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches. Some of our innovations are the integrated use of linguistic and structural matching, context-dependent matching of shared types, and a bias toward leaf structure where much of the schema content resides. After describing our algorithm, we present experimental results that compare Cupid to two other schema matching systems. info:eu-repo/classification/ddc/004 ddc:004
46	Training Selection for Tuning Entity Matching Köpcke, Hanna, Rahm, Erhard 06 February 2019 (has links) Entity matching is a crucial and difficult task for data integration. An effective solution strategy typically has to combine several techniques and to find suitable settings for critical configuration parameters such as similarity thresholds. Supervised (training-based) approaches promise to reduce the manual work for determining (learning) effective strategies for entity matching. However, they critically depend on training data selection which is a difficult problem that has so far mostly been addressed manually by human experts. In this paper we propose a training-based framework called STEM for entity matching and present different generic methods for automatically selecting training data to combine and configure several matching techniques. We evaluate the proposed methods for different match tasks and small- and medium-sized training sets. Informatics, Computer science, Databases info:eu-repo/classification/ddc/004 ddc:004
47	Effects of Mobile Business Processes on the Software Process Köhler, André, Gruhn, Volker 06 February 2019 (has links) The adoption of mobile technologies into companies frequently follows a technology-driven approach without precise knowledge about the potential benefits that may be realised. Especially in larger organisations with complex business processes, a systematic procedure is required if a verifiable economic benefit is to be created by the use of mobile technologies. Therefore, the term “mobile business process” is defined in this paper. Subsequently, we introduce a procedure for the systematical analysis of the distributed structure of a business process model in order to identify requirements for software engineering in mobile sub-processes. For that purpose, the method Mobile Process Landscaping is used to decompose a process model into different levels of detail. The method aims to manage the complexity and limit the process analysis to the potentially mobile sub-processes from the beginning. The result of the analysis can be used on the one hand as a foundation for the redesign of the business processes and on the other hand for the requirements engineering of mobile information systems. info:eu-repo/classification/ddc/004 ddc:004
48	Analysis of Mobile Business Processes for the Design of Mobile Information Systems Köhler, André, Gruhn, Volker 06 February 2019 (has links) The adoption of mobile technologies into companies frequently follows a technology-driven approach without precise knowledge about the potential benefits that may be realised. Especially in larger organisations with complex business processes, a systematic procedure is required if a verifiable economic benefit is to be created by the use of mobile technologies. Therefore, the term “mobile business process”, as well as requirements for information systems applied in such processes, are defined in this paper. Subsequently, we introduce a procedure for the systematical analysis of the distributed structure of a business process model in order to identify mobile sub-processes. For that purpose, the method Mobile Process Landscaping is used to decompose a process model into different levels of detail. The method aims to manage the complexity and limit the process analysis to the potentially mobile sub-processes from the beginning. The result of the analysis can be used on the one hand as a foundation for the redesign of the business processes and on the other hand for the requirements engineering of mobile information systems. An application of this method is shown by the example of business processes in the insurance industry. info:eu-repo/classification/ddc/004 ddc:004
49	A special-purpose peer-to-peer file sharing system for mobile ad hoc networks Klemm, Alexander, Lindemann, Christoph, Waldhorst, Oliver P. 06 February 2019 (has links) Establishing peer-to-peer (P2P) file sharing for mobile ad hoc networks (MANET) requires the construction of a search algorithm for transmitting queries and search results as well as the development of a transfer protocol for downloading files matching a query. In this paper, we present a special-purpose system for searching and file transfer tailored to both the characteristics of MANET and the requirements of peer-to-peer file sharing. Our approach is based on an application layer overlay network. As innovative feature, overlay routes are set up on demand by the search algorithm, closely matching network topology and transparently aggregating redundant transfer paths on a per-file basis. The transfer protocol guarantees low transmission overhead and a high fraction of successful downloads by utilizing overlay routes. In a detailed ns-2 simulation study, we show that both the search algorithm and the transfer protocol outperform off-the-shelf approaches based on a P2P file sharing system for the wireline Internet, TCP and a MANET routing protocol. info:eu-repo/classification/ddc/004 ddc:004
50	Relating Query Popularity and File Replication in the Gnutella Peer-to-Peer Network Klemm, Alexander, Lindemann, Christoph, Waldhorst, Oliver P. 06 February 2019 (has links) In this paper, we characterize the user behavior in a peer-to-peer (P2P) file sharing network. Our characterization is based on the results of an extensive passive measurement study of the messages exchanged in the Gnutella P2P file sharing system. Using the data recorded during this measurement study, we analyze which queries a user issues and which files a user shares. The investigation of users queries leads to the characterization of query popularity. Furthermore, the analysis of the files shared by the users leads to a characterization of file replication. As major contribution, we relate query popularity and file replication by an analytical formula characterizing the matching of files to queries. The analytical formula defines a matching probability for each pair of query and file, which depends on the rank of the query with respect query popularity, but is independent of the rank of the file with respect to file replication. We validate this model by conducting a detailed simulation study of a Gnutella-style overlay network and comparing simulation results to the results obtained from the measurement. Informatics, Computer science, Networks info:eu-repo/classification/ddc/004 ddc:004

Search results