• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 97
  • 13
  • 10
  • 5
  • 3
  • 3
  • 2
  • 1
  • Tagged with
  • 160
  • 160
  • 58
  • 53
  • 50
  • 46
  • 43
  • 43
  • 43
  • 38
  • 31
  • 29
  • 29
  • 29
  • 23
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
41

Medical Data Management on the cloud / Gestion de données médicales sur le cloud

Mohamad, Baraa 23 June 2015 (has links)
Résumé indisponible / Medical data management has become a real challenge due to the emergence of new imaging technologies providing high image resolutions.This thesis focuses in particular on the management of DICOM files. DICOM is one of the most important medical standards. DICOM files have special data format where one file may contain regular data, multimedia data and services. These files are extremely heterogeneous (the schema of a file cannot be predicted) and have large data sizes. The characteristics of DICOM files added to the requirements of medical data management in general – in term of availability and accessibility- have led us to construct our research question as follows:Is it possible to build a system that: (1) is highly available, (2) supports any medical images (different specialties, modalities and physicians’ practices), (3) enables to store extremely huge/ever increasing data, (4) provides expressive accesses and (5) is cost-effective .In order to answer this question we have built a hybrid (row-column) cloud-enabled storage system. The idea of this solution is to disperse DICOM attributes thoughtfully, depending on their characteristics, over both data layouts in a way that provides the best of row-oriented and column-oriented storage models in one system. All with exploiting the interesting features of the cloud that enables us to ensure the availability and portability of medical data. Storing data on such hybrid data layout opens the door for a second research question, how to process queries efficiently over this hybrid data storage with enabling new and more efficient query plansThe originality of our proposal comes from the fact that there is currently no system that stores data in such hybrid storage (i.e. an attribute is either on row-oriented database or on column-oriented one and a given query could interrogate both storage models at the same time) and studies query processing over it.The experimental prototypes implemented in this thesis show interesting results and opens the door for multiple optimizations and research questions.
42

Efficient and Reliable In-Network Query Processing in Wireless Sensor Networks

Malhotra, Baljeet Singh 11 1900 (has links)
The Wireless Sensor Networks (WSNs) have emerged as a new paradigm for collecting and processing data from physical environments, such as wild life sanctuaries, large warehouses, and battlefields. Users can access sensor data by issuing queries over the network, e.g., to find what are the 10 highest temperature values in the network. Typically, a WSN operates by constructing a logical topology, such as a spanning tree, built on top of the physical topology of the network. The constructed logical topology is then used to disseminate queries in the network, and also to process and return the results of such queries back to the user. A major challenge in this context is prolonging the network's lifetime that mainly depends on the energy cost of data communication via wireless radios, which is known to be very expensive as compared to the cost of data processing within the network. In this research, we investigate some of the core problems that deal with the different aspects of in-network query processing in WSNs. In that context, we propose an efficient filtering based algorithm for the top-k query processing in WSNs. Through a systematic study of the top-k query processing in WSNs we propose several solutions in this thesis, which are applicable not only to the top-k queries, but also to in-network query processing problems in general. Specifically, we consider broadcasting and convergecasting, which are two basic operations that are required by many in-network query processing solutions. Scheduling broadcasting and convergecasting is another problem that is important for energy efficiency in WSNs. Failure of communication links, which are common in WSNs, is yet another important issue that needs to be addressed. In this research, we take a holistic approach to deal with the above problems while processing the top-k queries in WSNs. To this end, the thesis makes several contributions. In particular, our proposed solutions include new logical topologies, scheduling algorithms, and an overall sophisticated communication framework, which allows to process the top-k queries efficiently and with increased reliability. Extensive simulation studies reveal that our solutions are not only energy efficient, saving up to 50% of the energy cost as compared to the current state-of-the-art solutions, but they are also robust to link failures.
43

Data Processing Techniques on Modern Hardware Architectures

Tsirogiannis, Dimitrios 31 August 2011 (has links)
The last decade has been characterized by radical changes in the computing landscape. We have witnessed the advent of multi-core processors, flash-based storage systems and the proliferation of scale out architectures, such as map-reduce-based systems and massively parallel databases. Although data management systems have embraced modern hardware technologies to some extent, they have not realized their full potential. The goal of this thesis is two-fold. Primarily, it demonstrates the staggering potential for performance improvement offered by modern hardware architectures and, then, proposes how data management systems must alter in order to realize this potential. Additionally, this thesis demonstrates that utilizing modern hardware architectures is important both for performance and energy-efficiency. Towards this goal, we propose query processing and indexing techniques for chip multiprocessors and we analyze the trade-offs of executing complex database queries on modern processor technologies. Subsequently, we propose query processing methods tailored to flash-based storage systems. Finally, we analyze the power consumption of database systems and we reveal opportunities for improving their energy efficiency.
44

Data Processing Techniques on Modern Hardware Architectures

Tsirogiannis, Dimitrios 31 August 2011 (has links)
The last decade has been characterized by radical changes in the computing landscape. We have witnessed the advent of multi-core processors, flash-based storage systems and the proliferation of scale out architectures, such as map-reduce-based systems and massively parallel databases. Although data management systems have embraced modern hardware technologies to some extent, they have not realized their full potential. The goal of this thesis is two-fold. Primarily, it demonstrates the staggering potential for performance improvement offered by modern hardware architectures and, then, proposes how data management systems must alter in order to realize this potential. Additionally, this thesis demonstrates that utilizing modern hardware architectures is important both for performance and energy-efficiency. Towards this goal, we propose query processing and indexing techniques for chip multiprocessors and we analyze the trade-offs of executing complex database queries on modern processor technologies. Subsequently, we propose query processing methods tailored to flash-based storage systems. Finally, we analyze the power consumption of database systems and we reveal opportunities for improving their energy efficiency.
45

Query Processing in a Traceable P2P Record Exchange Framework

ISHIKAWA, Yoshiharu, LI, Fengrong 01 June 2010 (has links)
No description available.
46

Flexible techniques for heterogeneous XML data retrieval

Sanz Blasco, Ismael 31 October 2007 (has links)
The progressive adoption of XML by new communities of users has motivated the appearance of applications that require the management of large and complex collections, which present a large amount of heterogeneity. Some relevant examples are present in the fields of bioinformatics, cultural heritage, ontology management and geographic information systems, where heterogeneity is not only reflected in the textual content of documents, but also in the presence of rich structures which cannot be properly accounted for using fixed schema definitions. Current approaches for dealing with heterogeneous XML data are, however, mainly focused at the content level, whereas at the structural level only a limited amount of heterogeneity is tolerated; for instance, weakening the parent-child relationship between nodes into the ancestor-descendant relationship. The main objective of this thesis is devising new approaches for querying heterogeneous XML collections. This general objective has several implications: First, a collection can present different levels of heterogeneity in different granularity levels; this fact has a significant impact in the selection of specific approaches for handling, indexing and querying the collection. Therefore, several metrics are proposed for evaluating the level of heterogeneity at different levels, based on information-theoretical considerations. These metrics can be employed for characterizing collections, and clustering together those collections which present similar characteristics. Second, the high structural variability implies that query techniques based on exact tree matching, such as the standard XPath and XQuery languages, are not suitable for heterogeneous XML collections. As a consequence, approximate querying techniques based on similarity measures must be adopted. Within the thesis, we present a formal framework for the creation of similarity measures which is based on a study of the literature that shows that most approaches for approximate XML retrieval (i) are highly tailored to very specific problems and (ii) use similarity measures for ranking that can be expressed as ad-hoc combinations of a set of --basic' measures. Some examples of these widely used measures are tf-idf for textual information and several variations of edit distances. Our approach wraps these basic measures into generic, parametrizable components that can be combined into complex measures by exploiting the composite pattern, commonly used in Software Engineering. This approach also allows us to integrate seamlessly highly specific measures, such as protein-oriented matching functions.Finally, these measures are employed for the approximate retrieval of data in a context of highly structural heterogeneity, using a new approach based on the concepts of pattern and fragment. In our context, a pattern is a concise representations of the information needs of a user, and a fragment is a match of a pattern found in the database. A pattern consists of a set of tree-structured elements --- basically an XML subtree that is intended to be found in the database, but with a flexible semantics that is strongly dependent on a particular similarity measure. For example, depending on a particular measure, the particular hierarchy of elements, or the ordering of siblings, may or may not be deemed to be relevant when searching for occurrences in the database. Fragment matching, as a query primitive, can deal with a much higher degree of flexibility than existing approaches. In this thesis we provide exhaustive and top-k query algorithms. In the latter case, we adopt an approach that does not require the similarity measure to be monotonic, as all previous XML top-k algorithms (usually based on Fagin's algorithm) do. We also presents two extensions which are important in practical settings: a specification for the integration of the aforementioned techniques into XQuery, and a clustering algorithm that is useful to manage complex result sets.All of the algorithms have been implemented as part of ArHeX, a toolkit for the development of multi-similarity XML applications, which supports fragment-based queries through an extension of the XQuery language, and includes graphical tools for designing similarity measures and querying collections. We have used ArHeX to demonstrate the effectiveness of our approach using both synthetic and real data sets, in the context of a biomedical research project.
47

Ranked Retrieval in Uncertain and Probabilistic Databases

Soliman, Mohamed January 2011 (has links)
Ranking queries are widely used in data exploration, data analysis and decision making scenarios. While most of the currently proposed ranking techniques focus on deterministic data, several emerging applications involve data that are imprecise or uncertain. Ranking uncertain data raises new challenges in query semantics and processing, making conventional methods inapplicable. Furthermore, the interplay between ranking and uncertainty models introduces new dimensions for ordering query results that do not exist in the traditional settings. This dissertation introduces new formulations and processing techniques for ranking queries on uncertain data. The formulations are based on marriage of traditional ranking semantics with possible worlds semantics under widely-adopted uncertainty models. In particular, we focus on studying the impact of tuple-level and attribute-level uncertainty on the semantics and processing techniques of ranking queries. Under the tuple-level uncertainty model, we introduce a processing framework leveraging the capabilities of relational database systems to recognize and handle data uncertainty in score-based ranking. The framework encapsulates a state space model, and efficient search algorithms that compute query answers by lazily materializing the necessary parts of the space. Under the attribute-level uncertainty model, we give a new probabilistic ranking model, based on partial orders, to encapsulate the space of possible rankings originating from uncertainty in attribute values. We present a set of efficient query evaluation algorithms, including sampling-based techniques based on the theory of Markov chains and Monte-Carlo method, to compute query answers. We build on our techniques for ranking under attribute-level uncertainty to support rank join queries on uncertain data. We show how to extend current rank join methods to handle uncertainty in scoring attributes. We provide a pipelined query operator implementation of uncertainty-aware rank join algorithm integrated with sampling techniques to compute query answers.
48

Data Integration Over Horizontally Partitioned Databases In Service-oriented Data Grids

Sonmez Sunercan, Hatice Kevser 01 September 2010 (has links) (PDF)
Information integration over distributed and heterogeneous resources has been challenging in many terms: coping with various kinds of heterogeneity including data model, platform, access interfaces / coping with various forms of data distribution and maintenance policies, scalability, performance, security and trust, reliability and resilience, legal issues etc. It is obvious that each of these dimensions deserves a separate thread of research efforts. One particular challenge among the ones listed above that is more relevant to the work presented in this thesis is coping with various forms of data distribution and maintenance policies. This thesis aims to provide a service-oriented data integration solution over data Grids for cases where distributed data sources are partitioned with overlapping sections of various proportions. This is an interesting variation which combines both replicated and partitioned data within the same data management framework. Thus, the data management infrastructure has to deal with specific challenges regarding the identification, access and aggregation of partitioned data with varying proportions of overlapping sections. To provide a solution we have extended OGSA-DAI DQP, a well-known service-oriented data access and integration middleware with distributed query processing facilities, by incorporating UnionPartitions operator into its algebra in order to cope with various unusual forms of horizontally partitioned databases. As a result / our solution extends OGSA-DAI DQP, in two points / 1 - A new operator type is added to the algebra to perform a specialized union of the partitions with different characteristics, 2 - OGSA-DAI DQP Federation Description is extended to include some more metadata to facilitate the successful execution of the newly introduced operator.
49

A Query Language and Its Processing for Time-Series Document Clusters

Khy, Sophoin, Ishikawa, Yoshiharu, Kitagawa, Hiroyuki 12 1900 (has links)
No description available.
50

Design and Implementation of Query Processing Strategies for Video Data

Yang, Wen-Haur 09 July 2002 (has links)
Traditional database systems only support textual and numerical data. Video data stored in these database systems can only be retrieved through their video identifiers, titles or descriptions. In the video data, frame-by-frame object change is one of the most obvious information. Each video contains temporal and spatial relationships between content objects. The temporal relationships can be specified between frame sequences and the spatial relationships can be specified by the relationships between objects in a single frame. The difficulty in designing a content-based video database system is how to store and describe the relationships between moving objects completely. Many researches on content-based video retrieval represented the content of video as a set of frames, but they either left out the temporal ordering of frames in the shot or only stored the relationships between objects in a single frame. According to these observations, we conclude that a content-based video database system requires video indexing, query processing and a convenient user interface to fit the requirements and characteristics of videos. In this thesis, we design and implement a query processing strategy for video data. In the proposed strategy, we consider three query types: the exact object match, the spatial-temporal object retrieval and the motion query, where a exact object match is to find the video files which contain the specific objects, a spatial-temporal objects retrieval is to retrieve the object pairs that satisfy some spatial-temporal relationships and a motion query is to find the set of frames which contain the object movements. Moreover, we consider three design issues: the video indexing, the video query processing and the video query interface. When there are a large number of videos in a video database and each video contains many shots, frames and objects, the processing time for content retrieval is tremendous. Thus, we need a proper video indexing strategy to speed up the searching time. In order to fulfill the spatial-temporal relationships of objects between different frames, we give the indexes both in the spatial and temporal axes. In the temporal index file structure, we propose the shot-based B+-tree to index the temporal data. In the spatial index file structure, we use R-tree to store not only the relationships between objects in one frame, but also the relationships of one object when the object first and last appears in the shot. Based on this strategy, we can describe the status of a moving object in details. For the part of query processing, we propose a signature file structure to filter out the videos that absolutely can not be the answer. After that, in order to determine whether the answer exists in the candidate videos, we use a multi-dimensional string, called binary string, to represent the spatial-temporal relationships between objects. Then, the video query processing problem will become a binary string matching problem. Finally, we design and implement an user-friendly user interface. Our system is performed on a Pentium III machine with one CPU clock rate of 550 MHz, 256 MB of main memory, running under Windows 2000 Professional edition, used Access 2000 database and coded in Delphi 6 with about 10,000 lines. From our experience, we show that the proposed system can support an efficient query processing, a fast searching capabilities and an user-friendly user interface.

Page generated in 0.1125 seconds