Global ETD Search

361	Ranked Retrieval in Uncertain and Probabilistic Databases Soliman, Mohamed January 2011 (has links) Ranking queries are widely used in data exploration, data analysis and decision making scenarios. While most of the currently proposed ranking techniques focus on deterministic data, several emerging applications involve data that are imprecise or uncertain. Ranking uncertain data raises new challenges in query semantics and processing, making conventional methods inapplicable. Furthermore, the interplay between ranking and uncertainty models introduces new dimensions for ordering query results that do not exist in the traditional settings. This dissertation introduces new formulations and processing techniques for ranking queries on uncertain data. The formulations are based on marriage of traditional ranking semantics with possible worlds semantics under widely-adopted uncertainty models. In particular, we focus on studying the impact of tuple-level and attribute-level uncertainty on the semantics and processing techniques of ranking queries. Under the tuple-level uncertainty model, we introduce a processing framework leveraging the capabilities of relational database systems to recognize and handle data uncertainty in score-based ranking. The framework encapsulates a state space model, and efficient search algorithms that compute query answers by lazily materializing the necessary parts of the space. Under the attribute-level uncertainty model, we give a new probabilistic ranking model, based on partial orders, to encapsulate the space of possible rankings originating from uncertainty in attribute values. We present a set of efficient query evaluation algorithms, including sampling-based techniques based on the theory of Markov chains and Monte-Carlo method, to compute query answers. We build on our techniques for ranking under attribute-level uncertainty to support rank join queries on uncertain data. We show how to extend current rank join methods to handle uncertainty in scoring attributes. We provide a pipelined query operator implementation of uncertainty-aware rank join algorithm integrated with sampling techniques to compute query answers. Ranking Uncertainty Probabilistic Models Query Processing Top-k Partial Order Computer Science
362	CDAR : contour detection aggregation and routing in sensor networks Pulimi, Venkat 05 May 2010 (has links) Wireless sensor networks offer the advantages of low cost, flexible measurement of phenomenon in a wide variety of applications, and easy deployment. Since sensor nodes are typically battery powered, energy efficiency is an important objective in designing sensor network algorithms. These algorithms are often application-specific, owing to the need to carefully optimize energy usage, and since deployments usually support a single or very few applications.<p> This thesis concerns applications in which the sensors monitor a continuous scalar field, such as temperature, and addresses the problem of determining the location of a contour line in this scalar field, in response to a query, and communicating this information to a designated sink node. An energy-efficient solution to this problem is proposed and evaluated. This solution includes new contour detection and query propagation algorithms, in-network-processing algorithms, and routing algorithms. Only a small fraction of network nodes may be adjacent to the desired contour line, and the contour detection and query propagation algorithms attempt to minimize processing and communication by the other network nodes. The in-network processing algorithms reduce communication volume through suppression, compression and aggregation techniques. Finally, the routing algorithms attempt to route the contour information to the sink as efficiently as possible, while meshing with the other algorithms. Simulation results show that the proposed algorithms yield significant improvements in data and message volumes compared to baseline models, while maintaining the integrity of the contour representation. Contour Wireless sensor networks Adhoc networks Aggregation Routing Query propagation Sensor networks Isolines Contour data
363	Efficient Access Methods on the Hilbert Curve Wu, Chen-Chang 18 June 2012 (has links) The design of multi-dimensional access methods is difficult as compared to those of one-dimensional case because of no total ordering that preserves spatial locality. One way is to look for the total order that preserves spatial proximity at least to some extent. A space-filling curve is a continuous path which passes through every point in a space once so giving a one-to-one correspondence between the coordinates of the points and the 1D-sequence numbers of points on the curve. The Hilbert curve is a famous space filling curve, since it has been shown to have strong locality preserving properties; that is, it is the best space-filling curve in minimizing the number of clusters. Hence, it has been extensively used to maintain spatial locality of multidimensional data in a wide variety of applications. A window query is an important query operation in spatial (image) databases. Given a Hilbert curve, a window query reports its corresponding orders without the need to decode all the points inside this window into the corresponding Hilbert orders. Chung et al. have proposed an algorithm for decomposing a window into the corresponding Hilbert orders. However, the Hilbert curve requires that the region is of size 2^k x 2^k, where k∈N. The intuitive method such as Chung et al.¡¦s algorithm is to directly use Hilbert curves in the decomposed areas and then connect them. They must generate a sequence of the scanned quadrants additionally before encoding and decoding the Hilbert order of one pixel and scan this sequence one time while encoding and decoding one pixel. In this dissertation, on the design of methods for window queries on a Hilbert curve, we propose an efficient algorithm, named as Quad-Splitting, for decomposing a window into the corresponding Hilbert orders on a Hilbert curve without individual sorting and merging steps. The proposed algorithm does not perform individual sorting and merging steps which are needed in Chung et al.¡¦s algorithm. From our experimental results, we show that the Quad-Splitting algorithm outperforms Chung et al.¡¦s algorithm. On the design of the methods for generating the Hilbert curve of an arbitrary-sized image, we propose approximately even partition approach to generate a pseudo Hilbert curve of an arbitrary-sized image. From our experimental results, we show that our proposed pseudo Hilbert curve preserves the similar strong locality property to the Hilbert curve. On the design of the methods for coding Hilbert curve of an arbitrary-sized image, we propose encoding and decoding algorithms. From our experimental results, we show that our encoding and decoding algorithms outperform the Chung et al.¡¦s algorithms. Image Processing Hilbert Curve Space Filling Curve Window Query Image Compression
364	Web Service Composition and Selection Using Query Rewriting and Bayesian Network Hsieh, I-Hsuan 24 July 2012 (has links) Web services can be broadly classified into two types, namely effect providing (EP) services and data providing (DP) services. In this work, we address DP service composition problem that intends to satisfy user preference specified at the instance level, namely the expected occurrence. We first use the query rewriting method to identify a composition of service types that satisfies user¡¦s requirement and employ Bayesian Network model to express the causal relationship between exchange variables of DP service types. Service selection is then conducted by computing the posterior probability in the Bayesian Network. We conduct experiments to show that our proposed Bayesian Network-based method outperforms the other baseline methods in terms of execution success rate and data quality. It also has reasonable execution time. Web service composition Query rewriting Bayesian Network Service selection Data providing services
365	Retrieval of Line-drawing Images Based on Surrounding Text Lin, Shih-Hsiu 06 August 2004 (has links) As advances of information technology, engineering consulting firms have gradually digitalized their documents and line-drawing images. Such digital libraries greatly facilitate document retrievals. However, engineers still face a challenging issue: searches and retrievals of line-drawing images in a digital library. With a small number of line-drawing images in a digital library, engineers can browse thumbnails for locating relevant images. As the number of line-drawing images increases, the manual browsing process is time-consuming and frustrated. In response to the need and importance of supporting efficient and effective retrieval of line-drawing images, this thesis aims to develop a line-drawing image retrieval system. Typically, a line-drawing image within an engineering document is associated with surrounding text for description or illustration purpose. Such surrounding text provides important information for automatically indexing the line-drawing image. With extracted indexes (or keywords), retrieval of line-drawing images can be accomplished using a traditional information retrieval technique. Specifically, in this study, we propose a line-drawing image retrieval system based on surrounding text. We develop four models for defining surrounding text boundaries for line-drawing images. Furthermore, two information retrieval techniques (one with and one without query expansion) are implemented and evaluated. According to our empirical evaluations, the surrounding text boundary model with image caption together with three sentences (preceding, image anchoring, and successive sentences) would result in the best retrieval effectiveness, as measured by recall and precision rates. Line-drawing image retrieval Query expansion Line-drawing image database Information retrieval Surrounding text
366	Genetic Algorithms For Distributed Database Design And Distributed Database Query Optimization Sevinc, Ender 01 October 2009 (has links) (PDF) The increasing performance of computers, reduced prices and ability to connect systems with low cost gigabit ethernet LAN and ATM WAN networks make distributed database systems an attractive research area. However, the complexity of distributed database query optimization is still a limiting factor. Optimal techniques, such as dynamic programming, used in centralized database query optimization are not feasible because of the increased problem size. The recently developed genetic algorithm (GA) based optimization techniques presents a promising alternative. We compared the best known GA with a random algorithm and showed that it achieves almost no improvement over the random search algorithm generating an equal number of random solutions. Then, we analyzed a set of possible GA parameters and determined that two-point truncate technique using GA gives the best results. New mutation and crossover operators defined in our GA are experimentally analyzed within a synthetic distributed database having increasing the numbers of relations and nodes. The designed synthetic database replicated relations, but there was no horizontal/vertical fragmentation. We can translate a select-project-join query including a fragmented relation with N fragments into a corresponding query with N relations. Comparisons with optimal results found by exhaustive search are only 20% off the results produced by our new GA formulation showing a 50% improvement over the previously known GA based algorithm.
367	A New Approach For Better Load Balancing Of Visibility Detection And Target Acquisition Calculations Filiz, Anil Yigit 01 August 2010 (has links) (PDF) Calculating visual perception of entities in simulations requires complex intersection tests between the line of sight and the virtual world. In this study, we focus on outdoor environments which consist of a terrain and various objects located on terrain. Using hardware capabilities of graphics cards, such as occlusion queries, provides a fast method for implementing these tests. In this thesis, we introduce an approach for better load balancing of visibility detection and target acquisition calculations by the use of occlusion queries. Our results show that, the proposed approach is 1.5 to 2 times more efficient than the existing algorithms on the average.
368	Data Integration Over Horizontally Partitioned Databases In Service-oriented Data Grids Sonmez Sunercan, Hatice Kevser 01 September 2010 (has links) (PDF) Information integration over distributed and heterogeneous resources has been challenging in many terms: coping with various kinds of heterogeneity including data model, platform, access interfaces / coping with various forms of data distribution and maintenance policies, scalability, performance, security and trust, reliability and resilience, legal issues etc. It is obvious that each of these dimensions deserves a separate thread of research efforts. One particular challenge among the ones listed above that is more relevant to the work presented in this thesis is coping with various forms of data distribution and maintenance policies. This thesis aims to provide a service-oriented data integration solution over data Grids for cases where distributed data sources are partitioned with overlapping sections of various proportions. This is an interesting variation which combines both replicated and partitioned data within the same data management framework. Thus, the data management infrastructure has to deal with specific challenges regarding the identification, access and aggregation of partitioned data with varying proportions of overlapping sections. To provide a solution we have extended OGSA-DAI DQP, a well-known service-oriented data access and integration middleware with distributed query processing facilities, by incorporating UnionPartitions operator into its algebra in order to cope with various unusual forms of horizontally partitioned databases. As a result / our solution extends OGSA-DAI DQP, in two points / 1 - A new operator type is added to the algebra to perform a specialized union of the partitions with different characteristics, 2 - OGSA-DAI DQP Federation Description is extended to include some more metadata to facilitate the successful execution of the newly introduced operator. QA Computer Software 76.75-76.765
369	Design and Implementation of Query Processing Strategies for Video Data Yang, Wen-Haur 09 July 2002 (has links) Traditional database systems only support textual and numerical data. Video data stored in these database systems can only be retrieved through their video identifiers, titles or descriptions. In the video data, frame-by-frame object change is one of the most obvious information. Each video contains temporal and spatial relationships between content objects. The temporal relationships can be specified between frame sequences and the spatial relationships can be specified by the relationships between objects in a single frame. The difficulty in designing a content-based video database system is how to store and describe the relationships between moving objects completely. Many researches on content-based video retrieval represented the content of video as a set of frames, but they either left out the temporal ordering of frames in the shot or only stored the relationships between objects in a single frame. According to these observations, we conclude that a content-based video database system requires video indexing, query processing and a convenient user interface to fit the requirements and characteristics of videos. In this thesis, we design and implement a query processing strategy for video data. In the proposed strategy, we consider three query types: the exact object match, the spatial-temporal object retrieval and the motion query, where a exact object match is to find the video files which contain the specific objects, a spatial-temporal objects retrieval is to retrieve the object pairs that satisfy some spatial-temporal relationships and a motion query is to find the set of frames which contain the object movements. Moreover, we consider three design issues: the video indexing, the video query processing and the video query interface. When there are a large number of videos in a video database and each video contains many shots, frames and objects, the processing time for content retrieval is tremendous. Thus, we need a proper video indexing strategy to speed up the searching time. In order to fulfill the spatial-temporal relationships of objects between different frames, we give the indexes both in the spatial and temporal axes. In the temporal index file structure, we propose the shot-based B+-tree to index the temporal data. In the spatial index file structure, we use R-tree to store not only the relationships between objects in one frame, but also the relationships of one object when the object first and last appears in the shot. Based on this strategy, we can describe the status of a moving object in details. For the part of query processing, we propose a signature file structure to filter out the videos that absolutely can not be the answer. After that, in order to determine whether the answer exists in the candidate videos, we use a multi-dimensional string, called binary string, to represent the spatial-temporal relationships between objects. Then, the video query processing problem will become a binary string matching problem. Finally, we design and implement an user-friendly user interface. Our system is performed on a Pentium III machine with one CPU clock rate of 550 MHz, 256 MB of main memory, running under Windows 2000 Professional edition, used Access 2000 database and coded in Delphi 6 with about 10,000 lines. From our experience, we show that the proposed system can support an efficient query processing, a fast searching capabilities and an user-friendly user interface. Video Query Processing Video Data Spatial-Temporal Relationships Video Indexing shot-based B+-tree
370	A Heap-Structure-Based Approach to On-Line Broadcast Scheduling in Mobile Systems Hsieh, Wu-Han 25 July 2003 (has links) ABSTRACT Broadcasting data delivery is rapidly becoming the good choice for disseminating information to a massive user population in many new application areas where the client-to-server communication is limited. There are two different ways of data dissemination. One is called push-based that the data items are broadcasted periodically in the channels, another one is called pull-based that the client requests a piece of data on the uplink channel and the server responds by sending this piece of data to the client. In push-based, most of the previous researches assume that each mobile client needs only one data item. However, in many situations, a mobile client might need more than one data item. In pull-based, the data items were broadcasted dynamically. Most of the previous researches assume that the data items which requested by the clients are of the same size. However, the data items may of different sizes in reality. In this thesis, we propose Improved QDS Expansion Method (Improved-QEM) and Heuristic On-line Algorithm to overcome the above two weaknesses, respectively. The issue of scheduling the broadcast data for the situation that each client may access multiple data items can not be simply considered as multiple subissues. There have been two methods was proposed, Query Expansion Method (QEM) and Modified Query Expansion Methods (Modified-QEM). These two methods are heuristic-based algorithm and do not provide the optimal solution. To improve the performance, our Improved-QEM is an efficient scheduling for query-set-based broadcasting, which is integrated with Query Expansion Method (QEM) and mining association rules technique. The mining association rules can globally find the data item sets (large itemsets) which are requested by clients, frequently. From our simulation results, we show that, as compared to the local optimal approach in the previous methods, our Improved-QEM can construct the schedule with the smaller TQD than that constructed by QEM and Modified-QEM, where TQD is denotes Total Query Distance and is proportional to the average access time. The on-line (push-based) algorithms are easy to adapt to time varying demands for the data items, which uses some decision-making mechanism to determine which data item is to be broadcasted next. Hence, when the number of data items is huge, it is important to schedule broadcasting program such that, it can provide the small overall mean access time. Therefore, Vaidya and Hameed have proposed two on-line algorithms, On-line Algorithm and On-line with Bucketing Algorithm. The main disadvantage of On-line Algorithm is the heavy run-time overhead and the main disadvantage of On-line Algorithm with Bucketing is the poor performance of the overall mean access time. Therefore, we propose the heuristic on-line algorithm to solve these two problems. From our simulation results, we show that our heuristic algorithm provides the performance that closes to the overall mean access time and has with low run-time overhead. Data Broadcasting On-Line Scheduling Query-Set-Based Dynamic Scheduling Mobile Databases

Search results