Global ETD Search

1	Probabilistic Databases and Their Applications Zhao, Wenzhong 01 January 2004 (has links) Probabilistic reasoning in databases has been an active area of research during the last twodecades. However, the previously proposed database approaches, including the probabilistic relationalapproach and the probabilistic object approach, are not good fits for storing and managingdiverse probability distributions along with their auxiliary information.The work in this dissertation extends significantly the initial semistructured probabilistic databaseframework proposed by Dekhtyar, Goldsmith and Hawkes in [20]. We extend the formal SemistructuredProbabilistic Object (SPO) data model of [20]. Accordingly, we also extend the SemistructuredProbabilistic Algebra (SP-algebra), the query algebra proposed for the SPO model.Based on the extended framework, we have designed and implemented a Semistructured ProbabilisticDatabase Management System (SPDBMS) on top of a relational DBMS. The SPDBMS isflexible enough to meet the need of storing and manipulating diverse probability distributions alongwith their associated information. Its query language supports standard database queries as wellas queries specific to probabilities, such as conditionalization and marginalization. Currently theSPDBMS serves as a storage backbone for the project Decision Making and Planning under Uncertaintywith Constraints 1‡ , that involves managing large quantities of probabilistic information. Wealso report our experimental results evaluating the performance of the SPDBMS.We describe an extension of the SPO model for handling interval probability distributions. TheExtended Semistructured Probabilistic Object (ESPO) framework improves the flexibility of theoriginal semistructured data model in two important features: (i) support for interval probabilitiesand (ii) association of context and conditionals with individual random variables. An extended SPO1 This project is partially supported by the National Science Foundation under Grant No. ITR-0325063.(ESPO) data model has been developed, and an extended query algebra for ESPO has also beenintroduced to manipulate probability distributions for probability intervals.The Bayesian Network Development Suite (BaNDeS), a system which builds Bayesian networkswith full data management support of the SPDBMS, has been described. It allows expertswith particular expertise to work only on specific subsystems during the Bayesian network constructionprocess independently and asynchronously while updating the model in real-time.There are three major foci of our ongoing and future work: (1) implementation of a queryoptimizer and performance evaluation of query optimization, (2) extension of the SPDBMS to handleinterval probability distributions, and (3) incorporation of machine learning techniques into theBaNDeS.
2	Efficient XML Stream Processing with Automata and Query Algebra Jian, Jinhuj 27 August 2003 (has links) "XML Stream Processing is an emerging technology designed to support declarative queries over continuous streams of data. The interest in this novel technology is growing due to the increasing number of real world applications such as monitoring systems for stock, email, and sensor data that need to analyze incoming data streams. There are however several open challenges. One, we must develop efficient techniques for pattern matching over the nested tag structure of XML as data streams in token by token. Two, we must develop techniques for query optimization to cope with complex user queries while given only incomplete knowledge of source data. When considering these challenges separately, then automata models have been shown by several recent works to be suited to tackle the first problem, while algebraic query models have been regarded as appropriate foundations to tackle the second problem. The question however remains how best to put these two models together to have an overall effective system. This thesis aims to exactly fill this gap. We propose a unified query framework to augment automata-style processing with algebra-based query optimization capabilities. We use the automata model to handle the token-oriented streaming XML data and use the algebraic model to support set-oriented optimization techniques. The framework has been designed in two layers such that the logical layer provides a uniform abstraction across the two models and any optimization techniques can be applied in either model uniformly using query rewritings. The physical layer, on the other hand, allows us to refine the implementation details after the logical layer optimization. We have successfully applied this framework in the Raindrop stream processing system. We have identified several trade-offs regarding which query functionality should be realized in which specific query model. We have developed novel optimization techniques to exploit these trade-offs. For example, a query rewrite rule can flexibly push down a pattern matching into the automata model when the optimizer decides that it is more efficient to do so. To deal with incomplete knowledge of source data, we have also developed novel techniques to monitor data statistics, based on which we can apply optimization techniques to choose the optimal query plan at runtime. Our experimental study confirms that considerable performance gains are being achieved when these optimization techniques are applied in our system." stream runtime optimization xml automata xquery query algebra Query languages (Computer science) XML (Document markup language) Mathematical optimization
3	Multilingual Information Processing On Relaltional Database Architectures Kumaran, A 12 1900 (has links) (PDF) No description available. Relational Databases Computer Software Architectures Multilingual Processing Multiscript Database Systems Homophonic/Homosemic Query Processing Multilingual Information Retrieval Multilingual Names/Semantic Matching Database Performance Multiscript Text Database Systems Multilingual Query Algebra Multilingual Semantic Matching Multilexical Matching Multiscript Matching Computer Science

1

Page generated in 0.0565 seconds