Spelling suggestions: "subject:"database"" "subject:"catabase""
111 |
Queries, Data, and Statistics: Pick TwoMishra, Chaitanya 21 April 2010 (has links)
The query processor of a relational database system executes declarative queries on relational data using query evaluation plans. The cost of the query evaluation plan depends on various statistics defined by the query and data. These statistics include intermediate and base table sizes, and data
distributions on columns. In addition to being an important factor in query optimization, such statistics also influence various runtime properties of the query evaluation plan.
This thesis explores the interactions between queries, data, and statistics in the query processor of a relational database system. Specifically, we consider problems where any two of the three - queries, data, and statistics - are provided, with the
objective of instantiating the missing element in the triple such that the query, when executed on the data, satisfies the statistics on the associated subexpressions. We present multiple query processing problems that can be abstractly formulated in this manner.
The first contribution of this thesis is a monitoring framework for collecting and estimating statistics during query execution. We apply this framework to the problems of
monitoring the progress of query execution, and adaptively reoptimizing query execution plans. Our monitoring and adaptivity framework has a low overhead, while significantly reducing query execution times. This work demonstrates the feasibility and utility of overlaying statistics estimators on query evaluation plans.
Our next contribution is a framework for testing the performance of a query
processor by generating targeted test queries and databases. We present techniques for data-aware query generation, and query-aware data generation that satisfy test cases specifying statistical constraints. We formally analyze the hardness of the problems considered, and present systems that support best-effort semantics for targeted query and data generation.
The final contribution of this thesis is a set of techniques for designing queries for business intelligence applications that specify cardinality constraints on the result. We present an interactive query refinement framework that explicitly incorporates user feedback into query design, refining queries returning too many or few answers.
Each of these contributions is accompanied by a formal analysis of the problem, and a detailed experimental evaluation of an associated system.
|
112 |
Query Interactions in Database SystemsAhmad, Mumtaz January 2012 (has links)
The typical workload in a database system consists of a mix of multiple queries of different types, running concurrently and
interacting with each other. The same query may have different performance in different mixes. Hence, optimizing performance
requires reasoning about query mixes and their interactions, rather than considering individual queries or query types. In this
dissertation, we demonstrate how queries affect each other when they are executing concurrently in different mixes. We show the
significant impact that query interactions can have on the end-to-end workload performance. A major hurdle in the understanding of query interactions in
database systems is that there is a large spectrum of possible causes of interactions. For example, query interactions can happen
because of any of the resource-related, data-related or configuration-related dependencies that exist in the system. This
variation in underlying causes makes it very difficult to come up with robust analytical performance models to capture and model query
interactions. We present a new approach for modeling performance in the presence of interactions, based on conducting experiments to measure the effect of query interactions and fitting statistical models to the data collected in these experiments to capture the impact of query interactions. The experiments collect samples of the different possible query mixes, and measure the performance metrics of interest for the different queries in these sample mixes.
Statistical models such as simple regression and instance-based learning techniques are used to train models from these sample mixes. This approach requires no prior assumptions about the internal workings of the database system or the nature or cause of
the interactions, making it portable across systems. We demonstrate the potential of capturing, modeling, and exploiting query interactions by developing techniques to help in two database performance related tasks: workload scheduling and estimating the
completion time of a workload. These are important workload management problems that database administrators have to deal with
routinely. We consider the problem of scheduling a workload of
report-generation queries. Our scheduling algorithms employ statistical performance models to schedule appropriate query mixes
for the given workload. Our experimental evaluation demonstrates that our interaction-aware scheduling algorithms outperform scheduling policies that are typically used in database systems. The problem of estimating the completion time of a workload is an important problem, and the state of the art does not offer any systematic solution. Typically database administrators rely on heuristics or observations of past behavior to solve this problem. We propose a more rigorous solution to this problem, based on a workload simulator that employs performance models to simulate the execution of the different mixes that make up a workload. This mix-based simulator provides a systematic tool that can help database administrators in estimating workload completion time. Our
experimental evaluation shows that our approach can estimate the workload completion times with a high degree of accuracy. Overall, this dissertation demonstrates that reasoning about query
interactions holds significant potential for realizing performance improvements in database systems. The techniques developed in this work can be viewed as initial steps in this interesting area of research, with lots of potential for future work.
|
113 |
Engineering truly automated data integration and translation systemsWarren, Robert H 10 December 2007 (has links)
This thesis presents an automated, data-driven integration process for relational databases. Whereas previous integration methods assumed a large amount of user involvement as well as the availability of database meta-data, we make no use of meta-data and little end user input. This is done using a novel join and translation finding algorithm that searches for the proper key / foreign key relationships while inferring the instance transformations from one database to another. Because we rely only on the relations that bind the attributes together, we make no use of the database schema information. A novel searching method allows us to search the database for relevant objects without requiring server side indexes or cooperative databases.
|
114 |
Engineering truly automated data integration and translation systemsWarren, Robert H 10 December 2007 (has links)
This thesis presents an automated, data-driven integration process for relational databases. Whereas previous integration methods assumed a large amount of user involvement as well as the availability of database meta-data, we make no use of meta-data and little end user input. This is done using a novel join and translation finding algorithm that searches for the proper key / foreign key relationships while inferring the instance transformations from one database to another. Because we rely only on the relations that bind the attributes together, we make no use of the database schema information. A novel searching method allows us to search the database for relevant objects without requiring server side indexes or cooperative databases.
|
115 |
Using metadata to implement eforms and their associated databasesLelei, Edgar David Kiprop 18 January 2011 (has links)
Web forms (eForms) and databases are at present widely used for data handling in most web applications. While eForms are used for data gathering and display, databases are used for data storage. To connect and interface an eForm to a database, an eForm processor is used. The eForm processor supports data saving, retrieval, update, and delete. In most web applications, eForms, eForm processors, and databases are designed and implemented separately. This leads to two main challenges: One, complexity in the manipulation of eForms and their associated database; and two, difficulty in the reproduction and reuse of existing eForms.<p>
To address the above-identified challenges, this thesis proposes the use of metadata in the creation and implementation of both eForms and their associated databases. Our approach comprises a two-part solution: One, modeling domains metadata and two, creating a tool, called Delk eForm Creator. To model domain metadata, Resource Description Framework Schema (RDFS) was used. However, to analyse the tools requirement, Putting Usability First (PUF) approach was used.<p>
In order to demonstrate the applicability of our solution approach, Delk eForm Creator was used to create a set of Metadata and three specific eForms based on a generic eForm. The created eForms were rendered in different web browsers and used to enter data into the associated databases. It was observed that Delk eForm Creator successfully generated a Generic eForm based on the Domain Metadata. Moreover, three different Specific eForms were successfully generated based on one Generic eForm, thereby leading to a reusable Generic eForm.<p>
We conclude that the metadata-based approach of implementing eForms, as proposed in this thesis, is a viable technique to creating eForms and their associated databases. The approach enables users to easily create, maintain, and reuse eForms and databases.
|
116 |
Design and Implementation of a Mapping Technique between XML Documents and Relational DatabasesLee, Chia-He 18 July 2001 (has links)
In recent years, many people use the World Wide Web and Internet to find information that they want. HTML is a document markup language for publishing hypertext on the WWW. HTML has been the target format for content developers around the world. Basically, HTML tags serve the primary purpose of describing how to display a data item. Therefore, HTML documents are difficult to find some useful information. That is because, HTML documents are mixed content with display tags. On the other hand, XML is the another data format for data exchange inter-enterprise applications on the Internet. In order to facilitate data exchange, industry groups define public Document Type Definitions (DTD) that specify the format of the XML documents to be exchanged between their applications. Moreover, WWW/EDI or Electric Commerce is very popular and a lot of business data uses XML to exchange on the World Wide Web. Basically, XML tags describe the data itself. The contents (meaning) of the XML documents and the display format is separated. It could be easily to find meaningful information of the XML documents and analyze the information. Moreover, when a large volume of business data (XML documents) exists, we must transform the XML documents to the relational databases. In order to exchange business data between applications, we must construct the XML documents from the relational database. In this thesis, we design the mapping technique and present the implementation of mapping tools between XML documents and relational databases. XML document is fundamentally different from relational data. XML document are hierarchy, and elements of document should be nested and repeated more times (i.e., set-valued and recursion). Therefore, we can not map from the XML documents to the relational databases straightforwardly. Our mapping technique must resolve the above problems. We design and implement a mapping technique between the XML documents and the relational database such that those mapping can be done automatically for any kind of XML documents and any kind of commercial relational databases. The whole tools are implemented in Visual Basic and SQL Server 2000. From our experiences, we show that our efficient mapping technique can be applied to any kind of relational databases without any extra requirements or changes to the databases.
|
117 |
Fractal-based Image Database RetrievalTien, Fu-Ming 24 July 2001 (has links)
With the advent of multimedia computer, the voice and images could be stored in database. How to retrieve the information user want is a heard question. To query the large numbers of digital images which human desired is not a simple task. The studies of traditional image database retrieval use color, shape, and content to analyze a digital image, and create the index file. But they cannot promise that use the similar index files will find the similar images, and the similar images can get the similar index files.
In this thesis, we propose a new method to analyze a digital image by fractal code. Fractal coding is an effective method to compress digital image. In fractal code, the image is partitioned into a set of non-overlapping range blocks, and a set of overlapping domain blocks is chosen from the same image. For all range blocks, we need to find one domain block and one iteration function such that the mapping from the domain block is similar to the range block. Two similar images have similar iterated functions, and two similar iterated functions have similar attractors. In these two reasons, we use the iteration function to create index file. We have proved fractal code can be a good index file in chapter 3.
In chapter 4, we implement the fractal-based image database. In this system, we used fractal code to create index file, and used Fisher discriminate function, color, complexity, and illumination to decide the output order.
|
118 |
Effective and efficient analysis of spatio-temporal data /Zhang, Zhongnan. January 2008 (has links)
Thesis (Ph.D.)--University of Texas at Dallas, 2008. / Includes vita. Includes bibliographical references (leaves 106-114)
|
119 |
Data mining-driven approaches for process monitoring and diagnosisSukchotrat, Thuntee. January 2008 (has links)
Thesis (Ph.D.) -- University of Texas at Arlington, 2008.
|
120 |
Data warehouse schema design /Lechtenbörger, Jens. January 1900 (has links)
Münster (Westfalen), University, Thesis (doctoral), 2001. / Includes bibliographical references (p. [207]-216) and index.
|
Page generated in 0.06 seconds