Global ETD Search

1	adaptive query processing in pipelined plans eurviriyanukul, kwanchai January 2008 (has links) This thesis presents an approach to enable changes to partially evaluated pipelined Query-Execution Plans (QEPs) at run-time to recover from optimisation mistakes in cardinality estimation according to the absences of accurate statistics. These mistakes may influence an optimiser to select in-efficient QEPs. 005.743
2	Two-level text classification using hybrid machine learning techniques Tripathi, Nandita January 2012 (has links) Nowadays, documents are increasingly being associated with multi-level category hierarchies rather than a flat category scheme. To access these documents in real time, we need fast automatic methods to navigate these hierarchies. Today’s vast data repositories such as the web also contain many broad domains of data which are quite distinct from each other e.g. medicine, education, sports and politics. Each domain constitutes a subspace of the data within which the documents are similar to each other but quite distinct from the documents in another subspace. The data within these domains is frequently further divided into many subcategories. Subspace Learning is a technique popular with non-text domains such as image recognition to increase speed and accuracy. Subspace analysis lends itself naturally to the idea of hybrid classifiers. Each subspace can be processed by a classifier best suited to the characteristics of that particular subspace. Instead of using the complete set of full space feature dimensions, classifier performances can be boosted by using only a subset of the dimensions. This thesis presents a novel hybrid parallel architecture using separate classifiers trained on separate subspaces to improve two-level text classification. The classifier to be used on a particular input and the relevant feature subset to be extracted is determined dynamically by using a novel method based on the maximum significance value. A novel vector representation which enhances the distinction between classes within the subspace is also developed. This novel system, the Hybrid Parallel Classifier, was compared against the baselines of several single classifiers such as the Multilayer Perceptron and was found to be faster and have higher two-level classification accuracies. The improvement in performance achieved was even higher when dealing with more complex category hierarchies. 005.743 Information Systems
3	Model-driven data migration Aboulsamh, Mohammed A. January 2012 (has links) Information systems often hold data of considerable value. Their continuing development or maintenance will often necessitate evolution of the system and migration of the data from one version to the next: a process that may be expensive, time-consuming, and prone to error. That such a process remains a source of challenges, is recognized by both academia and industry. In current practice, data migration is often considered only in the later stages of development, leaving critical data to be transformed and loaded by hand-written scripts, long after the design process has been completed. The advent of model-driven engineering offers an opportunity to consider the question of information system evolution and data migration earlier in the development process. A precise account of the proposed changes to an existing system model can be used to predict the consequences for existing data, and to generate the necessary data migration implementation. This dissertation shows how automatic data migration can be achieved by extending the definition of a data modeling language to include model level operations, each of which corresponds to the addition, modification, or deletion of a model component. Using the Unified Modeling Language (UML) notation as an example, we show how the specification of these operations may be translated into an abstract program in the Abstract Machine Notation (AMN), employed in the B-method, and then formally checked for consistency and applicability prior to translation into a concrete programming notation, such as Structured Query Language (SQL). 005.743

1

Page generated in 0.0208 seconds