Spelling suggestions: "subject:"database"" "subject:"catabase""
611 |
Efficient Concurrent Operations in Spatial DatabasesDai, Jing 16 November 2009 (has links)
As demanded by applications such as GIS, CAD, ecology analysis, and space research, efficient spatial data access methods have attracted much research. Especially, moving object management and continuous spatial queries are becoming highlighted in the spatial database area. However, most of the existing spatial query processing approaches were designed for single-user environments, which may not ensure correctness and data consistency in multiple-user environments. This research focuses on designing efficient concurrent operations on spatial datasets.
Current multidimensional data access methods can be categorized into two types: 1) pure multidimensional indexing structures such as the R-tree family and grid file; 2) linear spatial access methods, represented by the Space-Filling Curve (SFC) combined with B-trees. Concurrency control protocols have been designed for some pure multidimensional indexing structures, but none of them is suitable for variants of R-trees with object clipping, which are efficient in searching. On the other hand, there is no concurrency control protocol designed for linear spatial indexing structures, where the one-dimensional concurrency control protocols cannot be directly applied. Furthermore, the recently designed query processing approaches for moving objects have not been protected by any efficient concurrency control protocols.
In this research, solutions for efficient concurrent access frameworks on both types of spatial indexing structures are provided, as well as for continuous query processing on moving objects, for multiple-user environments. These concurrent access frameworks can satisfy the concurrency control requirements, while providing outstanding performance for concurrent queries. Major contributions of this research include: (1) a new efficient spatial indexing approach with object clipping technique, ZR+-tree, that outperforms R-tree and R+-tree on searching; (2) a concurrency control protocol, GLIP, to provide high throughput and phantom update protection on spatial indexing with object clipping; (3) efficient concurrent operations for indices based on linear spatial access methods, which form up the CLAM protocol; (4) efficient concurrent continuous query processing on moving objects for both R-tree-based and linear spatial indexing frameworks; (5) a generic access framework, Disposable Index, for optimal location update and parallel search. / Ph. D.
|
612 |
Flexible environments in dynamic lexical analysis systemsDenman, Matthew G. January 1984 (has links)
In this thesis, a system for studying human/computer interfaces is introduced. The human/computer interface provides several features, the most noteable of which is TOKEN COMPLETION. These features permit the user to define and/or redefine command tokens, define and/or delete synonym and noiseword tokens, and to establish a terminal environment. The terminal environment includes the ability to specify automatic comment blocking, token look-ahead, and to control the source of data input (keyboard, VMS file, or I/O buffer).
The ability to token complete is based on a forest of generalized trees used to implement dynamic deterministic finite state automata (DDFA). These trees are built during IPL and loaded with command, synonym, and noiseword tokens, all of which are stored in separate VMS files. Synonym and noiseword translation is carried out in the lexical analysis process, thereby negating any need to specify these functions in the grammar of the language. Insertion and deletion into the forest may be executed at any time, permitting the dynamic definition and deletion of synonyms and noisewords. During synonym and/or noiseword definition, lexical analysis switches to a deterministic finite state automata (DFA) mode of operation. Upon completion, lexical analysis reverts to DDFA mode.
A sample grammar is provided in the APPENDICES. The lexical analysis process is not tied into this grammar but rather is very general and will process any tokens stored in the command, synonym, and noiseword files. The sample grammar is LALR(1). / Master of Science
|
613 |
An analysis of IMS through simulationChrissis, James W. January 1977 (has links)
This thesis presents the development and validation of a simulation model for an on-line IMS/VS configuration. A conceptual queueing model is developed from an actual system configuration and the simulation model parallels this conceptualization. The validation of the simulation model is carried out in three phases utilizing operational data from IMS logs and system monitors. This model is then used to evaluate system performance under a variety of operating configurations and system loads. A configuration which results in improved IMS response is the objective of the simulation analysis. / Master of Science
|
614 |
Modeling and Computation of Complex Interventions in Large-scale Epidemiological Simulations using SQL and Distributed DatabaseKaw, Rushi 30 August 2014 (has links)
Scalability is an important problem in epidemiological applications that simulate complex intervention scenarios over large datasets. Indemics is one such interactive data intensive framework for High-performance computing (HPC) based large-scale epidemic simulations. In the Indemics framework, interventions are supplied from an external, standalone database which proved to be an effective way of implementing interventions. Although this setup performs well for simple interventions and small datasets, performance and scalability of complex interventions and large datasets remain an issue. In this thesis, we present IndemicsXC, a scalable and massively parallel high-performance data engine for Indemics in a supercomputing environment. IndemicsXC has the ability to implement complex interventions over large datasets. Our distributed database solution retains the simplicity of Indemics by using the same SQL query interface for expressing interventions. We show that our solution implements the most complex interventions by intelligently offloading them to the supercomputer nodes and processing them in parallel. We present an extensive performance evaluation of our database engine with the help of various intervention case studies over synthetic population datasets. The evaluation of our parallel and distributed database framework illustrates its scalability over standalone database. Our results show that the distributed data engine is efficient as it is parallel, scalable and cost-efficient means of implementing interventions. The proposed cost-model in this thesis could be used to approximate intervention query execution time with decent accuracy. The usefulness of our distributed database framework could be leveraged for fast, accurate and sensible decisions by the public health officials during an outbreak. Finally, we discuss the considerations for using distributed databases for driving large-scale simulations. / Master of Science
|
615 |
Runway Exit Speed Estimation ModelsBollempalli, Mani Bhargava Reddy 11 September 2018 (has links)
Increasing air traffic in the U.S.A has led to runway capacity limitations at the airports. Increasing the capacity of the existing runways involves reducing the runway occupancy time of an aircraft landing on a runway. The location of runway exits plays an important role in minimizing the runway occupancy time. Locating an optimal location for an exit is getting complex with a rapid increase in the number of aircraft types. So, the Air Transportation and Systems Laboratory at Virginia Tech developed the Runway Exit Interactive Design Model (abbreviated as REDIM). This model finds the optimal exit location considering multiple aircraft and a variety of environmental conditions.
To find the optimal exit location, REDIM simulates the landing aircraft behavior. The kinematic model simulating the aircraft landing behavior in REDIM using pseudo-nonlinear deceleration heuristic algorithm. REDIM models the aircraft landing behavior into five phases. The five phases are: 1) a flare phase, 2) a free roll period occurring between the aircraft touchdown and the brakes initiation 3) the braking phase, 4) a second free roll phase starting after the braking phase and ending before the turnoff maneuver and 5) a turnoff maneuver phase. The major contributors to the runway occupancy time (ROT) are the braking phase (60% of ROT) and the turnoff phase (25% of ROT).
Calculating the turnoff time requires few input variables such as deceleration rate along the turnoff and the speed at which an aircraft takes an exit (exit speed at the point of curvature). The deceleration rate along the turnoff is specific to every aircraft.
This study involves predicting the exit speed at the point of curvature based on the type of exit taken. It begins with collecting the exit geometry parameters of 37 airports in the U.S.A. The exit geometry parameters define the type of exit. The ASDE-X data provides the observed exit speeds at the point of curvature for these exits. This study examines a few models with observed exit speeds as the response variable and exit geometry as the predictor variables. / MS / Increasing air traffic in the U.S.A has led to runway capacity limitations at the airports. Increasing the capacity of the existing runways involves reducing the runway occupancy time of an aircraft landing on a runway. The location of runway exits plays an important role in minimizing the runway occupancy time. Locating an optimal location for an exit is getting complex with a rapid increase in the number of aircraft types. So, the Air Transportation and Systems Laboratory at Virginia Tech developed the Runway Exit Interactive Design Model (abbreviated as REDIM). This model finds the optimal exit location considering multiple aircraft and a variety of environmental conditions.
To find the optimal exit location, REDIM simulates the landing aircraft behavior. The kinematic model simulating the aircraft landing behavior in REDIM using pseudo-nonlinear deceleration heuristic algorithm. REDIM models the aircraft landing behavior into five phases. The five phases are: 1) a flare phase, 2) a free roll period occurring between the aircraft touchdown and the brakes initiation 3) the braking phase, 4) a second free roll phase starting after the braking phase and ending before the turnoff maneuver and 5) a turnoff maneuver phase. The major contributors to the runway occupancy time (ROT) are the braking phase (60% of ROT) and the turnoff phase (25% of ROT).
Calculating the turnoff time requires few input variables such as deceleration rate along the turnoff and the speed at which an aircraft takes an exit (exit speed at the point of curvature). The deceleration rate along the turnoff is specific to every aircraft.
This study involves predicting the exit speed at the point of curvature based on the type of exit taken. It begins with collecting the exit geometry parameters of 37 airports in the U.S.A. The exit geometry parameters define the type of exit. The ASDE-X data provides the observed exit speeds at the point of v curvature for these exits. This study examines a few models with observed exit speeds as the response variable and exit geometry as the predictor variables.
|
616 |
A RNA Virus Reference Database (RVRD) to Enhance Virus Detection in Metagenomic DataLei, Shaohua 16 October 2018 (has links)
With the great promise that metagenomics holds in exploring virome composition and discovering novel virus species, there is a pressing demand for comprehensive and up-to-date reference databases to enhance the downstream bioinformatics analysis. In this study, a RNA virus reference database (RVRD) was developed by manual and computational curation of RNA virus genomes downloaded from the three major virus sequence databases including NCBI, ViralZone, and ViPR. To reduce viral sequence redundancy caused by multiple identical or nearly identical sequences, sequences were first clustered and all sequences except one in a cluster that have more than 98% identity to one another were removed. Other identity cutoffs were also examined, and Hepatitis C virus genomes were studied in detail as an example. Using the 98% identity cutoff, sequences obtained from ViPR were combined with the unique RNA virus references from NCBI and ViralZone to generate the final RVRD. The resulting RVRD contained 23,085 sequences, nearly 5 times the size of NCBI RNA virus reference, and had a broad coverage of RNA virus families, with significant expansion on circular ssRNA virus and pathogenic virus families. Compared to NCBI RNA virus reference in performance evaluation, using RVRD as reference database identified more RNA virus species in RNAseq data derived from wastewater samples. Moreover, using RVRD as reference database also led to the discovery of porcine rotavirus as the etiology of unexplained diarrhea observed in pigs. RVRD is publicly available for enhancing RNA virus metagenomics. / Master of Science / Next-generation sequencing technology has demonstrated capability for the detection of viruses in various samples, but one challenge in bioinformatics analysis is the lack of well-curated reference databases, especially for RNA viruses. In this study, a RNA virus reference database (RVRD) was developed by manual and computational curation from the three commonly used resources: NCBI, ViralZone, and ViPR. While RVRD was managed to be comprehensive with broad coverage of RNA virus families, clustering was performed to reduce redundant sequences. The performance of RVRD was compared with NCBI RNA virus reference database using the pipeline FastViromeExplorer developed by our lab recently, the results showed that more RNA viruses were identified in several metagenomic datasets using RVRD, indicating improved performance in practice.
|
617 |
Exploiting Update Leakage in Searchable Symmetric EncryptionHaltiwanger, Jacob Sayid 15 March 2024 (has links)
Dynamic Searchable Symmetric Encryption (DSSE) provides efficient techniques for securely searching and updating an encrypted database. However, efficient DSSE schemes leak some sensitive information to the server. Recent works have implemented forward and backward privacy as security properties to reduce the amount of information leaked during update operations. Many attacks have shown that leakage from search operations can be abused to compromise the privacy of client queries. However, the attack literature has not rigorously investigated techniques to abuse update leakage.
In this work, we investigate update leakage under DSSE schemes with forward and backward privacy from the perspective of a passive adversary. We propose two attacks based on a maximum likelihood estimation approach, the UFID Attack and the UF Attack, which target forward-private DSSE schemes with no backward privacy and Level 2 backward privacy, respectively. These are the first attacks to show that it is possible to leverage the frequency and contents of updates to recover client queries. We propose a variant of each attack which allows the update leakage to be combined with search pattern leakage to achieve higher accuracy. We evaluate our attacks against a real-world dataset and show that using update leakage can improve the accuracy of attacks against DSSE schemes, especially those without backward privacy. / Master of Science / Remote data storage is a ubiquitous application made possible by the prevalence of cloud computing. Dynamic Symmetric Searchable Encryption (DSSE) is a privacy-preserving technique that allows a client to search and update a remote encrypted database while greatly restricting the information the server can learn about the client's data and queries. However, all efficient DSSE schemes have some information leakage that can allow an adversarial server to infringe upon the privacy of clients. Many prior works have studied the dangers of leakage caused by the search operation, but have neglected the leakage from update operations. As such, researchers have been unsure about whether update leakage poses a threat to user privacy.
To address this research gap, we propose two new attacks which exploit leakage from DSSE update operations. Our attacks are aimed at learning what keywords a client is searching and updating, even in DSSE schemes with forward and backward privacy, two security properties implemented by the strongest DSSE schemes. Our UFID Attack compromises forward-private schemes while our UF Attack targets schemes with both forward privacy and Level 2 backward privacy. We evaluate our attacks on a real-world dataset and show that they efficiently compromise client query privacy under realistic conditions.
|
618 |
Domain knowledge specification using fact schemaParthasarathy, S. 21 April 2010 (has links)
The advantages of integrating artificial intelligence (AI) Technology with data base management system (DBMS) technology are widely recognized as indicated by the results from the survey of AI and data base (DB) researchers. ...In our work, we have focused on the use of data base systems to store large number of facts and rules for a rule-based AI system. / Master of Science
|
619 |
Using ontologies to semantify a Web information portalChimamiwa, Gibson 01 1900 (has links)
Ontology, an explicit specification of a shared conceptualisation, captures knowledge about
a specific domain of interest. The realisation of ontologies, revolutionised the way data
stored in relational databases is accessed and manipulated through ontology and database
integration.
When integrating ontologies with relational databases, several choices exist regarding
aspects such as database implementation, ontology language features, and mappings.
However, it is unclear which aspects are relevant and when they affect specific choices. This
imposes difficulties in deciding which choices to make and their implications on ontology and
database integration solutions.
Within this study, a decision-making tool that guides users when selecting a technology
and developing a solution that integrates ontologies with relational databases is developed.
A theory analysis is conducted to determine current status of technologies that integrate
ontologies with databases. Furthermore, a theoretical study is conducted to determine
important features affecting ontology and database integration, ontology language features,
and choices that one needs to make given each technology. Based on the building blocks
stated above, an artifact-building approach is used to develop the decision-making tool, and
this tool is verified through a proof-of-concept to prove the usefulness thereof.
Key terms: Ontology, semantics, relational database, ontology and database integration,
mapping, Web information portal. / Information Science / M. Sc. (Information Systems)
|
620 |
Managerial data management applications utilising periodic data outputs from multiple legacy systems : a case within DaimlerChrysler AGTheron, Frederik J 12 1900 (has links)
Thesis (MBA)--Stellenbosch Unversity, 2006. / ENGLISH ABSTRACT: In project environments where periodical. standardised data exports from large relational databases serve as the source data for further repetitive manipulation, relational principles can be applied to automate or facilitate this process. The subsequent data model is only valid in environments where the recipient of these data exports has no influence on the data content or
structure, and where it can be relied upon these standardised exports not to change significantly over time. This paper discusses the development of a data application within the Development department of DaimlerChrysler AG that utilises standardised data objects as data sources, along with various aspects of the Relational and Entity models that enabled additional user generated
data to be related to the data structure. It further provides a brief introduction into Agile development strategies and iterative problem solving techniques as it pertains to database development. A working build of the application containing all the source code along with a
representative data set is supplied on a CD. / AFRIKAANSE OPSOMMING: In projek omgewings waar standaard, periodiese data stelle dien as die bron vir verdere
repeterende data manupulasie kan data verhoudings modelle gebruik word om die proses te outomatiseer. Die werkende data model wat hierdeur gegenereer word is slegs geldig indien die klient geen beheer kan uitoefen oor die data struktuur of inhouds vorm wat as bron gebruik word nie. Dit moet ook geredelik aanvaar kan word dat die gestandardiseerde data struktuur nie
wesenlik sal verander met tyd nie. Die studie stel ondersoek in na die ontwikkeling van 'n data program binne die ontwikkelingsdepartement van DaimlerChrysler AG asook verskye beginsels aangaande die verhoudings en entiteits modelle soos van toepassing op die ontwikkelde program.
Gestandardiseerde data stelle dien as 'n periodiese data bron vir hierdie program en word deur verhoudings beginsels gekoppel aan data wat deur gebruikers gegenereer word. 'n Werkende kopie van die program gepaartgaande met 'n verteenwoordigende data stel asook alle oorspronklike programerings kode word op 'n CD voorsien.
|
Page generated in 0.0591 seconds