Spelling suggestions: "subject:"database"" "subject:"catabase""
1111 |
Web-based Performance Benchmarking Data Collection and Preliminary Analysis for Drinking Water and Wastewater UtilityRathor, Ankur 12 January 2013 (has links)
High-quality drinking water and wastewater systems are essential to public health, business, and quality of life in the United States. Even though the current performance of these systems is moderate, the concern is about the future performance. Planning can be done for improvement once the current performance of utilities is evaluated, and areas with a scope of improvement are identified. Benchmarking and performance evaluation are key components in the process of continuous improvement for utility's performance. Benchmarking helps utilities make policies and programmatic decisions that reduce operational expenses and increase productivity by understanding areas of underperformance, understanding customer needs, developing future plans, and setting goals. This study establishes a strong case for implementing benchmarking methodologies among utilities to evaluate and improve performance.
There are many initiatives on performance benchmarking of utilities but a few of them focuses on one or few area of performance. There are a few initiatives which use subjective indicators. Additionally, consultants visit the utilities for performance evaluation. This research focuses on creating a web-based benchmarking platform for performance evaluation using holistic and quantitative indicators. Practical and robust methodologies are used and the research presents the current performance comparisons among utilities for areas that impact overall utility's performance. Web based benchmarking consists of two major parts -- data collection and result visualization. A major contribution from this study is the creation of an online performance benchmarking database. With time more data will be collected which will provide utilities an access to a better database for performance evaluation. The future work in this research will be analyzing the data and results for each participant for each set of indicators, and finding possible reasons for under performance, followed by suggesting solutions for improvement using the best practices. / Master of Science
|
1112 |
An update expert and response generator for a transportable natural language interface to database management systemsBessasparis, Michael J. 01 November 2008 (has links)
Fully transportable natural language interfaces to database management systems (DBMS) have been under study for some years I but until now I all have suffered from a lack of response ability and lack of natural language update ability. Response generation is relatively easy to overcome, but the second problem, lack of update ability, is more serious. Adding update capacity involves primarily three tasks. First, the system must be able to recognize and process update requests. Processing an update typically involves both altering the knowledge base to reflect the new state of the database and performing dynamic extensions to the lexicon. Second, the intermediate language used to communicate with the database manager must be extended to cover update information. Third, the post-processor must be extended to transform commands into DBMS update requests.
The system described here uses a flexible and unified knowledge base to recognize and process update requests. Through information stored in the knowledge base, the system can recognize and resolve certain classes of ambiguity. The update request is then converted into an unambiguous intermediate query language. This language is easily translated to the target database management language using simple syntactic methods. The response generator uses the intermediate query language, the knowledge base I and the results returned by the target DBMS to form a response for all database accesses. / Master of Science
|
1113 |
Development of a Web-Based System for Water Quality Data Management and VisualizationYang, Wei 18 June 2010 (has links)
With increasing urbanization and population growth, humankind faces multiple environmental challenges. Stresses on limited resources, especially water resources, are now greater than ever before. Watershed monitoring and management are important components of programs to abate water resource stresses. The increasing water quantity and quality monitoring has produced a need for better data management techniques to manage the vast amount of watershed monitoring data being observed. These data must be stored, error checked, manipulated, retrieved and shared with the watershed management community. The web-based data visualization and analysis technology has played a critical role in all aspects of watershed management. Especially in recent years, computer-assisted data analysis has matured enormously. This maturing technology makes web-based visualization and analysis technology change its role to become an integrated system which combines applications of databases, and internet technology.
The main objective of this study is to develop a prototype system which has ability of data visualization and analysis. Microsoft SQL Server is used to build a comprehensive database, which includes all datasets collected by OWML. A Web-Based Data Visualization and Analysis System which provides an integrated interface for permitted users to explore, analyze and download data has been developed. / Master of Science
|
1114 |
A relational database management systems approach to system designMoolman, George Christiaan 10 July 2009 (has links)
Systems are developed to fulfill certain requirements. Several system design configurations usually can fulfill the technical requirements, but at different equivalent life-cycle costs. The problem is how to manipulate and evaluate different system configurations so that the required system effectiveness can be achieved at a minimum equivalent cost. It is also important to have a good definition of all the major consequences of each design configuration. For each alternative configuration considered, it is useful to know the number of units to deploy, the inventory and other logistic requirements, as well as the sensitivity of the system to changes in input variable values.
An intelligent relational database management system is defined to solve the problem described. Table structures are defined to maintain the required data elements and algorithms are constructed to manipulate the data to provide the necessary information. The methodology is as follows: Customer requirements are analyzed in functional terms. Feasible design alternatives are considered and defined as system design configurations. The reliability characteristics of each system configuration are determined, initially from a system-level allocation, and later determined from test and evaluation data. A maintenance analysis is conducted to determine the inventory requirements (using reliability data) and the other logistic requirements for each design configuration. A vector of effectiveness measures can be developed for each customer, depending on objectives, constraints, and risks. These effectiveness measures, consisting of a combination of performance and cost measures, are used to aid in objectively deciding which alternative is preferred.
Relationships are defined between the user requirements, the reliability and maintainability of the system, the number of units deployed, the inventory level, and other logistic characteristics of the system. A heuristic procedure is developed to interactively manipulate these parameters to obtain a good solution to the problem with technical performance and cost measures as criteria. Although it is not guaranteed that the optimal solution will be found, a feasible solution close to the optimal will be found. Eventually the user will have, at any time, the ability to change the value of any parameter modeled. The impact on the total system will subsequently be made visible. / Master of Science
|
1115 |
A system for document analysis, translation, and automatic hypertext linkingAverboch, Guillermo Andres 21 July 2009 (has links)
A digital library database is a heterogeneous collection of documents. Documents may become available in different formats (e.g., ASCII, SGML, typesetter languages) and they may have to be translated to a standard document representation scheme used by the digital library.
This work focuses on the design of a framework that can be used to convert text documents in any format to equivalent documents in different formats and, in particular, to SGML (Standard Generalized Markup Language). In addition, the framework must be able to extract information about the analyzed documents, store that information in a permanent database, and construct hypertext links between documents and the information contained in that database and between the document themselves. For example, information about the author of a document could be extracted and stored in the database. A link can then be established between the document and the information about its author and from there to other documents by the same author. These tasks must be performed without any human intervention, even at the risk of making a small number of mistakes.
To accomplish these goals we developed a language called DELTO (Description Language for Textual Objects) that can be used to describe a document format. Given a description for a particular format, our system is able to extract information from documents in that format, to store part of that information in a permanent database, and to use that information in constructing an abstract representation of those documents that can be used to generate equivalent documents in different formats.
The system originated from this work is used for constructing the database of Envision, a Virginia Tech digital library research project. / Master of Science
|
1116 |
DJ: Bridging Java and Deductive DatabasesHall, Andrew Brian 07 July 2008 (has links)
Modern society is intrinsically dependent on the ability to manage data effectively. While relational databases have been the industry standard for the past quarter century, recent growth in data volumes and complexity requires novel data management solutions. These trends revitalized the interest in deductive databases and highlighted the need for column-oriented data storage. However, programming technologies for enterprise computing were designed for the relational data management model (i.e., row-oriented data storage). Therefore, developers cannot easily incorporate emerging data management solutions into enterprise systems.
To address the problem above, this thesis presents Deductive Java (DJ), a system that enables enterprise programmers to use a column oriented deductive database in their Java applications. DJ does so without requiring that the programmer become proficient in deductive databases and their non-standardized, vendor-specific APIs. The design of DJ incorporates three novel features: (1) tailoring orthogonal persistence technology to the needs of a deductive database with column-oriented storage; (2) using Java interfaces as a primary mapping construct, thereby simplifying method call interception; (3) providing facilities to deploy light-weight business rules.
DJ was developed in partnership with LogicBlox Inc., an Atlanta based technology startup. / Master of Science
|
1117 |
Efficient Spatio-Temporal Network Analytics in Epidemiological Studies using Distributed DatabasesKhan, Mohammed Saquib Akmal 26 January 2015 (has links)
Real-time Spatio-Temporal Analytics has become an integral part of Epidemiological studies. The size of the spatio-temporal data has been increasing tremendously over the years, gradually evolving into Big Data. The processing in such domains are highly data and compute intensive. High performance computing resources resources are actively being used to handle such workloads over massive datasets. This confluence of High performance computing and datasets with Big Data characteristics poses great challenges pertaining to data handling and processing. The resource management of supercomputers is in conflict with the data-intensive nature of spatio-temporal analytics. This is further exacerbated due to the fact that the data management is decoupled from the computing resources. Problems of these nature has provided great opportunities in the growth and development of tools and concepts centered around MapReduce based solutions. However, we believe that advanced relational concepts can still be employed to provide an effective solution to handle these issues and challenges.
In this study, we explore distributed databases to efficiently handle spatio-temporal Big Data for epidemiological studies. We propose DiceX (Data Intensive Computational Epidemiology using supercomputers), which couples high-performance, Big Data and relational computing by embedding distributed data storage and processing engines within the supercomputer. It is characterized by scalable strategies for data ingestion, unified framework to setup and configure various processing engines, along with the ability to pause, materialize and restore images of a data session. In addition, we have successfully configured DiceX to support approximation algorithms from MADlib Analytics Library [54], primarily Count-Min Sketch or CM Sketch [33][34][35].
DiceX enables a new style of Big Data processing, which is centered around the use of clustered databases and exploits supercomputing resources. It can effectively exploit the cores, memory and compute nodes of supercomputers to scale processing of spatio-temporal queries on datasets of large volume. Thus, it provides a scalable and efficient tool for data management and processing of spatio-temporal data. Although DiceX has been designed for computational epidemiology, it can be easily extended to different data-intensive domains facing similar issues and challenges.
We thank our external collaborators and members of the Network Dynamics and Simulation Science Laboratory (NDSSL) for their suggestions and comments. This work has been partially supported by DTRA CNIMS Contract HDTRA1-11-D-0016-0001, DTRA Validation Grant HDTRA1-11-1-0016, NSF - Network Science and Engineering Grant CNS-1011769, NIH and NIGMS - Models of Infectious Disease Agent Study Grant 5U01GM070694-11.
Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government. / Master of Science
|
1118 |
Forest Change Dynamics Across Levels of Urbanization in the Eastern USWu, Yi-Jei 03 September 2014 (has links)
The forests of the eastern United States reflect complex and highly dynamic patterns of change. This thesis seeks to explore the highly variable nature of these changes and to develop techniques that will enable researchers to examine their temporal and spatial patterns. The objectives of this research are to: 1) determine whether the forest change dynamics in the eastern US differ across levels of the urban hierarchy; 2) identify and explore key micropolitan areas that deviate from anticipated trends in forest change; and 3) develop and apply techniques for Big Data exploration of Landsat satellite images for forest cover analysis over large regions.
Results demonstrate that forest change at the micropolitan level of urbanization differs from rural and metropolitan forest dynamics. The work highlights the dynamic nature of forest change within the Piedmont Atlantic megaregion, largely attributed to the forestry industry. This is by far the most dominant change phenomenon in the region but is not necessarily indicative of permanent forest change. A longer temporal analysis may be required to separate the contribution of the forest industry from permanent forest conversion in the region.
Techniques utilized in this work suggest that emerging tools that provide supercomputing/parallel processing capabilities for the analysis of big satellite data open the door for researchers to better address different landscape signals and to investigate large regions at a high temporal and spatial resolution. The opportunity now exists to conduct initial assessments regarding spatio-temporal land cover trends in the southeast in a manner previously not possible. / Master of Science
|
1119 |
Development of Ground-Level Hyperspectral Image Datasets and Analysis Tools, and their use towards a Feature Selection based Sensor Design Method for Material ClassificationBrown, Ryan Charles 31 August 2018 (has links)
Visual sensing in robotics, especially in the context of autonomous vehicles, has advanced quickly and many important contributions have been made in the areas of target classification. Typical to these studies is the use of the Red-Green-Blue (RGB) camera. Separately, in the field of remote sensing, the hyperspectral camera has been used to perform classification tasks on natural and man-made objects from typically aerial or satellite platforms. Hyperspectral data is characterized by a very fine spectral resolution, resulting in a significant increase in the ability to identify materials in the image. This hardware has not been studied in the context of autonomy as the sensors are large, expensive, and have non-trivial image capture times.
This work presents three novel contributions: a Labeled Hyperspectral Image Dataset (LHID) of ground-level, outdoor objects based on typical scenes that a vehicle or pedestrian may encounter, an open-source hyperspectral interface software package (HSImage), and a feature selection based sensor design algorithm for object detection sensors (DLSD). These three contributions are novel and useful in the fields of hyperspectral data analysis, visual sensor design, and hyperspectral machine learning. The hyperspectral dataset and hyperspectral interface software were used in the design and testing of the sensor design algorithm.
The LHID is shown to be useful for machine learning tasks through experimentation and provides a unique data source for hyperspectral machine learning. HSImage is shown to be useful for manipulating, labeling and interacting with hyperspectral data, and allows wavelength and classification based data retrieval, storage of labeling information and ambient light data. DLSD is shown to be useful for creating wavelength bands for a sensor design that increase the accuracy of classifiers trained on data from the LHID. DLSD shows accuracy near that of the full spectrum hyperspectral data, with a reduction in features on the order of 100 times. It compared favorably to other state-of-the-art wavelength feature selection techniques and exceeded the accuracy of an RGB sensor by 10%. / Ph. D. / To allow for better performance of autonomous vehicles in the complex road environment, identifying different objects in the roadway or near it is very important. Typically, cameras are used to identify objects and there has been much research into this task. However, the type of camera used is an RGB camera, the same used in consumer electronics, and it has a limited ability to identify colors. Instead, it only detects red, green, and blue and combines the results of these three measurements to simulate color. Hyperspectral cameras are specialized hardware that can detect individual colors, without having to simulate them. This study details an algorithm that will design a sensor for autonomous vehicle object identification that leverages the higher amount of information in a hyperspectral camera, but keep the simpler hardware of the RGB camera.
This study presents three separate novel contributions: A database of hyperspectral images useful for tasks related to autonomous vehicles, a software tool that allows scientific study of hyperspectral images, and an algorithm that provides a sensor design that is useful for object identification.
Experiments using the database show that it is useful for research tasks related to autonomous vehicles. The software tool is shown to be useful to interfacing between image files, algorithms and external software, and the sensor design algorithm is shown to be comparable to other such algorithms in accuracy, but outperforms the other algorithms in the size of the data required to complete the goal.
|
1120 |
Towards A Sufficient Set of Mutation Operators for Structured Query Language (SQL)McCormick II, Donald W. 25 May 2010 (has links)
Test suites for database applications depend on adequate test data and real-world test faults for success. An automated tool is available that quantifies test data coverage for database queries written in SQL. An automated tool is also available that mimics real-world faults by mutating SQL, however tests have revealed that these simulated faults do not completely represent real-world faults. This paper demonstrates how half of the mutation operators used by the SQL mutation tool in real-world test suites generated significantly lower detection scores than those from research test suites. Three revised mutation operators are introduced that improve detection scores and contribute toward re-defining a sufficient set of mutation operators for SQL. Finally, a procedure is presented that reduces the test burden by automatically comparing SQL mutants with their original queries. / Master of Science
|
Page generated in 0.0642 seconds