Global ETD Search

Return to search

Hybrid database| Dynamic selection of database infrastructure to improve query performance

Distributed file systems have enabled storage and parsing of arbitrarily large datasets with linearly scaling to hardware resources, however the latency created for minor queries of large datasets becomes untenable in a production environment. By utilizing data storage on both a distributed file system and a traditional relational database, this product will achieve low latency data service to users while maintaining complete archiving. The software stack will be utilizing the Apache Hadoop Distributed File System for distributed storage. Apache Hive will be used for queries of the distributed file system. A MySQL database backend will be used for the traditional database service. A J2EE web application will serve as the user interface. Decisions on which data service will provide the requested data with the lowest latency will be determined by evaluating the query.

http://pqdtopen.proquest.com/#viewpdf?dispub=10195971

Computer science

Identifer	oai:union.ndltd.org:PROQUEST/oai:pqdtoai.proquest.com:10195971
Date	23 December 2016
Creators	Williams, Michael
Publisher	California State University, Long Beach
Source Sets	ProQuest.com
Language	English
Detected Language	English
Type	thesis

Page generated in 0.0014 seconds

Hybrid database| Dynamic selection of database infrastructure to improve query performance

Description

Links & Downloads

Tags

Additional Fields