Return to search

Towards a big data analytics platform with Hadoop/MapReduce framework using simulated patient data of a hospital system

Background: Big data analytics (BDA) is important to reduce healthcare costs. However, there are many challenges. The study objective was high performance establishment of interactive BDA platform of hospital system.

Methods: A Hadoop/MapReduce framework formed the BDA platform with HBase (NoSQL database) using hospital-specific metadata and file ingestion. Query performance tested with Apache tools in Hadoop’s ecosystem.

Results: At optimized iteration, Hadoop distributed file system (HDFS) ingestion required three seconds but HBase required four to twelve hours to complete the Reducer of MapReduce. HBase bulkloads took a week for one billion (10TB) and over two months for three billion (30TB). Simple and complex query results showed about two seconds for one and three billion, respectively.

Interpretations: BDA platform of HBase distributed by Hadoop successfully under high performance at large volumes representing the Province’s entire data. Inconsistencies of MapReduce limited operational efficiencies. Importance of the Hadoop/MapReduce on representation of health informatics is further discussed. / Graduate / 0566 / 0769 / 0984 / dillon.chrimes@viha.ca

Identiferoai:union.ndltd.org:uvic.ca/oai:dspace.library.uvic.ca:1828/7645
Date28 November 2016
CreatorsChrimes, Dillon
ContributorsKuo, Alex (Mu - Hsing)
Source SetsUniversity of Victoria
LanguageEnglish, English
Detected LanguageEnglish
TypeThesis
RightsAvailable to the World Wide Web

Page generated in 0.003 seconds