There is currently considerable enthusiasm around the MapReduce paradigm, and the distributed computing paradigm for analysis of large volumes of data. The Apache Hadoop is the most popular open source implementation of MapReduce model and LINQ to HPC is Microsoft's alternative to open source Hadoop. In this thesis, the performance of LINQ to HPC and Hadoop are compared using different benchmarks.
To this end, we identified four benchmarks (Grep, Word Count, Read and Write) that we have run on LINQ to HPC as well as on Hadoop. For each benchmark, we measured each system’s performance metrics (Execution Time, Average CPU utilization and Average Memory utilization) for various degrees of parallelism on clusters of different sizes. Results revealed some interesting trade-offs. For example, LINQ to HPC performed better on three out of the four benchmarks (Grep, Read and Write), whereas Hadoop performed better on the Word Count benchmark. While more research that is extensive has focused on Hadoop, there are not many references to similar research on the LINQ to HPC platform, which is slowly evolving during the writing of this thesis.
Identifer | oai:union.ndltd.org:unf.edu/oai:digitalcommons.unf.edu:etd-1381 |
Date | 01 January 2013 |
Creators | Sivasubramaniam, Ravishankar |
Publisher | UNF Digital Commons |
Source Sets | University of North Florida |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | UNF Theses and Dissertations |
Page generated in 0.0018 seconds