Global ETD Search

Return to search

Opportunities for near data computing in MapReduce workloads

<p> In-memory big data applications are growing in popularity, including in-memory versions of the MapReduce framework. The move away from disk-based datasets shifts the performance bottleneck from slow disk accesses to memory bandwidth. MapReduce is a data-parallel application, and is therefore amenable to being executed on as many parallel processors as possible, with each processor requiring high amounts of memory bandwidth. We propose using Near Data Computing (NDC) as a means to develop systems that are optimized for in-memory MapReduce workloads, offering high compute parallelism and even higher memory bandwidth. This dissertation explores three different implementations and styles of NDC to improve MapReduce execution. First, we use 3D-stacked memory+logic devices to process the Map phase on compute elements in close proximity to database splits. Second, we attempt to replicate the performance characteristics of the 3D-stacked NDC using only commodity memory and inexpensive processors to improve performance of both Map and Reduce phases. Finally, we incorporate fixed-function hardware accelerators to improve sorting performance within the Map phase. This dissertation shows that it is possible to improve in-memory MapReduce performance by potentially two orders of magnitude by designing system and memory architectures that are specifically tailored to that end.</p>

http://pqdtopen.proquest.com/#viewpdf?dispub=3704952

Engineering, Computer|Computer Science

Identifer	oai:union.ndltd.org:PROQUEST/oai:pqdtoai.proquest.com:3704952
Date	25 June 2015
Creators	Pugsley, Seth Hintze
Publisher	The University of Utah
Source Sets	ProQuest.com
Language	English
Detected Language	English
Type	thesis

Page generated in 0.0022 seconds

Opportunities for near data computing in MapReduce workloads

Description

Links & Downloads

Tags

Additional Fields