DOI: 10.5176/978-981-08-5837_188
Authors: Yu Shyang Tan, Bu-Sung Lee, Hui Ping Chak, Jiaqi Tan and more
Abstract:
Hadoop, based on the popular MapReduce framework, is an opensource distributed computing framework that has been gaining much popularity and usage. It aims to allow programmers to focus on building applications that deals with processing large amount of data, without having to handle other issues when performing parallel computations. However, tuning the performance of Hadoop applications is not an easy task due to the level of abstraction the framework works on. In this paper, we present some of the challenges and issues that are to be considered in performance tuning when running applications in Hadoop.
