Authors: Byungsang Kim, Yougjoo Han and Chan-Hyun Youn
As the global grid has grown in size, large-scale distributed data analysis schemes have gained momentum. Over the last few years, a number of methods have been introduced for allocating data intensive tasks across distributed and heterogeneous computing platforms. However, these approaches have a limited potential for scaling up computing nodes so that they can serve more tasks simultaneously. This paper tackles the scalability problem in a user-centric task distribution overlay. We propose a distributed data cache framework for storing and allocating the tasks. This paper also provides task procurement algorithms based on the steady states of the distributed caches for maximizing the distribution efficiency between the data source and the computing nodes. The experimental results show that scalability and distribution efficiency can be achieved in our system.