Adaptive Dynamic Data Placement Algorithm for Hadoop in Heterogeneous Environments

سال انتشار: 1395
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 241

فایل این مقاله در 14 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_JACET-2-4_003

تاریخ نمایه سازی: 18 تیر 1398

چکیده مقاله:

Hadoop MapReduce framework is an important distributed processing model for large-scale data intensive applications. The current Hadoop and the existing Hadoop distributed file system’s rack-aware data placement strategy in MapReduce in the homogeneous Hadoop cluster assume that each node in a cluster has the same computing capacity and a same workload is assigned to each node. Default Hadoop doesn’t consider load state of each node in distribution input data blocks, which may cause inappropriate overhead and reduce Hadoop performance, but in practice, such data placement policy can noticeably reduce MapReduce performance and may increase extra energy dissipation in heterogeneous environments. This paper proposes a resource aware adaptive dynamic data placement algorithm (ADDP) .With ADDP algorithm, we can resolve the unbalanced node workload problem based on node load status. The proposed method can dynamically adapt and balance data stored on each node based on node load status in a heterogeneous Hadoop cluster. Experimental results show that data transfer overhead decreases in comparison with DDP and traditional Hadoop algorithms. Moreover, the proposed method can decrease the execution time and improve the system’s throughput by increasing resource utilization

نویسندگان

Avishan Sharafi

Department of Computer Engineering, Islamic Azad University South Tehran Branch

Ali Rezaee

Department of Computer Engineering, Islamic Azad University, Science and Research Branch,Tehran, Iran.