An intelligent and distributed crawling algorithm using Map-Reduce

سال انتشار: 1398
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 409

فایل این مقاله در 6 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

ETECH04_068

تاریخ نمایه سازی: 27 بهمن 1398

چکیده مقاله:

In this paper we have presented a Map-Reduce version of an effective crawler called FICA. The proposed approach is based on logarithmic distance and has reasonable time complexity. Here, we present the improved and distributed implementation of this crawler. Comprehensive test cases are designed and the result are analyzed. These experiments lead to identifying a major bottleneck in the distributed version of FICA and an improved version is presented. We achieved 3x speedup for total execution time in comparison with naïve Map-Reduce implementation.

نویسندگان

Saeed Rahmani

Department of Computer Science and Engineering, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran

Esmaeil Nourani

Department of Computer Engineering Azarbaijan Shahid Madani University Tabriz, Iran

Farshad Khunjush

Department of Computer Science and Engineering, School of Electrical and Computer Engineering, Shiraz University, Shiraz, Iran