Genetic-based Feature Selection for Spam Detection

سال انتشار: 1392
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 1,118

فایل این مقاله در 6 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

ICEE21_072

تاریخ نمایه سازی: 27 مرداد 1392

چکیده مقاله:

In recent years, email has evolved into a pervasive and economical means of communication, but spam as an annoying fact has decreased usefulness of this means. For encountering the challenge, email filtering as a special kind of text classification emerged and developed. A main problem in text classificationtasks which is more serious in email filtering is existence of large number of features. For solving the issue, various featureselection methods are considered, which extract a lower dimensional feature space from original one and offer it as input to classifier. In this regard, we examined effectiveness of two existent individual methods and offer a new combinational method. The methods, which are experimented individually, areInformation Gain (IG) and χ2 statistic (CHI), and our combined method is applying genetic algorithm (GA) on the top featuresselected by IG. We used Perceptron neural network as classifier. For evaluation of our system, experiments were conducted on PU data set. The results showed that the individual methods are very effective in reducing dimensionality of input space along with increasing performance of classifier, and the combined method further improves performance in spite of bringing dimensionality to a lower extent

نویسندگان

Seyyed Hossein Seyyedi Arani

Islamic Azad University, Qazvin Branch, Qazvin, Iran,

Saeed Mozaffari

Electrical and Computer Engineering Department, Semnan University