CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

Improving Text Mining with Featured Word Selection

عنوان مقاله: Improving Text Mining with Featured Word Selection
شناسه ملی مقاله: IRANWEB04_020
منتشر شده در چهارمین کنفرانس بین المللی وب پژوهی در سال 1397
مشخصات نویسندگان مقاله:

M.Amin Abolghasemi - Master Student of Artificial Intelligence, Amirkabir University of Technology, Tehran, Iran
Saeedeh Momtazi - Assistant Professor of Artificial Intelligence, Amirkabir University of Technology, Tehran, Iran

خلاصه مقاله:
Text mining is one of the main tasks in web research that aims at classification or clustering available texts in the web for different applications, such as news analysis and social network analysis. Since a very large amount of textual data is available on the Web, reducing the dimension of data using feature extraction techniques plays an important role in improving the efficiency and effectiveness of the text mining algorithms. Various techniques have been proposed in machine learning tasks that can also be applied in the text mining domain. In this paper we study the available techniques and compare their impact on improving Persian text classification performance. Our experimental results on Hamshahri corpus shows that using an appropriate feature selection technique can improve the classification f-measure from 88.12% to 93.07%.×

کلمات کلیدی:
Web Mining, Text Mining, Text Classification, Feature Selection

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/773318/