Comparing classification algorithms of data mining in diagnosis of diabetes and assessing the effectiveness of k-fold cross validation in the accuracy of the constructed model

سال انتشار: 1395
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 602

فایل این مقاله در 6 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

ICCSE01_030

تاریخ نمایه سازی: 14 شهریور 1396

چکیده مقاله:

One of the applications of data mining is in medicine and model construction for disease diagnosis. The more the modellearns from previous data, the more accurate it would perform. The essential issue is that, the training and testing data in classificationof data must be selected in a way that the model enjoys the most efficient learning from previous data and the highest accuracy indiagnosis of the disease. In this study, the Pima dataset of diabetics is applied, the models for predicting and diagnosing diabetes aredeveloped based on KNN, SVM, Nave Bayesian and Decision Tree classification methods and the accuracy of each model is evaluated.The effectiveness of k-fold validation on the accuracy of each model is assessed. According to the findings here, k-fold cross validationincreases the model accuracy and a classification technique would not always have the best performance and accuracy, while it dependson the nature and complexity of the dataset. The simulation is made by the tool named RapidMiner.

نویسندگان

Nasim Nikbakhsh

Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran

GholamReza Dehghani

Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran

Farsad Dr.Zamani

Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran