Machine Learning Algorithms Capable of Type ۲ Diabetes MellitusEarly Diagnosis using Explored Important Features
سال انتشار: 1401
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 61
نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
IBIS11_066
تاریخ نمایه سازی: 19 آذر 1402
چکیده مقاله:
Diabetes Mellitus is a choric metabolic disease that according to World Health Organization (WHO), is the cause of death of more than ۱.۶ million people. In this study, using the Random Forest algorithm, the six most important features of twenty features of the public dataset of type ۲ diabetes patients was determined and by using extracted features six machine learning algorithms, namely Logistic Regression (LR), Support Vector Machine (SVM), K Nearest Neighbors (KNN), Decision Tree (DT), Extremely Randomized Trees (ERT), and XGBoost were developed. Their performance in diagnosing diabetes was trained and tested using ۴-fold cross-validation and hold-out approaches (with ۲۵% of the data excluded from the training process for testing). Accuracy of the LR, SVM, KNN, DT, ERT, and XGBoost algorithms were ۹۲.۳۱%, ۹۰.۷۷%, ۹۶.۱۵%, ۹۵.۳۸%, ۹۵.۹۲%, and ۹۶.۹۲%, with XGBoost outperforming the rest of the algorithms. Considering the F۱-Score metric, LR, SVM, KNN, DT, ERT, and XGBoost algorithms achieved ۹۳.۷۲%, ۹۲.۲۱%, ۹۶.۷۷%, ۹۶.۱۵%, ۹۶.۴۴%, and ۹۷.۴۴% results, confirming the performance of the XGBoost algorithm based on the accuracy metric. Also, in addition to results acquired with a ۴-fold cross-validation approach, the XGBoost algorithm o↵ers better performance regarding the accuracy and F۱-Score metrics. Through hold-out cross-validation approach, accuracy of the LR, SVM, KNN, DT, ERT, and XGBoost algorithms were ۹۲.۳۱%, ۹۳.۰۸%,۹۵.۳۸%, ۹۵.۳۸%, ۹۴.۶۲%, and ۹۶.۱۵% and F۱-Score of the LR, SVM, KNN, DT, ERT, and XGBoost algorithms were ۹۳.۹۰%, ۹۴.۳۴%, ۹۶.۳۰%, ۹۶.۲۰%, ۹۵.۶۵%, and ۹۶.۸۶%, respectively. XGBoost algorithm was capable of diagnosing type ۲ diabetes outperforming other algorithms evaluated in this study using the most important features (age, gender, polyuria, polydipsia, sudden weight loss, and partial paresis) validated using ۴-fold and hold-out cross-validation methods. This algorithm can act as a supplementary tool for the faster and early diagnosis of type ۲ diabetes
کلیدواژه ها:
نویسندگان
Samin Babaei rikan
Urmia university
Ali Ghafari
Tehran university of medical sciences
Reza Ghafari
Urmia university of medical sciences
Amir Sorayaie azar
Urmia university.