Classification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest

سال انتشار: 1396
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 319

فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_IJP-12-4_003

تاریخ نمایه سازی: 1 مهر 1398

چکیده مقاله:

Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and small-n. Therefore, RF can be used to select and rank the genes for the diagnosis and effective treatment of cancer. Methods: The microarray gene expression data of colon, leukemia, and prostate cancers were collected from public databases. Primary preprocessing was done on them using limma package, and then, the RF classification method was implemented on datasets separately in R software.  Finally, the selected genes in each of the cancers were evaluated and compared with those of previous experimental studies and their functionalities were assessed in molecular cancer processes. Result: The RF method extracted very small sets of genes while it retained its predictive performance. About colon cancer data set DIEXF, GUCA2A, CA7, and IGHA1 key genes with the accuracy of 87.39 and precision of 85.45 were selected. The SNCA, USP20, and SNRPA1 genes were selected for prostate cancer with the accuracy of 73.33 and precision of 66.67. Also, key genes of leukemia data set were BAG4, ANKHD1-EIF4EBP3, PLXNC1, and PCDH9 genes, and the accuracy and precision were 100 and 95.24, respectively. Conclusion: The current study results showed most of the selected genes involved in the processes and cancerous pathways were previously reported and had an important role in shifting from normal cell to abnormal.

نویسندگان

Malihe Ram

Dept. of Biostatistics, Public Health School, Mashhad University of Medical Sciences, Mashhad, Iran

Ali Najafi

Molecular Biology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran

Mohammad Taghi Shakeri

Dept. of Biostatistics, Public Health School, Mashhad University of Medical Sciences, Mashhad, Iran

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • Najafi A, Ram M, Ranjbar R. Microarray: Principles & Applications. ...
  • Najafi A, Masoudi-Nejad A, Imani Fooladi AA, Ghanei M, Nourani ...
  • Beltrame F, Papadimitropoulos A, Porro I, Scaglione S, Schenone A, ...
  • Sandvik A, Alsberg B, Nørsett K, Yadetie F, Waldum H, ...
  • Tabakhi S, Najafi A, Ranjbar R, Moradi P. Gene selection ...
  • Wu GP, Chan KC, Wong AK. Unsupervised fuzzy pattern discovery ...
  • Pique-Regi R, Monso-Varona J, Ortega A, Seeger RC, Triche TJ, ...
  • Chen L, Xuan J, Riggins RB, Clarke R, Wang Y. ...
  • Boulesteix AL, Janitza S, Kruppa J, König IR. Overview of ...
  • Hua J, Xiong Z, Lowey J, Suh E, Dougherty ER. ...
  • Liaw A, Wiener M. Classification and regression by randomForest. R ...
  • Breiman L. Random forests. Machine learning. 2001;45(1):5-32. ...
  • Genuer R, Poggi J-M, Tuleau C. Random Forests: some methodological ...
  • Hastie T, Tibshirani R, Friedman J, Franklin J. The elements ...
  • Stirewalt DL, Meshinchi S, Kopecky KJ, Fan W, Pogosova-Agadjanyan EL, ...
  • Wang BD, Ceniccola K, Yang Q, Andrawis R, Patel V, ...
  • Ryan BM, Zanetti KA, Robles AI, Schetter AJ, Goodman J, ...
  • Bina M. Gene mapping, discovery, and expression: methods and protocols: ...
  • Krawetz SA, Womble DD. Introduction to bioinformatics: a theoretical and ...
  • Diaz-Uriarte R, Diaz-Uriarte MR. Package ‘varSelRF’. Citeseer; 2010. ...
  • Anaissi A, Kennedy PJ, Goyal M, Catchpoole DR. A balanced ...
  • Díaz-Uriarte R, De Andres SA. Gene selection and classification of ...
  • Zhang X, Yan Z, Zhang J, Gong L, Li W, ...
  • Jiang H, Deng Y, Chen HS, Tao L, Sha Q, ...
  • Diaz-Uriarte R, de Andrés SA. Variable selection from random forests: ...
  • نمایش کامل مراجع