Feature selection techniques in bioinformatics

سال انتشار: 1396
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 425

فایل این مقاله در 14 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

KAUCEE01_196

تاریخ نمایه سازی: 29 مهر 1396

چکیده مقاله:

Machine learning methods are often used to classify objects described by hundreds of attributes; however, as the dimensionality of the data rises, the amount of data required to provide a reliable analysis grows exponentially. A popular approach to this problem of high-dimensional datasets is to search for a projection of the data onto a smaller number of variables (or features) which preserves the information as much as possible. Feature selection is an important step in data mining and is used in various subjects including genetics, medicine, and bioinformatics. In many bioinformatics problems the number of features is significantly larger than the number of samples (high feature to sample ratio datasets) and feature selection techniques have become an apparent need in many bioinformatics applications. This article provides the reader aware of the possibilities of feature selection, providing a basic taxonomy of feature selection techniques and discussing its uses in bioinformatics applications including sequence analysis, microarray analysis, discovering Statistically-Equivalent Feature Subsets in the R Package MXM, classification of pre-miRNAs and Mass spectra analysis.

نویسندگان

Mohamad Reza Hosseini

Department of Computer Engineering, Shahid Ashrafi Esfahani University, Isfahan, Iran

Naser Nematbakhsh

Department of Computer Engineering, Shahid Ashrafi Esfahani University, Isfahan, Iran

Motahareh Nadimi

Department of Biology, Faculty of Science, University of Isfahan, Isfahan, Iran