Feature selection using feature dissimilarity measure and density-based clustering: Application to biological data.
Loading...
Date
2015-10
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Reduction of dimensionality has emerged as a routine process in modelling complex biological systems. A large number
of feature selection techniques have been reported in the literature to improve model performance in terms of accuracy and
speed. In the present article an unsupervised feature selection technique is proposed, using maximum information
compression index as the dissimilarity measure and the well-known density-based cluster identification technique
DBSCAN for identifying the largest natural group of dissimilar features. The algorithm is fast and less sensitive to the
user-supplied parameters. Moreover, the method automatically determines the required number of features and identifies
them. We used the proposed method for reducing dimensionality of a number of benchmark data sets of varying sizes. Its
performance was also extensively compared with some other well-known feature selection methods.
Description
Keywords
Clustering, dissimilarity, eigenvalue, feature selection
Citation
Sengupta Debarka, Aich Indranil, Bandyopadhyay Sanghamitra. Feature selection using feature dissimilarity measure and density-based clustering: Application to biological data. Journal of Biosciences. 2015 Oct; 40(4): 721-730.