FEATURE SELECTION ON HIGH DIMENSIONAL DATASETS

Authors

  • Dharmender Kumar Sunita Beniwal Poonam Guru Jambeshwar University of Science and Technology, Hisar, Haryana

Keywords:

Abstract

High dimensional data contain billions of entries with small numbers of instances and large numbers of attributes. Some problems with high dimensional data are large dimensions, overfitting, class imbalance and outliers which decrease the performance. To handle these problems need a algorithm called feature selection. Feature selection is used to reduce dimensions by removing irrelevant attributes. Mainly three techniques are used for feature selection namely, filter, wrapper and embedded. This paper is about overview of feature selection techniques for high dimensional data. Also describe some existing related work found in this literature.

References

Vinod S. Bawane , Shireesh P. Bhoyar , Manish P. Tembhurkar . “A Review on High Dimensional Data Visualization” International Journal of Emerging Trends in Engineering and Development, Vol.3, Issue 4, pp. 878-884, ISSN 2249-6149, May, 2014.

VerónicaBolón-Canedo, Noelia Sánchez-Maroño, Amparo Alonso-Betanzos. “Introduction to High-Dimensionality” in Introduction to High-Dimensionality, Springer International Publishing pp. 1-12, 2015

V. Bolon-Canedo , N. Sanchez-Marono , A. Alonso-Betanzos , J.M. Benitez , F. Herrera . “ A review of microarray datasets and applied feature selection methods” . Information Sciences vol. 282, pp. 111–135, 2014.

Michel Verleysen. “Learning high-dimensional data”, Limitations and Future Trends in Neural Computation, IOS Press, pp. 141-162, 2003.

Genevera I. Allen “Examples of High-Dimensional Data” , Statistical Learning: High-Dimensional Data , January 2011.

JasminaNovakovic, PericaStrbac, DusanBulatovic . “Toward optimal feature selection using ranking methods and classification algorithms” Yugoslav Journal of Operations Research , Number 1, pp. 119-135, 2011.

L.Ladha ,T.Deepa. “Feature selection methods an algorithms” International Journal on Computer Science and Engineering (IJCSE) , Vol. 3 ,No. 5, May 2011.

A. Jović, K. Brkić and N. Bogunović, “A review of feature selection methods with applications” ,IEEE ,pp. 1200-1205, 2015.

Pinar Yildirim . “Filter Based Feature Selection Methods for Prediction of Risks in Hepatitis Disease” International Journal of Machine Learning and Computing, Vol. 5, No. 4, August 2015

C.Lavanya, M.Nandihini, R.Niranjana, C.Gunavathi . “Classification of Microarray Data Based On Feature Selection Method” , International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 1, February 2014.

A. Jovic,K. Brkicand N. Bogunovic. “A review of feature selection methods withapplications” IEEE, may 2015.

S.VanajaK.Rameshkumar. “Analysis of Feature Selection Algorithms on Classification: A Survey ”, International Journal of Computer Applications ,Vol. 96, pp. 28-35, No.17, June 2014.

KokaneVina ,Lomte Archana . “Feature Selection for High Dimensional and Imbalanced Data- A Comparative Study”, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) ,ISSN: 2278-1323, pp. 3800-3804, Vol. 3, Issue 11, November 2014.

M. Yasodha and P. Ponmuthuramalingam . “A fast and efficient feature selection algorithm for microarray gene expression and classification” , ARPN Journal of Engineering and Applied Sciences ,Vol. 10, No. 4, March 2015.

RattanawadeePanthonga,AnongnartSrivihok. “Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm”,Information Systems International Conference (ISIC),Published by Elsevier B.V.,pp. 162 – 169, 2015.

M. Wanderley, V. Gardeux, R. Natowicz, A. Braga.“Ga-kde-bayes: an evolutionary wrapper method based on non-parametric density estimationapplied to bioinformatics problems”, 21st European Symposium on Artificial Neural Networks-ESANN, pp. 155–160, 2013.

Sung-Sam Hong, Wanhee Lee, and Myung-Mook Han. “The Feature Selection Method based on Genetic Algorithm for Efficient of Text Clustering and Text Classification”, Int. J. Advance Soft Compu. Appl, Vol. 7, No. 1, ISSN 2074-8523, March 2015.

BabatundeOluleye, Armstrong Leisa, JinsongLeng, Diepeveen Dean. “A Genetic Algorithm-Based Feature Selection”, International Journal of Electronics Communication and ComputerEngineering,Vol. 5, Issue 4, ISSN : 2278–4209, July 2014.

A. Sharma, S. Imoto, S.Miyano.“A top-r feature selection algorithm for microarray gene expression data”, IEEE/ACM Trans. Comput.Biol. Bioinformatics(TCBB) 9 (3) pp. 754–764, 2012.

V. Susheela Devi “Class Specific Feature SelectionUsing Simulated Annealing”, Springer International Publishing Switzerland, pp. 12–21, 2015.

Shutao Li, Chen Liao, and James T. Kwok.“Gene Feature Extraction Using T-Test Statistics and Kernel Partial Least Squares” pp. 11-20 , 2006.

Eric P. Xing ,Michael I. Jordan , Richard M. Karp . ” Feature Selection for High-Dimensional Genomic Microarray Data” pp. 601-608, 2001.

Wei Luo, Lipo Wang, Jingjing Sun. “Feature Selection for Cancer Classification Based on Support Vector Machine”, Volume: 4 Pages: 422 - 426, 2009.

Mr. Swapnil R Kumbhar, Mr.Suhel S Mulla.“Literature Review on Feature SubsetSelectionTechniques”, International Journal of Application or Innovation in Engineering & Management (IJAIEM), ISSN 2319 - 4847, Vol. 03, pp. 231-233, Issue 09, September 2014.

Rabia Aziz, C.K. Verma, and Namita Srivastava. “Dimension reduction methods for microarray data: a review”,AIMS Bioengineering, Vol. 4, Issue 2,pp. 179-197, March 2017.

Cao J, Zhang L, Wang B,“A fast gene selection method for multi-cancer classification using multiple support vector data description.” J Biomed Inform, pp. 381–389, 2015.

Lan L, Djuric N, Guo Y. “MS-kNN: protein function prediction by integrating multiple data sources”, Bioinformatics, 14: S8, 2013.

G. Wang, Q. Song, B. Xu, Y. Zhou. “Selecting feature subset for high dimensional data via the propositional foil rules” Pattern Recognition, vol. 46, Issue 1, 199–214, 2013.

Downloads

Published

2017-07-31

Issue

Section

Articles