MPA-SVM: An Effective Feature Selection Approach for High-Dimensional Datasets
DOI:
https://doi.org/10.32792/jeps.v15i2.526الكلمات المفتاحية:
Marine Predators Algorithm، Feature selection، Support Vector Machine، Data mining، High Dimensional Datasetsالملخص
A crucial phase in the data mining process is choosing a subset of potential features. Finding the ideal number of superior features to optimize the learning algorithm's performance is the ultimate aim of feature selection. But when a dataset's feature count rises, this issue gets more difficult to solve. As a result, modern optimization techniques are employed to find the best feature combinations. The Marine Predators Algorithm (MPA) is a novel metaheuristic that has proven effective in solving many optimization issues. Support vector machines (SVMs) are an essential method that are skillfully used to address classification problems. In order to address the feature selection problem in high dimensional datasets, in this work the MPA is adjusted using the SVM as classifier. The present study proposes MPA- SVM as a solution to the issue of feature selection in high dimensional datasets. The suggested method's efficacy was confirmed using ten high-dimensional datasets acquired from Arizona State University (ASU) repository; the outcomes are contrasted with those of the other six cutting-edge feature selection algorithms. Atom Search Optimization (ASO), Equilibrium Optimizer (EO), Emperor Penguin Optimizer (EPO), Monarch Butterfly Optimization (MBO), Satin Bowerbird Optimizer (SBO), and Sine Cosine Algorithm (SCA) are the algorithms that we compared. The outcomes confirm that the suggested MPA-SVM method outperformed a number of metaheuristic algorithms and demonstrated an amazing capacity to choose the most important and ideal attributes. Across all datasets, MPA-SVM produces the lowest average error rates, minimum classification standard deviation (STD) values and FS rates.
المراجع
Agyeman, M., Guerrero, A., Access, Q. V.-I., & 2022, undefined. (n.d.). A review of classification techniques for arrhythmia patterns using convolutional neural networks and Internet of Things (IoT) devices. Ieeexplore.Ieee.Org. Retrieved October 13, 2022, from https://ieeexplore.ieee.org/abstract/document/9832886/
Al-Betar, M. A., Awadallah, M. A., Heidari, A. A., Chen, H., Al-khraisat, H., & Li, C. (2021). Survival exploration strategies for Harris Hawks Optimizer. Expert Systems with Applications, 168(December 2019), 114243. https://doi.org/10.1016/j.eswa.2020.114243
Al-Qaness, M. A. A., Ewees, A. A., Fan, H., Abualigah, L., & Elaziz, M. A. (2020). Marine predators algorithm for forecasting confirmed cases of COVID-19 in Italy, USA, Iran and Korea. International Journal of Environmental Research and Public Health, 17(10). https://doi.org/10.3390/ijerph17103520
Battiti, R. (1994). Using Mutual Information for Selecting Features in Supervised Neural Net Learning. IEEE Transactions on Neural Networks, 5(4), 537–550. https://doi.org/10.1109/72.298224
Beheshti, Z. (2022). BMPA-TVSinV: A Binary Marine Predators Algorithm using time-varying sinus and V-shaped transfer functions for wrapper-based feature selection. Knowledge-Based Systems, 252. https://doi.org/10.1016/j.knosys.2022.109446
Bradley, P. S., & Mangasarjan, O. L. (1998). Feature Selection via Concave Minimization and Support Vector Machines. Proceedings of the Fifteenth International Conference on Machine Learning (ICML ’98), 6, 82–90.
Datasets | Feature Selection @ ASU. (n.d.). Retrieved August 1, 2024, from https://jundongl.github.io/scikit-feature/OLD/datasets_old.html
De Stefano, C., Fontanella, F., & Scotto di Freca, A. (2016). A novel GA-based feature selection approach for high dimensional data. GECCO 2016 Companion - Proceedings of the 2016 Genetic and Evolutionary Computation Conference, 87–88. https://doi.org/10.1145/2908961.2909049
Dhiman, G., & Kumar, V. (2018). Emperor penguin optimizer: A bio-inspired algorithm for engineering problems. Knowledge-Based Systems, 159, 20–50. https://doi.org/10.1016/j.knosys.2018.06.001
Dorigo, M., & Stützle, T. (2009). Ant colony optimization: overview and recent advances. Techreport, IRIDIA, Universite Libre de Bruxelles, May.
Drucker, H., Wu, D., & Vapnik, V. N. (1999). Support vector machines for spam categorization. IEEE Transactions on Neural Networks, 10(5), 1048–1054. https://doi.org/10.1109/72.788645
Dupin, N., & Talbi, E. G. (2020). Machine learning-guided dual heuristics and new lower bounds for the refueling and maintenance planning problem of nuclear power plants. Algorithms, 13(8). https://doi.org/10.3390/A13080185
Einstein, A., & Cowper, A. D. (n.d.). INVESTIGATIONS O N THE THEORY .OF ,THE BROWNIAN MOVEMENT R. F ü R T H TRANSLATED BY.
Elminaam, D. S. A., Nabil, A., Ibraheem, S. A., & Houssein, E. H. (2021). An Efficient Marine Predators Algorithm for Feature Selection. IEEE Access, 9, 60136–60153. https://doi.org/10.1109/ACCESS.2021.3073261
Emary, E., & Zawbaa, H. M. (2019). Feature selection via Lèvy Antlion optimization. Pattern Analysis and Applications, 22(3), 857–876. https://doi.org/10.1007/s10044-018-0695-2
Faramarzi, A., Heidarinejad, M., Mirjalili, S., & Gandomi, A. H. (2020). Marine Predators Algorithm: A nature-inspired metaheuristic. Expert Systems with Applications, 152. https://doi.org/10.1016/j.eswa.2020.113377
Faramarzi, A., Heidarinejad, M., Stephens, B., & Mirjalili, S. (2020). Equilibrium optimizer: A novel optimization algorithm ✩. 191, 105190. https://doi.org/10.1016/j.knosys
Feature Selection using Salp Swarm Algorithm for Real Biomedical Datasets. (2017). IJCSNS International Journal of Computer Science and Network Security.
Gheyas, I. A., & Smith, L. S. (2010). Feature subset selection in large dimensionality domains. Pattern Recognition, 43(1), 5–13. https://doi.org/10.1016/j.patcog.2009.06.009
Gopal, S., Patro, K., & Kumar Sahu, K. (n.d.). Normalization: A Preprocessing Stage. www.kiplinger.com,
Hermes, L., & Buhmann, J. M. (2000). Feature Selection for Support Vector Machines. Proc. {IEEE} Intl. Conf. Pattern Recognition ({ICPR’00}), 2, 716–719. https://doi.org/10.1109/ICPR.2000.906174
Hussain, K., Neggaz, N., Zhu, W., & Houssein, E. H. (2021). An efficient hybrid sine-cosine Harris hawks optimization for low and high-dimensional feature selection. Expert Systems with Applications, 176. https://doi.org/10.1016/j.eswa.2021.114778
Ibrahim, H. T., Mazher, W. J., Ucan, O. N., & Bayat, O. (2018). A grasshopper optimizer approach for feature selection and optimizing SVM parameters utilizing real biomedical data sets. Neural Computing and Applications. https://doi.org/10.1007/s00521-018-3414-4
Izmailov, A. F. (2010). Solution sensitivity for Karush-Kuhn-Tucker systems with non-unique Lagrange multipliers. Optimization, 59(5), 747–775. https://doi.org/10.1080/02331930802434922
Jha, K., & Saha, S. (2021). Incorporation of multimodal multiobjective optimization in designing a filter based feature selection technique. Applied Soft Computing, 98. https://doi.org/10.1016/j.asoc.2020.106823
Jia, L., Gong, W., & Wu, H. (n.d.). An Improved Self-adaptive Control Parameter of Differential Evolution for Global Optimization.
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Neural Networks, 1995. Proceedings., IEEE International Conference On, 4, 1942–1948 vol.4. https://doi.org/10.1109/ICNN.1995.488968
Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1–2), 273–324. https://doi.org/10.1016/S0004-3702(97)00043-X
Kumar, L., & Bharti, K. K. (2021). A novel hybrid BPSO–SCA approach for feature selection. Natural Computing, 20(1), 39–61. https://doi.org/10.1007/s11047-019-09769-z
Luo, J., Zhou, D., Jiang, L., & Ma, H. (2022). A particle swarm optimization based multiobjective memetic algorithm for high-dimensional feature selection. Memetic Computing, 14(1), 77–93. https://doi.org/10.1007/s12293-022-00354-z
Mafarja, M. M., & Mirjalili, S. (2017). Hybrid Whale Optimization Algorithm with simulated annealing for feature selection. Neurocomputing, 260, 302–312. https://doi.org/10.1016/j.neucom.2017.04.053
Mammone, A., Turchi, M., & Cristianini, N. (2009). Support vector machines. Engineering, December, 1–39. https://doi.org/10.1002/wics.049
Mantegna, R. N. (1994a). Fast, accurate algorithm for numerical simulation of Levy stable stochastic processes (Vol. 49, Issue 5).
Mantegna, R. N. (1994b). Fast, accurate algorithm for numerical simulation of Lévy stable stochastic processes. Physical Review E, 49(5), 4677–4683. https://doi.org/10.1103/PhysRevE.49.4677
Mirjalili, S. (2016). SCA: A Sine Cosine Algorithm for solving optimization problems. Knowledge-Based Systems, 96, 120–133. https://doi.org/10.1016/j.knosys.2015.12.022
Moorthy, U., & Gandhi, U. D. (2021). A novel optimal feature selection technique for medical data classification using ANOVA based whale optimization. Journal of Ambient Intelligence and Humanized Computing, 12(3), 3527–3538. https://doi.org/10.1007/s12652-020-02592-w
Mukherjee, S., Chapelle, O., Weston, J., Mukherjee yy, S., Chapelle, O., Pontil yy Tomaso Poggio yy, M., Vapnik, V., & Barnhill BioInformaticscom, yyy. (2000). Feature selection for SVMs. https://www.researchgate.net/publication/221619995
Parouha, R. P., & Das, K. N. (2016). A memory based differential evolution algorithm for unconstrained optimization. Applied Soft Computing, 38, 501–517. https://doi.org/10.1016/J.ASOC.2015.10.022
Pehlivanlı, A. Ç. (2016). A novel feature selection scheme for high-dimensional data sets: four-Staged Feature Selection. Journal of Applied Statistics, 43(6), 1140–1154. https://doi.org/10.1080/02664763.2015.1092112
Qinbao Song, Jinjie Ni, & Guangtao Wang. (2011). A Fast Clustering-Based Feature Subset Selection Algorithm for High Dimensional Data. Knowledge Creation Diffusion Utilization, 99(X), 1–14. https://doi.org/10.1109/TKDE.2011.181
Salcedo-sanz, S., Prado-cumplido, M., Fernando, P., & Bouso, C. (2002). Feature Selection via Genetic Optimization. 547–548.
Samareh Moosavi, S. H., & Khatibi Bardsiri, V. (2017). Satin bowerbird optimizer: A new optimization algorithm to optimize ANFIS for software development effort estimation. Engineering Applications of Artificial Intelligence, 60, 1–15. https://doi.org/10.1016/j.engappai.2017.01.006
Shaheen, M. A. M., Yousri, D., Fathy, A., Hasanien, H. M., Alkuhayli, A., & Muyeen, S. M. (2020). A Novel Application of Improved Marine Predators Algorithm and Particle Swarm Optimization for Solving the ORPD Problem. Energies, 13(21). https://doi.org/10.3390/en13215679
Shen, C., & Zhang, K. (2022). Two-stage improved Grey Wolf optimization algorithm for feature selection on high-dimensional classification. Complex and Intelligent Systems, 8(4), 2769–2789. https://doi.org/10.1007/s40747-021-00452-4
Soliman, M. A., Hasanien, H. M., & Alkuhayli, A. (2020). Marine Predators Algorithm for Parameters Identification of Triple-Diode Photovoltaic Models. IEEE Access, 8, 155832–155842. https://doi.org/10.1109/ACCESS.2020.3019244
Storn, R., & Price, K. (1997). Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces. Journal of Global Optimization, 11(4), 341–359. https://doi.org/10.1023/A:1008202821328
Tawhid, M. A., & Dsouza, K. B. (2018a). Hybrid binary bat enhanced particle swarm optimization algorithm for solving feature selection problems. Applied Computing and Informatics, 16(1–2), 117–136. https://doi.org/10.1016/j.aci.2018.04.001
Tawhid, M. A., & Dsouza, K. B. (2018b). Hybrid binary bat enhanced particle swarm optimization algorithm for solving feature selection problems. Applied Computing and Informatics, 16(1–2), 117–136. https://doi.org/10.1016/j.aci.2018.04.001
Too, J., Mafarja, M., & Mirjalili, S. (2021). Spatial bound whale optimization algorithm: an efficient high-dimensional feature selection approach. Neural Computing and Applications, 33(23), 16229–16250. https://doi.org/10.1007/s00521-021-06224-y
Tran, B., Xue, B., & Zhang, M. (2019). Adaptive multi-subswarm optimisation for feature selection on high-dimensional classification. GECCO 2019 - Proceedings of the 2019 Genetic and Evolutionary Computation Conference, 481–489. https://doi.org/10.1145/3321707.3321713
V, V. (1995). The nature of statistical learning theory. Springer, New York.
Wang, G. G., Deb, S., & Cui, Z. (2019). Monarch butterfly optimization. Neural Computing and Applications, 31(7), 1995–2014. https://doi.org/10.1007/s00521-015-1923-y
Winter, G., Periaux, J., Galan, M., & Cuesta, P. (1996). Genetic Algorithms in Engineering and Computer Science. http://dl.acm.org/citation.cfm?id=547504
Xu, Z., Huang, G., Weinberger, K. Q., & Zheng, A. X. (2014). Gradient boosted feature selection. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 522–531. https://doi.org/10.1145/2623330.2623635
Yan, C., Ma, J., Luo, H., & Patel, A. (2019). Hybrid binary Coral Reefs Optimization algorithm with Simulated Annealing for Feature Selection in high-dimensional biomedical datasets. Chemometrics and Intelligent Laboratory Systems, 184, 102–111. https://doi.org/10.1016/j.chemolab.2018.11.010
Zhang, B., & Cao, P. (2019). Classification of high dimensional biomedical data based on feature selection using redundant removal. PLoS ONE, 14(4), 1–19. https://doi.org/10.1371/journal.pone.0214406
Zhao, W., Wang, L., & Zhang, Z. (2019). Atom search optimization and its application to solve a hydrogeologic parameter estimation problem. Knowledge-Based Systems, 163, 283–304. https://doi.org/10.1016/j.knosys.2018.08.030
التنزيلات
منشور
إصدار
القسم
الرخصة
الحقوق الفكرية (c) 2025 Journal of Education for Pure Science

هذا العمل مرخص بموجب Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The Authors understand that, the copyright of the articles shall be assigned to Journal of education for Pure Science (JEPS), University of Thi-Qar as publisher of the journal.
Copyright encompasses exclusive rights to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms and any other similar reproductions, as well as translations. The reproduction of any part of this journal, its storage in databases and its transmission by any form or media, such as electronic, electrostatic and mechanical copies, photocopies, recordings, magnetic media, etc. , will be allowed only with a written permission from Journal of education for Pure Science (JEPS), University of Thi-Qar.
Journal of education for Pure Science (JEPS), University of Thi-Qar, the Editors and the Advisory International Editorial Board make every effort to ensure that no wrong or misleading data, opinions or statements be published in the journal. In any way, the contents of the articles and advertisements published in the Journal of education for Pure Science (JEPS), University of Thi-Qar are sole and exclusive responsibility of their respective authors and advertisers.