Enhanced Wasserstein Generative Adversarial Network (EWGAN) to Oversample Imbalanced Datasets

Authors

  • Muhammad Hassan Ajmal Hashmi CEO Tech Solutions Lahore-54000 Pakistan, Faculty of Computer Science, KIPS College Lahore-5400 Pakistan Author
  • Muhammad Ashraf IT Department ,Gulab Devi Teaching Hospital Lahore 54000, Pakistan Author
  • Saleem Zubair Ahmad Department of Software Engineering, Superior University Lahore-54000, Pakistan Author
  • Muhammad Waseem Iqbal Department of Software Engineering, Superior University Lahore-54000, Pakistan Author
  • Adeel Hamid Faculty of Computer Science, Virtual University Lahore-5400, Pakistan Author
  • Dr. Abid Ali Hashmi Educational Complex Lahore-54000 Pakistan Author
  • Muhammad Ameer Hamza Department of Computer Science, Superior University Lahore-54000, Pakistan Author

DOI:

https://doi.org/10.61506/01.00505

Keywords:

WGAN, Imbalanced Data, Synthetic Data, Machine Learning, Cancer Diagnosis, Data Sampling, Model Stability, Data Generation, GAN Models

Abstract

This paper examines WGAN as a more advanced technique for addressing imbalanced data sets in the context of machine learning. A variety of domains, including medical diagnosis and image generation, are affected by the problem of imbalanced datasets since it is essential to represent the minority class to train a satisfactory model and create various types of data. To overcome these challenges WGAN uses some features such as; Residual connections in the critic network, better sampling for minority classes, and some noise and sample reshaping. These innovations contribute to the increased stability of the model, the quality of synthetic data, and the distribution of classes in a dataset. The comparative analysis of WGAN with basic GAN and Improved GAN has shown the effectiveness of the given algorithm in terms of producing high-quality diversified synthetic data that is closer to the real data distribution. The study identifies the future research direction of WGAN in enhancing machine learning based on reliable and diverse synthesized data, providing new insights and directions for future studies and practical applications in tackling data imbalance issues.

References

Ali, G., Dastgir, A., Iqbal, M. W., Anwar, M., & Faheem, M. (2023). A hybrid convolutional neural network model for automatic diabetic retinopathy classification from fundus images. IEEE Journal of Translational Engineering in Health and Medicine, 11, 341-350. DOI: https://doi.org/10.1109/JTEHM.2023.3282104

Arjovsky, M., Chintala, S., & Bottou, L. (2017, July). Wasserstein generative adversarial networks. In International conference on machine learning (pp. 214-223). PMLR.

Chapaneri, R., & Shah, S. (2022). Enhanced detection of imbalanced malicious network traffic with regularized generative adversarial networks. Journal of Network and Computer Applications, 202, 103368. DOI: https://doi.org/10.1016/j.jnca.2022.103368

Dewi, C., Chen, R. C., & Liu, Y. T. (2021). Wasserstein generative adversarial networks for realistic traffic sign image generation. In Intelligent Information and Database Systems: 13th Asian Conference, ACIIDS 2021, Phuket, Thailand, April 7–10, 2021, Proceedings 13 (pp. 479-493). Springer International Publishing. DOI: https://doi.org/10.1007/978-3-030-73280-6_38

Engelmann, J., & Lessmann, S. (2021). Conditional Wasserstein GAN-based oversampling of tabular data for imbalanced learning. Expert Systems with Applications, 174, 114582. DOI: https://doi.org/10.1016/j.eswa.2021.114582

Ghasemieh, A., & Kashef, R. (2023). An enhanced Wasserstein generative adversarial network with Iranian angular fields for efficient stock market prediction during market crash periods. Applied Intelligence, 53(23), 28479-28500. DOI: https://doi.org/10.1007/s10489-023-05016-2

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144. DOI: https://doi.org/10.1145/3422622

Guan, S., Zhao, X., Xue, Y., & Pan, H. (2024). AWGAN: An adaptive weighting GAN approach for oversampling imbalanced datasets. Information Sciences, 663, 120311. DOI: https://doi.org/10.1016/j.ins.2024.120311

Hamıd, k., muhammad, h. a. b., ıqbal, m. w., hamza, m. a., bhattı, s. u., Hassan, s. a., & ıkram, a. extendable banhattı sombor ındıces for modelıng certaın computer networks.

Hamid, K., Iqbal, M. W., Arif, E., Mahmood, Y., Khan, A. S., Kama, N., ... & Ikram, A. (2022). K-Banhatti Invariants Empowered Topological Investigation of Bridge Networks. Computers, Materials & Continua, 73(3). DOI: https://doi.org/10.32604/cmc.2022.030927

Hamid, K., Iqbal, M. W., Ashraf, M. U., Gardezi, A. A., Ahmad, S., Alqahtani, M., & Shafiq, M. (2023). Intelligent Systems and Photovoltaic Cells Empowered Topologically by Sudoku Networks. Computers, Materials & Continua, 74(2). DOI: https://doi.org/10.32604/cmc.2023.034320

Hamid, K., Iqbal, M. W., Virk, A. U. R., Ashraf, M. U., Alghamdi, A. M., Bahaddad, A. A., & Almarhabi, K. A. (2022). K-Banhatti Sombor Invariants of Certain Computer Networks. Computers, Materials & Continua, 73(1). DOI: https://doi.org/10.32604/cmc.2022.028406

Hamid, K., Waseem Iqbal, M., Abbas, Q., Arif, M., Brezulianu, A., & Geman, O. (2022). Discovering irregularities from computer networks by topological mapping. Applied Sciences, 12(23), 12051. DOI: https://doi.org/10.3390/app122312051

Jin, Q., Lin, R., & Yang, F. (2019). E-WACGAN: Enhanced generative model of signaling data based on WGAN-GP and ACGAN. IEEE Systems Journal, 14(3), 3289-3300. DOI: https://doi.org/10.1109/JSYST.2019.2935457

Lee, G. C., Li, J. H., & Li, Z. Y. (2023). A Wasserstein Generative Adversarial Network–Gradient Penalty-Based Model with Imbalanced Data Enhancement for Network Intrusion Detection. Applied Sciences, 13(14), 8132. DOI: https://doi.org/10.3390/app13148132

Li, Q., Chen, L., Shen, C., Yang, B., & Zhu, Z. (2019). Enhanced generative adversarial networks for fault diagnosis of rotating machinery with imbalanced data. Measurement Science and Technology, 30(11), 115005. DOI: https://doi.org/10.1088/1361-6501/ab3072

Man, C. K., Quddus, M., Theofilatos, A., Yu, R., & Imprialou, M. (2022). Wasserstein generative adversarial network to address the imbalanced data problem in real-time crash risk prediction. IEEE Transactions on Intelligent Transportation Systems, 23(12), 23002-23013. DOI: https://doi.org/10.1109/TITS.2022.3207798

Munia, M. S., Nourani, M., & Houari, S. (2020, November). Biosignal oversampling using Wasserstein generative adversarial network. In 2020 IEEE International Conference on Healthcare Informatics (ICHI) (pp. 1-7). IEEE. DOI: https://doi.org/10.1109/ICHI48887.2020.9374315

Qin, S., & Jiang, T. (2018). Improved Wasserstein conditional generative adversarial network speech enhancement. EURASIP Journal on Wireless Communications and Networking, 2018(1), 181. DOI: https://doi.org/10.1186/s13638-018-1196-0

Suh, S. (2021). Improving Classification Performance under Imbalanced Data Conditions using Generative Adversarial Networks (Doctoral dissertation, Technische Universität Kaiserslautern).

Suh, S., Lee, H., Lukowicz, P., & Lee, Y. O. (2021). CEGAN: Classification Enhancement Generative Adversarial Networks for unraveling data imbalance problems. Neural Networks, 133, 69-86. DOI: https://doi.org/10.1016/j.neunet.2020.10.004

Wang, W., Wang, C., Cui, T., & Li, Y. (2020). Study of restrained network structures for wasserstein generative adversarial networks (WGANs) on numeric data augmentation. IEEE Access, 8, 89812-89821. DOI: https://doi.org/10.1109/ACCESS.2020.2993839

Zhang, H., Wang, R., Pan, R., & Pan, H. (2020). Imbalanced fault diagnosis of rolling bearing using enhanced generative adversarial networks. IEEE Access, 8, 185950-185963. DOI: https://doi.org/10.1109/ACCESS.2020.3030058

Zhang, L., Duan, L., Hong, X., Liu, X., & Zhang, X. (2021). Imbalanced data enhancement method based on improved DCGAN and its application. Journal of Intelligent & Fuzzy Systems, 41(2), 3485-3498. DOI: https://doi.org/10.3233/JIFS-210843

Zhang, M., Liu, Y., Luan, H., & Sun, M. (2017, September). Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 1934-1945). DOI: https://doi.org/10.18653/v1/D17-1207

Zheng, M., Li, T., Zhu, R., Tang, Y., Tang, M., Lin, L., & Ma, Z. (2020). Conditional Wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classification. Information Sciences, 512, 1009-1023. DOI: https://doi.org/10.1016/j.ins.2019.10.014

Downloads

Published

2024-08-28

Issue

Section

Articles

How to Cite

Hashmi, M. H. A., Ashraf, M. ., Ahmad, S. Z. ., Iqbal, M. W. ., Hamid, A. ., Hashmi, A. A. ., & Hamza, M. A. . (2024). Enhanced Wasserstein Generative Adversarial Network (EWGAN) to Oversample Imbalanced Datasets. Bulletin of Business and Economics (BBE), 13(3), 385-395. https://doi.org/10.61506/01.00505