UNVEILING HIDDEN INSIGHTS: DIMENSIONALITY REDUCTION FOR PROSTATE CANCER DATA WITH PCA AND GAUSSIAN MIXTURE MODEL
Keywords:
mixture model, GMM, principal component analysis, PCA mixture, BIC, EM algorithmDOI:
https://doi.org/10.17654/0972361725024Abstract
Applying machine learning techniques for dimensionality reduction of data is a commonly used approach to improve computational efficiency and accuracy of data analysis. Available dimensionality reduction techniques may be categorized into supervised and unsupervised methods. One of the oldest and most popular unsupervised techniques is Principal Component Analysis (PCA), which does not use class labels. It reduces the dimensionality of a dataset while preserving as much variability (i.e., statistical information) as possible. In this study, we apply a multivariate mixture model comprising three components on simulation and real datasets to reduce data and learning volumes. To this end, significant components are first identified by determining the initial values of location parameters using k-means and the default values in the R package. The expectation-maximization algorithm is used to estimate the model parameters. The Bayesian Information Criterion (BIC) is evaluated for the two sets of initial values, revealing that both approaches are computationally efficient. Consequently, the proposed technique effectively reduces dimensionality, retains high variance, and achieves reliable clustering. This enhancement increases computational efficiency and improves result quality, making it essential for high-dimensional datasets.
Received: October 17, 2024
Revised: January 13, 2025
Accepted: January 20, 2025
References
L. Zhou, S. Pan, J. Wang and A. V. Vasilakos, Machine learning on big data: opportunities and challenges, Neurocomputing 237 (2017), 350-361.
P. Cunningham, Dimension reduction, Machine Learning Techniques for Multimedia: Case Studies on Organization and Retrieval, Springer Berlin Heidelberg, 2008, pp. 91 112.
S. Ahmadkhani and P. Adibi, Face recognition using supervised probabilistic principal component analysis mixture model in dimensionality reduction without loss framework, IET Comput. Vis. 10(3) (2016), 193-201.
K. Pearson, LIII. On lines and planes of closest fit to systems of points in space, Lond. Edinb. Dublin Philos. Mag. J. Sci. 2 (1901), 559-572.
H. Hotelling, Analysis of a complex of statistical variables into principal components, J. Educ. Psychol. 24 (1933), 417-441.
I. T. Jolliffe, Principal Component Analysis, 2nd ed., Springer, New York, NY, 2002.
R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, 6th ed., Pearson, 2007.
B. Nicenboim, D. J. Schad and S. Vasishth, An Introduction to Bayesian Data Analysis for Cognitive Science, Under Contract with Chapman and Hall/CRC Statistics in the Social and Behavioral Sciences Series, 2021.
L. Jinying and L. Jinchao, Next-day electricity price forecasting based on support vector machines and data mining technology, 27th Chinese Control Conference, IEEE, 2008, pp. 630-633.
S. J. Roberts and W. D. Penny, Mixtures of independent component analyzers, Proceedings of ICANN2001, Vienna, 2001, pp. 527-534.
N. A. Alqahtani and Z. I. Kalantan, Gaussian mixture models based on principal components and applications, Math. Probl. Eng. 2020 (2020), 1202307.
K. H. Pho, S. Ly, S. Ly and T. M. Lukusa, Comparison among Akaike information criterion, Bayesian information criterion and Vuong’s test in model selection: a case study of violated speed regulation in Taiwan, J. Adv. Eng. Comput. 3(1) (2019), 293-303.
D. Kim and J. Sung, Automated face analysis, emerging technologies and research, Medical Information Science Reference, 2009.
I. T. Jolliffe and J. Cadima, Principal component analysis: a review and recent developments, Philos. Trans. Roy. Soc. A 374(2065) (2016), 20150202.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Pushpa Publishing House, Prayagraj, India

This work is licensed under a Creative Commons Attribution 4.0 International License.
____________________________
Attribution: Credit Pushpa Publishing House as the original publisher, including title and author(s) if applicable.
No Derivatives: Modifying or creating derivative works not allowed without written permission.
Contact Pushpa Publishing House for more info or permissions.
Journal Impact Factor: 