Principal Components Based on a Subset of Qualitative Variables and Its Accelerated Computational Algorithm
Masahiro Kuroda, Masaya Iizuka, Yuichi Mori, Michio Sakakihara
Department of Socio-Information, Okayama University of Science, Okayama, Japan; Graduate School of Environmental Science, Okayama University, Okayama, Japan; Department of Socio-Information, Okayama University of Science, Okayama, Japan

Principal components analysis (PCA) is a descriptive multivariate method that transforms a number of possibly correlated variables into a smaller number of uncorrelated variables, and selecting a subset of variables in PCA is useful in many practical applications.

Modified PCA (M.PCA) proposed by Tanaka and Mori (1996) derives principal components which are computed as a linear combination of a subset of variables but can reproduce all the variables very well. M.PCA therefore can select a reasonable subset of any number of variables. When applying M.PCA to qualitative data, PRINCIPALS by Young, et al. (1978) and PRINCALS by Gifi (1989) based on the alternating least squares (ALS) algorithm can be used as a quantification method. In this situation, the computation time has been a big issue so far, because the total number of iterations of the algorithm is much larger and it takes a long computational time until its convergence even though a cost-saving selection procedure such as backward, forward or stepwise selection is employed. Kuroda et al. (2011) derives a new iterative algorithm for accelerating the convergence by using the vector epsilon (VE) algorithm by Wynn (1962). In this paper, we investigate how much the proposed VE acceleration algorithm improves the computational efficiency when we apply the accelerated algorithm to the variable selection problem in M.PCA for qualitative data.

Keywords: Principal components analysis; Alternating least squares algorithm; Vector epsilon acceleration; Model selection

Biography: Masahiro Kuroda is Associate Professor at Okayama University of Science. His research area is computational statistics. He is especially interested in the acceleration of convergence of statistical computation algorithms.