A Weighted Score Derived from a Multiple Correspondence Analysis Solution
Márcio L.M. Souza, Ronaldo R. Bastos, Marcel De Toledo Vieira
Departamento de Estatística, Universidade Federal de Juiz de Fora, Juiz de Fora, Minas Gerais, Brazil

Multivariate data analysis considers each variable as a dimension of space and had a greater momentum in recent decades with the development and improvement of computer technology. Such techniques have as one of their main aims the reduction of data dimensionality. In this paper, we propose methodology for calculation of weighted scores from a set of categorical variables based on the mathematical properties of the Multiple Correspondence Analysis (MCA).

Survey response data to address attitudes, satisfaction and other latent variables of interest to social scientists often rely on a set of Likert-type statements for which respondents choose one category among all possible categorical answers. For each respondent, scores are usually calculated as a summation of individual values obtained from each response. However, such scores are represented by integer values only and assume equal distances between each ordered category. Furthermore, summation scores may be less accurate in representing latent traits, as different profiles may result in identical score values. The scoring method proposed in this paper also tries to minimize this problem.

Methodology developed in this paper allows the subsequent application of explanatory models for both cross-sectional and longitudinal data, since the average profile of scores may change over time. We apply the proposed weighted score technique to attitudinal data from the British Household Panel Survey (Taylor et al., 2001), using the R program (R Development Core Team, 2010) from the MCA solution obtained through the ca package (Nenadic and Greenacre, 2007). In order to evaluate the stability of the results we have been undertaking simulation-based analyses with the original data and also with data generated from different population scenarios. Our results suggest that the proposed weighted score has the potential of better representing latent variables than the simple summation of values of categorical variables over all responses.

References:

Nenadic, O. and Greenacre, M. (2007) Correspondence Analysis in R, with Two-and-Three-dimensional graphics: The ca Package. Journal of Statistics Software, vol. 20, issue 3. http://www.jstatsoft.org/

R Development Core Team (2010) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL.

Taylor, M.F. (ed), Brice, J., Buck, N. and Prentice-Lane, E. (2001) British Household Panel Survey - User Manual - Vol. A: Introduction, Technical Report and Appendices. Colchester, U. of Essex.

Keywords: scoring method; multivariate analysis; multiple correspondence analysis; attitudinal data

Biography: Vieira, Marcel D.T. (B.Sc. M.Sc. Ph.D; b 1976).

Marcel Vieira is Senior Lecturer of Statistics and Head of the Department of Statistics at the Federal University of Juiz de Fora, Brazil. His academic interests focus on statistical methods in sample surveys, official statistics, and the social sciences, including the design and analysis of longitudinal sample surveys. He was the CASS-ESRC Fellowship Award winner at the Southampton Statistical Sciences Research Institute (S3RI) at the University of Southampton in 2006, and the IASS Cochran-Hansen Prize winner in 2007.