Predictive Performance of Bayesian Model Diagnostics
Naoki Isogawa1, Masashi Goto2
1Clinical Statistics, Pfizer Japan Inc., Shibuya-ku, Tokyo, Japan; 2Biostatistical Research Association, NPO, Toyonaka-shi, Osaka, Japan

In the framework of traditional model selection based on Bayesian information criteria (BIC) and Bayes factor (BF), a model with highest posterior probability is selected. Then, we are often interested in data or information which will be gained in the future.

The prior and posterior predictive checking approaches (PCA) (Box, 1980; Rubin, 1984; Gelman, Meng and Stern, 1996; Daimon and Goto, 2007) and the Bayesian predictive information criterion (BPIC) (Ando, 2007) are two diagnostic methods that focus on prediction.

In the prior and posterior predictive checking approach, a model is evaluated by comparing the prior and posterior predictive distributions of future observations to the data that have actually occurred. In the Bayesian predictive information criterion, a model is selected with highest expected log-likelihood.

We evaluated these two methods under various predictive distributions. Namely, we evaluated the two methods under various scenarios of prior distributions, sample sizes, and indices of interest (e.g. mean, SD).

As a result, we obtained a clearer interpretation of their characteristics. Main findings which were obtained in our study are as follow: For models with weak prior information, BPIC was more sensitive about model selection than PCA, that is, selection rates of correct model in BPIC were higher than those in PCA. For models with strong prior information as large as information about actual data which were gained, BPIC was as sensitive as PCA. Furthermore, under models evaluated with weak and strong prior distributions simultaneously, we had much the same predictive checking probabilities in PCA for models with weak and strong prior distribution including true value of parameter. Thus, we couldn't distinguish between them. On the other hand, BPIC chose models with strong prior distribution including true value of parameter more than models with week prior distribution including true value of parameter. However, PCA was useful for the discrimination among many models because it could make diagnostics for not only model itself but also indices of interest from the perspective of the frequentist.

We can conclude that an improved model selection can be achieved by combining the two methods.


Ando, T. (2007). Bayesian predictive information criterion for the evaluation of hierarchical Bayesian and empirical Bayes models. Biometrika, 94, 443-458.

Box, G. E. P. (1980). Sampling and Bayes' inference in scientific modelling and robustness. J. Roy. Statist. Soc., A153, 383-430.

Daimon, T. and Goto, M. (2007). Predictive checking approach to Bayesian interim monitoring. Japanese J.Appl. statist., 36(2 & 3), 119-137.

Gelman, A., Meng, X. L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Statistica Sinica, 6, 733-807.

Rubin, D. B. (1984). Bayesianly justifable and relevant frequency calculations for the applied statistician. The Annals of Statistics, 12, 1151-1172.

Keywords: Prior and posterior predictive checking approaches; Bayesian predictive information criterion; Predictive model diagnostics

Biography: Isogawa, N., Ikebe, T., Sakamoto, W. and Goto, M. (2011). A preliminary evaluation about health guidance (in Japanese). Behaviormetrika (in press).

Isogawa, N., Shirahata, S. and Goto, M. (2008). Effect of asymmetry in underlying distributions on performance of paired tests. Proceedings of the Joint Meeting of 4th Workd Conference of the IASC and 6th Conference of the Asian Regional Section of the IASC on Computational Statistics and Data Analysis, Yokohama, Japan.

Sakamoto, W., Isogawa, N. and Goto, M. (2008). Statistical issues on Japanese criteria of metabolic syndrome (in Japanese). Behaviormetrika,35(2), 177-192.