In statistical disclosure risk estimation, efficient computation of a posterior predictive distribution, based on a contingency table of key variables, can facilitate accurate and well-calibrated risk identification. However, exact computation is generally infeasible due to the large size (in terms of number of cells) and high degree of sparsity of the tables concerned, and the nature of the posterior predictive summaries required. Furthermore, interest is typically focussed on cells which have small sample frequencies, making the problem somewhat different from standard contingency table estimation, where interest is usually focussed on regions of high probability.
We propose an approach for summarising the posterior predictive distribution arising from a large sparse contingency table, which combines exact computations and accurate approximations. The predictive distributions are model-averaged, to take account of model uncertainty, and hence provide appropriate smoothing in the estimation of cell probabilities in regions of low probability.
Keywords: Bayesian inference; Predictive distribution; Contingency table; Statistical disclosure risk estimation
Biography: Jon Forster is Deputy Director of Southampton Statistical Sciences Research Institute and Professor of Statistics at the University of Southampton, UK. His research interests include Bayesian inference and computation, particularly in the context of categorical data analysis, and statistical applications in demography.