Statisticians have yet to deal with privacy protection for large-scale sparse statistical databases in an adequate and systematic fashion. Here I review some of the traditional approaches to disclosure limitation used for more standard rectangular n x p arrays and discuss them from the perspective of usability (freedom from systematic distortions), transparency (the provision of information bias and variability), and duality (balancing the risk-utility trade-off). Then I explain why extensions of these approaches to the domain of network data pose even greater challenges and I review progress on the topic to date.
Keywords: Privacy; Disclosure; Network Data
Biography: Stephen Fienberg is the Maurice Falk University Professor of Statistics and Social Science in the Department of Statistics, the Machine Learning Department, Cylab and i-Lab at Carnegie Mellon University. He is senior editor of the Annals of Applied Statistics and editor in chief of Journal of Privacy and Confidentiality. His research interests include the analysis of categorical data, confidentiality and disclosure control, Bayesian methods, statistics and the law and foundations of statistical inference.