It has now generally accepted that a precursor to carrying out statistical disclosure risk analysis is the grounded generation of appropriate key variable sets. Work by Paass (1990), Elliot and Dale (1999) laid the ground work for an approach based on attack scenarios. This approach has been further extended by Mackey (2009) who describes the notion of the data environment. Recent developments by Elliot et al (2009) show how this notion might be formalised and describes a pilot system the key variable mapping system. This paper reports on updates to this work, incorporating the notion of data accessibility and other parameters effecting the degree of effort required by a would be data intruder who wished to use external data sources to identify individual population units within anonymised data sets. This is the first attempt to metricise the likelihood of a disclosure attempt and the work provides an indicator of how we might move beyond the data-centric approach (using just the properties of the to-be-released data to estimate risk).
Keywords: Disclosure risk; Key variable; Data environment; Data intrusion
Biography: Mark Elliot is a Senior Lecturer at Manchester University.