For some decades now administrative registers have been an important data source for official statistics alongside survey sampling and population census. Reduction of response burden, long-term cost efficiency as well as potentials for detailed spatial-demographic and longitudinal statistics are some of the major advantages associated with the use of administrative registers. However, administrative registers certainly do not provide perfect statistical data. Despite the absence of sampling errors, there exist a variety of non-sampling errors such as over- and under-coverage, lack of relevance, misclassification, missing data, delays and mistakes in the data registration process, inconsistency across the administrative sources, etc.. Until recently, there was clearly a lack of statistical theories for assessing the quality of register statistics. We believe that a key issue here is the conceptualization and measurement of the statistical accuracy in register statistics, which will enable us to apply rigorous statistical concepts such as bias, variance, efficiency and consistency, e.g. as one is able to do when it comes to survey sampling. In this paper we review the recent developments in statistical theory for register-based statistics, and present a systematic approach to the various potential error sources — a shared understanding of which shall hopefully help us to collocate and coordinate efforts in future research and development.
Keywords: Register-based statistics; Statistical theory; Error source; Data integration
Biography: Li-Chun Zhang obtained Dr. Scient. at University of Tromsø. He is currently a senior methodologist at Statistics Norway.