A Proposal for the Treatment of Non Response To Improve the Quality of Business Census-Like Data in Italian National Accounts Estimations
Augusto Puggioni, Giuseppe Sacco
National Accounts Directorate, Istat - Italian National Statistical Institute, Rome, Italy; National Accounts Directorate, Istat - Italian National Statistical Institute, Rome, Italy

The availability of census-like data sources on enterprises is a key requirement to obtain a sound estimate of economic aggregates. The annual survey on Business Accounts run by Istat is one of the main sources used in National Accounts, as it gathers information on the Annual reports of all the big enterprises with at least 100 workers, in industry, construction, trade and services. Treating non response in the results of the survey comes out to be an issue of paramount importance. Experience has shown that the traditional imputation methods based on the use of administrative data of the same enterprise to impute the main items of the questionnaire and on the use of a “donor” selected among the respondents to impute the sub-items, proved inadequate when used to reconstruct the information for big enterprises in a time series approach. The paper analyzes alternative approaches to impute missing items in the survey questionnaire and provides some comparative outcomes on how the traditional and the new proposed methods impact on the estimation of annual changes of economic aggregates.

Keywords: Business Survey Data imputation; National Accounts; Administrative data

Biography: Giuseppe Sacco has a degree in Statistics at the University of Rome la Sapienza. Currently is technologist at the National Institute of Statistics at the National Accounts Directorate, where has in charge the analysis of business data. He participated in several projects in collaboration with the Universities of Bologna, Perugia and Rome, where he has worked on algorithms for combinatorial optimization and integration of information from different sources. Currently, his area of research is the use of methods of learning based on Bayesian networks for imputation of missing data.