Study of Record Linkage Software for the 2010 Brazilian Census Post Enumeration Survey
Andrea Diniz da Silva, Otavio Santana Martins Romeo, Thaigo Silva Soares, Vinicius Layter Xavier
Directorate of Surveys, Instituto Brasileiro de Geografia e Estatística - IBGE, Rio de Janeiro, Brazil

One of the biggest improvements of the 2010 Brazilian Census Post Enumeration Survey (PES) is the incorporation of new methodologies and technologies developed or improved throughout the last decade. One of the areas that had shown [JD1] great advances is the development of software packages for the implementation of computational models. It is now possible to find more than one software package to implement record linkage projects. If on the one hand such advances bring possibilities of improvements in the process and its results, on the other hand it requires a great investment of time and resources to study and choose the one that better meets the specific needs of each reality. To find a tool that could be used for linking records from the Brazilian Census and PES, the Brazilian Institute of Geography and Statistics (IBGE) invested in software studies over the last two years. During this period, covering the years 2009 and 2010, were evaluated commercial and free software including Data Quality (SAS), Quality Stage (IBM), RecLink (Universidade do Estado do Rio de Janeiro & Universidade Federal do Rio de Janeiro), LinkPlus (CDC-Atlanta), FEBRL (Australian National University) and RELAIS (Istat). The study took into account not only operational but also technical aspects once it was conducted in parallel with the study of the methodological models. The experiment was based on the documents and licenses available at the moment. It led to the conclusion that no existing software package met properly the specificities of the Brazilian PES. Among these specificities is the need to process the PES while the Census is still being collected. Therefore, IBGE developed the software program used in the record linkage of the 2010 Census PES using the R language.

Keywords: Record Linkage; Software studies; Post Enumeration Survey; Census Evaluation

Biography: Andrea Diniz da Silva works for the Brazilian Institute of Geography and Statistics (IBGE) as a coordinator of the 2010 Brazilian Census Post Enumeration Survey. Previously she was part of the team for the preparation of the 2010 Census and worked on pilot and cognitive tests on disability and literacy. In the 2000 Census, she worked on the data analyses during and after the data collection operation. She holds a master's degree in Social Survey and Sampling and is a member of the Brazilian Association of Population Studies and the Brazilian Association of Statistics.