This presentation introduces a class of algorithms for stratifying a population using a known positive stratification variable X. The range of X is divided in non-overlapping intervals that define the strata. The units whose X-values are in a given interval are a stratum. The determination of stratum boundaries is discussed in this presentation. Several criteria can be envisaged. One can minimize the anticipated sample size for estimating the population total of a survey variable, with a given level of precision. One can also minimize the variance of an estimate of this total for a fixed sample size. In these calculations one can assume that the survey and the stratification variables coincide. Models can also be used to account for a discrepancy between the two. The optimal stratum boundaries depend on an allocation rule for distributing the sample points to the strata and several rules are available. In some instance one might be interested in having a take-all stratum for the “large” units and/or a take-none stratum for the small ones. An R-package called stratification which implements most of these procedures will be used. The basic methods of Dalenius and Hodge and of Gunning and Horgan will be presented together with iterative procedures similar to that of Lavallée and Hidiroglou. Two algorithms for determining optimal boundaries, attributable to Sethi and Kozak, will be compared. The presentation will consist in a series of case studies illustrating various aspects of these procedures using real populations.
Keywords: Neyman allocation; Take all stratum; Take none stratum
Biography: Louis-Paul Rivest is Professor of Statistics at Université laval. He has a Canada Research Chair in Statistical Sampling and Data Analysis. He has been an active researcher in survey sampling for more than 20 years.