6–9 Jul 2026
Europe/Warsaw timezone

Solving Optimum Allocation Problems in Stratified Sampling Using the R Package stratallo

7 Jul 2026, 16:25
5m
Lightning Talk (5 minutes) Lightning Talks

Speaker

Wojciech Wójciak (Warsaw University of Technology)

Description

Optimum allocation of sample sizes across strata is a fundamental problem in survey sampling. When designing a stratified survey, researchers must decide how to distribute a fixed total sample size among strata in order to achieve specific statistical goals, such as minimizing estimator variance or minimizing survey cost subject to precision constraints. Classical results, such as Neyman allocation, provide solutions in simple settings without constraints. However, practical survey designs frequently require additional restrictions, including lower or upper bounds on stratum sample sizes, heterogeneous unit costs, or precision requirements across multiple domains.

This talk presents stratallo, an R package developed to solve several optimum sample allocation problems arising in stratified sampling designs. The package implements efficient algorithms for computing optimal allocations under a range of practical constraints. In particular, it supports three main classes of problems. First, the function opt() computes variance-minimizing allocations under a fixed total sample size, optionally subject to lower bounds, upper bounds, or box constraints on stratum sample sizes. Depending on the constraint structure, the function employs specialized algorithms such as RNA, SGA, SGAPLUS, COMA, and RNABOX.

Second, the function optcost() determines minimum-cost allocations for a specified target variance of a stratified estimator. This formulation is useful when survey planners seek to minimize survey cost while maintaining a prescribed level of precision. The implementation relies on the LRNA algorithm, which efficiently handles upper-bound constraints and heterogeneous unit costs.

Third, the package includes methods for multi-domain optimum allocation with controlled precision. The function dopt() computes allocations that balance precision requirements across multiple domains while respecting total sample size and population constraints. The problem is solved using the RDCA algorithm, which provides exact solutions and, to the best of the author’s knowledge, constitutes the first well-established exact algorithm for this class of allocation problems.

The talk will introduce the optimization formulations underlying these problems and demonstrate how they can be solved using stratallo. Several examples will illustrate the use of the package in practical survey design scenarios, including cases with large numbers of strata and domains. The package also provides artificial population datasets that facilitate benchmarking and experimentation.

The stratallo package aims to make advanced optimal allocation methods easily accessible within the R ecosystem, enabling survey practitioners and researchers to implement principled and computationally efficient sampling designs.

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

ChatGPT (OpenAI): used for grammar and style checking and for language suggestions, as I am not a native English speaker.

Keywords: Please list up to 5 keywords to help us find the right session for your contribution. survey sampling, optimum sample allocation.
Virtual Option This submission is for onsite presentation only
Video Recording Please don't share recordings of my talk
The author(s) agree(s) to take responsibility and be accountable for the contents of the submission and is/are authorized to present it. Confirm

Author

Wojciech Wójciak (Warsaw University of Technology)

Co-authors

Prof. Jacek Wesołowski (Warsaw University of Technology) Dr Robert Wieczorkowski (Statistics Poland)

Presentation materials

There are no materials yet.