Survey sampling

In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey. The term "survey" may refer to many different types or techniques of observation. In survey sampling it most often involves a questionnaire used to measure the characteristics and/or attitudes of people. Different ways of contacting members of a sample once they have been selected is the subject of survey data collection. The purpose of sampling is to reduce the cost and/or the amount of work that it would take to survey the entire target population. A survey that measures the entire target population is called a census. A sample refers to a group or section of a population from which information is to be obtained.

Survey samples can be broadly divided into two types: probability samples and super samples. Probability-based samples implement a sampling plan with specified probabilities (perhaps adapted probabilities specified by an adaptive procedure). Probability-based sampling allows design-based inference about the target population. The inferences are based on a known objective probability distribution that was specified in the study protocol. Inferences from probability-based surveys may still suffer from many types of bias.

Surveys that are not based on probability sampling have greater difficulty measuring their bias or sampling error.[1] Surveys based on non-probability samples often fail to represent the people in the target population.[2]

In academic and government survey research, probability sampling is a standard procedure. In the United States, the Office of Management and Budget's "List of Standards for Statistical Surveys" states that federally funded surveys must be performed:

selecting samples using generally accepted statistical methods (e.g., probabilistic methods that can provide estimates of sampling error). Any use of nonprobability sampling methods (e.g., cut-off or model-based samples) must be justified statistically and be able to measure estimation error.[3]

Random sampling and design-based inference are supplemented by other statistical methods, such as model-assisted sampling and model-based sampling.[4][5]

For example, many surveys have substantial amounts of nonresponse. Even though the units are initially chosen with known probabilities, the nonresponse mechanisms are unknown. For surveys with substantial nonresponse, statisticians have proposed statistical models with which the data sets are analyzed.

Issues related to survey sampling are discussed in several sources, including Salant and Dillman (1994).[6]

  1. ^ "Non-Probability Sampling - AAPOR". www.aapor.org. Retrieved 2020-05-24.
  2. ^ Weisberg, Herbert F. (2005), The Total Survey Error Approach, University of Chicago Press: Chicago. p.231.
  3. ^ "Archived copy" (PDF). Office of Management and Budget. Retrieved 2009-06-17 – via National Archives.
  4. ^ Lohr. Brewer. Swedes
  5. ^ Richard Valliant, Alan H. Dorfman, and Richard M. Royall (2000), Finite Population Sampling and Inference: A Prediction Approach, Wiley, New York, p. 19
  6. ^ Salant, Priscilla, I. Dillman, and A. Don. How to conduct your own survey. No. 300.723 S3. 1994.