Methods
Healthy kids colorado survey (HKCS) [9]
HKCS is a biannual statewide survey on the health and well-being of young Coloradans. The methods of sampling and data analysis for HKCS are aligned with the Youth Risk Behavior Survey conducted by the Centers for Disease Control and Prevention [10] that has been administered on a two-year cycle since 1991. The 2019 HKCS high school dataset was used as our analytical baseline dataset due to relatively higher response rates (83.4% for schools, 82.1% for classes, and 71.1% for students). The survey was administered from September to December of 2019 and included over 120 questions in domains such as physical activity, nutrition, bullying, substance use, school and teacher connections, mental health, and sexual behaviors.
In the first stage, 199 high schools (primary sampling units) were systematically sampled from 21 health statistic regions (strata). Of those, 33 sampled high schools refused to participate resulting in a school nonresponse rate of 16.6%. In the second stage, four or more classes were selected within each participating school, and all the students within the sampled classes were recruited. Of 3634 sampled classes, 651 classes failed to participate (class level nonresponse rate was 17.9%). A total of 65,468 9–12th grade students were sampled. Of those, 18,931 failed to complete the surveys (third stage student nonresponse rate was 28.9%). Weights were constructed to account for the selection probability, nonresponse, and difference in demographic distribution between the sample and the population of Colorado’s high school students [11, 12]. Weighting factors included: school base weight (\(W_{1}\)); school nonresponse adjustment factor (\(F_{1}\)); classroom selection weight (\(W_{2}\)); classroom nonresponse factor (\(F_{2}\)); an adjustment factor that accounts for student nonresponse (\(F_{3}\)); a post-stratification factor that adjusts the difference between the sample and the population (\(F_{4}\)). The final weights are the products of base weights and adjustment factors (final weight = \(W_{1} F_{1} \times W_{2} F_{2} F_{3} \times F_{4}\)), with extreme weights trimmed.
Baseline dataset
The sample included 46,537 9–12th graders from 2983 classes of 166 high schools from 21 health statistic regions. Twelve survey questions across several domains and their constructed binary variables were included in the baseline dataset for illustration: (1) Been active 60 min on more than 5 + days past 7 days; (2) Had 1 + drinks past 30 days; (3) Ate breakfast on all of the past 7 days; (4) Been bullied at school in past 12 months; (5) Fought 1 + times in past 12 months; (6) Described grades as mostly A’s or B’s over past 12 months; (7) Used marijuana 1 + times in past 30 days; (8) Never/rarely wore seat belt; (9) Ever had sex; (10) Slept 8 + hours/average school night; (11) Smoked 1 + days in past 30 days; (12) Attempted suicide 1 + times in past 12 months. Weighted prevalence for those indicators based on the baseline state dataset ranged from slightly lower than 10% to higher than 70%.
Simulation of nonresponding schools, classes, and students
Simulation was used to generate datasets with nonresponding schools, classes, and students at different rates. For example, to simulate the first stage school nonresponses, rates of 5%, 10%, 20%, 30%, 40%, 50%, and 60% of schools were randomly dropped from the baseline dataset. The simulation was repeated 1000 times at each nonresponse rate and 7000 datasets with seven different school nonresponse rates were created. A pre-compiled macro program was applied to construct survey weights for each simulated dataset, with the nonresponse adjustment factor (F1) and post-stratification factor (F4) calculated so the sum of the weight in the simulated datasets was identical to the original baseline state dataset. Similar procedures were used to simulate the scenarios of second stage class nonresponse and third stage student nonresponse.
Statistical analysis
Weighted prevalence and the corresponding 95% confidence interval (CI) for the 12 binary outcome variables were estimated for each of the 21,000 simulated datasets. For each outcome variable, the mean margin of error of the point prevalence estimates at each nonresponse rate were calculated. For instance, to calculate the mean margin of error for the outcome variable “Attempted suicide” at a 10% school nonresponse rate, the margin of error was obtained from the half-width of each CI and then was averaged across the 1000 simulated datasets for that nonresponse rate. The means of each margin of error were plotted and compared at each nonresponse rate. Figure 1 is a flowchart to illustrates the simulation and data analysis procedure. The simulation and survey data analysis were all performed using SAS 9.4 (Cary, NC) [13].
Results
The mean margin of error increased with increasing nonresponse rates for the simulated data with nonresponse at the first (school), second (class), and third (student) stages. However, at the same nonresponse rate, compared to the survey data with lower stage nonresponse (i.e., students/classes), the mean margin of error was greater for the data with higher stage nonresponse (i.e., nonresponding schools). Furthermore, with increasing nonresponse rates, the magnitude of the increase in the margin of error were more greatly inflated for the data with higher stage nonresponse.
Because we randomly dropped schools, classes, and students from the baseline dataset, there was not substantial fluctuation in the point prevalence estimates for the data with simulated nonresponding schools, classes, and students across different nonresponse rates. However, the magnitude of the increase in the margin of errors were larger for the survey items with higher prevalence (i.e., school grading vs. current smoking). The mean margin of error for the 12 outcomes are shown in Fig. 2.
Discussion
Although the adverse consequence of producing biased outcomes from nonresponse and the negative relationship between sample size and variance are both well known, the negative effect of nonresponse at different stages on variance estimation has not been thoroughly assessed. This study used a simulation approach to assess the impact that nonresponse at different sampling stages has on the variance estimation for survey data.
The findings from the simulation indicated that, under scenarios of identical nonresponse rates, higher stage nonresponse data was more likely to impair the precision of the estimates compared to lower stage nonresponse data. Furthermore, the magnitude in the difference of the variance was greater at higher nonresponse rates and was more pronounced for the survey items with higher prevalence estimates (i.e., school grading).
Our findings have reinforced existing knowledge regarding the variance of survey data. We further revealed that nonresponse at different sampling stages had different impact on the precision of estimates. The findings highlighted the need to place extra emphasis and resources on recruitment at higher sampling stages (e.g., primary sampling units), especially for survey studies with instruments that involve common or non-rare items. Though the simulation is based on the data from a single survey with categorical and ordinal responses, the evidence provided in this study can also be generalized to other multi-stage survey studies with similar structure and outcome types.