

My question is not about the underlying code, only about what scenarios each technique would be most appropriate to use or how to tell the difference between when to use each technique. In Stata, the NCVS sample design must be appropriately specified using the svyset command, as follows, each time a new data set is put in use for analysis. When searching for help with this there is much about 'how it works' rather than why one method is appropriate for given circumstances. Which option is statistically justifiable and how does one pick? Although all data are represented (I did not exclude observations, only create a sub-population for analysis) either scaled or centered seem viable. Given the data was collected with the intent of all strata being used, the first option (certainty) seems inappropriate. I've looked at this and can confirm the standard errors are different depending on which technique is selected. I do not understand how to pick between these techniques. Singleunit(centered), specifies that strata with singleton PSUs beĬentered at the grand mean instead of the stratum mean. Units for each stratum with a singleton PSU. The average of the variances from the strata with multiple sampling

The second option, singleunit(scaled), is a scaled

PSUs as certainty units, so those strata contribute nothing to the The first one, singleunit(certainty), will treat strata with singleton However, when I use it, I get an error related to "Missing standard error because of stratum with single sampling unit."Īccording to the documentation from Stata ( here) there are three different ways of dealing with this: Included with the data are weights and strata, meaning Stata svyset is needed. I'm currently working with data related to education that was collected in different geographic regions.
