Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Stratified sampling
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Disadvantages== It would be a misapplication of the technique to make subgroups' sample sizes proportional to the amount of data available from the subgroups, rather than scaling sample sizes to subgroup sizes (or to their variances, if known to vary significantly—e.g. using an [[F-test|F test]]). Data representing each subgroup are taken to be of equal importance if suspected variation among them warrants stratified sampling. If subgroup variances differ significantly and the data needs to be stratified by variance, it is not possible to simultaneously make each subgroup sample size proportional to subgroup size within the total population. For an efficient way to partition sampling resources among groups that vary in their means, variance and costs, see [[Sample size#Stratified sample size|"optimum allocation"]]. The problem of stratified sampling in the case of unknown class priors (ratio of subpopulations in the entire population) can have a deleterious effect on the performance of any analysis on the dataset, e.g. classification.<ref name=minimax-sampling/> In that regard, [[minimax|minimax sampling ratio]] can be used to make the dataset robust with respect to uncertainty in the underlying data generating process.<ref name=minimax-sampling/> Combining sub-strata to ensure adequate numbers can lead to [[Simpson's paradox]], where trends that exist in different groups of data disappear or even reverse when the groups are combined.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)