Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Statistical inference
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Randomization-based models=== {{Main|Randomization}} {{See also|Random sample|Random assignment}} For a given dataset that was produced by a randomization design, the randomization distribution of a statistic (under the null-hypothesis) is defined by evaluating the test statistic for all of the plans that could have been generated by the randomization design. In frequentist inference, the randomization allows inferences to be based on the randomization distribution rather than a subjective model, and this is important especially in survey sampling and design of experiments.<ref>[[Jerzy Neyman|Neyman, J.]](1934) "On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection", ''[[Journal of the Royal Statistical Society]]'', 97 (4), 557β625 {{JSTOR|2342192}}</ref><ref name="Hinkelmann and Kempthorne2">Hinkelmann and Kempthorne(2008) {{page needed|date=June 2011}}</ref> Statistical inference from randomized studies is also more straightforward than many other situations.<ref>ASA Guidelines for the first course in statistics for non-statisticians. (available at the ASA website)</ref><ref>[[David A. Freedman]] et alia's ''Statistics''.</ref><ref>Moore et al. (2015).</ref> In [[Bayesian inference]], randomization is also of importance: in [[survey sampling]], use of [[sampling without replacement]] ensures the [[exchangeability]] of the sample with the population; in randomized experiments, randomization warrants a [[missing at random]] assumption for [[covariate]] information.<ref>[[Andrew Gelman|Gelman A.]] et al. (2013). ''Bayesian Data Analysis'' ([[Chapman & Hall]]).</ref> Objective randomization allows properly inductive procedures.<ref>Peirce (1877-1878)</ref><ref>Peirce (1883)</ref>{{sfn|Freedman|Pisani|Purves|1978}}<ref>[[David A. Freedman]] ''Statistical Models''.</ref><ref>[[C. R. Rao|Rao, C.R.]] (1997) ''Statistics and Truth: Putting Chance to Work'', World Scientific. {{isbn|981-02-3111-3}}</ref> Many statisticians prefer randomization-based analysis of data that was generated by well-defined randomization procedures.<ref>Peirce; Freedman; Moore et al. (2015).{{Citation needed|date=March 2010}}</ref> (However, it is true that in fields of science with developed theoretical knowledge and experimental control, randomized experiments may increase the costs of experimentation without improving the quality of inferences.<ref>Box, G.E.P. and Friends (2006) ''Improving Almost Anything: Ideas and Essays, Revised Edition'', Wiley. {{isbn|978-0-471-72755-2}}</ref><ref>Cox (2006), p. 196.</ref>) Similarly, results from [[randomized experiment]]s are recommended by leading statistical authorities as allowing inferences with greater reliability than do observational studies of the same phenomena.<ref>ASA Guidelines for the first course in statistics for non-statisticians. (available at the ASA website) * David A. Freedman et alias ''Statistics''. * Moore et al. (2015).</ref> However, a good observational study may be better than a bad randomized experiment. The statistical analysis of a randomized experiment may be based on the randomization scheme stated in the experimental protocol and does not need a subjective model.<ref>Neyman, Jerzy. 1923 [1990]. "On the Application of Probability Theory to AgriculturalExperiments. Essay on Principles. Section 9." ''Statistical Science'' 5 (4): 465β472. Trans. [[Dorota Dabrowska|Dorota M. Dabrowska]] and Terence P. Speed.</ref><ref>Hinkelmann & Kempthorne (2008) {{page needed|date=June 2011}}</ref> However, at any time, some hypotheses cannot be tested using objective statistical models, which accurately describe randomized experiments or random samples. In some cases, such randomized studies are uneconomical or unethical. ==== Model-based analysis of randomized experiments ==== It is standard practice to refer to a statistical model, e.g., a linear or logistic models, when analyzing data from randomized experiments.<ref name="Dinov Palanimalai Khare Christou 20182">{{cite journal |last1=Dinov |first1=Ivo |last2=Palanimalai |first2=Selvam |last3=Khare |first3=Ashwini |last4=Christou |first4=Nicolas |date=2018 |title=Randomization-based statistical inference: A resampling and simulation infrastructure |journal=Teaching Statistics |volume=40 |issue=2 |pages=64β73 |doi=10.1111/test.12156 |pmc=6155997 |pmid=30270947}}</ref> However, the randomization scheme guides the choice of a statistical model. It is not possible to choose an appropriate model without knowing the randomization scheme.<ref name="Hinkelmann and Kempthorne2" /> Seriously misleading results can be obtained analyzing data from randomized experiments while ignoring the experimental protocol; common mistakes include forgetting the blocking used in an experiment and confusing repeated measurements on the same experimental unit with independent replicates of the treatment applied to different experimental units.<ref>Hinkelmann and Kempthorne (2008) Chapter 6.</ref> ==== Model-free randomization inference ==== Model-free techniques provide a complement to model-based methods, which employ reductionist strategies of reality-simplification. The former combine, evolve, ensemble and train algorithms dynamically adapting to the contextual affinities of a process and learning the intrinsic characteristics of the observations.<ref name="Dinov Palanimalai Khare Christou 2018"> {{cite journal |last1= Dinov | first1=Ivo |last2= Palanimalai | first2= Selvam |last3= Khare |first3= Ashwini |last4= Christou |first4= Nicolas |date= 2018 |title= Randomization-based statistical inference: A resampling and simulation infrastructure |journal= Teaching Statistics |volume= 40 |issue= 2 |pages= 64β73 |doi= 10.1111/test.12156 | pmid=30270947 |pmc= 6155997 }}</ref><ref name="Tang model-based Model-Free 2019"> {{cite journal |last1= Tang | first1=Ming |last2= Gao | first2=Chao |last3= Goutman | first3=Stephen |last4= Kalinin | first4=Alexandr |last5= Mukherjee | first5=Bhramar |last6= Guan | first6=Yuanfang |last7= Dinov | first7=Ivo |date= 2019 |title= Model-Based and Model-Free Techniques for Amyotrophic Lateral Sclerosis Diagnostic Prediction and Patient Clustering |journal= Neuroinformatics |volume= 17 | issue=3 |pages= 407β421 |doi= 10.1007/s12021-018-9406-9 | pmid=30460455 | pmc= 6527505 }}</ref> For example, model-free simple linear regression is based either on: * a ''random design'', where the pairs of observations <math>(X_1,Y_1), (X_2,Y_2), \cdots , (X_n,Y_n)</math> are independent and identically distributed (iid), * or a ''deterministic design'', where the variables <math>X_1, X_2, \cdots, X_n</math> are deterministic, but the corresponding response variables <math>Y_1,Y_2, \cdots, Y_n</math> are random and independent with a common conditional distribution, i.e., <math>P\left (Y_j \leq y | X_j =x\right ) = D_x(y)</math>, which is independent of the index <math>j</math>. In either case, the model-free randomization inference for features of the common conditional distribution <math>D_x(.)</math> relies on some regularity conditions, e.g. functional smoothness. For instance, model-free randomization inference for the population feature ''conditional mean'', <math>\mu(x)=E(Y | X = x)</math>, can be consistently estimated via local averaging or local polynomial fitting, under the assumption that <math>\mu(x)</math> is smooth. Also, relying on asymptotic normality or resampling, we can construct confidence intervals for the population feature, in this case, the ''conditional mean'', <math>\mu(x)</math>.<ref name="Politis Model-Free Inference 2019"> {{cite journal |last1= Politis | first1=D.N. |date= 2019 |title= Model-free inference in statistics: how and why |journal= IMS Bulletin |volume= 48 |url= http://bulletin.imstat.org/2015/11/model-free-inference-in-statistics-how-and-why/ }}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)