Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
General linear model
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{Short description|Statistical linear model}} {{Distinguish|text=[[Multiple linear regression]], [[Generalized linear model]] or [[General linear methods]]}} {{Regression bar}} The '''general linear model''' or '''general multivariate regression model''' is a compact way of simultaneously writing several [[multiple linear regression]] models. In that sense it is not a separate statistical [[linear model]]. The various multiple linear regression models may be compactly written as<ref name="MardiaK1979Multivariate">{{Cite book |last1=Mardia |first1=K. V. |author1-link=Kanti Mardia |last2=Kent |first2=J. T. |last3=Bibby |first3=J. M. |year=1979 |title=Multivariate Analysis |publisher=[[Academic Press]] |isbn=0-12-471252-5}}</ref> : <math>\mathbf{Y} = \mathbf{X}\mathbf{B} + \mathbf{U},</math> where '''Y''' is a [[Matrix (mathematics)|matrix]] with series of multivariate measurements (each column being a set of measurements on one of the [[dependent variable]]s), '''X''' is a matrix of observations on [[independent variable]]s that might be a [[design matrix]] (each column being a set of observations on one of the independent variables), '''B''' is a matrix containing parameters that are usually to be estimated and '''U''' is a matrix containing [[Errors and residuals in statistics|errors]] (noise). The errors are usually assumed to be uncorrelated across measurements, and follow a [[multivariate normal distribution]]. If the errors do not follow a multivariate normal distribution, [[generalized linear model]]s may be used to relax assumptions about '''Y''' and '''U'''. The general linear model (GLM) encompasses several statistical models, including [[Analysis of variance|ANOVA]], [[Analysis of covariance|ANCOVA]], [[Multivariate analysis of variance|MANOVA]], [[Multivariate analysis of covariance|MANCOVA]], ordinary [[linear regression]]. Within this framework, both [[t-test|''t''-test]] and [[F-test|''F''-test]] can be applied. The general linear model is a generalization of multiple linear regression to the case of more than one dependent variable. If '''Y''', '''B''', and '''U''' were [[column vector]]s, the matrix equation above would represent multiple linear regression. Hypothesis tests with the general linear model can be made in two ways: [[multivariate statistics|multivariate]] or as several independent [[univariate]] tests. In multivariate tests the columns of '''Y''' are tested together, whereas in univariate tests the columns of '''Y''' are tested independently, i.e., as multiple univariate tests with the same design matrix. == Comparison to multiple linear regression == {{Further|Multiple linear regression}} Multiple linear regression is a generalization of [[simple linear regression]] to the case of more than one independent variable, and a [[special case]] of general linear models, restricted to one dependent variable. The basic model for multiple linear regression is :<math> Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \ldots + \beta_p X_{ip} + \epsilon_i</math> or more compactly <math>Y_i = \beta_0 + \sum \limits_{k=1}^{p} {\beta_k X_{ik}} + \epsilon_i</math> for each observation ''i'' = 1, ... , ''n''. In the formula above we consider ''n'' observations of one dependent variable and ''p'' independent variables. Thus, ''Y''<sub>''i''</sub> is the ''i''<sup>th</sup> observation of the dependent variable, ''X''<sub>''ik''</sub> is ''k''<sup>th</sup> observation of the ''k''<sup>th</sup> independent variable, ''j'' = 1, 2, ..., ''p''. The values ''Ξ²''<sub>''j''</sub> represent parameters to be estimated, and ''Ξ΅''<sub>''i''</sub> is the ''i''<sup>th</sup> independent identically distributed normal error. In the more general multivariate linear regression, there is one equation of the above form for each of ''m'' > 1 dependent variables that share the same set of explanatory variables and hence are estimated simultaneously with each other: :<math> Y_{ij} = \beta_{0j} + \beta_{1j} X_{i1} + \beta_{2j}X_{i2} + \ldots + \beta_{pj} X_{ip} + \epsilon_{ij}</math> or more compactly <math>Y_{ij} = \beta_{0j} + \sum \limits_{k=1}^{p} { \beta_{kj} X_{ik}} + \epsilon_{ij}</math> for all observations indexed as ''i'' = 1, ... , ''n'' and for all dependent variables indexed as ''j = 1'', ''...'' , ''m''. Note that, since each dependent variable has its own set of regression parameters to be fitted, from a computational point of view the general multivariate regression is simply a sequence of standard multiple linear regressions using the same explanatory variables. == Comparison to generalized linear model == The general linear model and the [[generalized linear model]] (GLM)<ref name=":0">{{Cite book |last1=McCullagh |first1=P. |author1-link=Peter McCullagh |last2=Nelder |first2=J. A. |author2-link=John Nelder |date=January 1, 1983 |chapter=An outline of generalized linear models |title=Generalized Linear Models |pages=21β47 |publisher=Springer US |isbn=9780412317606 |doi=10.1007/978-1-4899-3242-6_2 |doi-broken-date=13 December 2024}}</ref><ref>Fox, J. (2015). ''Applied regression analysis and generalized linear models''. Sage Publications.</ref> are two commonly used families of [[Statistics|statistical methods]] to relate some number of continuous and/or categorical [[Dependent and independent variables|predictors]] to a single [[Dependent and independent variables|outcome variable]]. The main difference between the two approaches is that the general linear model strictly assumes that the [[Errors and residuals|residuals]] will follow a [[Conditional probability distribution|conditionally]] [[normal distribution]],<ref name=":1">{{cite report |last1=Cohen |first1=J. |last2=Cohen |first2=P. |last3=West |first3=S. G. |last4=Aiken |first4=L. S. |author4-link=Leona S. Aiken |date=2003 |title=Applied multiple regression/correlation analysis for the behavioral sciences}}</ref> while the GLM loosens this assumption and allows for a variety of other [[Distribution (mathematics)|distributions]] from the [[exponential family]] for the residuals.<ref name=":0"/> The general linear model is a special case of the GLM in which the distribution of the residuals follow a conditionally normal distribution. The distribution of the residuals largely depends on the type and distribution of the outcome variable; different types of outcome variables lead to the variety of models within the GLM family. Commonly used models in the GLM family include [[Logistic regression|binary logistic regression]]<ref>Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). ''Applied logistic regression'' (Vol. 398). John Wiley & Sons.</ref> for binary or dichotomous outcomes, [[Poisson regression]]<ref>{{cite journal |last1=Gardner |first1=W. |last2=Mulvey |first2=E. P. |last3=Shaw |first3=E. C. |date=1995 |title=Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models |journal=Psychological Bulletin |volume=118 |issue=3 |pages=392β404 |doi=10.1037/0033-2909.118.3.392 |pmid=7501743}}</ref> for count outcomes, and [[linear regression]] for continuous, normally distributed outcomes. This means that GLM may be spoken of as a general family of statistical models or as specific models for specific outcome types. {| class="wikitable" ! !General linear model ![[Generalized linear model]] |- |Typical estimation method |[[Least squares]], [[best linear unbiased prediction]] |[[Maximum likelihood]] or [[Bayesian probability|Bayesian]] |- |Examples |[[ANOVA]], [[ANCOVA]], [[linear regression]] |[[linear regression]], [[logistic regression]], [[Poisson regression]], gamma regression,<ref name=":02">{{cite book |last1=McCullagh |first1=Peter |author1-link=Peter McCullagh |last2=Nelder |first2=John |author2-link=John Nelder |year=1989 |title=Generalized Linear Models |edition=2nd |publisher=Boca Raton: Chapman and Hall/CRC |isbn=978-0-412-31760-6 |ref=McCullagh1989}}</ref> general linear model |- |Extensions and related methods |[[Multivariate analysis of variance|MANOVA]], [[Multivariate analysis of covariance|MANCOVA]], [[Mixed model|linear mixed model]] |[[generalized linear mixed model]] (GLMM), [[Generalized estimating equation|generalized estimating equations]] (GEE) |- |[[R (programming language)|R]] package and function |[https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html lm()] in stats package (base R) |[https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html glm()] in stats package (base R) manova, |- |[[MATLAB]] function |mvregress() |glmfit() |- |[[SAS (software)|SAS]] procedures |[https://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#glm_toc.htm PROC GLM], [https://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#reg_toc.htm PROC REG] |[https://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#genmod_toc.htm PROC GENMOD], [https://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#logistic_toc.htm PROC LOGISTIC] (for binary & ordered or unordered categorical outcomes) |- |[[Stata]] command |regress |glm |- |[[SPSS]] command |[https://stats.idre.ucla.edu/spss/output/regression-analysis/ regression], [https://stats.idre.ucla.edu/spss/library/spss-librarymanova-and-glm-2/ glm] |genlin, logistic |- |[[Wolfram Language]] & [[Mathematica]] function |LinearModelFit[]<ref>[http://reference.wolfram.com/language/ref/LinearModelFit.html LinearModelFit], Wolfram Language Documentation Center.</ref> |GeneralizedLinearModelFit[]<ref>[http://reference.wolfram.com/language/ref/GeneralizedLinearModelFit.html GeneralizedLinearModelFit], Wolfram Language Documentation Center.</ref> |- |[[EViews]] command |ls<ref>[http://www.eviews.com/help/helpintro.html#page/content%2Fcommandcmd-ls.html ls], EViews Help.</ref> |glm<ref>[http://www.eviews.com/help/helpintro.html#page/content%2Fcommandcmd-glm.html glm], EViews Help.</ref> |- |statsmodels Python Package |[https://www.statsmodels.org/dev/user-guide.html#regression-and-linear-models regression-and-linear-models] |[https://www.statsmodels.org/dev/glm.html GLM] |} == Applications == An application of the general linear model appears in the analysis of multiple [[brain scan]]s in scientific experiments where {{var|Y}} contains data from brain scanners, {{var|X}} contains experimental design variables and confounds. It is usually tested in a univariate way (usually referred to a ''mass-univariate'' in this setting) and is often referred to as [[statistical parametric mapping]].<ref>{{Cite journal |last1=Friston |first1=K.J. |last2=Holmes |first2=A.P. |last3=Worsley |first3=K.J. |last4=Poline |first4=J.-B. |last5=Frith |first5=C.D. |last6=Frackowiak |first6=R.S.J. |year=1995 |title=Statistical Parametric Maps in functional imaging: A general linear approach |journal=Human Brain Mapping |volume=2 |pages=189β210 |issue=4 |doi=10.1002/hbm.460020402 |s2cid=9898609}}</ref> == See also == * [[Bayesian multivariate linear regression]] * [[F-test]] * [[t-test]] == Notes == {{Reflist}} == References == * {{cite book |last1=Christensen |first1=Ronald |year=2020 |title=Plane Answers to Complex Questions: The Theory of Linear Models |location=New York |publisher=Springer |edition=5th |isbn=978-3-030-32096-6}} * {{cite book |last1=Wichura |first1=Michael J. |year=2006 |title=The coordinate-free approach to linear models |series=Cambridge Series in Statistical and Probabilistic Mathematics |publisher=Cambridge University Press |location=Cambridge |pages=xiv+199 |isbn=978-0-521-86842-6 |mr=2283455}} * {{Cite book |editor1-last=Rawlings |editor1-first=John O. |editor2-last=Pantula |editor2-first=Sastry G. |editor3-last=Dickey |editor3-first=David A. |year=1998 |title=Applied Regression Analysis |series=Springer Texts in Statistics |isbn=0-387-98454-2 |doi=10.1007/b98890}} {{Statistics}} [[Category:Regression models]]
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)
Pages transcluded onto the current version of this page
(
help
)
:
Template:Cite book
(
edit
)
Template:Cite journal
(
edit
)
Template:Cite report
(
edit
)
Template:Distinguish
(
edit
)
Template:Further
(
edit
)
Template:Reflist
(
edit
)
Template:Regression bar
(
edit
)
Template:Short description
(
edit
)
Template:Statistics
(
edit
)
Template:Var
(
edit
)