Editing Factor analysis (section)

===Criteria for determining the number of factors===
Researchers wish to avoid such subjective or arbitrary criteria for factor retention as "it made sense to me". A number of objective methods have been developed to solve this problem, allowing users to determine an appropriate range of solutions to investigate.<ref name="Zwick1986">{{cite journal |last1=Zwick |first1=William R. |last2=Velicer |first2=Wayne F. |title=Comparison of five rules for determining the number of components to retain. |journal=Psychological Bulletin |date=1986 |volume=99 |issue=3 |pages=432–442 |doi=10.1037/0033-2909.99.3.432}}</ref> However these different methods often disagree with one another as to the number of factors that ought to be retained. For instance, the [[parallel analysis]] may suggest 5 factors while Velicer's MAP suggests 6, so the researcher may request both 5 and 6-factor solutions and discuss each in terms of their relation to external data and theory.

====Modern criteria====
[[Horn's parallel analysis]] (PA):<ref name="Horn1965">{{cite journal |last1=Horn |first1=John L. |title=A rationale and test for the number of factors in factor analysis |journal=Psychometrika |date=June 1965 |volume=30 |issue=2 |pages=179–185 |doi=10.1007/BF02289447|pmid=14306381 |s2cid=19663974 }}</ref> A Monte-Carlo based simulation method that compares the observed eigenvalues with those obtained from uncorrelated normal variables. A factor or component is retained if the associated eigenvalue is bigger than the 95th percentile of the distribution of eigenvalues derived from the random data. PA is among the more commonly recommended rules for determining the number of components to retain,<ref name="Zwick1986" /><ref>{{Cite arXiv|last=Dobriban|first=Edgar|date=2017-10-02|title=Permutation methods for factor analysis and PCA|class=math.ST|language=en|eprint=1710.00479v2}}</ref> but many programs fail to include this option (a notable exception being [[R (programming language)|R]]).<ref>* {{cite journal | last1 = Ledesma | first1 = R.D. | last2 = Valero-Mora | first2 = P. | year = 2007 | title = Determining the Number of Factors to Retain in EFA: An easy-to-use computer program for carrying out Parallel Analysis | url = http://pareonline.net/getvn.asp?v=12&n=2 | journal = Practical Assessment Research & Evaluation | volume = 12 | issue = 2| pages = 1–11 }}</ref> However, [[Anton Formann|Formann]] provided both theoretical and empirical evidence that its application might not be appropriate in many cases since its performance is considerably influenced by [[sample size]], [[Item response theory#The item response function|item discrimination]], and type of [[correlation coefficient]].<ref>Tran, U. S., & Formann, A. K. (2009). Performance of parallel analysis in retrieving unidimensionality in the presence of binary data. ''Educational and Psychological Measurement, 69,'' 50-61.</ref>

Velicer's (1976) MAP test<ref name=Velicer>{{cite journal|last=Velicer|first=W.F.|title=Determining the number of components from the matrix of partial correlations|journal=Psychometrika|year=1976|volume=41|issue=3|pages=321–327|doi=10.1007/bf02293557|s2cid=122907389}}</ref> as described by Courtney (2013)<ref name="pareonline.net">Courtney, M. G. R. (2013). Determining the number of factors to retain in EFA: Using the SPSS R-Menu v2.0 to make more judicious estimations. Practical Assessment, Research and Evaluation, 18(8). Available online:
http://pareonline.net/getvn.asp?v=18&n=8 {{Webarchive|url=https://web.archive.org/web/20150317145450/http://pareonline.net/getvn.asp?v=18&n=8 |date=2015-03-17 }}</ref> “involves a complete principal components analysis followed by the examination of a series of matrices of partial correlations” (p.&nbsp;397 (though this quote does not occur in Velicer (1976) and the cited page number is outside the pages of the citation). The squared correlation for Step “0” (see Figure 4) is the average squared off-diagonal correlation for the unpartialed correlation matrix. On Step 1, the first principal component and its associated items are partialed out. Thereafter, the average squared off-diagonal correlation for the subsequent correlation matrix is then computed for Step 1. On Step 2, the first two principal components are partialed out and the resultant average squared off-diagonal correlation is again computed. The computations are carried out for k minus one step (k representing the total number of variables in the matrix). Thereafter, all of the average squared correlations for each step are lined up and the step number in the analyses that resulted in the lowest average squared partial correlation determines the number of components or factors to retain.<ref name=Velicer/> By this method, components are maintained as long as the variance in the correlation matrix represents systematic variance, as opposed to residual or error variance. Although methodologically akin to principal components analysis, the MAP technique has been shown to perform quite well in determining the number of factors to retain in multiple simulation studies.<ref name="Zwick1986" /><ref name="Warne, R. T. 2014"/><ref name =Ruscio>{{cite journal|last=Ruscio|first=John|author2=Roche, B.|title=Determining the number of factors to retain in an exploratory factor analysis using comparison data of known factorial structure|journal=Psychological Assessment|year=2012|volume=24|issue=2|pages=282–292|doi=10.1037/a0025697|pmid=21966933}}</ref><ref name=Garrido>Garrido, L. E., & Abad, F. J., & Ponsoda, V. (2012). A new look at Horn's parallel analysis with ordinal variables. Psychological Methods. Advance online publication. {{doi|10.1037/a0030005}}</ref> This procedure is made available through SPSS's user interface,<ref name="pareonline.net"/> as well as the ''psych'' package for the [[R (programming language)|R programming language]].<ref>{{cite journal |last1=Revelle |first1=William |title=Determining the number of factors: the example of the NEO-PI-R |date=2007 |url=http://www.personality-project.org/r/book/numberoffactors.pdf}}</ref><ref>{{cite web |last1=Revelle |first1=William |title=psych: Procedures for Psychological, Psychometric, and PersonalityResearch |url=https://cran.r-project.org/web/packages/psych/ |date=8 January 2020}}</ref>

==== Older methods ====
Kaiser criterion: The Kaiser rule is to drop all components with eigenvalues under 1.0 – this being the eigenvalue equal to the information accounted for by an average single item.<ref name="Kaiser1960">{{cite journal |last1=Kaiser |first1=Henry F. |title=The Application of Electronic Computers to Factor Analysis |journal=Educational and Psychological Measurement |date=April 1960 |volume=20 |issue=1 |pages=141–151 |doi=10.1177/001316446002000116|s2cid=146138712 }}</ref> The Kaiser criterion is the default in [[SPSS]] and most [[statistical software]] but is not recommended when used as the sole cut-off criterion for estimating the number of factors as it tends to over-extract factors.<ref>{{cite book |first1=D.L. |last1=Bandalos |first2=M.R. |last2=Boehm-Kaufman |chapter=Four common misconceptions in exploratory factor analysis |editor1-first=Charles E. |editor1-last=Lance |editor2-first=Robert J. |editor2-last=Vandenberg |title=Statistical and Methodological Myths and Urban Legends: Doctrine, Verity and Fable in the Organizational and Social Sciences |chapter-url=https://books.google.com/books?id=KFAnkvqD8CgC&pg=PA61 |year=2008 |publisher=Taylor & Francis |isbn=978-0-8058-6237-9 |pages=61–87}}</ref> A variation of this method has been created where a researcher calculates [[confidence interval]]s for each eigenvalue and retains only factors which have the entire confidence interval greater than 1.0.<ref name="Warne, R. T. 2014">{{cite journal | last1 = Warne | first1 = R. T. | last2 = Larsen | first2 = R. | year = 2014 | title = Evaluating a proposed modification of the Guttman rule for determining the number of factors in an exploratory factor analysis | journal = Psychological Test and Assessment Modeling | volume = 56 | pages = 104–123 }}</ref><ref>{{cite journal | last1 = Larsen | first1 = R. | last2 = Warne | first2 = R. T. | year = 2010 | title = Estimating confidence intervals for eigenvalues in exploratory factor analysis | journal = Behavior Research Methods | volume = 42 | issue = 3| pages = 871–876 | doi = 10.3758/BRM.42.3.871 | pmid = 20805609 | doi-access = free }}</ref>

[[Scree plot]]:<ref>{{cite journal|first1=Raymond |last1=Cattell|journal=Multivariate Behavioral Research|volume=1|number=2|pages=245–76|year=1966|title=The scree test for the number of factors|doi=10.1207/s15327906mbr0102_10|pmid=26828106}}</ref>
The Cattell scree test plots the components as the X-axis and the corresponding [[eigenvalue]]s as the [[Y-axis]].  As one moves to the right, toward later components, the eigenvalues drop. When the drop ceases and the curve makes an elbow toward less steep decline, Cattell's scree test says to drop all further components after the one starting at the elbow. This rule is sometimes criticised for being amenable to researcher-controlled "[[Wiktionary:fudge factor|fudging]]".  That is, as picking the "elbow" can be subjective because the curve has multiple elbows or is a smooth curve, the researcher may be tempted to set the cut-off at the number of factors desired by their research agenda.{{Citation needed|date=March 2016}}

Variance explained criteria: Some researchers simply use the rule of keeping enough factors to account for 90% (sometimes 80%) of the variation.  Where the researcher's goal emphasizes [[Occam's razor|parsimony]] (explaining variance with as few factors as possible), the criterion could be as low as 50%.

==== Bayesian methods ====

By placing a [[Prior probability|prior distribution]] over the number of latent factors and then applying Bayes' theorem, Bayesian models can return a [[probability distribution]] over the number of latent factors. This has been modeled using the [[Indian buffet process]],<ref>{{cite book|author=Alpaydin|year=2020|title=Introduction to Machine Learning|edition=5th|pages=528–9}}</ref> but can be modeled more simply by placing any discrete prior (e.g. a [[negative binomial distribution]]) on the number of components.