Editing Sensitivity analysis (section)

== Sensitivity analysis methods ==
There are a large number of approaches to performing a sensitivity analysis, many of which have been developed to address one or more of the constraints discussed above. They are also distinguished by the type of sensitivity measure, be it based on (for example) [[Variance-based sensitivity analysis|variance decompositions]], [[partial derivatives]] or [[elementary effects method|elementary effects]]. In general, however, most procedures adhere to the following outline:

# Quantify the uncertainty in each input (e.g. ranges, probability distributions). Note that this can be difficult and many methods exist to elicit uncertainty distributions from subjective data.<ref>{{cite book |last=O'Hagan |first=A. |title=Uncertain Judgements: Eliciting Experts' Probabilities |publisher=Wiley |location=Chichester |year=2006 |isbn= 9780470033302|url=https://books.google.com/books?id=H9KswqPWIDQC |display-authors=etal}}</ref>
# Identify the model output to be analysed (the target of interest should ideally have a direct relation to the problem tackled by the model).
# Run the model a number of times using some [[design of experiments]],<ref>{{cite journal |last1=Sacks |first1=J. |first2=W. J. |last2=Welch |first3=T. J. |last3=Mitchell |first4=H. P. |last4=Wynn |year=1989 |title=Design and Analysis of Computer Experiments |journal=Statistical Science |volume=4 |issue= 4|pages=409–435 |doi= 10.1214/ss/1177012413|doi-access=free }}</ref> dictated by the method of choice and the input uncertainty.
# Using the resulting model outputs, calculate the sensitivity measures of interest.

In some cases this procedure will be repeated, for example in high-dimensional problems where the user has to screen out unimportant variables before performing a full sensitivity analysis.

The various types of "core methods" (discussed below) are distinguished by the various sensitivity measures which are calculated. These categories can somehow overlap. Alternative ways of obtaining these measures, under the constraints of the problem, can be given. In addition, an engineering view of the methods that takes into account the four important sensitivity analysis parameters has also been proposed.<ref>{{cite book | vauthors=Da Veiga S, Gamboa F, Iooss B, Prieur C | date= 2021 | title=Basics and Trends in Sensitivity Analysis | publisher=SIAM | url=https://blackwells.co.uk/bookshop/product/Basics-and-Trends-in-Sensitivity-Analysis-by-Sbastien-da-Veiga-Fabrice-Gamboa-Bertrand-Iooss-Clmentine-Prieur/9781611976687 | doi=10.1137/1.9781611976694 | isbn=978-1-61197-668-7}}</ref>

=== Visual analysis ===
[[File:Scatter plots for sensitivity analysis bis.jpg|thumb|right | upright=2 | Figure 2. Sampling-based sensitivity analysis by scatterplots. ''Y'' (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by ''Z''<sub>1</sub>, ''Z''<sub>2</sub>, ''Z''<sub>3</sub>, ''Z''<sub>4</sub> in turn. Note that the abscissa is different for each plot: (−5,&nbsp;+5) for ''Z''<sub>1</sub>, (−8,&nbsp;+8) for ''Z''<sub>2</sub>, (−10,&nbsp;+10) for ''Z''<sub>3</sub> and ''Z''<sub>4</sub>. ''Z''<sub>4</sub> is most important in influencing ''Y'' as it imparts more 'shape' on ''Y''.]]

The first intuitive approach (especially useful in less complex cases) is to analyze the relationship between each input <math>Z_i</math> and the output <math>Y</math> using scatter plots, and observe the behavior of these pairs. The diagrams give an initial idea of the correlation and which input has an impact on the output. Figure 2 shows an example where two inputs, <math>Z_3</math> and <math>Z_4</math> are highly correlated with the output.

=== One-at-a-time (OAT) ===
{{main|One-factor-at-a-time method}}
One of the simplest and most common approaches is that of changing one-factor-at-a-time (OAT), to see what effect this produces on the output.<ref>{{cite journal |first=J. |last=Campbell |year=2008 |title=Photosynthetic Control of Atmospheric Carbonyl Sulfide During the Growing Season |journal=[[Science (journal)|Science]] |volume=322 |issue=5904 |pages=1085–1088 |doi=10.1126/science.1164015 |display-authors=etal |pmid=19008442|bibcode=2008Sci...322.1085C |s2cid=206515456 |url=http://www.escholarship.org/uc/item/82r9s2x3 }}</ref><ref>{{cite journal |first1=R. |last1=Bailis |first2=M. |last2=Ezzati |first3=D. |last3=Kammen |year=2005 |title=Mortality and Greenhouse Gas Impacts of Biomass and Petroleum Energy Futures in Africa |journal=[[Science (journal)|Science]] |volume=308 |issue= 5718|pages=98–103 |doi=10.1126/science.1106881 |pmid=15802601|bibcode=2005Sci...308...98B |s2cid=14404609 }}</ref><ref>{{cite journal |first=J. |last=Murphy |year=2004 |title=Quantification of modelling uncertainties in a large ensemble of climate change simulations |journal=[[Nature (journal)|Nature]] |volume=430 |issue= 7001|pages=768–772 |doi= 10.1038/nature02771|display-authors=etal |pmid=15306806|bibcode=2004Natur.430..768M|s2cid=980153 }}</ref> OAT customarily involves

* moving one input variable, keeping others at their baseline (nominal) values, then,
* returning the variable to its nominal value, then repeating for each of the other inputs in the same way.

Sensitivity may then be measured by monitoring changes in the output, e.g. by [[partial derivatives]] or [[linear regression]]. This appears a logical approach as any change observed in the output will unambiguously be due to the single variable changed. Furthermore, by changing one variable at a time, one can keep all other variables fixed to their central or baseline values. This increases the comparability of the results (all 'effects' are computed with reference to the same central point in space) and minimizes the chances of computer program crashes, more likely when several input factors are changed simultaneously.
OAT is frequently preferred by modelers because of practical reasons. In case of model failure under OAT analysis the modeler immediately knows which is the input factor responsible for the failure.

Despite its simplicity however, this approach does not fully explore the input space, since it does not take into account the simultaneous variation of input variables. This means that the OAT approach cannot detect the presence of [[Interaction (statistics)|interactions]] between input variables and is unsuitable for nonlinear models.<ref>{{cite journal |last=Czitrom|first=Veronica|author-link= Veronica Czitrom |year=1999 |title=One-Factor-at-a-Time Versus Designed Experiments |journal=American Statistician |volume=53 |issue=2 |pages=126–131 |doi=10.2307/2685731|jstor= 2685731}}</ref>

The proportion of input space which remains unexplored with an OAT approach grows superexponentially with the number of inputs. For example, a 3-variable parameter space which is explored one-at-a-time is equivalent to taking points along the x, y, and z axes of a cube centered at the origin. The [[convex hull]] bounding all these points is an [[octahedron]] which has a volume only 1/6th of the total parameter space. More generally, the convex hull of the axes of a hyperrectangle forms a [[hyperoctahedron]] which has a volume fraction of <math>1/n!</math>. With 5 inputs, the explored space already drops to less than 1% of the total parameter space. And even this is an overestimate, since the off-axis volume is not actually being sampled at all. Compare this to random sampling of the space, where the convex hull approaches the entire volume as more points are added.<ref>{{cite journal |last1=Gatzouras |first1=D |last2=Giannopoulos |first2=A |title=Threshold for the volume spanned by random points with independent coordinates |journal=[[Israel Journal of Mathematics]] |date=2009 |volume=169 |issue=1 |pages=125–153 | doi=10.1007/s11856-009-0007-z | doi-access=free}}</ref> While the sparsity of OAT is theoretically not a concern for [[linear model]]s, true linearity is rare in nature.

=== Morris ===
{{main|Morris method}}

Named after the statistician Max D. Morris, this method is suitable for screening systems with many parameters. This is also known as method of elementary effects because it combines repeated steps along the various parametric axes.<ref>{{cite journal | vauthors=Morris MD | journal=Technometrics | title=Factorial Sampling Plans for Preliminary Computational Experiments | volume=33 | issue=2 | pages=161–174 | publisher=Taylor & Francis | date=  1991 | doi=10.2307/1269043| jstor=1269043 }}</ref>

=== Derivative-based local methods ===
Local derivative-based methods involve taking the [[partial derivative]] of the output  <math>Y</math> with respect to an input factor <math>X_i</math>:
:<math>
\left| \frac{\partial Y}{\partial X_i} \right |_{\textbf {x}^0 },
</math>
where the subscript '''x'''<sup>0</sup> indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling<ref>{{cite book |last=Cacuci |first=Dan G. |title=Sensitivity and Uncertainty Analysis: Theory |volume=I |publisher=Chapman & Hall }}</ref><ref>{{cite book |last1=Cacuci |first1=Dan G. |first2=Mihaela |last2=Ionescu-Bujor |first3=Michael |last3=Navon |year=2005 |title=Sensitivity and Uncertainty Analysis: Applications to Large-Scale Systems |volume=II |publisher=Chapman & Hall }}</ref> and Automated Differentiation<ref>{{cite book |last=Griewank |first=A. |year=2000 |title=Evaluating Derivatives, Principles and Techniques of Algorithmic Differentiation |publisher=SIAM }}</ref> are methods which allow to compute all partial derivatives at a cost at most 4-6 times of that for evaluating the original function. Similar to OAT, local methods do not attempt to fully explore the input space, since they examine small perturbations, typically one variable at a time. It is possible to select similar samples from derivative-based sensitivity through Neural Networks and perform uncertainty quantification.

One advantage of the local methods is that it is possible to make a matrix to represent all the sensitivities in a system, thus providing an overview that cannot be achieved with global methods if there is a large number of input and output variables.<ref name="Possible">[https://ieeexplore.ieee.org/abstract/document/9206746 Kabir HD, Khosravi A, Nahavandi D, Nahavandi S. Uncertainty Quantification Neural Network from Similarity and Sensitivity. In2020 International Joint Conference on Neural Networks (IJCNN) 2020 Jul 19 (pp. 1-8). IEEE.]</ref>

=== Regression analysis ===
[[Regression analysis]], in the context of sensitivity analysis, involves fitting a [[linear regression]] to the model response and using [[Standardized coefficient|standardized regression coefficients]] as direct measures of sensitivity. The regression is required to be linear with respect to the data (i.e. a hyperplane, hence with no quadratic terms, etc., as regressors) because otherwise it is difficult to interpret the standardised coefficients. This method is therefore most suitable when the model response is in fact linear;  linearity can be confirmed, for instance, if the [[coefficient of determination]] is large. The advantages of regression analysis are that it is simple and has a low computational cost.

=== Variance-based methods ===
{{Main|Variance-based sensitivity analysis}}

Variance-based methods<ref>{{cite journal | last1 = Sobol' | first1 = I | year = 1990 | title = Sensitivity estimates for nonlinear mathematical models | journal = Matematicheskoe Modelirovanie | volume = 2 | pages = 112–118 | language = ru }}; translated in English in {{cite journal | last1 = Sobol' | first1 = I | year = 1993 | title = Sensitivity analysis for non-linear mathematical models | journal = Mathematical Modeling & Computational Experiment | volume = 1 | pages = 407–414 }}</ref> are a class of probabilistic approaches which quantify the input and output uncertainties as [[random variable]]s, represented via their [[probability distribution]]s, and decompose the output variance into parts attributable to input variables and combinations of variables. The sensitivity of the output to an input variable is therefore measured by the amount of variance in the output caused by that input.

This amount is quantified and calculated using '''Sobol indices''': they represent the proportion of variance explained by an input or group of inputs. This expression essentially measures the contribution of <math>X_i</math> alone to the uncertainty (variance) in <math>Y</math> (averaged over variations in other variables), and is known as the ''' ''first-order sensitivity index'' ''' or ''' ''main effect index'' ''' or ''' ''main Sobol index'' ''' or ''' ''Sobol main index'' '''.

For an input <math>X_i</math>, Sobol index is defined as following:

<math display="block">S_i=\frac{V(\mathbb{E}[Y\vert X_i])}{V(Y)}</math> where <math>V(\cdot)</math> and <math>\mathbb{E}[\cdot]</math> denote the variance and expected value operators respectively.

Importantly, first-order sensitivity index of <math>X_i</math> does not measure the uncertainty caused by interactions <math>X_i</math> has with other variables. A further measure, known as the ''' ''total effect index'' ''', gives the total variance in <math>Y</math> caused by <math>X_i</math> and its interactions with any of the other input variables. The total effect index is given as following: <math display="block">S_i^T=1-\frac{V(\mathbb{E}[Y\vert X_{\sim i}])}{V(Y)}</math>where <math>X_{\sim i} = (X_1,...,X_{i-1},X_{i+1},...,X_p)</math> denotes the set of all input variables except <math>X_i</math>.

Variance-based methods allow full exploration of the input space, accounting for interactions, and nonlinear responses. For these reasons they are widely used when it is feasible to calculate them. Typically this calculation involves the use of [[Monte Carlo integration|Monte Carlo]] methods, but since this can involve many thousands of model runs, other methods (such as metamodels) can be used to reduce computational expense when necessary.

=== Moment-independent methods ===
Moment-independent methods extend variance-based techniques by considering the probability density or cumulative distribution function of the model output <math>Y</math>. Thus, they do not refer to any particular [[Moment (mathematics)|moment]] of  <math>Y</math>, whence the name.

The moment-independent sensitivity measures of <math>X_i</math>, here denoted by <math>\xi_i</math>, can be defined through an equation similar to variance-based indices replacing the conditional expectation with a distance, as <math>\xi_i=E[d(P_Y,P_{Y|X_i})]</math>, where <math>d(\cdot,\cdot) </math> is a [[statistical distance]] [metric or divergence] between probability measures, <math>P_Y</math> and <math>P_{Y|X_i}</math> are the marginal and [[conditional probability]] measures of <math>Y</math>.<ref name="Borgonovo2014">{{Cite journal |vauthors=Borgonovo E, Tarantola S, Plischke E, Morris MD |date=2014 |title=Transformations and invariance in the sensitivity analysis of computer experiments |journal=Journal of the Royal Statistical Society |series=Series B (Statistical Methodology) |volume=76 |issue=5 |pages=925–947 |doi=10.1111/rssb.12052 |issn=1369-7412}}</ref>
 
If <math>d()\geq 0</math> is a [[Statistical distance|distance]], the moment-independent global sensitivity measure satisfies zero-independence. This is a relevant statistical property also known as Renyi's postulate D.<ref name="Renyi">{{Cite journal |last=Rényi |first=A |date=1 September 1959 |title=On measures of dependence |journal=Acta Mathematica Academiae Scientiarum Hungaricae |volume=10 |issue=3 |pages=441–451 |doi=10.1007/BF02024507 |issn=1588-2632}}</ref>

The class of moment-independent sensitivity measures includes indicators such as the <math>\delta </math> -importance measure,<ref name="Borgonovo2007">{{Cite journal |vauthors=Borgonovo E |date=June 2007 |title=A new uncertainty importance measure |journal=Reliability Engineering & System Safety |volume=92 |issue=6 |pages=771–784 |doi=10.1016/J.RESS.2006.04.015 |issn=0951-8320}}</ref> the new correlation coefficient of Chatterjee,<ref name="Chatterjee">{{Cite journal |vauthors=Chatterjee S |date=2 October 2021 |title=A New Coefficient of Correlation |journal=Journal of the American Statistical Association |volume=116 |issue=536 |pages=2009–2022 |arxiv=1909.10140 |doi=10.1080/01621459.2020.1758115 |issn=0162-1459}}</ref> the Wasserstein correlation of Wiesel <ref name="Wiesel">{{Cite journal |vauthors=Wiesel JC |date=November 2022 |title=Measuring association with Wasserstein distances |journal=Bernoulli |volume=28 |issue=4 |pages=2816–2832 |arxiv=2102.00356 |doi=10.3150/21-BEJ1438 |issn=1350-7265}}</ref> and the kernel-based sensitivity measures of Barr and Rabitz.<ref name="Barr">{{Cite journal |vauthors=Barr J, Rabitz H |date=31 March 2022 |title=A Generalized Kernel Method for Global Sensitivity Analysis |journal=SIAM/ASA Journal on Uncertainty Quantification |publisher=Society for Industrial and Applied Mathematics |volume=10 |issue=1 |pages=27–54 |doi=10.1137/20M1354829}}</ref>

Another measure for global sensitivity analysis, in the category of moment-independent approaches, is the PAWN index.<ref name="PAWN">{{Cite journal |vauthors=Pianosi F, Wagener T |date=2015 |title=A simple and efficient method for global sensitivity analysis based on cumulative distribution functions |journal=Environmental Modelling & Software |volume=67 |pages=1–11 |bibcode=2015EnvMS..67....1P |doi=10.1016/j.envsoft.2015.01.004 |doi-access=free}}</ref> {{citation needed span|It relies on [[cumulative distribution functions|Cumulative Distribution Functions]] (CDFs) to characterize the maximum distance between the unconditional output distribution and conditional output distribution (obtained by varying all input parameters and by setting the <math>i</math>-th input, consequentially). The difference between the unconditional and conditional output distribution is usually calculated using the [[Kolmogorov–Smirnov test]] (KS). The PAWN index for a given input parameter is then obtained by calculating the summary statistics over all KS values.|date=October 2024}}

=== Variogram analysis of response surfaces (''VARS'') ===
One of the major shortcomings of the previous sensitivity analysis methods is that none of them considers the spatially ordered structure of the response surface/output of the model <math>Y=f(X)</math> in the parameter space. By utilizing the concepts of directional [[variogram]]s and covariograms, variogram analysis of response surfaces (VARS) addresses this weakness through recognizing a spatially continuous correlation structure to the values of <math>Y</math>, and hence also to the values of <math> \frac{\partial Y}{\partial x_i} </math>.<ref>{{cite journal|last1=Razavi|first1=Saman|last2=Gupta|first2=Hoshin V.|title=A new framework for comprehensive, robust, and efficient global sensitivity analysis: 1. Theory|journal=Water Resources Research|date=January 2016|volume=52|issue=1|pages=423–439|doi=10.1002/2015WR017558|language=en|issn=1944-7973|bibcode=2016WRR....52..423R|doi-access=free}}</ref><ref>{{cite journal|last1=Razavi|first1=Saman|last2=Gupta|first2=Hoshin V.|title=A new framework for comprehensive, robust, and efficient global sensitivity analysis: 2. Application|journal=Water Resources Research|date=January 2016|volume=52|issue=1|pages=440–455|doi=10.1002/2015WR017559|language=en|issn=1944-7973|bibcode=2016WRR....52..440R|doi-access=free}}</ref>

Basically, the higher the variability the more heterogeneous is the response surface along a particular direction/parameter, at a specific perturbation scale. Accordingly, in the VARS framework, the values of directional [[variogram]]s for a given perturbation scale can be considered as a comprehensive illustration of sensitivity information, through linking variogram analysis to both direction and perturbation scale concepts. As a result, the VARS framework accounts for the fact that sensitivity is a scale-dependent concept, and thus overcomes the scale issue of traditional sensitivity analysis methods.<ref>{{cite journal|last1=Haghnegahdar|first1=Amin|last2=Razavi|first2=Saman|title=Insights into sensitivity analysis of Earth and environmental systems models: On the impact of parameter perturbation scale|journal=Environmental Modelling & Software|date=September 2017|volume=95|pages=115–131|doi=10.1016/j.envsoft.2017.03.031|bibcode=2017EnvMS..95..115H }}</ref> More importantly, VARS is able to provide relatively stable and statistically robust estimates of parameter sensitivity with much lower computational cost than other strategies (about two orders of magnitude more efficient).<ref>{{cite book|last1=Gupta|first1=H|last2=Razavi|first2=S|editor1-last=Petropoulos|editor1-first=George|editor2-last=Srivastava|editor2-first=Prashant|title=Sensitivity Analysis in Earth Observation Modelling|date=2016|isbn=9780128030318|pages=397–415|edition=1st|chapter-url=https://www.elsevier.com/books/sensitivity-analysis-in-earth-observation-modelling/petropoulos/978-0-12-803011-0|language=en|chapter=Challenges and Future Outlook of Sensitivity Analysis|publisher=Elsevier}}</ref> Noteworthy, it has been shown that there is a theoretical link between the VARS framework and the [[Variance-based sensitivity analysis|variance-based]] and derivative-based approaches.

=== Fourier amplitude sensitivity test (FAST) ===
{{Main| Fourier amplitude sensitivity testing }}

The Fourier amplitude sensitivity test (FAST) uses the [[Fourier series]] to represent a multivariate function (the model) in the frequency domain, using a single frequency variable. Therefore, the integrals required to calculate sensitivity indices become univariate, resulting in computational savings.

=== Shapley effects ===
Shapley effects rely on [[Shapley value]]s and represent the average marginal contribution of a given factors across all possible combinations of factors.  These value are related to Sobol’s indices as their value falls between the first order Sobol’ effect and the total order effect.<ref>{{cite journal | vauthors=Owen AB | journal=SIAM/ASA Journal on Uncertainty Quantification | title=Sobol' Indices and Shapley Value | volume=2 | issue=1 | pages=245–251 | publisher=Society for Industrial and Applied Mathematics | date=1 January 2014 | doi=10.1137/130936233}}</ref>

=== Chaos polynomials ===
The principle is to project the function of interest onto a basis of orthogonal polynomials. The Sobol indices are then expressed analytically in terms of the coefficients of this decomposition.<ref name="GSA Sudret">{{cite journal |last1=Sudret|first1=B.
|year= 2008
|title=Global sensitivity analysis using polynomial chaos expansions
|url= http://www.sciencedirect.com/science/article/pii/S0951832007001329
|journal=Bayesian Networks in Dependability] 
|volume=93
|issue=7 
|pages=964–979|doi=10.1016/j.ress.2007.04.002
}}</ref>