Editing Laplace's method (section)

==Other formulations==
Laplace's approximation is sometimes written as

:<math>\int_a^b h(x) e^{M g(x)}\, dx \approx \sqrt{\frac{2\pi}{M|g''(x_0)|}} h(x_0) e^{M g(x_0)} \ \text { as } M\to\infty</math>

where <math>h</math> is positive.

Importantly, the accuracy of the approximation depends on the variable of integration, that is, on what stays in <math>g(x)</math> and what goes into <math>h(x).</math><ref>{{cite book |last=Butler |first=Ronald W |date=2007 |title=Saddlepoint approximations and applications |publisher=Cambridge University Press |isbn=978-0-521-87250-8}}</ref>

{{hidden begin|border=1px #aaa solid|title={{center|The derivation of its relative error}}}}
First, use <math>x_0=0</math> to denote the global maximum, which will simplify this derivation. We are interested in the relative error, written as <math>|R|</math>,

:<math>\int_a^b h(x) e^{M g(x)}\, dx = h(0)e^{Mg(0)}s \underbrace{\int_{a/s}^{b/s}\frac{h(sy)}{h(0)}e^{M\left[ g(sy)-g(0) \right]} dy}_{1+R},</math>
<!--I am pretty sure the h(x) in the integral above should be h(sy). h(x) is not technically wrong since x=sy via the substitution, but I think it should be changed.-->
where

:<math>s\equiv\sqrt{\frac{2\pi}{M\left| g'' (0) \right|}}.</math>

So, if we let

:<math>A\equiv \frac{h(sy)}{h(0)}e^{M\left[g(sy)-g(0) \right]}</math>

and <math>A_0\equiv e^{-\pi y^2}</math>, we can get

:<math>\left| R\right| = \left| \int_{a/s}^{b/s}A\,dy -\int_{-\infty}^{\infty}A_0\,dy \right|</math>

since <math>\int_{-\infty}^{\infty}A_0\,dy =1</math>.

For the upper bound, note that <math>|A+B| \le |A|+|B|,</math> thus we can separate this integration into 5 parts with 3 different types (a), (b) and (c), respectively. Therefore,

:<math>|R| < \underbrace{\left| \int_{D_y}^{\infty}A_0 dy \right|}_{(a_1)} + \underbrace{\left| \int_{D_y}^{b/s}A dy \right|}_{(b_1)}+ \underbrace{\left| \int_{-D_y}^{D_y}\left(A-A_0\right) dy \right|}_{(c)} + \underbrace{\left| \int_{a/s}^{-D_y}A dy \right|}_{(b_2)} + \underbrace{\left| \int_{-\infty}^{-D_y}A_0 dy \right|}_{(a_2)}</math>

where <math>(a_1)</math> and <math>(a_2)</math> are similar, let us just calculate <math>(a_1)</math> and <math>(b_1)</math> and <math>(b_2)</math> are similar, too, I’ll just calculate <math>(b_1)</math>.

For <math>(a_1)</math>, after the translation of <math>z\equiv\pi y^2</math>, we can get

:<math>(a_1) = \left| \frac{1}{2\sqrt{\pi}}\int_{\pi D_y^2}^{\infty} e^{-z}z^{-1/2} dz\right| <\frac{e^{-\pi D_y^2}}{2\pi D_y}.</math>

This means that as long as <math>D_y</math> is large enough, it will tend to zero.

For <math>(b_1)</math>, we can get

:<math>(b_1)\le\left| \int_{D_y}^{b/s}\left[\frac{h(sy)}{h(0)}\right]_{\text{max}} e^{Mm(sy)}dy \right|</math>

where

:<math>m(x) \ge g(x)-g(0) \text{as} x\in [sD_y,b]</math>

and <math>h(x)</math> should have the same sign of <math>h(0)</math> during this region. Let us choose <math>m(x)</math> as the tangent across the point at <math>x=sD_y</math> , i.e. <math>m(sy)= g(sD_y)-g(0) +g'(sD_y)\left( sy-sD_y \right)</math> which is shown in the figure

[[File:For laplace method --- upper limit function m(x).gif|thumb|<math>m(x)</math> is the tangent lines across the point at <math>x=sD_y</math> .]]

From this figure you can find that when <math>s</math> or <math>D_y</math> gets smaller, the region satisfies the above inequality will get larger. Therefore, if we want to find a suitable <math>m(x)</math> to cover the whole <math>f(x)</math> during the interval of <math>(b_1)</math>, <math>D_y</math> will have an upper limit. Besides, because the integration of <math>e^{-\alpha x}</math> is simple, let me use it to estimate the relative error contributed by this <math>(b_1)</math>.

Based on Taylor expansion, we can get

:<math>\begin{align}
M\left[g(sD_y)-g(0)\right] &= M\left[ \frac{g''(0)}{2}s^2D_y^2 +\frac{g'''(\xi)}{6}s^3D_y^3 \right] && \text{as } \xi\in[0,sD_y] \\
& = -\pi D_y^2 +\frac{(2\pi)^{3/2}g'''(\xi)D_y^3}{6\sqrt{M}|g''(0)|^{\frac{3}{2}}},
\end{align}</math>

and

:<math>\begin{align}
Msg'(sD_y) &= Ms\left(g''(0)sD_y +\frac{g'''(\zeta)}{2}s^2D_y^2\right) && \text{as } \zeta\in[0,sD_y] \\
 &= -2\pi D_y +\sqrt{\frac{2}{M}}\left( \frac{\pi}{|g''(0)|} \right)^{\frac{3}{2}}g'''(\zeta)D_y^2,
\end{align}</math>

and then substitute them back into the calculation of <math>(b_1)</math>; however, you can find that the remainders of these two expansions are both inversely proportional to the square root of <math>M</math>, let me drop them out to beautify the calculation. Keeping them is better, but it will make the formula uglier.

:<math>\begin{align}
(b_1) &\le \left|\left[ \frac{h(sy)}{h(0)} \right]_{\max} e^{-\pi D_y^2}\int_0^{b/s-D_y}e^{-2\pi D_y y} dy \right| \\
 &\le \left|\left[ \frac{h(sy)}{h(0)} \right]_{\max} e^{-\pi D_y^2}\frac{1}{2\pi D_y} \right|.
\end{align}</math>

Therefore, it will tend to zero when <math>D_y</math> gets larger, but don't forget that the upper bound of <math>D_y</math> should be considered during this calculation.

About the integration near <math>x=0</math>, we can also use [[Taylor's Theorem]] to calculate it. When <math>h'(0) \ne 0</math>

:<math>\begin{align}
(c) &\le \int_{-D_y}^{D_y} e^{-\pi y^2} \left| \frac{sh'(\xi)}{h(0)}y \right|\, dy \\
 &< \sqrt{\frac{2}{\pi M |g''(0)|}} \left| \frac{h'(\xi)}{h(0)} \right|_\max \left( 1-e^{-\pi D_y^2} \right)
\end{align}</math>

and you can find that it is inversely proportional to the square root of <math>M</math>. In fact, <math>(c)</math> will have the same behave when <math>h(x)</math> is a constant.

Conclusively, the integral near the stationary point will get smaller as <math>\sqrt{M}</math> gets larger, and the rest parts will tend to zero as long as <math>D_y</math> is large enough; however, we need to remember that <math>D_y</math> has an upper limit which is decided by whether the function <math>m(x)</math> is always larger than <math>g(x)-g(0)</math> in the rest region. However, as long as we can find one <math>m(x)</math> satisfying this condition, the upper bound of <math>D_y</math> can be chosen as directly proportional to <math>\sqrt{M}</math> since <math>m(x)</math> is a tangent across the point of <math>g(x)-g(0)</math> at <math>x=sD_y</math>. So, the bigger <math>M</math> is, the bigger <math>D_y</math> can be.
{{hidden end}}

In the multivariate case, where <math>\mathbf{x}</math> is a <math>d</math>-dimensional vector and <math>f(\mathbf{x})</math> is a scalar function of <math>\mathbf{x}</math>, Laplace's approximation is usually written as:

:<math>\int h(\mathbf{x})e^{M f(\mathbf{x})}\, d^dx \approx \left(\frac{2\pi}{M}\right)^{d/2} \frac{h(\mathbf{x}_0)e^{M f(\mathbf{x}_0)}}{\left|-H(f)(\mathbf{x}_0)\right|^{1/2}} \text { as } M\to\infty</math>

where <math>H(f)(\mathbf{x}_0)</math> is the [[Hessian matrix]] of <math>f</math> evaluated at <math>\mathbf{x}_0</math> and where <math>|\cdot|</math> denotes its [[matrix determinant]]. Analogously to the univariate case, the Hessian is required to be [[Positive-definite matrix|negative-definite]].<ref>{{cite book | last =MacKay | first =David J. C. | title =Information Theory, Inference and Learning Algorithms | publisher =Cambridge University Press |date=September 2003 | location =Cambridge | url =http://www.inference.phy.cam.ac.uk/mackay/itila/book.html | isbn = 9780521642989}}</ref>