Editing Newton's method (section)

==Analysis==
Suppose that the function {{mvar|f}} has a zero at {{mvar|α}}, i.e., {{math|{{var|f}}({{var|α}}) {{=}} 0}}, and {{mvar|f}} is differentiable in a [[topological neighborhood|neighborhood]] of {{mvar|α}}.

If {{mvar|f}} is continuously differentiable and its derivative is nonzero at&nbsp;{{mvar|α}}, then there exists a [[topological neighborhood|neighborhood]] of {{mvar|α}} such that for all starting values {{math|{{var|x}}{{sub|0}}}} in that neighborhood, the [[sequence]] {{math|({{var|x}}{{sub|{{var|n}}}})}} will [[limit of a sequence|converge]] to {{mvar|α}}.<ref>{{citation|title=A Theoretical Introduction to Numerical Analysis|first1=Victor S.|last1=Ryaben'kii|first2=Semyon V.|last2=Tsynkov|publisher=CRC Press|year=2006|isbn=9781584886075|page=243|url=https://books.google.com/books?id=V8gWP031-hEC&pg=PA243}}.</ref>

If {{mvar|f}} is continuously differentiable, its derivative is nonzero at&nbsp;{{mvar|α}}, ''and'' it has a [[second derivative]] at&nbsp;{{mvar|α}}, then the convergence is quadratic or faster. If the second derivative is not 0 at {{mvar|α}} then the convergence is merely quadratic. If the third derivative exists and is bounded in a neighborhood of {{mvar|α}}, then:

<math display="block">\Delta x_{i+1} = \frac{f'' (\alpha)}{2 f' (\alpha)} \left(\Delta x_{i}\right)^2 + O\left(\Delta x_{i}\right)^3 \,,</math>

where

<math display="block">\Delta x_i \triangleq x_i - \alpha \,.</math>

If the derivative is 0 at {{mvar|α}}, then the convergence is usually only linear. Specifically, if {{mvar|f}} is twice continuously differentiable, {{math|{{var|{{prime|f}}}}({{var|α}}) {{=}} 0}} and {{math|{{var|{{prime|f|c=&Prime;}}}}({{var|α}}) ≠ 0}}, then there exists a neighborhood of {{mvar|α}} such that, for all starting values {{math|{{var|x}}{{sub|0}}}} in that neighborhood, the sequence of iterates converges linearly, with [[rate of convergence|rate]] {{sfrac|1|2}}.<ref>{{harvnb|Süli|Mayers|2003|loc=Exercise 1.6}}</ref> Alternatively, if {{math|{{var|{{prime|f}}}}({{var|α}}) {{=}} 0}} and {{math|{{var|{{prime|f}}}}({{var|x}}) ≠ 0}} for {{math|{{var|x}} ≠ {{var|α}}}}, {{mvar|x}}&nbsp;in a [[topological neighborhood|neighborhood]] {{mvar|U}} of {{mvar|α}}, {{mvar|α}} being a zero of [[Multiplicity (mathematics)|multiplicity]] {{mvar|r}}, and if {{math|{{var|f}} ∈ {{var|C}}{{isup|{{var|r}}}}({{var|U}})}}, then there exists a neighborhood of {{mvar|α}} such that, for all starting values {{math|{{var|x}}{{sub|0}}}} in that neighborhood, the sequence of iterates converges linearly.

However, even linear convergence is not guaranteed in pathological situations.

In practice, these results are local, and the neighborhood of convergence is not known in advance. But there are also some results on global convergence: for instance, given a right neighborhood {{math|{{var|U}}{{sub|+}}}} of {{mvar|α}}, if {{mvar|f}} is twice differentiable in {{math|{{var|U}}{{sub|+}}}} and if {{math|{{var|{{prime|f}}}} ≠ 0}}, {{math|{{var|f}} · {{var|{{prime|f|c=&Prime;}}}} > 0}} in {{math|{{var|U}}{{sub|+}}}}, then, for each {{math|{{var|x}}{{sub|0}}}} in {{math|{{var|U}}{{sub|+}}}} the sequence {{math|{{var|x}}{{sub|{{var|k}}}}}} is monotonically decreasing to {{mvar|α}}.

===Proof of quadratic convergence for Newton's iterative method===
According to [[Taylor's theorem]], any function {{math|{{var|f}}({{var|x}})}} which has a continuous second derivative can be represented by an expansion about a point that is close to a root of {{math|{{var|f}}({{var|x}})}}. Suppose this root is {{mvar|α}}. Then the expansion of {{math|{{var|f}}({{var|α}})}} about {{math|{{var|x}}{{sub|{{var|n}}}}}} is:
{{NumBlk2|:|<math>f(\alpha) = f(x_n) + f'(x_n)(\alpha - x_n) + R_1 \,</math>|1}}

where the [[Lagrange remainder|Lagrange form of the Taylor series expansion remainder]] is

<math display="block">R_1 = \frac{1}{2!}f''(\xi_n)\left(\alpha - x_n\right)^{2} \,,</math>

where {{math|{{var|ξ}}{{sub|{{var|n}}}}}} is in between {{math|{{var|x}}{{sub|{{var|n}}}}}} and {{mvar|α}}.

Since {{mvar|α}} is the root, ({{EquationNote|1}}) becomes:

{{NumBlk2|:|<math>0 = f(\alpha) = f(x_n) + f'(x_n)(\alpha - x_n) + \tfrac12f''(\xi_n)\left(\alpha - x_n\right)^2 \,</math>|2}}

Dividing equation ({{EquationNote|2}}) by {{math|{{var|{{prime|f}}}}({{var|x}}{{sub|{{var|n}}}})}} and rearranging gives

{{NumBlk2|:|<math> \frac {f(x_n)}{f'(x_n)}+\left(\alpha-x_n\right) = \frac {- f'' (\xi_n)}{2 f'(x_n)}\left(\alpha-x_n\right)^2 </math>|3}}

Remembering that {{math|{{var|x}}{{sub|{{var|n}} + 1}}}} is defined by

{{NumBlk2|:|<math> x_{n+1} = x_{n} - \frac {f(x_n)}{f'(x_n)} \,,</math>|4}}

one finds that

<math display="block"> \underbrace{\alpha - x_{n+1}}_{\varepsilon_{n+1}} = \frac {- f'' (\xi_n)}{2 f'(x_n)} {(\,\underbrace{\alpha - x_n}_{\varepsilon_{n}}\,)}^2 \,.</math>

That is,

{{NumBlk2|:|<math> \varepsilon_{n+1} = \frac {- f'' (\xi_n)}{2 f'(x_n)} \cdot \varepsilon_n^2 \,.</math>|5}}

Taking the absolute value of both sides gives

{{NumBlk2|:|<math> \left| {\varepsilon_{n+1}}\right| = \frac {\left| f'' (\xi_n) \right| }{2 \left| f'(x_n) \right|} \cdot \varepsilon_n^2 \,. </math> |6}}

Equation ({{EquationNote|6}}) shows that the [[order of convergence]] is at least quadratic if the following conditions are satisfied:

# {{math|{{var|{{prime|f}}}}({{var|x}}) ≠ 0}}; for all {{math|{{var|x}} ∈ {{var|I}}}}, where {{mvar|I}} is the interval {{math|[{{var|α}} − {{abs|{{var|ε}}{{sub|0}}}}, {{var|α}} + {{abs|{{var|ε}}{{sub|0}}}}]}};
# {{math|{{var|{{prime|f|c=&Prime;}}}}({{var|x}})}} is continuous, for all {{math|{{var|x}} ∈ {{var|I}}}};
# {{math|{{var|M}} {{abs|{{var|ε}}{{sub|0}}}} < 1}}

where {{mvar|M}} is given by

<math display="block"> M = \frac12 \left( \sup_{x \in I} \vert f'' (x) \vert \right) \left( \sup_{x \in I} \frac {1}{ \vert f'(x) \vert } \right) . \,</math>

If these conditions hold,

<math display="block"> \vert \varepsilon_{n+1} \vert \leq M \cdot \varepsilon_n^2 \,. </math>

===Fourier conditions===

Suppose that {{math|''f''(''x'')}} is a [[concave function]] on an interval, which is [[strictly increasing]]. If it is negative at the left endpoint and positive at the right endpoint, the [[intermediate value theorem]] guarantees that there is a zero {{math|&zeta;}} of {{mvar|f}} somewhere in the interval. From geometrical principles, it can be seen that the Newton iteration {{math|''x''<sub>''i''</sub>}} starting at the left endpoint is [[monotonic function|monotonically increasing]] and convergent, necessarily to {{math|&zeta;}}.<ref name="ostrowski">{{cite book|last1=Ostrowski|first1=A. M.|title=Solution of equations in Euclidean and Banach spaces|year=1973|mr=0359306|series=Pure and Applied Mathematics|volume=9|publisher=[[Academic Press]]|location=New York–London|author-link1=Alexander Ostrowski|zbl=0304.65002|edition=Third edition of 1960 original}}</ref>

[[Joseph Fourier]] introduced a modification of Newton's method starting at the right endpoint:
:<math>y_{i+1}=y_i-\frac{f(y_i)}{f'(x_i)}.</math>
This sequence is monotonically decreasing and convergent. By passing to the limit in this definition, it can be seen that the limit of {{math|''y''<sub>''i''</sub>}} must also be the zero {{math|&zeta;}}.<ref name="ostrowski" /> 

So, in the case of a concave increasing function with a zero, initialization is largely irrelevant. Newton iteration starting anywhere left of the zero will converge, as will Fourier's modified Newton iteration starting anywhere right of the zero. The accuracy at any step of the iteration can be determined directly from the difference between the location of the iteration from the left and the location of the iteration from the right. If {{mvar|f}} is twice continuously differentiable, it can be proved using [[Taylor's theorem]] that
:<math>\lim_{i\to\infty}\frac{y_{i+1}-x_{i+1}}{(y_i-x_i)^2}=-\frac{1}{2}\frac{f''(\zeta)}{f'(\zeta)},</math>
showing that this difference in locations converges quadratically to zero.<ref name="ostrowski" />

All of the above can be extended to systems of equations in multiple variables, although in that context the relevant concepts of [[monotonicity]] and concavity are more subtle to formulate.<ref>Ortega and Rheinboldt, Section 13.3</ref> In the case of single equations in a single variable, the above monotonic convergence of Newton's method can also be generalized to replace concavity by positivity or negativity conditions on an arbitrary higher-order derivative of {{mvar|f}}. However, in this generalization, Newton's iteration is modified so as to be based on [[Taylor polynomial]]s rather than the [[tangent line]]. In the case of concavity, this modification coincides with the standard Newton method.<ref>{{cite book|last1=Traub|first1=J. F.|title=Iterative methods for the solution of equations|year=1964|series=Prentice-Hall Series in Automatic Computation|publisher=[[Prentice-Hall, Inc.]]|location=Englewood Cliffs, NJ|mr=0169356|author-link1=Joseph F. Traub|zbl=0121.11204}}</ref>




===Error for {{math|n>1}} variables===
If we seek the root of a single function <math>f : \mathbf{R}^n \to \mathbf{R}</math>
then the error <math>\epsilon_n=x_n-\alpha</math> is a vector such that its components obey <math>\epsilon^{(n+1)}_k = \frac{1}{2} (\epsilon^{(n)})^T Q_k \epsilon^{(n)} + O(\|\epsilon^{(n)}\|^3)</math> where <math>Q_k</math> is a quadratic form:
<math>(Q_k)_{i,j} = \sum_{\ell} ((D^2 f)^{-1})_{i,\ell} \frac{\partial^3f}{\partial x_j \partial x_k \partial x_\ell}</math>
evaluated at the root <math>\alpha</math>
(where <math>D^2f</math> is the 2nd derivative Hessian matrix).