Editing Newton's method (section)

==Practical considerations==
Newton's method is a powerful technique—if the derivative of the function at the root is nonzero, then the [[rate of convergence|convergence]] is at least quadratic: as the method converges on the root, the difference between the root and the approximation is squared (the number of accurate digits roughly doubles) at each step. However, there are some difficulties with the method.

===Difficulty in calculating the derivative of a function===
Newton's method requires that the derivative can be calculated directly. An analytical expression for the derivative may not be easily obtainable or could be expensive to evaluate. In these situations, it may be appropriate to approximate the derivative by using the slope of a line through two nearby points on the function. Using this approximation would result in something like the [[secant method]] whose convergence is slower than that of Newton's method.

===Failure of the method to converge to the root===
It is important to review the [[#Proof of quadratic convergence for Newton's iterative method|proof of quadratic convergence]] of Newton's method before implementing it. Specifically, one should review the assumptions made in the proof. For [[#Failure analysis|situations where the method fails to converge]], it is because the assumptions made in this proof are not met.

For example, [[#Divergence even when initialization is close to the root|in some cases]], if the first derivative is not well behaved in the neighborhood of a particular root, then it is possible that Newton's method will fail to converge no matter where the initialization is set. In some cases, Newton's method can be stabilized by using [[successive over-relaxation#Other applications of the method|successive over-relaxation]], or the speed of convergence can be increased by using the same method.

In a robust implementation of Newton's method, it is common to place limits on the number of iterations, bound the solution to an interval known to contain the root, and combine the method with a more robust root finding method.

===Slow convergence for roots of multiplicity greater than 1===
If the root being sought has [[Multiplicity (mathematics)#Multiplicity of a root of a polynomial|multiplicity]] greater than one, the convergence rate is merely linear (errors reduced by a constant factor at each step) unless special steps are taken. When there are two or more roots that are close together then it may take many iterations before the iterates get close enough to one of them for the quadratic convergence to be apparent. However, if the multiplicity {{mvar|m}} of the root is known, the following modified algorithm preserves the quadratic convergence rate:<ref>{{cite web|title=Accelerated and Modified Newton Methods|url=http://mathfaculty.fullerton.edu/mathews/n2003/newtonacceleratemod.html|access-date=4 March 2016|archive-url=https://web.archive.org/web/20190524083302/http://mathfaculty.fullerton.edu/mathews/n2003/NewtonAccelerateMod.html|archive-date=24 May 2019|url-status=dead}}</ref>

<math display="block">x_{n+1} = x_n - m\frac{f(x_n)}{f'(x_n)}. </math>

This is equivalent to using [[successive over-relaxation#Other applications of the method|successive over-relaxation]]. On the other hand, if the multiplicity {{mvar|m}} of the root is not known, it is possible to estimate {{mvar|m}} after carrying out one or two iterations, and then use that value to increase the rate of convergence.

If the multiplicity {{mvar|m}} of the root is finite then {{math|1={{var|g}}({{var|x}}) = {{sfrac|{{var|f}}({{var|x}})|{{var|{{prime|f}}}}({{var|x}})}}}} will have a root at the same location with multiplicity 1.  Applying Newton's method to find the root of {{math|{{var|g}}({{var|x}})}} recovers quadratic convergence in many cases although it generally involves the second derivative of {{math|{{var|f}}({{var|x}})}}.  In a particularly simple case, if {{math|1={{var|f}}({{var|x}}) = {{var|x}}{{sup|{{var|m}}}}}} then {{math|1={{var|g}}({{var|x}}) = {{sfrac|{{var|x}}|{{var|m}}}}}} and Newton's method finds the root in a single iteration with

<math display="block">x_{n+1} = x_n - \frac{g(x_n)}{g'(x_n)} = x_n - \frac{\;\frac{x_n}{m}\;}{\frac{1}{m}} = 0\,.</math>

===Slow convergence===
The function {{math|''f''(''x'') {{=}} ''x''<sup>2</sup>}} has a root at 0.<ref name="dennis">J. E. Dennis, Jr. and Robert B. Schnabel. Numerical methods for unconstrained optimization and nonlinear equations. SIAM</ref> Since {{mvar|f}} is continuously differentiable at its root, the theory guarantees that Newton's method as initialized sufficiently close to the root will converge. However, since the derivative {{math|''f'' ′}} is zero at the root, quadratic convergence is not ensured by the theory. In this particular example, the Newton iteration is given by
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=\frac{1}{2}x_n.</math>
It is visible from this that Newton's method could be initialized anywhere and converge to zero, but at only a linear rate. If initialized at 1, dozens of iterations would be required before ten digits of accuracy are achieved.

The function {{math|''f''(''x'') {{=}} ''x'' + ''x''<sup>4/3</sup>}} also has a root at 0, where it is continuously differentiable. Although the first derivative {{mvar|''f'' ′}} is nonzero at the root, the second derivative {{mvar|''f'' ′′}} is nonexistent there, so that quadratic convergence cannot be guaranteed. In fact the Newton iteration is given by
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=\frac{x_n^{4/3}}{3+4x_n^{1/3}} \approx x_n \cdot \frac{x_n^{1/3}}{3} .</math>
From this, it can be seen that the rate of convergence is superlinear but subquadratic. This can be seen in the following tables, the left of which shows Newton's method applied to the above {{math|''f''(''x'') {{=}} ''x'' + ''x''<sup>4/3</sup>}} and the right of which shows Newton's method applied to {{math|''f''(''x'') {{=}} ''x'' + ''x''<sup>2</sup>}}. The quadratic convergence in iteration shown on the right is illustrated by the orders of magnitude in the distance from the iterate to the true root (0,1,2,3,5,10,20,39,...) being approximately doubled from row to row. While the convergence on the left is superlinear, the order of magnitude is only multiplied by about 4/3 from row to row (0,1,2,4,5,7,10,13,...).

{| class=wikitable style="border: none;"
! scope=col | {{math|''x''<sub>''n''</sub>}}
! scope=col | {{math|''x'' + ''x''{{su|p=4/3|b=''n''}}}}
| rowspan=9 style="border: none;"|
! scope=col | {{math|''x''<sub>''n''</sub>}}
! scope=col | {{math|''x'' + ''x''{{su|p=2|b=''n''}}}}
|-
| 1 || 2 || 1 || 2
|-
| 1.4286 × 10<sup>−1</sup> || 2.1754 × 10<sup>−1</sup> || 3.3333 × 10<sup>−1</sup> || 4.4444 × 10<sup>−1</sup>
|-
| 1.4669 × 10<sup>−2</sup> || 1.8260 × 10<sup>−2</sup> || 6.6666 × 10<sup>−2</sup> || 7.1111 × 10<sup>−2</sup>
|-
| 9.0241 × 10<sup>−4</sup> || 9.8961 × 10<sup>−4</sup> || 3.9216 × 10<sup>−3</sup> || 3.9369 × 10<sup>−3</sup>
|-
| 2.5750 × 10<sup>−5</sup> || 2.6511 × 10<sup>−5</sup> || 1.5259 × 10<sup>−5</sup> || 1.5259 × 10<sup>−5</sup>
|-
| 2.4386 × 10<sup>−7</sup> || 2.4539 × 10<sup>−7</sup> || 2.3283 × 10<sup>−10</sup> || 2.3283 × 10<sup>−10</sup>
|-
| 5.0366 × 10<sup>−10</sup> || 5.0406 × 10<sup>−10</sup> || 5.4210 × 10<sup>−20</sup> || 5.4210 × 10<sup>−20</sup>
|-
| 1.3344 × 10<sup>−13</sup> || 1.3344 × 10<sup>−13</sup> || 2.9387 × 10<sup>−39</sup> || 2.9387 × 10<sup>−39</sup>
|}

The rate of convergence is distinguished from the number of iterations required to reach a given accuracy. For example, the function {{math|''f''(''x'') {{=}} ''x''<sup>20</sup> − 1}} has a root at 1. Since {{math|''f'' ′(1) ≠ 0}} and {{mvar|f}} is smooth, it is known that any Newton iteration convergent to 1 will converge quadratically. However, if initialized at 0.5, the first few iterates of Newton's method are approximately 26214, 24904, 23658, 22476, decreasing slowly, with only the 200th iterate being 1.0371. The following iterates are 1.0103, 1.00093, 1.0000082, and 1.00000000065, illustrating quadratic convergence. This highlights that quadratic convergence of a Newton iteration does not mean that only few iterates are required; this only applies once the sequence of iterates is sufficiently close to the root.<ref>Anthony Ralston and Philip Rabinowitz. A first course in numerical analysis, second edition</ref>

===Convergence dependent on initialization===
The function {{math|''f''(''x'') {{=}} ''x''(1 + ''x''<sup>2</sup>)<sup>−1/2</sup>}} has a root at 0. The Newton iteration is given by
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n-\frac{x_n(1+x_n^2)^{-1/2}}{(1+x_n^2)^{-3/2}}=-x_n^3.</math>
From this, it can be seen that there are three possible phenomena for a Newton iteration. If initialized strictly between {{math|±1}}, the Newton iteration will converge (super-)quadratically to 0; if initialized exactly at {{math|1}} or {{math|−1}}, the Newton iteration will oscillate endlessly between {{math|±1}}; if initialized anywhere else, the Newton iteration will diverge.<ref>Yuri Nesterov. Lectures on convex optimization, second edition. Springer Optimization and its Applications, Volume 137.</ref> This same trichotomy occurs for {{math|''f''(''x'') {{=}} arctan ''x''}}.<ref name="dennis" />

In cases where the function in question has multiple roots, it can be difficult to control, via choice of initialization, which root (if any) is identified by Newton's method. For example, the function {{math|''f''(''x'') {{=}} ''x''(''x''<sup>2</sup> − 1)(''x'' − 3)e<sup>−(''x'' − 1)<sup>2</sup>/2</sup>}} has roots at −1, 0, 1, and 3.{{sfnm|1a1=Süli|1a2=Mayers|1y=2003}} If initialized at −1.488, the Newton iteration converges to 0; if initialized at −1.487, it diverges to {{math|∞}}; if initialized at −1.486, it converges to −1; if initialized at −1.485, it diverges to {{math|−∞}}; if initialized at −1.4843, it converges to 3; if initialized at −1.484, it converges to {{math|1}}. This kind of subtle dependence on initialization is not uncommon; it is frequently studied in the [[complex plane]] in the form of the [[Newton fractal]].

===Divergence even when initialization is close to the root===

Consider the problem of finding a root of {{math|''f''(''x'') {{=}} ''x''<sup>1/3</sup>}}. The Newton iteration is
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n-\frac{x_n^{1/3}}{\frac{1}{3}x_n^{-2/3}}=-2x_n.</math>
Unless Newton's method is initialized at the exact root 0, it is seen that the sequence of iterates will fail to converge. For example, even if initialized at the reasonably accurate guess of 0.001, the first several iterates are −0.002, 0.004, −0.008, 0.016, reaching 1048.58, −2097.15, ... by the 20th iterate. This failure of convergence is not contradicted by the analytic theory, since in this case {{mvar|f}} is not differentiable at its root.

In the above example, failure of convergence is reflected by the failure of {{math|''f''(''x''<sub>''n''</sub>)}} to get closer to zero as {{mvar|n}} increases, as well as by the fact that successive iterates are growing further and further apart. However, the function {{math|''f''(''x'') {{=}} ''x''<sup>1/3</sup>e<sup>−''x''<sup>2</sup></sup>}} also has a root at 0. The Newton iteration is given by
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n\left(1-\frac{3}{1-6x_n^2}\right).</math>
In this example, where again {{mvar|f}} is not differentiable at the root, any Newton iteration not starting exactly at the root will diverge, but with both {{math|''x''<sub>''n'' + 1</sub> − ''x''<sub>''n''</sub>}} and {{math|''f''(''x''<sub>''n''</sub>)}} converging to zero.<ref name="judd">Kenneth L. Judd. Numerical methods in economics. MIT Press</ref> This is seen in the following table showing the iterates with initialization 1:
{| class=wikitable style="border: none;"
! scope=col | {{math|''x''<sub>''n''</sub>}}
! scope=col | {{math|''f''(''x''<sub>''n''</sub>)}}
|-
| 1 || 0.36788
|-
| 1.6 || 9.0416 × 10<sup>−2</sup> 
|-
| 1.9342 || 2.9556 × 10<sup>−2</sup>
|-
| 2.2048 || 1.0076 × 10<sup>−2</sup>
|- 
| 2.4396 || 3.5015 × 10<sup>−3</sup>
|-
| 2.6505 || 1.2307 × 10<sup>−3</sup> 
|-
| 2.8437 || 4.3578 × 10<sup>−4</sup>
|-
| 3.0232 || 1.5513 × 10<sup>−4</sup>
|}

Although the convergence of {{math|''x''<sub>''n'' + 1</sub> − ''x''<sub>''n''</sub>}} in this case is not very rapid, it can be proved from the iteration formula. This example highlights the possibility that a stopping criterion for Newton's method based only on the smallness of {{math|''x''<sub>''n'' + 1</sub> − ''x''<sub>''n''</sub>}} and {{math|''f''(''x''<sub>''n''</sub>)}} might falsely identify a root.

===Oscillatory behavior===
[[Image:NewtonsMethodConvergenceFailure.svg|thumb|upright=1.4|The tangent lines of {{math|{{var|x}}{{sup|3}} − 2{{var|x}} + 2}} at 0 and 1 intersect the {{mvar|x}}-axis at 1 and 0 respectively, illustrating why Newton's method oscillates between these values for some starting points.]]

It is easy to find situations for which Newton's method oscillates endlessly between two distinct values. For example, for Newton's method as applied to a function {{mvar|f}} to oscillate between 0 and 1, it is only necessary that the tangent line to {{mvar|f}} at 0 intersects the {{mvar|x}}-axis at 1 and that the tangent line to {{mvar|f}} at 1 intersects the {{mvar|x}}-axis at 0.<ref name="judd" /> This is the case, for example, if {{math|''f''(''x'') {{=}} ''x''<sup>3</sup> − 2''x'' + 2}}. For this function, it is even the case that Newton's iteration as initialized sufficiently close to 0 or 1 will ''asymptotically'' oscillate between these values. For example, Newton's method as initialized at 0.99 yields iterates 0.99, −0.06317, 1.00628, 0.03651, 1.00196, 0.01162, 1.00020, 0.00120, 1.000002, and so on. This behavior is present despite the presence of a root of {{mvar|f}} approximately equal to −1.76929.

===Undefinedness of Newton's method===
In some cases, it is not even possible to perform the Newton iteration. For example, if {{math|''f''(''x'') {{=}} ''x''<sup>2</sup> − 1}}, then the Newton iteration is defined by
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)} = x_n-\frac{x_n^2-1}{2x_n} = \frac{x_n^2+1}{2x_n}.</math>
So Newton's method cannot be initialized at 0, since this would make {{math|''x''<sub>1</sub>}} undefined. Geometrically, this is because the tangent line to {{mvar|f}} at 0 is horizontal (i.e. {{math|''f'' ′(0) {{=}} 0}}), never intersecting the {{math|''x''}}-axis.

Even if the initialization is selected so that the Newton iteration can begin, the same phenomenon can block the iteration from being indefinitely continued.

If {{mvar|f}} has an incomplete domain, it is possible for Newton's method to send the iterates outside of the domain, so that it is impossible to continue the iteration.<ref name="judd" /> For example, the [[natural logarithm]] function {{math|''f''(''x'') {{=}} ln ''x''}} has a root at 1, and is defined only for positive {{mvar|x}}. Newton's iteration in this case is given by
:<math>x_{n+1}=x_n-\frac{f(x_n)}{f'(x_n)}=x_n(1- \ln x_n).</math>
So if the iteration is initialized at {{math|e}}, the next iterate is 0; if the iteration is initialized at a value larger than {{math|e}}, then the next iterate is negative. In either case, the method cannot be continued.