Template:Short description Template:Log(x) Template:Duplication In mathematics, the prime number theorem (PNT) describes the asymptotic distribution of the prime numbers among the positive integers. It formalizes the intuitive idea that primes become less common as they become larger by precisely quantifying the rate at which this occurs. The theorem was proved independently by Jacques Hadamard<ref name="Hadamard1896">Template:Citation</ref> and Charles Jean de la Vallée Poussin<ref name="de la Vallée Poussin1896">Template:Citation</ref> in 1896 using ideas introduced by Bernhard Riemann (in particular, the Riemann zeta function).

The first such distribution found is Template:Math, where Template:Math is the prime-counting function (the number of primes less than or equal to N) and Template:Math is the natural logarithm of Template:Mvar. This means that for large enough Template:Mvar, the probability that a random integer not greater than Template:Mvar is prime is very close to Template:Math. Consequently, a random integer with at most Template:Math digits (for large enough Template:Mvar) is about half as likely to be prime as a random integer with at most Template:Mvar digits. For example, among the positive integers of at most 1000 digits, about one in 2300 is prime (Template:Math), whereas among positive integers of at most 2000 digits, about one in 4600 is prime (Template:Math). In other words, the average gap between consecutive prime numbers among the first Template:Mvar integers is roughly Template:Math.<ref>Template:Cite book</ref>

StatementEdit

File:Prime number theorem ratio convergence.svg
Graph showing ratio of the prime-counting function Template:Math to two of its approximations, Template:Math and Template:Math. As Template:Mvar increases (note Template:Mvar axis is logarithmic), both ratios tend towards 1. The ratio for Template:Math converges from above very slowly, while the ratio for Template:Math converges more quickly from below.
File:Prime number theorem absolute error.svg
Log–log plot showing absolute error of Template:Math and Template:Math, two approximations to the prime-counting function Template:Math. Unlike the ratio, the difference between Template:Math and Template:Math increases without bound as Template:Mvar increases. On the other hand, Template:Math switches sign infinitely many times.

Let Template:Math be the prime-counting function defined to be the number of primes less than or equal to Template:Mvar, for any real number Template:Mvar. For example, Template:Math because there are four prime numbers (2, 3, 5 and 7) less than or equal to 10. The prime number theorem then states that Template:Math is a good approximation to Template:Math (where log here means the natural logarithm), in the sense that the limit of the quotient of the two functions Template:Math and Template:Math as Template:Mvar increases without bound is 1:

<math>\lim_{x\to\infty}\frac{\;\pi(x)\;}{\;\left[ \frac{x}{\log(x)}\right]\;} = 1,</math>

known as the asymptotic law of distribution of prime numbers. Using asymptotic notation this result can be restated as

<math>\pi(x)\sim \frac{x}{\log x}.</math>

This notation (and the theorem) does not say anything about the limit of the difference of the two functions as Template:Mvar increases without bound. Instead, the theorem states that Template:Math approximates Template:Math in the sense that the relative error of this approximation approaches 0 as Template:Mvar increases without bound.

The prime number theorem is equivalent to the statement that the Template:Mvarth prime number Template:Mvar satisfies

<math>p_n \sim n\log(n),</math>

the asymptotic notation meaning, again, that the relative error of this approximation approaches 0 as Template:Mvar increases without bound. For example, the Template:Valth prime number is Template:Val,<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> and (Template:Val)log(Template:Val) rounds to Template:Val, a relative error of about 6.4%.

On the other hand, the following asymptotic relations are logically equivalent:<ref name=Apostol76>Template:Cite book</ref>Template:Rp

<math>\begin{align}
   \lim_{x\rightarrow \infty}\frac{\pi(x)\log x}{x}&=1,\text{ and}\\
   \lim_{x\rightarrow \infty}\frac{\pi(x)\log \pi(x)}{x}\,&=1.

\end{align} </math>

As outlined below, the prime number theorem is also equivalent to

<math>\lim_{x\to\infty} \frac{\vartheta (x)}x = \lim_{x\to\infty} \frac{\psi(x)}x=1,</math>

where Template:Mvar and Template:Mvar are the first and the second Chebyshev functions respectively, and to

<math>\lim_{x \to \infty} \frac{M(x)}{x}=0,</math>Template:R

where <math>M(x)=\sum_{n \leq x} \mu(n)</math> is the Mertens function.

History of the proof of the asymptotic law of prime numbersEdit

Based on the tables by Anton Felkel and Jurij Vega, Adrien-Marie Legendre conjectured in 1797 or 1798 that Template:Math is approximated by the function Template:Math, where Template:Mvar and Template:Mvar are unspecified constants. In the second edition of his book on number theory (1808) he then made a more precise conjecture, with Template:Math and Template:Math. Carl Friedrich Gauss considered the same question at age 15 or 16 "in the year 1792 or 1793", according to his own recollection in 1849.<ref>Template:Citation.</ref> In 1838 Peter Gustav Lejeune Dirichlet came up with his own approximating function, the logarithmic integral Template:Math (under the slightly different form of a series, which he communicated to Gauss). Both Legendre's and Dirichlet's formulas imply the same conjectured asymptotic equivalence of Template:Math and Template:Math stated above, although it turned out that Dirichlet's approximation is considerably better if one considers the differences instead of quotients.

In two papers from 1848 and 1850, the Russian mathematician Pafnuty Chebyshev attempted to prove the asymptotic law of distribution of prime numbers. His work is notable for the use of the zeta function Template:Math, for real values of the argument "Template:Mvar", as in works of Leonhard Euler, as early as 1737. Chebyshev's papers predated Riemann's celebrated memoir of 1859, and he succeeded in proving a slightly weaker form of the asymptotic law, namely, that if the limit as Template:Mvar goes to infinity of Template:Math exists at all, then it is necessarily equal to one.<ref>Template:Cite journal</ref> He was able to prove unconditionally that this ratio is bounded above and below by 0.92129 and 1.10555, for all sufficiently large Template:Mvar.<ref>Template:Cite journal</ref><ref name="Goldfeld Historical Perspective" /> Although Chebyshev's paper did not prove the Prime Number Theorem, his estimates for Template:Math were strong enough for him to prove Bertrand's postulate that there exists a prime number between Template:Math and Template:Math for any integer Template:Math.

An important paper concerning the distribution of prime numbers was Riemann's 1859 memoir "On the Number of Primes Less Than a Given Magnitude", the only paper he ever wrote on the subject. Riemann introduced new ideas into the subject, chiefly that the distribution of prime numbers is intimately connected with the zeros of the analytically extended Riemann zeta function of a complex variable. In particular, it is in this paper that the idea to apply methods of complex analysis to the study of the real function Template:Math originates. Extending Riemann's ideas, two proofs of the asymptotic law of the distribution of prime numbers were found independently by Jacques Hadamard<ref name="Hadamard1896" /> and Charles Jean de la Vallée Poussin<ref name="de la Vallée Poussin1896" /> and appeared in the same year (1896). Both proofs used methods from complex analysis, establishing as a main step of the proof that the Riemann zeta function Template:Math is nonzero for all complex values of the variable Template:Mvar that have the form Template:Math with Template:Math.<ref>Template:Cite book</ref>

During the 20th century, the theorem of Hadamard and de la Vallée Poussin also became known as the Prime Number Theorem. Several different proofs of it were found, including the "elementary" proofs of Atle Selberg<ref name="Selberg1949" /> and Paul Erdős<ref name="Erdős1949">Template:Citation</ref> (1949). Hadamard's and de la Vallée Poussin's original proofs are long and elaborate; later proofs introduced various simplifications through the use of Tauberian theorems but remained difficult to digest. A short proof was discovered in 1980 by the American mathematician Donald J. Newman.<ref>Template:Cite journal</ref><ref name=":0">Template:Cite journal</ref> Newman's proof is arguably the simplest known proof of the theorem, although it is non-elementary in the sense that it uses Cauchy's integral theorem from complex analysis.

Proof sketchEdit

Here is a sketch of the proof referred to in one of Terence Tao's lectures.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> Like most proofs of the PNT, it starts out by reformulating the problem in terms of a less intuitive, but better-behaved, prime-counting function. The idea is to count the primes (or a related set such as the set of prime powers) with weights to arrive at a function with smoother asymptotic behavior. The most common such generalized counting function is the Chebyshev function Template:Math, defined by

<math>\psi(x) = \sum_{k \geq 1} \sum_\overset{p^k \le x,}{\!\!\!\!p \text{ is prime}\!\!\!\!} \log p \; .</math>

This is sometimes written as

<math>\psi(x) = \sum_{n\le x} \Lambda(n) \; ,</math>

where Template:Math is the von Mangoldt function, namely

<math>\Lambda(n) = \begin{cases} \log p & \text{ if } n = p^k \text{ for some prime } p \text{ and integer } k \ge 1, \\ 0 & \text{otherwise.} \end{cases}</math>

It is now relatively easy to check that the PNT is equivalent to the claim that

<math>\lim_{x\to\infty} \frac{\psi(x)}{x} = 1 \; .</math>

Indeed, this follows from the easy estimates

<math>\psi(x) = \sum_\overset{p\le x}{\!\!\!\! p \text{ is prime}\!\!\!\!} \log p \left\lfloor \frac{\log x}{\log p} \right\rfloor \le \sum_\overset{p\le x}{\!\!\!\! p \text{ is prime}\!\!\!\!} \log x = \pi(x)\log x</math>

and (using [[big O notation|big Template:Mvar notation]]) for any Template:Math,

<math>\psi(x) \ge \sum_{\!\!\!\!\overset{x^{1-\varepsilon}\le p\le x}{p \text{ is prime}}\!\!\!\!} \log p \ge \sum_{\!\!\!\!\overset{x^{1-\varepsilon}\le p\le x}{p \text{ is prime}}\!\!\!\!} (1-\varepsilon)\log x=(1-\varepsilon)\left(\pi(x)+O\left(x^{1-\varepsilon}\right)\right)\log x \; .</math>

The next step is to find a useful representation for Template:Math. Let Template:Math be the Riemann zeta function. It can be shown that Template:Math is related to the von Mangoldt function Template:Math, and hence to Template:Math, via the relation

<math>-\frac{\zeta'(s)}{\zeta(s)} = \sum_{n = 1}^\infty \Lambda(n) \, n^{-s} \; .</math>

A delicate analysis of this equation and related properties of the zeta function, using the Mellin transform and Perron's formula, shows that for non-integer Template:Mvar the equation

<math>\psi(x) = x \; - \; \log(2\pi) \; - \!\!\!\! \sum\limits_{\rho :\, \zeta(\rho) = 0} \frac{x^\rho}{\rho}</math>

holds, where the sum is over all zeros (trivial and nontrivial) of the zeta function. This striking formula is one of the so-called explicit formulas of number theory, and is already suggestive of the result we wish to prove, since the term Template:Mvar (claimed to be the correct asymptotic order of Template:Math) appears on the right-hand side, followed by (presumably) lower-order asymptotic terms.

The next step in the proof involves a study of the zeros of the zeta function. The trivial zeros −2, −4, −6, −8, ... can be handled separately:

<math>\sum_{n=1}^\infty \frac{1}{2n\,x^{2n}} = -\frac{1}{2}\log\left(1-\frac{1}{x^2}\right),</math>

which vanishes for large Template:Mvar. The nontrivial zeros, namely those on the critical strip Template:Math, can potentially be of an asymptotic order comparable to the main term Template:Mvar if Template:Math, so we need to show that all zeros have real part strictly less than 1.

Non-vanishing on Re(s) = 1Edit

To do this, we take for granted that Template:Math is meromorphic in the half-plane Template:Math, and is analytic there except for a simple pole at Template:Math, and that there is a product formula

<math>\zeta(s)=\prod_p\frac{1}{1-p^{-s}} </math>

for Template:Math. This product formula follows from the existence of unique prime factorization of integers, and shows that Template:Math is never zero in this region, so that its logarithm is defined there and

<math>\log\zeta(s)=-\sum_p\log \left(1-p^{-s} \right)=\sum_{p,n}\frac{p^{-ns}}{n} \; .</math>

Write Template:Math ; then

<math>\big| \zeta(x+iy) \big| = \exp\left( \sum_{n,p} \frac{\cos ny\log p}{np^{nx}} \right) \; .</math>

Now observe the identity

<math> 3 + 4 \cos \phi+ \cos 2 \phi = 2 ( 1 + \cos \phi )^2\ge 0 \; ,</math>

so that

<math>\left| \zeta(x)^3 \zeta(x+iy)^4 \zeta(x+2iy) \right| = \exp\left( \sum_{n,p} \frac{3 + 4 \cos(ny\log p) + \cos( 2 n y \log p )}{np^{nx}} \right) \ge 1</math>

for all Template:Math. Suppose now that Template:Math. Certainly Template:Mvar is not zero, since Template:Math has a simple pole at Template:Math. Suppose that Template:Math and let Template:Mvar tend to 1 from above. Since <math>\zeta(s)</math> has a simple pole at Template:Math and Template:Math stays analytic, the left hand side in the previous inequality tends to 0, a contradiction.

Finally, we can conclude that the PNT is heuristically true. To rigorously complete the proof there are still serious technicalities to overcome, due to the fact that the summation over zeta zeros in the explicit formula for Template:Math does not converge absolutely but only conditionally and in a "principal value" sense. There are several ways around this problem but many of them require rather delicate complex-analytic estimates. Edwards's book<ref>Template:Cite book</ref> provides the details. Another method is to use Ikehara's Tauberian theorem, though this theorem is itself quite hard to prove. D.J. Newman observed that the full strength of Ikehara's theorem is not needed for the prime number theorem, and one can get away with a special case that is much easier to prove.

Newman's proof of the prime number theoremEdit

D. J. Newman gives a quick proof of the prime number theorem (PNT). The proof is "non-elementary" by virtue of relying on complex analysis, but uses only elementary techniques from a first course in the subject: Cauchy's integral formula, Cauchy's integral theorem and estimates of complex integrals. Here is a brief sketch of this proof. See <ref name=":0" /> for the complete details.

The proof uses the same preliminaries as in the previous section except instead of the function <math display="inline">\psi</math>, the Chebyshev function<math display="inline"> \quad \vartheta(x) = \sum_{p\le x} \log p</math> is used, which is obtained by dropping some of the terms from the series for <math display="inline">\psi</math>. Similar to the argument in the previous proof based on Tao's lecture, we can show that Template:Math, and Template:Math for any Template:Math. Thus, the PNT is equivalent to <math>\lim _{x \to \infty} \vartheta(x)/x = 1</math>. Likewise instead of <math> - \frac{\zeta '(s)}{\zeta(s)} </math> the function <math> \Phi(s) = \sum_{p\le x} \log p\,\, p^{-s} </math> is used, which is obtained by dropping some terms in the series for <math> - \frac{\zeta '(s)}{\zeta(s)} </math>. The functions <math> \Phi(s) </math> and <math> -\zeta'(s)/\zeta(s) </math> differ by a function holomorphic on <math>\Re s = 1</math>. Since, as was shown in the previous section, <math>\zeta(s)</math> has no zeroes on the line <math>\Re s = 1</math>, <math> \Phi(s) - \frac 1{s-1} </math> has no singularities on <math>\Re s = 1</math>.

One further piece of information needed in Newman's proof, and which is the key to the estimates in his simple method, is that <math>\vartheta(x)/x</math> is bounded. This is proved using an ingenious and easy method due to Chebyshev.

Integration by parts shows how <math>\vartheta(x)</math> and <math>\Phi(s)</math> are related. For <math>\Re s > 1</math>,

<math>

\Phi(s) = \int _1^\infty x^{-s} d\vartheta(x) = s\int_1^\infty \vartheta(x)x^{-s-1}\,dx = s \int_0^\infty \vartheta(e^t) e^{-st} \, dt. </math>

Newman's method proves the PNT by showing the integral

<math>

I = \int_0 ^\infty \left( \frac{\vartheta(e^t)}{e^t} -1 \right) \, dt. </math> converges, and therefore the integrand goes to zero as <math>t \to \infty</math>, which is the PNT. In general, the convergence of the improper integral does not imply that the integrand goes to zero at infinity, since it may oscillate, but since <math>\vartheta</math> is increasing, it is easy to show in this case.

To show the convergence of <math> I </math>, for <math>\Re z > 0</math> let

<math> g_T(z) = \int_0^T f(t) e^{-zt}\, dt </math> and <math> g(z) = \int_0^\infty f(t) e^{-zt}\, dt </math> where <math> f(t) = \frac {\vartheta(e^t)}{e^t} -1 </math>

then

<math> \lim_{T \to \infty} g_T(z) = g(z) = \frac{\Phi(s)}{s} - \frac 1 {s-1} \quad \quad \text{where} \quad z = s -1 </math>

which is equal to a function holomorphic on the line <math>\Re z = 0</math> .

The convergence of the integral <math> I </math>, and thus the PNT, is proved by showing that <math>\lim_{T \to \infty} g_T(0) = g(0)</math>. This involves change of order of limits since it can be written <math display="inline"> \lim_{T \to \infty} \lim_{z \to 0} g_T(z) = \lim_{z \to 0} \lim_{T \to \infty}g_T(z) </math> and therefore classified as a Tauberian theorem.

The difference <math>g(0) - g_T(0)</math> is expressed using Cauchy's integral formula and then shown to be small for <math> T </math> large by estimating the integrand. Fix <math>R>0</math> and <math>\delta >0</math> such that <math>g(z)</math> is holomorphic in the region where <math> |z| \le R \text{ and } \Re z \ge - \delta</math>, and let <math>C</math> be the boundary of this region. Since 0 is in the interior of the region, Cauchy's integral formula gives

<math> g(0) - g_T(0) = \frac 1 {2 \pi i }\int_C \left( g(z) - g_T(z) \right ) \frac {dz} z = \frac 1 {2 \pi i }\int_C \left( g(z) - g_T(z) \right ) F(z)\frac {dz} z </math>

where <math> F(z) = e^{zT}\left( 1 + \frac {z^2}{R^2}\right) </math> is the factor introduced by Newman, which does not change the integral since <math>F</math> is entire and <math>F(0) = 1</math>.

To estimate the integral, break the contour <math> C </math> into two parts, <math> C = C_+ + C_- </math> where <math>C_+ = C \cap \left \{ z \, \vert \, \Re z > 0 \right \}</math> and <math>C_- \cap \left \{ \Re z \le 0 \right \}</math>. Then <math>g(0)- g_T(0) = \int_{C_+}\int_T^\infty H(t,z) dt dz - \int_{C_-}\int_0^T H(t,z) dt dz + \int_{C_-}g(z)F(z)\frac {dz}{2\pi i z}</math>where <math>H(t,z) = f(t)e^{-tz}F(z)/2 \pi i</math>. Since <math>\vartheta(x)/x</math>, and hence <math> f(t) </math>, is bounded, let <math>B</math> be an upper bound for the absolute value of <math>f(t)</math>. This bound together with the estimate <math> |F| \le 2 \exp(T \Re z)|\Re z|/R </math> for <math> |z| = R </math> gives that the first integral in absolute value is <math>\le B/R</math>. The integrand over <math>C_-</math> in the second integral is entire, so by Cauchy's integral theorem, the contour <math>C_-</math> can be modified to a semicircle of radius <math>R</math> in the left half-plane without changing the integral, and the same argument as for the first integral gives the absolute value of the second integral is <math>\le B/R</math>. Finally, letting <math>T \to \infty</math> , the third integral goes to zero since <math>e^{zT}</math> and hence <math>F</math> goes to zero on the contour. Combining the two estimates and the limit get

<math> \limsup_{T \to \infty }|g(0) - g_T(0) | \le \frac {2 B} R. </math>

This holds for any <math>R</math> so <math>\lim_{T \to \infty} g_T(0) = g(0)</math>, and the PNT follows.

Prime-counting function in terms of the logarithmic integralEdit

In a handwritten note on a reprint of his 1838 paper "{{#invoke:Lang|lang}}", which he mailed to Gauss, Dirichlet conjectured (under a slightly different form appealing to a series rather than an integral) that an even better approximation to Template:Math is given by the offset logarithmic integral function Template:Math, defined by

<math> \operatorname{Li}(x) = \int_2^x \frac{dt}{\log t} = \operatorname{li}(x) - \operatorname{li}(2). </math>

Indeed, this integral is strongly suggestive of the notion that the "density" of primes around Template:Mvar should be Template:Math. This function is related to the logarithm by the asymptotic expansion

<math> \operatorname{Li}(x) \sim \frac{x}{\log x} \sum_{k=0}^\infty \frac{k!}{(\log x)^k} = \frac{x}{\log x} + \frac{x}{(\log x)^2} + \frac{2x}{(\log x)^3} + \cdots </math>

So, the prime number theorem can also be written as Template:Math. In fact, in another paper<ref name="de la Vallée Poussin1899">Template:Citation</ref> in 1899 de la Vallée Poussin proved that

<math> \pi(x) = \operatorname{Li} (x) + O \left(x e^{-a\sqrt{\log x}}\right) \quad\text{as } x \to \infty</math>

for some positive constant Template:Mvar, where Template:Math is the [[big O notation|big Template:Mvar notation]]. This has been improved to

<math>\pi(x) = \operatorname{li} (x) + O \left(x \exp \left( -\frac{A(\log x)^\frac35}{(\log \log x)^\frac15} \right) \right)</math> where <math>A = 0.2098</math>.<ref name="Ford">Template:Cite journal</ref>

In 2016, Timothy Trudgian proved an explicit upper bound for the difference between <math>\pi(x)</math> and <math>\operatorname{li}(x)</math>:

<math>\big| \pi(x) - \operatorname{li}(x) \big| \le 0.2795 \frac{x}{(\log x)^{3/4}} \exp \left( -\sqrt{ \frac{\log x}{6.455} } \right)</math>

for <math>x \ge 229</math>.<ref>Template:Cite journal</ref>

The connection between the Riemann zeta function and Template:Math is one reason the Riemann hypothesis has considerable importance in number theory: if established, it would yield a far better estimate of the error involved in the prime number theorem than is available today. More specifically, Helge von Koch showed in 1901<ref>Template:Cite journal</ref> that if the Riemann hypothesis is true, the error term in the above relation can be improved to

<math> \pi(x) = \operatorname{Li} (x) + O\left(\sqrt x \log x\right) </math>

(this last estimate is in fact equivalent to the Riemann hypothesis). The constant involved in the big Template:Mvar notation was estimated in 1976 by Lowell Schoenfeld,<ref>Template:Cite journal</ref> assuming the Riemann hypothesis:

<math>\big|\pi(x) - \operatorname{li}(x)\big| < \frac{\sqrt x \log x}{8\pi}</math>

for all Template:Math. He also derived a similar bound for the Chebyshev prime-counting function Template:Mvar:

<math>\big|\psi(x) - x\big| < \frac{\sqrt x (\log x)^2 }{8\pi}</math>

for all Template:Math . This latter bound has been shown to express a variance to mean power law (when regarded as a random function over the integers) and Template:Sfrac noise and to also correspond to the Tweedie compound Poisson distribution. (The Tweedie distributions represent a family of scale invariant distributions that serve as foci of convergence for a generalization of the central limit theorem.<ref>Template:Cite journal</ref>) A lower bound is also derived by J. E. Littlewood, assuming the Riemann hypothesis:<ref name="Littlewood1914">Template:Citation</ref><ref>Template:Cite journal</ref><ref> Template:Cite book</ref>

<math>\big|\pi(x) - \operatorname{li}(x)\big| = \Omega \left(\sqrt x\frac{\log\log\log x}{\log x} \right)</math>

The logarithmic integral Template:Math is larger than Template:Math for "small" values of Template:Mvar. This is because it is (in some sense) counting not primes, but prime powers, where a power Template:Mvar of a prime Template:Mvar is counted as Template:Sfrac of a prime. This suggests that Template:Math should usually be larger than Template:Math by roughly <math>\ \tfrac{1}{2} \operatorname{li}(\sqrt{x})\ ,</math> and in particular should always be larger than Template:Math. However, in 1914, Littlewood proved that <math>\ \pi(x) - \operatorname{li}(x)\ </math> changes sign infinitely often.<ref name="Littlewood1914"/> The first value of Template:Mvar where Template:Math exceeds Template:Math is probably around Template:Math; see the article on Skewes' number for more details. (On the other hand, the offset logarithmic integral Template:Math is smaller than Template:Math already for Template:Math; indeed, Template:Math, while Template:Math.)

Elementary proofsEdit

In the first half of the twentieth century, some mathematicians (notably G. H. Hardy) believed that there exists a hierarchy of proof methods in mathematics depending on what sorts of numbers (integers, reals, complex) a proof requires, and that the prime number theorem (PNT) is a "deep" theorem by virtue of requiring complex analysis.<ref name="Goldfeld Historical Perspective">Template:Cite book</ref> This belief was somewhat shaken by a proof of the PNT based on Wiener's tauberian theorem, though Wiener's proof ultimately relies on properties of the Riemann zeta function on the line <math>\text{re}(s)=1</math>, where complex analysis must be used.

In March 1948, Atle Selberg established, by "elementary" means, the asymptotic formula

<math>\vartheta ( x )\log ( x ) + \sum\limits_{p \le x} {\log ( p )}\ \vartheta \left( {\frac{x}{p}} \right) = 2x\log ( x ) + O( x )</math>

where

<math>\vartheta ( x ) = \sum\limits_{p \le x} {\log ( p )}</math>

for primes Template:Mvar.<ref name="Selberg1949">Template:Citation</ref> By July of that year, Selberg and Paul Erdős<ref name="Erdős1949" /> had each obtained elementary proofs of the PNT, both using Selberg's asymptotic formula as a starting point.<ref name="Goldfeld Historical Perspective"/><ref name=interview>Template:Cite journal</ref> These proofs effectively laid to rest the notion that the PNT was "deep" in that sense, and showed that technically "elementary" methods were more powerful than had been believed to be the case. On the history of the elementary proofs of the PNT, including the Erdős–Selberg priority dispute, see an article by Dorian Goldfeld.<ref name="Goldfeld Historical Perspective" />

There is some debate about the significance of Erdős and Selberg's result. There is no rigorous and widely accepted definition of the notion of elementary proof in number theory, so it is not clear exactly in what sense their proof is "elementary". Although it does not use complex analysis, it is in fact much more technical than the standard proof of PNT. One possible definition of an "elementary" proof is "one that can be carried out in first-order Peano arithmetic." There are number-theoretic statements (for example, the Paris–Harrington theorem) provable using second order but not first-order methods, but such theorems are rare to date. Erdős and Selberg's proof can certainly be formalized in Peano arithmetic, and in 1994, Charalambos Cornaros and Costas Dimitracopoulos proved that their proof can be formalized in a very weak fragment of PA, namely Template:Math.<ref>Template:Cite journal</ref> However, this does not address the question of whether or not the standard proof of PNT can be formalized in PA.

A more recent "elementary" proof of the prime number theorem uses ergodic theory, due to Florian Richter.<ref>Bergelson, V., & Richter, F. K. (2022). Dynamical generalizations of the prime number theorem and disjointness of additive and multiplicative semigroup actions. Duke Mathematical Journal, 171(15), 3133-3200.</ref> The prime number theorem is obtained there in an equivalent form that the Cesàro sum of the values of the Liouville function is zero. The Liouville function is <math>(-1)^{\omega(n)}</math> where <math>\omega(n)</math> is the number of prime factors, with multiplicity, of the integer <math>n</math>. Bergelson and Richter (2022) then obtain this form of the prime number theorem from an ergodic theorem which they prove:

Let <math>X</math> be a compact metric space, <math>T</math> a continuous self-map of <math>X</math>, and <math>\mu</math> a <math>T</math>-invariant Borel probability measure for which <math>T</math> is uniquely ergodic. Then, for every <math>f\in C(X)</math>,

<math display="block">\tfrac1N\sum_{n=1}^Nf(T^{\omega(n)}x)\to \int_Xf\,d\mu,\quad\forall x\in X.</math>

This ergodic theorem can also be used to give "soft" proofs of results related to the prime number theorem, such as the Pillai–Selberg theorem and Erdős–Delange theorem.

Computer verificationsEdit

In 2005, Avigad et al. employed the Isabelle theorem prover to devise a computer-verified variant of the Erdős–Selberg proof of the PNT.<ref name=Avigad>Template:Cite journal</ref> This was the first machine-verified proof of the PNT. Avigad chose to formalize the Erdős–Selberg proof rather than an analytic one because while Isabelle's library at the time could implement the notions of limit, derivative, and transcendental function, it had almost no theory of integration to speak of.<ref name=Avigad/>Template:Rp

In 2009, John Harrison employed HOL Light to formalize a proof employing complex analysis.<ref> Template:Cite journal</ref> By developing the necessary analytic machinery, including the Cauchy integral formula, Harrison was able to formalize "a direct, modern and elegant proof instead of the more involved 'elementary' Erdős–Selberg argument".

Prime number theorem for arithmetic progressionsEdit

Let Template:Math denote the number of primes in the arithmetic progression Template:Math that are less than Template:Mvar. Dirichlet and Legendre conjectured, and de la Vallée Poussin proved, that if Template:Mvar and Template:Mvar are coprime, then

<math>\pi_{d,a}(x) \sim \frac{ \operatorname{Li}(x) }{ \varphi(d) } \ ,</math>

where Template:Mvar is Euler's totient function. In other words, the primes are distributed evenly among the residue classes Template:Math modulo Template:Mvar with Template:Math . This is stronger than Dirichlet's theorem on arithmetic progressions (which only states that there is an infinity of primes in each class) and can be proved using similar methods used by Newman for his proof of the prime number theorem.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

The Siegel–Walfisz theorem gives a good estimate for the distribution of primes in residue classes.

Bennett et al.<ref>Template:Cite journal</ref> proved the following estimate that has explicit constants Template:Mvar and Template:Mvar (Theorem 1.3): Let Template:Mvar <math>\ge 3</math> be an integer and let Template:Mvar be an integer that is coprime to Template:Mvar. Then there are positive constants Template:Mvar and Template:Mvar such that

<math> \left | \pi_{d,a}(x) - \frac{\ \operatorname{Li}(x)\ }{\ \varphi(d)\ } \right | < \frac{A\ x}{\ (\log x)^2\ } \quad \text{ for all } \quad x \ge B\ ,</math>

where

<math> A = \frac{1}{\ 840\ } \quad \text{ if } \quad 3 \leq d \leq 10^4 \quad \text{ and } \quad A = \frac{1}{\ 160\ } \quad \text{ if } \quad d > 10^4 ~,</math>

and

<math>B = 8 \cdot 10^9 \quad \text{ if } \quad 3 \leq d \leq 10^5 \quad \text{ and } \quad B = \exp(\ 0.03\ \sqrt{d\ }\ (\log{d})^3 \ ) \quad \text{ if } \quad d > 10^5\ .</math>

Prime number raceEdit

File:Chebyshev bias.svg
Plot of the function <math>\ \pi(x;4,3)-\pi(x;4,1) \ </math> for Template:Math

Although we have in particular

<math>\pi_{4,1}(x) \sim \pi_{4,3}(x) \ ,</math>

empirically the primes congruent to 3 are more numerous and are nearly always ahead in this "prime number race"; the first reversal occurs at Template:Math.<ref name="Granville Martin MAA"> Template:Cite journal</ref>Template:Rp However Littlewood showed in 1914<ref name="Granville Martin MAA"/>Template:Rp that there are infinitely many sign changes for the function

<math>\pi_{4,1}(x) - \pi_{4,3}(x) ~,</math>

so the lead in the race switches back and forth infinitely many times. The phenomenon that Template:Math is ahead most of the time is called Chebyshev's bias. The prime number race generalizes to other moduli and is the subject of much research; Pál Turán asked whether it is always the case that Template:Math and Template:Math change places when Template:Mvar and Template:Mvar are coprime to Template:Mvar.<ref name=GuyA4>Template:Cite book This book uses the notation Template:Math where this article uses Template:Math for the number of primes congruent to Template:Mvar modulo Template:Mvar.</ref> Granville and Martin give a thorough exposition and survey.<ref name="Granville Martin MAA" />

File:Prime race of last digit up to 10000.png
Graph of the number of primes ending in 1, 3, 7, and 9 up to Template:Math for Template:Math

Another example is the distribution of the last digit of prime numbers. Except for 2 and 5, all prime numbers end in 1, 3, 7, or 9. Dirichlet's theorem states that asymptotically, 25% of all primes end in each of these four digits. However, empirical evidence shows that, for a given limit, there tend to be slightly more primes that end in 3 or 7 than end in 1 or 9 (a generation of the Chebyshev's bias).<ref>Template:Cite journal</ref> This follows that 1 and 9 are quadratic residues modulo 10, and 3 and 7 are quadratic nonresidues modulo 10.

Non-asymptotic bounds on the prime-counting functionEdit

{{#invoke:Labelled list hatnote|labelledList|Main article|Main articles|Main page|Main pages}} The prime number theorem is an asymptotic result. It gives an ineffective bound on Template:Math as a direct consequence of the definition of the limit: for all Template:Math, there is an Template:Mvar such that for all Template:Math,

<math> (1-\varepsilon)\frac {x}{\log x} \; < \; \pi(x) \; < \; (1+\varepsilon)\frac {x}{\log x} \; .</math>

However, better bounds on Template:Math are known, for instance Pierre Dusart's

<math> \frac{x}{\log x}\left(1+\frac{1}{\log x}\right) \; < \; \pi(x) \; < \; \frac{x}{\log x}\left(1+\frac{1}{\log x}+\frac{2.51}{(\log x)^2}\right) \; .</math>

The first inequality holds for all Template:Math and the second one for Template:Math.<ref>Template:Cite thesis</ref>

The proof by de la Vallée Poussin implies the following bound: For every Template:Math, there is an Template:Mvar such that for all Template:Math,

<math>\frac {x}{\log x - (1 - \varepsilon)} \; < \; \pi(x) \; < \; \frac {x}{\log x - (1+\varepsilon)} \; .</math>

The value Template:Math gives a weak but sometimes useful bound for Template:Math:<ref name="rosser">Template:Cite journal</ref>

<math> \frac {x}{\log x + 2} \; < \; \pi(x) \; < \; \frac {x}{\log x - 4} \; .</math>

In Pierre Dusart's thesis there are stronger versions of this type of inequality that are valid for larger Template:Mvar. Later in 2010, Dusart proved:<ref>Template:Cite arXiv</ref>

<math>\begin{align}

\frac {x}{\log x - 1} \; &< \; \pi(x) &&\text{ for } x \ge 5393 \;, \text{ and }\\ \pi(x) &< \; \frac {x} {\log x - 1.1} &&\text{ for } x \ge 60184 \; . \end{align}</math>

Note that the first of these obsoletes the Template:Math condition on the lower bound.

Approximations for the nth prime numberEdit

As a consequence of the prime number theorem, one gets an asymptotic expression for the Template:Mvarth prime number, denoted by Template:Math:

<math>p_n \sim n \log n.</math><ref>{{#invoke:citation/CS1|citation

|CitationClass=web }}</ref> A better approximation is by Cesàro (1894):<ref>Template:Cite journal</ref>

<math> p_n = n B_2(\log n),\text{ where}</math>
<math> B_2(x) = x + \log x - 1 + \frac{\log x - 2}{x} - \frac{(\log x)^2 - 6 \log x + 11}{2x^2} + o\left(\frac 1{x^2}\right).</math>

Again considering the Template:Valth prime number Template:Val, assuming the trailing error term is zero gives an estimate of Template:Val; the first 5 digits match and relative error is about 0.46 parts per million.

Cipolla (1902)<ref>Template:Cite journal</ref><ref name=Toulisse13>Template:Cite journal</ref> showed that these are the leading terms of an infinite series which may be truncated at arbitrary degree, with

<math>B_k(x) = x + \log x - 1 - \sum_{i=1}^k (-1)^i \frac{P_i(\log x)}{ix^i} + O\left(\frac{(\log x)^{k+1}}{x^{k+1}}\right),</math>

where each Template:Math is a degree-Template:Mvar monic polynomial. (Template:Math, Template:Math, Template:Math, and so on.<ref name=Toulisse13/>)

Rosser's theorem<ref name="rosser" /> states that

<math>p_n > n \log n.</math>

Dusart (1999).<ref name=Dusart99>Template:Cite journal</ref> found tighter bounds using the form of the Cesàro/Cipolla approximations but varying the lowest-order constant term. Template:Math is the same function as above, but with the lowest-order constant term replaced by a parameter Template:Mvar:

<math>\begin{align}

p_n \; &> \; n B_0(\log n; 1) && \text{for } n \ge 2,\text{ and}\\ p_n \; &< \; n B_0(\log n; 0.9484) && \text{for } n \ge 39017,\text{ where}\\ B_0(x;C) \; &= \; x + \log x - C.\\ p_n \; &> \; n B_1(\log n; 2.25) && \text{for } n \ge 2,\text{ and}\\ p_n \; &< \; n B_1(\log n; 1.8) && \text{for } n \ge 27076,\text{ where}\\ B_1(x;C) \; &= \; x + \log x - 1 + \frac{\log x - C}{x}. \end{align}</math> The upper bounds can be extended to smaller Template:Mvar by loosening the parameter. For example, Template:Math for all Template:Math.<ref name=Axler19>Template:Cite journal</ref>

Axler (2019)<ref name=Axler19/> extended this to higher order, showing:

<math>\begin{align}

p_n \; &> \; n B_2(\log n;11.321) \quad \text{for } n \ge 2, \text{ and }\\ p_n \; &< \; n B_2(\log n;10.667) \quad \text{for } n \ge 46\,254\,381,\text{ where}\\ B_2(x;C) \; &= \; x + \log x - 1 + \frac{\log x - 2}{x} - \frac{(\log x)^2 - 6\log x + C}{2x^2}. \end{align}</math> Again, the bound on Template:Mvar may be decreased by loosening the parameter. For example, Template:Math for Template:Math.

Table of π(x), x / log x, and li(x)Edit

The table compares exact values of Template:Math to the two approximations Template:Math and Template:Math. The approximation difference columns are rounded to the nearest integer, but the "% error" columns are computed based on the unrounded approximations. The last column, Template:Math, is the average prime gap below Template:Mvar.

Template:Mvar Template:Math Template:Math Template:Math % error Template:Math
Template:Math Template:Math
10 4 0 2 8.22% 42.606% 2.500
102 25 3 5 14.06% 18.597% 4.000
103 168 23 10 14.85% 5.561% 5.952
104 1,229 143 17 12.37% 1.384% 8.137
105 9,592 906 38 9.91% 0.393% 10.425
106 78,498 6,116 130 8.11% 0.164% 12.739
107 664,579 44,158 339 6.87% 0.051% 15.047
108 5,761,455 332,774 754 5.94% 0.013% 17.357
109 50,847,534 2,592,592 1,701 5.23% 3.34Template:E % 19.667
1010 455,052,511 20,758,029 3,104 4.66% 6.82Template:E % 21.975
1011 4,118,054,813 169,923,159 11,588 4.21% 2.81Template:E % 24.283
1012 37,607,912,018 1,416,705,193 38,263 3.83% 1.02Template:E % 26.590
1013 346,065,536,839 11,992,858,452 108,971 3.52% 3.14Template:E % 28.896
1014 Template:Zwsp 102,838,308,636 314,890 3.26% 9.82Template:E % 31.202
1015 Template:Zwsp 891,604,962,452 1,052,619 3.03% 3.52Template:E % 33.507
1016 Template:Zwsp Template:Zwsp 3,214,632 2.83% 1.15Template:E % 35.812
1017 Template:Zwsp Template:Zwsp 7,956,589 2.66% 3.03Template:E % 38.116
1018 Template:Zwsp Template:Zwsp 21,949,555 2.51% 8.87Template:E % 40.420
1019 Template:Zwsp Template:Zwsp 99,877,775 2.36% 4.26Template:E % 42.725
1020 Template:Zwsp Template:Zwsp 222,744,644 2.24% 1.01Template:E % 45.028
1021 Template:Zwsp Template:Zwsp 597,394,254 2.13% 2.82Template:E % 47.332
1022 Template:Zwsp Template:Zwsp 1,932,355,208 2.03% 9.59Template:E % 49.636
1023 Template:Zwsp Template:Zwsp 7,250,186,216 1.94% 3.76Template:E % 51.939
1024 Template:Zwsp Template:Zwsp 17,146,907,278 1.86% 9.31Template:E % 54.243
1025 Template:Zwsp Template:Zwsp 55,160,980,939 1.78% 3.21Template:E % 56.546
1026 Template:Zwsp Template:Zwsp 155,891,678,121 1.71% 9.17Template:E % 58.850
1027 Template:Zwsp Template:Zwsp 508,666,658,006 1.64% 3.11Template:E % 61.153
1028 Template:Zwsp Template:Zwsp Template:Zwsp 1.58% 9.05Template:E % 63.456
1029 Template:Zwsp Template:Zwsp Template:Zwsp 1.53% 2.99Template:E % 65.759

The value for Template:Math was originally computed assuming the Riemann hypothesis;<ref name="Franke">{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> it has since been verified unconditionally.<ref name="PlattARXIV2012">Template:Cite journal</ref>

Analogue for irreducible polynomials over a finite fieldEdit

There is an analogue of the prime number theorem that describes the "distribution" of irreducible polynomials over a finite field; the form it takes is strikingly similar to the case of the classical prime number theorem.

To state it precisely, let Template:Math be the finite field with Template:Mvar elements, for some fixed Template:Mvar, and let Template:Mvar be the number of monic irreducible polynomials over Template:Mvar whose degree is equal to Template:Mvar. That is, we are looking at polynomials with coefficients chosen from Template:Mvar, which cannot be written as products of polynomials of smaller degree. In this setting, these polynomials play the role of the prime numbers, since all other monic polynomials are built up of products of them. One can then prove that

<math>N_n \sim \frac{q^n}{n}.</math>

If we make the substitution Template:Math, then the right hand side is just

<math>\frac{x}{\log_q x},</math>

which makes the analogy clearer. Since there are precisely Template:Math monic polynomials of degree Template:Mvar (including the reducible ones), this can be rephrased as follows: if a monic polynomial of degree Template:Mvar is selected randomly, then the probability of it being irreducible is about Template:Math.

One can even prove an analogue of the Riemann hypothesis, namely that

<math>N_n = \frac{q^n}n + O\left(\frac{q^\frac{n}{2}}{n}\right).</math>

The proofs of these statements are far simpler than in the classical case. It involves a short, combinatorial argument,<ref>Template:Cite journal</ref> summarised as follows: every element of the degree Template:Mvar extension of Template:Mvar is a root of some irreducible polynomial whose degree Template:Mvar divides Template:Mvar; by counting these roots in two different ways one establishes that

<math>q^n = \sum_{d\mid n} d N_d,</math>

where the sum is over all divisors Template:Mvar of Template:Mvar. Möbius inversion then yields

<math>N_n = \frac{1}{n} \sum_{d\mid n} \mu\left(\frac{n}{d}\right) q^d,</math>

where Template:Math is the Möbius function. (This formula was known to Gauss.) The main term occurs for Template:Math, and it is not difficult to bound the remaining terms. The "Riemann hypothesis" statement depends on the fact that the largest proper divisor of Template:Mvar can be no larger than Template:Math.

See alsoEdit

CitationsEdit

Template:Reflist

ReferencesEdit

External linksEdit